This is the first slide This is the second slide A bit more information about this INTRODUCTION History shows Galileo to be much more than an astronomical hero, though. His clear and careful record keeping and publication style not only let Galileo understand the Solar System, it continues to let _anyone_ understand _how_ Galileo did it. Galileo’s notes directly integrated his DATA (drawings of Jupiter and its moons), key METADATA (timing of each observation, weather, telescope properties), and TEXT (descriptions of methods, analysis, and conclusions). Critically, when Galileo included the information from those notes in _Siderius Nuncius_ , this integration of text, data and metadata was preserved, as shown in Figure 1. Galileo’s work advanced the “Scientific Revolution,” and his approach to observation and analysis contributed significantly to the shaping of today’s modern ”Scientific Method”. [The original data (blue curve) has been fit by a model (red curve) consisting of the band structure, superconducting gap, and self-energy.] A MORE ADVANCED EXAMPLE We produce an aggregate mood vector md for the set of tweets submitted on a particular date d, denoted Td ⊂ T by simply averaging the mood vectors of the tweets submitted that day, i.e. \[m_d = }{||T_d||}\] The time series of aggregated, daily mood vectors md for a particular period of time [i, i + k], denoted θmd[i, k], is then defined as: \[ [i,k] = [ m_{i}, m_{i+1}, m_{i+2}, \cdots, m_{i+k}] \] A different number of tweets is submitted on any given day. Each entry of θmd[i, k] is therefore derived from a different sample of Nd = ||Td|| tweets. The probability that the terms extracted from the tweets submitted on any given day match the given number of POMS adjectives Np thus varies considerably along the binomial probability mass function: \[P(K=n) = \left({c}N_p\\||W(T_d)||\right)p^{||W(T_d)||}(1-p)^{N_p-||W(T_d)||}\] where P(K = n) represents the probability of achieving n number of POMS term matches, ||W(Td)|| represents the total number of terms extracted from the tweets submitted on day d vs. Np the total number of POMS mood adjectives. Since the number of tweets per day has increased consistently from Twitter’s inception in 2006 to present, this leads to systemic changes in the variance of θmd[i, k] over time. In particular, the variance is larger in the early days of Twitter, when tweets are relatively scarce. As the number of tweets per day increases, the variance of the time series decreases. This effect makes it problematic to compare changes in the mood vectors of θ[i, k] over time.