Alberto Pepe edited subsectionDefinitions_delay_and_time.tex  about 11 years ago

Commit id: b76ac0d6a0df63c58620ea47a7056e7e3d045c9f

deletions | additions      

       

\subsection{Definitions: delay and time span.}  Twitter mentions and arXiv downloads may follow particular temporal patterns. For example, for some articles downloads and mentions may take weeks to slowly increase after submission, whereas for other articles downloads may increase very swiftly after submission to wane very shortly thereafter. The total number of downloads and mentions is orthogonal to these temporal effects, and could be different in either case.  The two parameters that we use to describe the temporal distributions of arXiv downloads and Twitter mentions are \emph{delay} and the time \emph{span}, which we define as follows. Let $t_0 \in \mathbb{N}^+$ be the date of submission for article $a_i$. We represent both arXiv downloads and Twitter mentions for article $a_i$ as the time series $T$, the value of which at time $t$ is given by the function $\mathbb{T}(a_i,t) \in \mathbb{N}^+$. We then define the time of the first, maximum, and last arXiv download of article $a_i$ as $\mathbb{T}_{\text{first}}(a_i)$, $\mathbb{T}_{\text{max}}(a_i)$, and $\mathbb{T}_{\text{last}}(a_i)$ respectively:  \begin{equation}  \mathbb{T}_{\text{first}}(a_i) = \min\{t:\mathbb{T}(a_i,t) > 0\}\\  \mathbb{T}_{\text{last}}(a_i) = t: \max(\mathbb{T}(a_i,t) \\  \mathbb{T}_{\text{last}}(a_i) = \max\{t:\mathbb{T}(a_i,t) > 0\}\\  \end{equation}  The delay, $\Theta(a_i)$, and span, $\Delta(a_i)$, of the temporal distribution of arXiv downloads for article $a_i$ will then be defined as:  \begin{equation}  \Theta(a_i) = \mathbb{T}_{\text{last}}(a_i) - t_0 \\  \Delta(a_i)=\mathbb{T}_{\text{last}}(a_i) - \mathbb{T}_{\text{first}}(a_i)  \end{equation}  To distinguish between the delay and span of arXiv downloads and twitter mentions, we simply denote $\Theta_{\text{ax}}(a_i)$, $\Delta_{\text{ax}}(a_i)$, $\Theta_{\text{tw}}(a_i)$, $\Delta_{\text{tw}}(a_i)$ respectively which are defined according to the above provided definitions.  As shown in Figure 2, the delay is thus measured as the time difference between the date of a preprint submission and a subsequent spike in Twitter mentions (the day in which an article receives the highest volume of related tweets) or arXiv downloads (the day in which it receives the highest volume of downloads). The time span is the temporal ``duration'' of the response, measured as the time lag between the first and the last Twitter mention or download of the article in question.  To illustrate delay and span, we examine in detail the response dynamics for an article in the corpus, in Figure 3. The article in question was submitted to arXiv on October 14, 2010. Time runs horizontally from left to right. Downloads and Twitter mentions are charted over time (weekly for downloads, daily for mentions). As Figure 3 shows, the Twitter response to submission occurs within a day, reaching a peak of nearly 40 daily mentions within several days, and then slowly dies out over the course of the following week. The peak of arXiv downloads, with over 16,000 weekly downloads, occurs a couple of weeks after submission, and continues to be marked by downloads for months. From a \textit{post hoc, ergo propter hoc} point of view, in this case the Twitter response occurs immediately and nearly exactly before the peak in arXiv reads, suggesting that social media attention may have led to subsequently higher levels of arXiv downloads.