Alberto Pepe deleted file itemize_item_textbfArXiv_downloads_For.tex  about 11 years ago

Commit id: 05e4f0365ca96115ee77c928f6890233b956a3fc

deletions | additions      

         

The datasets employed in this study are:  \begin{itemize}  \item \textbf{ArXiv downloads}: For each article in the aforementioned cohort we retrieved their weekly download numbers from the arXiv logs for the period from October 4, 2010 to May 9, 2011. A total of 2,904,816 downloads were recorded for 4,606 articles.   %  \item \textbf{Twitter mentions}: Our collection of tweets is based on the Gardenhose, a data feed that returns a randomly sampled 10\% of all daily tweets. A Twitter mention of arXiv article was deemed to have occurred when a tweet contained an explicit or shortened link to an arXiv paper (see ``Materials'' appendix for more details). Between October 4, 2010 and May 9, 2011 we scanned 1,959,654,862 tweets in which 4,415 articles out of 4,606 in our cohort were mentioned at least once, i.e. approximately 95\% of the cohort. Such a wide coverage of arXiv articles is mostly due to specialized bot accounts which post arXiv submissions daily. The volume of Twitter mentions of arXiv papers was very small compared to the total volume of tweets in period, with only 5,752 tweets containing mentions of papers in the arXiv corpus. We found that 2,800 out of 5,752 tweets are from non-bot accounts. After filtering out all tweets posted by bot accounts, we retain 1,710 arXiv articles out of 4,415 that are mentioned on Twitter by non-bot accounts. Including or excluding bot mentions, the distribution of number of tweets over all papers was very skewed; most papers were mentioned only once, but one paper in the corpus was mentioned as much as 113 times.  \item \textbf{Early citations}: We manually retrieved citation counts from Google Scholar for the 70 most Twitter-mentioned articles in our cohort. Citation counts were retrieved on September 30, 2011 and date back to the initial submission date in arXiv. All 70 articles combined were cited a total of 431 times at that point. The most cited article in the corpus was cited 62 times whereas most articles received hardly any citations.  \end{itemize}