Authorea

Alberto Pepe edited subsectionData_collection_Our_process.tex about 11 years ago

Commit id: 028c06c319d1aeb11612d8aef91fde0ade80651b

deletions | additions

\section{Materials} \subsection{Data collection} Our process of determining whether a particular arXiv article was mentioned on Twitter consists of three phases: crawling, filtering, and organization. Tweets are acquired via the Streaming API from Twitter Gardenhose, which represents roughly 10\% of the total tweets from public time line through random sampling. We collected tweets whose date and time stamp ranges from 2010-10-01 to 2011-04-30 which results in a sample of 1,959,654,862 tweets.