Rannie Teodoro added Data Collection Procedures.tex  over 9 years ago

Commit id: e202d35a491778ef7ab4f61aeb62bf72cfc0cb05

deletions | additions      

         

\subsection{Data Collection Procedures}  A total of 2,000 pro-ana disclosures were collected for analysis: 1,000 LiveJournal disclosures and 1,000 Twitter disclosures. For the LiveJournal disclosures, a team of three trained research assistants and one of the authors collected every message posted from January 1st, 2012 to December 31st, 2012 on the Pro-Ana-Nation LiveJournal thread (N = 29,716). Each individual was assigned, via a random number generator, certain months from which to collect data. The data collected included: the initial post, the date and time of the post, and the user characteristics (e.g., gender, age). Out of the overall conversations collected (N = 6,070) and the initial messages (N = 6,070), one thousand (N = 1,000) were included in the main dataset eligible for analysis.  For Twitter, to obtain the sample of disclosures we conducted the following: (1) determined the most popular pro-ana Twitter hashtags, (2) identified the most active users utilizing these hashtags, and (3) created a criteria for including their content into the main collection of tweets. First, we sought to determine the most popular pro-ana Twitter hashtags yielding the most attention and interest. International and national news media coverage such as those in Huffington Post (Huffington, 2013), TIME (Hasan, 2012) and Business Insider (Edwards, 2012) provided insight into the most relevant and popular social media hashtags for pro-ana content. This, along with informal observations in Twitter, resulted in the selection of the hashtags “#proana” and “#thinspo” (a colloquialism for ‘thin’ and ‘inspiration’).  Next, using Twitter Search, we identified the top 100 users who most frequently posted with each hashtag and downloaded the content of their timelines. One of the authors filtered the timelines for English-only tweets and messages posted in the year 2012. The data collected included: username, the initial post, and the date and time of the post. “Retweets” (re-posting of another user’s tweet) directed at or in response to other users, and image-only tweets were also excluded from the sample. This resulted in a collection of initial, undirected (no specific person listed as the recipient), and public Twitter disclosures (N = 16,363). One thousand (N = 1000) tweets were randomly selected for analysis.