Twitter Miner (Final Report)



The aim of our project is to collect Twitter streaming data and implement series of data mining algorithms for three main objectives. These objectives are; event detection, sentimental analysis and user categorization. By analyzing real time events, we will survey user reactions and categorize them based on their interests.


Twitter is one of the fastest growing micro-blogging platforms with their 320 million active users per month [1]. Every user can report events that are happening around him or her. Due to this nature of Twitter, it has become a rich source for detecting, monitoring and analyzing real time events such as natural disasters, health epidemics, political elections, sports matches or release of a new product.

In this work, we aim to detect events from Twitter and based on users’ reactions on that particular event we will be conducting a sentiment analysis. To detect events, we will apply a method to analyze clustering of hashtags or certain keywords. With these available tweets of that particular event, we will detect emotion of users and categorize these users based on their common interests.

Progress in Phase I

Main objective of phase one was to retrieve Tweets from Twitter API. We achieve this goal by distributing different approaches and third party libraries among project members. Each project member tried to implement an algorithm by using these approaches and libraries. Eventually we managed to retrieve data via Twitterizer library. We successfully collected tweets which were tweeted from Ankara location, and related user information and those users’ last twenty tweets and favorites. We plan to continuously collect these information and store in XML files.

For sentiment analysis, text content of tweets will be processed and emoji characters will be extracted. In this case, we will parse the Unicode representation of emoji characters and categorize them as a positive or negative reaction.