Mircea Trifan edited Introduction.tex  about 10 years ago

Commit id: efa3d8c4533dae782724d950fa39f5db403c8bd7

deletions | additions      

       

\section{Introduction}  There are four types of twitter streams that a ordinary user has acces to: trends, search phrase wich returns up to maximum 1500 tweets, user timeline, streams parametrized by keywords or users and spritzer stream that is 10\% of overall tweets. Theese can be implemented as tab panels in a spread sheet like user interface. One dimension of the spreadsheet is given by the trending entity and another is based on the stream based on the keywords category. There is a single twitter stream given by all keywords in all categories that is further classified (use a signal processing approach) in individual categories. Another tab can be the co-occurence matrix. Each cell contains the mostly tweeted entity. On cell click the first 20 for example are displayed in a popup along with the corresponding tweets maybe on the right side of the screen.  Fuel UX datagrid and Twitter's Bootstrap are used for the user interface. Big data processing can be integrated in M3Data. The underlying database is Apache Accumulo and the processing could be done in a pipeline approach by Cascading. Cascading can run on top of Accumulo (or Storm).