Big Data Digital Signal Processing on Social Networks Graphs

Abstract

Twitter related problems:

  1. Named Entity Recognition (NER) (Li 2012), (Caputo 2009)

  2. Relation Extraction (Wang 2011)

  3. Classification (Jain 2014)

  4. Event Detection (Dong 2014), (Nurwidyantoro 2013), (Pohl 2013), (Gao 2013)

  5. Event Tracking (Pohl 2013)

  6. Geo search and visualization (Gao 2013)

  7. Recommender (citation not found: Arru_2013), (Costa 2010)

  8. Trend Mining (Desmier 2013)

  9. Difusion of topics (Guille 2013), (Altshuler 2012), (Choudhury 2010)

  10. Prediction (Guille 2013), (Symeonidis 2013), (Altshuler 2012), (Sizov 2010), (Choudhury 2010)

  11. Emergence (Miller 2013), (Jain 2014)

Theoretical aspects:

  1. Attributed Graph Model (Miller 2013), (Kim 2010)

  2. Rezidual Analysis of Attributed Graphs (Miller 2013)

  3. Sub-graph matching (Miller 2013), (Kriege 2012)

  4. Diffusion wavelets transform (Wang 2009), (Jain 2014)

  5. Detection Theory (Miller 2013)

  6. Complex networks

  7. Spectral analysis of graphs

  8. Signal Processing

  9. Prediction Annalytics

  10. Tensors (Miller 2013)

Problems addresed in the framework of DSP on graphs:

  1. Mathematical model of Twitter as a dynamic attributed graph with streams attached to vertexes.

  2. Subgraph matching.

  3. Twitter diffusion of topics or hash terms.

  4. Tweet classification.

  5. Quering a corpus of dependencies parses of sentences viewed as graphs. (Miller 2012)

  6. Integrated search of documents, multimedia archives and geographic data.

  7. Detection Theory on twitter graphs.

  8. Recommender Systems. (Arru 2013) + (Li 2012) NER

  9. Event Detection. (Dong 2014)

  10. Predicting Event

  11. Event Summarization

  12. Event Association

  13. Early Warning System

  14. Big Data Implementation.

Big Data Application Layer for Graphs:

  1. Search/Query

    1. Graph Analytics

    2. PageRank

    3. Subgraph Detection

    4. Belief Propagation

    5. Clustering/Classification

Introduction

Multiuser selection

Remove high frequency words as in Lucene

Adapteva or GPU/matlab. http://www.mathworks.com/discovery/gpu-signal-processing.html Use data from SEMEVAL 2014 for sentence semantic relatedness. Dependency parsing based links as Walsh codes, capture relation between words expressed by a vector (word2vec). Unified search from RDF Graphs and unstructured text. Use iconic environment (m3data or Simulink). Study deeplearning4J twitter application. To draw dependency graphs: DependenSee A Dependency Parse Visualisation Tool that makes pictures of Stanford Dependency output. By Awais Athar. (http://nlp.stanford.edu/software/lex-parser.shtml#Sample). Form a document signal by concatenate sentences associated signals. A dependancy graph link source is encoded by a Walsh code and the destination by the code obtained by a rotation with -90 degree. Question-Answering as a decoding-encoding problem or filter(docs)/Fourier(search). Apply at multimedia annotation or unified searching, encryption and watermarking.

Jive search with relations represented as database tabels in M3Data (Campbell 2013), community detection, leadership... implemented as M3Data big data (AROM) by Apache Crunch on top of Spark with collaborative interface by NoFlo

Represent a topics graph (Hash + NER) as in (Sizov 2010), apply Detection (Miller 2013) to identify emergence.

There are four types of twitter streams that a ordinary user has acces to: trends, search phrase wich returns up to maximum 1500 tweets, user timeline, stre