NOV 8 - NOV 15 Principal Components Supposing we now have N hashtag point processes. Here we present analysis based on the daily count process. We denote the daily counts for hashtag i as xi(t), where t takes integer values in [tmin, tmax]. In our data, we have the year-long time window of 2012. In the following, we show the daily counts with aligned time axis for a few top-trending hashtags.
MAIN IDEA Objective The data matrix is generated as Y = A₀X₀, where Y ∈ ℝn × p, and the following conditions are assumed. 1. The dictionary A ∈ ℝn × n is complete, namely square and invertible. 2. The elements in X₀ is independently drawn from Bernoulli-Gaussian (BG) model. X0[ij] = ΩijVij, where Ωij ∼ Ber(θ) and Vij ∼ N(0, 1). The goal is to recover A₀ and X₀ from Y. The main result is that when θ ∈ (0, 1/3), A₀ and X₀ can be exactly recovered with a polynomial time algorithm with high probability, if number of samples p is no less than a polynomial of (n, 1/θ, κ(A₀),1/μ), where κ(A₀) is the conditional number of the dictionary.