# Main Idea

## Objective

The data matrix is generated as $$\mathbf{Y} = \mathbf{A}_0 \mathbf{X}_0$$, where $$\mathbf{Y} \in \mathbb{R}^{n \times p}$$, and the following conditions are assumed.

1. The dictionary $$\mathbf{A} \in \mathbb{R}^{n \times n}$$ is complete, namely square and invertible.

2. The elements in $$\mathbf{X}_0$$ is independently drawn from Bernoulli-Gaussian (BG) model. $$X_{0{[ij]}} = \Omega_{ij} V_{ij}$$, where $$\Omega_{ij} \sim \text{Ber}(\theta)$$ and $$V_{ij} \sim N(0,1)$$.

The goal is to recover $$\mathbf{A}_0$$ and $$\mathbf{X}_0$$ from $$\mathbf{Y}$$.

The main result is that when $$\theta \in (0, 1/3)$$, $$\mathbf{A}_0$$ and $$\mathbf{X}_0$$ can be exactly recovered with a polynomial time algorithm with high probability, if number of samples $$p$$ is no less than a polynomial of $$(n, 1/\theta, \kappa(\mathbf{A}_0), 1/\mu)$$, where $$\kappa(\mathbf{A}_0)$$ is the conditional number of the dictionary.

## Non-convex optimization formulation

Since $$\mathbf{A_0}$$ is complete, the row space of $$\mathbf{Y}$$ is equivalent to the row space of $$\mathbf{X}_0$$. By the result from (Spielman 2013), under the generative model, rows of $$\mathbf{X}_0$$ corresponds to the $$n$$ sparsest vectors in the row space. Once $$\mathbf{X}_0$$ is recovered, $$\mathbf{A}_0$$ could be solved from least square $$\mathbf{A}_0 = \mathbf{Y} \mathbf{X}_0 (\mathbf{X}_0 \mathbf{X}_0^T)^{-1}$$.

A straightforward formulation for recovering one row of $$\mathbf{X}_0$$ is $\min \| \mathbf{q}^{T} \mathbf{Y} \|_0 \quad s.t.\ \mathbf{q} \neq \mathbf{0}.$ $$\mathbf{X}_0$$ can b