Authorea

Xavier Holt edited Inference_in_Bayesian_TLG_Models__.md almost 8 years ago

Commit id: 9fe320042780174ca12f924403f14444260db370

deletions | additions

### Inference in Bayesian TLG Models The result of the models above is a probability density. The catch is that unlike with simpler models, the integral representing our density is intractable. One solution makes use of the fact that while intergrating over the whole measure space is difficult, plugging in values and retrieving an (un-normalised) probability is not. We can exploit this property by performing an intelligent random walk over the surface of the density. The idea is that if we walk for long enough, we'll obtain a reasonable representation of the surface. This is the basis for a family of inference methods called Markov-Chain Monte-Carlo (MCMC) sampling. All current Bayesian TLG formulations use MCMC based inference methods \cite{Ahmed2011, Hong:2011du, Wang2013, Ahmed:2012vh, Li2013}. They are simple to implement and an intuitive method for exploring an unknown density. On the other hand they tend to scale poorly with both data-set size or dimensionality \cite{wainwright2008graphical, Grimmer_2010}. NLP in general tends to have large amounts of sparse high dimensional data. Furthermore TLG is a summarisation task and the value of summarisation grows with the size of the underlying data. Because of this, exploring additional inference methods is an important goal for further research. One alternative to sampling-based methods is variational inference. In broad strokes this method defines a family of densities capable of approximating the generating distribution of any given dataset. We then perform an iterative optimisation process to find the distribution in this family that best matches our data. Variational inference has been shown to generate models of similar quality to MCMC methods several order of magnitudes faster \cite{wainwright2008graphical, Grimmer_2010}. No implementations of TLG to date use this form of inference. Variational methods are highly specific to the underlying model, and developing a new variational inference formulation is an involved task. Nevertheless they could provide a large increase in scalability. For the sake of comparison, we can observe the work done by Wang et al. and Bryant et al. \cite{Wang2011, Bryant2012}. They analyse the performance of variational inference methods for hierarchical Dirichlet processes. This process is the foundation of our nonparametric topic models, justifying the comparison. Wang et al. employ variational methods on datasets of over 400,000 articles\cite{Wang2011}. In contrast the largest TLG model with MCMC inference to date has only had a dataset of 10,000 articles\cite{Wang2013}. Lorem