Authorea

Xavier Holt edited Contributions_to_the_State_of__.md almost 8 years ago

Commit id: 198bc041214d821a61cef797ebf929e1765e683b

deletions | additions

##Contributions to the State of the Art: Inference Our inference method is one of our primary contributions to the state of the art. Inference in nonparametric Bayesian formulations can largely be divided into sampling and expectation-maximisation (EM) like approaches. The former has been applied to TLG but as of yet no attempt has been made to apply the latter. The work of Wang et al. \cite{Wang2011} and Bryant et al.\cite{Bryant2012} on variational inference is a step in this direction. They develop a variational framework for the hierarchal dirichlet model, a fundamental part of all nonparametric TLG formulations. As such we seek to build on their work and apply it to specifically the TLG case. Our goal is motivated by the excellent performance of variational inference. This is both generally \cite{Grimmer_2010} and specifically; Wang et al.\cite{Wang2011} had excellent performance on a dataset of 400,000 articles, an order ### State ofmagnitude larger than any sampling-based inference on the TLG problem. Art The result of the models above is a probability density. The catch is that unlike with simpler models, the integral representing our density is intractable. One solution makes use of the fact that while intergrating over the whole measure space is difficult, plugging in values and retrieving an (un-normalised) probability is not. We can exploit this property by performing an intelligent random walk over the surface of the density. The idea is that if we walk for long enough, we'll obtain a reasonable representation of the surface. This is the basis for a family of inference methods called Markov-Chain Monte-Carlo (MCMC) sampling. All current Bayesian TLG formulations use MCMC based inference methods \cite{Ahmed2011, Hong:2011du, Wang2013, Ahmed:2012vh, Li2013}. They are simple to implement and an intuitive method for exploring an unknown density. On the other hand they tend to scale poorly with both data-set size or dimensionality \cite{wainwright2008graphical, Grimmer_2010}. NLP in general tends to have large amounts of sparse high dimensional data. Furthermore TLG is a summarisation task and the value of summarisation grows with the size of the underlying data. Because of this, exploring additional inference methods is an important goal for further research. ### Contributions Inference in nonparametric Bayesian formulations can largely be divided into sampling and expectation-maximisation (EM) like approaches. The former has been applied to TLG but as of yet no attempt has been made to apply the latter. The work of Wang et al. \cite{Wang2011} and Bryant et al.\cite{Bryant2012} on variational inference is a step in this direction. They develop a variational framework for the hierarchal dirichlet model, a fundamental part of all nonparametric TLG formulations. As such we seek to build on their work and apply it to specifically the TLG case. Our goal is motivated by the excellent performance of variational inference. This is both generally \cite{Grimmer_2010} and specifically; Wang et al.\cite{Wang2011} had excellent performance on a dataset of 400,000 articles, an order of magnitude larger than any sampling-based inference on the TLG problem. One alternative to sampling-based methods is variational inference. In broad strokes this method defines a family of densities capable of approximating the generating distribution of any given dataset. We then perform an iterative optimisation process to find the distribution in this family that best matches our data. Variational inference has been shown to generate models of similar quality to MCMC methods several order of magnitudes faster \cite{wainwright2008graphical, Grimmer_2010}. No implementations of TLG to date use this form of inference. Variational methods are highly specific to the underlying model, and developing a new variational inference formulation is an involved task. Nevertheless they could provide a large increase in scalability. For the sake of comparison, we can observe the work done by Wang et al. and Bryant et al. \cite{Wang2011, Bryant2012}. They analyse the performance of variational inference methods for hierarchical Dirichlet processes. This process is the foundation of our nonparametric topic models, justifying the comparison. Wang et al. employ variational methods on datasets of over 400,000 articles\cite{Wang2011}. In contrast the largest TLG model with MCMC inference to date has only had a dataset of 10,000 articles\cite{Wang2013}.