Xavier Holt edited Inference_Our_inference_method_is__.md  almost 8 years ago

Commit id: aadc5c369a5fb3537e413b22c6b59e1a5ddd93dd

deletions | additions      

       

Our inference method is one of our primary contributions to the state of the art.  ## State of the Art Current Approaches  The result Underlying all  of the our timeline  modelsabove  is a probability probabilistic  density. In order to do anything of interest, we have to be able to perform inference in this space.  The catch is that unlike with simpler models, the integral representing our density is intractable.One solution makes use of the fact that while intergrating over the whole measure space is difficult, plugging in values and retrieving an (un-normalised) probability is not. We can exploit this property by performing an intelligent random walk over the surface of the density. The idea is that if we walk for long enough, we'll obtain a reasonable representation of the surface. This is the basis for a family of inference methods called Markov-Chain Monte-Carlo (MCMC) sampling.  All current In general, the way this is handled in nonparametric  Bayesian TLG formulations use MCMC based inference methods \cite{Ahmed2011, Hong:2011du, Wang2013, Ahmed:2012vh, Li2013}. They models can largely be divided into two camps. These  are simple to implement sampling  and expectation-maximisation (EM) approaches:  1. The first solution makes use of the fact that while intergrating over the whole measure space is difficult, plugging in values and retrieving  an intuitive method for exploring (un-normalised) probability is not. We can exploit this property by performing  an unknown density. On intelligent random walk over  the other hand they tend to scale poorly with both data-set size or dimensionality \cite{wainwright2008graphical, Grimmer_2010}. NLP in general tends to have large amounts surface  of sparse high dimensional data. Furthermore TLG the density. The idea  is that if we walk for long enough, we'll obtain  a summarisation task and the value reasonable representation  ofsummarisation grows with  the size of surface. This is  the underlying data. Because basis for a family  ofthis, exploring additional  inference methods is an important goal for further research. called Markov-Chain Monte-Carlo (MCMC) sampling.  2. In broad strokes, the second this method defines a family of densities. The family should be capable of approximating the generating distribution of any given dataset. An iterative optimisation process then occurs to find the distribution in this family that best matches our data.  ## Contributions  Inference in nonparametric All current  Bayesianformulations can largely be divided into sampling and expectation-maximisation (EM) like approaches. The former has been applied to  TLG but as of yet no attempt has been made to apply the latter. The work of Wang et al. \cite{Wang2011} and Bryant et al.\cite{Bryant2012} on variational formulations use MCMC based  inference is a step in this direction. methods \cite{Ahmed2011, Hong:2011du, Wang2013, Ahmed:2012vh, Li2013}.  They develop a variational framework for the hierarchal dirichlet model, a fundamental part of all nonparametric TLG formulations. As such we seek are simple  to build on their work implement  and apply it to specifically an intuitive method for exploring an unknown density. On  the TLG case. Our goal is motivated by other hand they tend to scale poorly with both data-set size or dimensionality \cite{wainwright2008graphical, Grimmer_2010}. On  the excellent performance of other hand,  variational inference. This is both generally \cite{Grimmer_2010} and specifically; Wang et al.\cite{Wang2011} had excellent performance on a dataset inference (a type  of 400,000 articles, an EM-algorithm) has been shown to generate models of similar quality to MCMC methods several  order of magnitude larger than any sampling-based inference on the TLG problem. magnitudes faster \cite{wainwright2008graphical, Grimmer_2010}.  One alternative No implementations of TLG  to sampling-based methods is variational inference. In broad strokes date use  this method defines a family of densities capable form  of approximating the generating distribution of any given dataset. We then perform an iterative optimisation process inference. Variational methods are highly specific  tofind  the distribution in this family that best matches our data. Variational underlying model, and developing a new variational  inference has been shown to generate models of similar quality to MCMC methods several order formulation is an involved task. Nevertheless they have the potential massively increase scalability  of magnitudes faster \cite{wainwright2008graphical, Grimmer_2010}. TLG models, and as such we develop the framework below.  No implementations of TLG to date use this form of inference. ##  Variational methods are highly specific to the underlying model, and developing a new variational inference formulation is an involved task. Nevertheless they could provide a large increase in scalability. For the sake of comparison, we can observe the work done by Wang et al. and Bryant et al. \cite{Wang2011, Bryant2012}. They analyse the performance of variational inference methods for hierarchical Dirichlet processes. This process is the foundation of our nonparametric topic models, justifying the comparison. Wang et al. employ variational methods on datasets of over 400,000 articles\cite{Wang2011}. In contrast the largest TLG model with MCMC inference to date has only had a dataset of 10,000 articles\cite{Wang2013}. Inference  Derived update rules: