this is for holding javascript data
Xavier Holt edited Inference_Our_inference_method_is__.md
almost 8 years ago
Commit id: aadc5c369a5fb3537e413b22c6b59e1a5ddd93dd
deletions | additions
diff --git a/Inference_Our_inference_method_is__.md b/Inference_Our_inference_method_is__.md
index 8bf415f..1868c8a 100644
--- a/Inference_Our_inference_method_is__.md
+++ b/Inference_Our_inference_method_is__.md
...
Our inference method is one of our primary contributions to the state of the art.
##
State of the Art Current Approaches
The result Underlying all of
the our timeline models
above is a
probability probabilistic density.
In order to do anything of interest, we have to be able to perform inference in this space. The catch is that unlike with simpler models, the integral representing our density is intractable.
One solution makes use of the fact that while intergrating over the whole measure space is difficult, plugging in values and retrieving an (un-normalised) probability is not. We can exploit this property by performing an intelligent random walk over the surface of the density. The idea is that if we walk for long enough, we'll obtain a reasonable representation of the surface. This is the basis for a family of inference methods called Markov-Chain Monte-Carlo (MCMC) sampling.
All current In general, the way this is handled in nonparametric Bayesian
TLG formulations use MCMC based inference methods \cite{Ahmed2011, Hong:2011du, Wang2013, Ahmed:2012vh, Li2013}. They models can largely be divided into two camps. These are
simple to implement sampling and
expectation-maximisation (EM) approaches:
1. The first solution makes use of the fact that while intergrating over the whole measure space is difficult, plugging in values and retrieving an
intuitive method for exploring (un-normalised) probability is not. We can exploit this property by performing an
unknown density. On intelligent random walk over the
other hand they tend to scale poorly with both data-set size or dimensionality \cite{wainwright2008graphical, Grimmer_2010}. NLP in general tends to have large amounts surface of
sparse high dimensional data. Furthermore TLG the density. The idea is
that if we walk for long enough, we'll obtain a
summarisation task and the value reasonable representation of
summarisation grows with the
size of surface. This is the
underlying data. Because basis for a family of
this, exploring additional inference methods
is an important goal for further research. called Markov-Chain Monte-Carlo (MCMC) sampling.
2. In broad strokes, the second this method defines a family of densities. The family should be capable of approximating the generating distribution of any given dataset. An iterative optimisation process then occurs to find the distribution in this family that best matches our data.
## Contributions
Inference in nonparametric All current Bayesian
formulations can largely be divided into sampling and expectation-maximisation (EM) like approaches. The former has been applied to TLG
but as of yet no attempt has been made to apply the latter. The work of Wang et al. \cite{Wang2011} and Bryant et al.\cite{Bryant2012} on variational formulations use MCMC based inference
is a step in this direction. methods \cite{Ahmed2011, Hong:2011du, Wang2013, Ahmed:2012vh, Li2013}. They
develop a variational framework for the hierarchal dirichlet model, a fundamental part of all nonparametric TLG formulations. As such we seek are simple to
build on their work implement and
apply it to specifically an intuitive method for exploring an unknown density. On the
TLG case. Our goal is motivated by other hand they tend to scale poorly with both data-set size or dimensionality \cite{wainwright2008graphical, Grimmer_2010}. On the
excellent performance of other hand, variational
inference. This is both generally \cite{Grimmer_2010} and specifically; Wang et al.\cite{Wang2011} had excellent performance on a dataset inference (a type of
400,000 articles, an EM-algorithm) has been shown to generate models of similar quality to MCMC methods several order of
magnitude larger than any sampling-based inference on the TLG problem. magnitudes faster \cite{wainwright2008graphical, Grimmer_2010}.
One alternative No implementations of TLG to
sampling-based methods is variational inference. In broad strokes date use this
method defines a family of densities capable form of
approximating the generating distribution of any given dataset. We then perform an iterative optimisation process inference. Variational methods are highly specific to
find the
distribution in this family that best matches our data. Variational underlying model, and developing a new variational inference
has been shown to generate models of similar quality to MCMC methods several order formulation is an involved task. Nevertheless they have the potential massively increase scalability of
magnitudes faster \cite{wainwright2008graphical, Grimmer_2010}. TLG models, and as such we develop the framework below.
No implementations of TLG to date use this form of inference. ## Variational
methods are highly specific to the underlying model, and developing a new variational inference formulation is an involved task. Nevertheless they could provide a large increase in scalability. For the sake of comparison, we can observe the work done by Wang et al. and Bryant et al. \cite{Wang2011, Bryant2012}. They analyse the performance of variational inference methods for hierarchical Dirichlet processes. This process is the foundation of our nonparametric topic models, justifying the comparison. Wang et al. employ variational methods on datasets of over 400,000 articles\cite{Wang2011}. In contrast the largest TLG model with MCMC inference to date has only had a dataset of 10,000 articles\cite{Wang2013}. Inference
Derived update rules: