Authorea

Xavier Holt edited Contributions_Contributions_to_the_State__.md almost 8 years ago

Commit id: adac1e9fa288ad43c585d082dc19a44ae9b773db

deletions | additions

Our evaluation process relies upon the existence of gold-standard timelines. As discussed in the literature review there is currently no dataset that matches this description. Our final contribution will to be to develop such a corpus. We propose to select twenty figures who are central to the US presidential election. Current forerunners Bernie Sanders, Donald Trump, Hillary Clinton, John Kasich and Ted Cruz will be included. Additionally we will select a number of figures with less news coverage to allow evaluation on models with more sparse input. Gold-standard generation will involve presenting crowd workers with a list of news URLS. They will then be asked to indicate whether or not to include an article in a gold-standard timeline. ### Contributions to the State of the Art: Inference Our inference method is one of our primary contributions to the state of the art. Inference in nonparametric Bayesian formulations can largely be divided into sampling and expectation-maximisation (EM) like approaches. The former has been applied to TLG but as of yet no attempt has been made to apply the latter. The work of Wang et al. \cite{Wang2011} and Bryant et al.\cite{Bryant2012} on variational inference is a step in this direction. They develop a variational framework for the hierarchal dirichlet model, a fundamental part of all nonparametric TLG formulations. As such we seek to build on their work and apply it to specifically the TLG case. Our goal is motivated by the excellent performance of variational inference. This is both generally \cite{Grimmer_2010} and specifically; Wang et al.\cite{Wang2011} had excellent performance on a dataset of 400,000 articles, an order of magnitude larger than any sampling-based inference on the TLG problem. This approach is particularly exciting for a number of reasons. Firstly, the datasets where we would want to apply our model