Xavier Holt added Contributions_Contributions_to_the_State__.md  almost 8 years ago

Commit id: 71e0dd48b5bcf213e95a0cd82b5c923330904fda

deletions | additions      

         

## Contributions  ### Contributions to the State of the Art: Inference  Our inference method is one of our primary contributions to the state of the art. Inference in nonparametric Bayesian formulations can largely be divided into sampling and expectation-maximisation (EM) like approaches. The former has been applied to TLG but as of yet no attempt has been made to apply the latter. The work of Wang et al. \cite{Wang2011} and Bryant et al.\cite{Bryant2012} on variational inference is a step in this direction. They develop a variational framework for the hierarchal dirichlet model, a fundamental part of all nonparametric TLG formulations. As such we seek to build on their work and apply it to specifically the TLG case. Our goal is motivated by the excellent performance of variational inference. This is both generally \cite{Grimmer_2010} and specifically; Wang et al.\cite{Wang2011} had excellent performance on a dataset of 400,000 articles, an order of magnitude larger than any sampling-based inference on the TLG problem.  ### Contributions to the State of the Art: Evaluation Methodology  Our second major contribution is in consolidating the timeline evaluation problem. There is currently no uniform standard method. It is a difficult task consisting of several different sub-problems: classification, summarisation and selection. As argued above, this makes purely numeric measures of performance unrealistic. On the other hand, crowd voting provides evaluation of much higher quality but is costly. We therefore propose a framework where we use our numeric measures for exploratory analysis and model-comparison. Specifically we will use the ROUGE family of metrics as well as perplexity when first developing our model and in hyperparameter selection. Once we have used automatic methods to determine our ideal model, we will use a crowd-vote to determine overall performance. This will take the form of a binary preferencing problem between a collection of system-generated and one gold-standard timeline. This hybrid approach provides the advantages of both numeric and crowd methods while mitigating their downsides. To date this is the first such method of evaluation for the TLG problem.  Our evaluation process relies upon the existence of gold-standard timelines. As discussed in the literature review there is currently no dataset that matches this description. Our final contribution will to be to develop such a corpus. We propose to select twenty figures who are central to the US presidential election. Current forerunners Bernie Sanders, Donald Trump, Hillary Clinton, John Kasich and Ted Cruz will be included. Additionally we will select a number of figures with less news coverage to allow evaluation on models with more sparse input. Gold-standard generation will involve presenting crowd workers with a list of news URLS. They will then be asked to indicate whether or not to include an article in a gold-standard timeline.