Xavier Holt edited Contributions_Timeline_generation_is_a__.md  almost 8 years ago

Commit id: 666af0e9d72948c567e036ed6052b55f973c5693

deletions | additions      

       

# Contributions  Timeline generation is a tool with incredible potential. The ability to intelligently It provides a broad  and efficiently summarise huge blocks of data is hugely valuable. As general framework for intelligently summarising heterogeneous data. However, as  it stands all current TLG models fall short in two areas: evaluation and inference. This is the gap in the literature we seek to fill, and our the methodologies we develop to address this gap make up our two primary contributions to the state-of-the-art. 1. First, we develop We provide  a novelscientifically rigorous  framework for evaluating the quality of a timeline. timeline that is scientifically rigourous.  No current approach in the literature is satisfactory. This is concerning asin some sense  the evaluative process can be seen as the is a  fundamental pillars pillar  upon which our model resides. models reside.  Any conclusion we make on our models depend on the correctedness of ourevaluation framework. We address this by constructing an  evaluative pipeline framework. Yet, we argue below  that seeks to balance rigour, cost, correctedness and interpretability.  2. We also present a novel method for performing inference on the timeline models. No all  current implementation has a dataset of more than a few-thousand articles. This is unsatisfactory considering the huge wealth methods fall short in one  of information available several ways. We use these shortcomings  to us. The primary reason for this is motivate  the inference bottleneck. Current methods use some form development  of Gibb's sampling, however these methods do not scale well. our evaluation pipeline.  We seek to improve on performance in this area by leveraging an alternative approach: variational bayesian inference.   Being able to perform these modelling   * argue that our approach  is a technical complexity of the methods.  This is both generally \cite{Grimmer_2010} balances cost  and specifically; Wang et al.\cite{Wang2011} had excellent performance on a dataset of 400,000 articles, an order of magnitude larger than any sampling-based inference on the TLG problem. correctedness.  2. We also method for performing inference on the timeline models. No current implementation uses more than a few thousand articles. This is unsatisfactory considering the wealth of information available to us. The primary reason for this is the inference bottleneck. Current methods all use some form of Gibb's sampling, however these methods (especially simpler implementations thereof) have been known to scale poorly. We seek to improve on performance in this area by leveraging an alternative approach: variational bayesian inference. This family of techniques has been known to outclass Gibb's methods in terms of scalability while still maintaining comparable performance\cite{Grimmer_2010}. The catch is that unlike the generalisability and ease of understanding that characterises sampling based approaches, applying variational methods to a new domain is an involved, technical endeavour.