Research Methods in Timeline Modelling

Several approaches have been used to model timelines and the TLG problem. They can be roughly divided into optimisation based approaches and probabilstic approaches. Our focus and the majority of our analysis is on the latter. They are the more prevalent approach and have several highly desirable attributes in terms of post-hoc querying and representation.

Optimisation Based Approaches

Optimisation based approaches develop a function that measures sentence importance. This has to account for relevancy and diversity -- both in terms of content and temporality. Once this function has been formulated, inference occurs through maximisisation.

Simple Optimisation

Several approaches formulate TLG as a simple optimisation problem. Althoff et al. and Chieu et al. both define objective functions on the sentence level which include signals of relevance and diversity \cite{Althoff:2015dg, Chieu:2004id}. Numeric methods are used to find the sentences maximising this objective. This approach forgoes the topic and clustering structure present in other methodologies. The advantage is that model construction and timeline generation is relatively simple. Without any additional structure on the data it can be hard to determine a valid and representative objective function. This structure can also be useful in and of itself, discussed further in the subsequent sections.

PageRank Framework

In a similar manner some academics have applied a PageRank-esque framework to TLG \cite{Yan2011, Yan2011a, Tran:2015ws}. These models operate through maximising an objective function, but are characterised by an underlying graph structure. Sentences are linked to one another and ranked based on the 'votes' or 'preferences' they receive from their neighbours. Often inter- and intra- document sentences are represented and weighted differently to improve performance \cite{wan2007manifold}. Temporal dependence is induced through the particulars of graph construction; Yan et al. for example models this component through intra- and inter- date edges \cite{Yan2011, Yan2011a}. The PageRank approach is more structured than a simple optimisation approach, and comes at the cost of additional model building complexity. The tradeoff is that the graph-structure is argued as being an intuitive method for representing the problem \cite{wan2007manifold}.