Critique of Mining Topics in Documents: Standing on the Shoulders of Big Data

Abstract

In this paper I will be reviewing and evaluating the work of Zhiyuan Chen and Bing Liu of the University of Chicago, entitled Mining Topics in Documents: Standing on the Shoulders of Big Data, presented at the 2014 Conference of Knowledge Discovery and Data Mining. After a brief summary of the paper and its findings, I present the author's background and related previous work - to find that they have pioneered transfer-learning models for topic mining. Subsequenlty I conduct a brief overview of related work and possible improvements, followed by my personal reflections and suggested future work pathways. I finish with a brief conclusion.

Keywords: topic modeling, lifelong learning, transfer learning, AMC, LDA

Summary

In this paper I will be reviewing and evaluating the work of Zhiyuan Chen and Bing Liu of the University of Chicago, entitled Mining Topics in Documents: Standing on the Shoulders of Big Data (MTD), presented at the 2014 Conference of Knowledge Discovery and Data Mining (Chen 2014).

MTD is a methodological paper that aims to improve and extend the field of topic modeling (i.e. automatically discovering and clustering the topics in an existing a text snippet, called a document - the typical outcome is a set of topics, each uniquely define