Paf Paris edited untitled.tex  about 8 years ago

Commit id: 64a0cc4556ebc84256b145c8bffe08e228202c64

deletions | additions      

       

\title{Extracts from various articles}  Appearing in a rather random order. Will tidy up later...  The problem was initially refer to as Extract - Transform - Load. The basic methodology was to:  \begin{itemize}  \item{construct a local schema}  \item{write a connector to do the extraction}  \item{write transformations to cleanup the data}  \item{load the data in the data warehouse}  \end{itemize}  Zachary G. Ives in \cite{cidr2015-Ives} says that a view \textit{at scale} yields many benefits, and this is evident in \cite{ieee-3-googlers}. "Follow the data. Choose a representation that can use unsupervised learning on unlabeled data, which is so much more plentiful than labeled data. Represent all the data with a nonparametric model rather than trying to summarize it with a parametric model, because with very large data sources, the data holds a lot of detail."  Georgia Kapitsaki in \cite{kapitsaki-2015} proposes a context extraction technique from existing datasets.