Authorea

Paf Paris edited untitled.tex about 8 years ago

Commit id: 6851d056cd469fe1e839c87b355d7b36fd1d569a

deletions | additions

\item{write transformations to cleanup the data} \item{load the data in the data warehouse} \end{itemize} Experience \cite{red-book} proved the previous methodology of lacking scalability. Zachary G. Ives in \cite{cidr2015-Ives} says that a view \textit{at scale} yields many benefits, and this is evident in \cite{ieee-3-googlers}. "Follow the data. Choose a representation that can use unsupervised learning on unlabeled data, which is so much more plentiful than labeled data. Represent all the data with a nonparametric model rather than trying to summarize it with a parametric model, because with very large data sources, the data holds a lot of detail."