Paf Paris edited untitled.tex  about 8 years ago

Commit id: c4f2368763258b33aa2cef02b1d98b6a54a56448

deletions | additions      

       

From Michael Stonebraker's \textit{Red Book} \cite{red-book}:  Data Integration is the following steps:  \begin{enumerate}  \item{Ingest:} \item{\textbf{Ingest}:}  Locate and capture data source. Parse whatever data structure is used for storage. \item{\textbf{Clean}:} Find and rectify data errors.  \item{\textbf{Transform}:} Euro to dollars, Airport code to city name, date of birth to age etc.  \item{\textbf{Schema Integration}:} Wages - salary, Likes - hobbies, Person - Employ.  \item{\textbf{Entity consolidation}(deduplication):} Find and merge duplicates.  \end{enumerate}