Authorea

Wen Jenny Shi edited sectionSequential_Cl.tex over 9 years ago

Commit id: faa7e9ee09d65f62f61897ceaccd83b70f2efa36

deletions | additions

The first step of our procedure is combing the datasets collected at different time and consolidate the invariant read sites. As a demonstration of the processing step, we provide a toy example for a virus of length 5 (Table \ref{tab:joindata}). The first row of three tables in Table \ref{tab:joindata} presents the read counts are obtained at time $t_1,\;t_2,$ and $t_3$; the table in the second row provides the combined data of the first row by merging all the sites with the same homogeneous read type. The first few columns of Table \ref{tab:joindata} are the consolidation of columns with single read type A, C, G, T, M, respectively. The sites with non-homogenous reads are copied to joint data matrix after all the combined invariant sites ($Y_1^{all\;t},Y_2^{all\;t},Y_3^{all\;t}$ in the toy example). In particular, $Y_1^{all \; t}$ in the joint data matrix (second row in Table \ref{tab:joindata}) is formed by merging columns $Y_1^{t_1}$ and $Y_1^{t_2}$. Similarly, $Y_2^{all\;t},Y_3^{all\;t}$ are formed based on sites with homogenous read of $C$ and $T$, respectively: $$Y_2^{all\;t} $Y_2^{all\;t} = Y_3^{t_1} + Y_3^{t_2},$$ $$Y_3^{all\;t} Y_3^{t_2},$ $Y_3^{all\;t} = Y_2^{t_1} + Y_5^{t_1} + Y_2^{t_2} + Y_5^{t_2} + Y_2^{t_3} + Y_4^{t_3} + Y_5^{t_3}.$$ Y_5^{t_3}.$ The following columns in the second row are $$Y_4^{all\;t} $Y_4^{all\;t} = Y_4^{t_1}, ... $$ $ The exact mapping is shown in the third row of the table.

\begin{table} %\centering \resizebox{16cm}{!}{ $\begin{array}{cccccc} \hline

M&0& 0& 1&0 & 1 \\ \hline\\ \end{array}$} $$\Downarrow$$ $\Downarrow$ \begin{center} \resizebox{10cm}{!}{ $\begin{array}{ccccccccc}

\hline\\ \end{array}$} \end{center} \begin{center} \resizebox{8cm}{!}{ $\begin{array}{| c c l |}

\hline \end{array}$} \end{center} \vspace{.2in} \caption{Toy example of joining and preprocessing three $5\times 5$ data matrices. The first few columns in the joint data matrix (second row) are the consolidation of columns with single nucleotide read in the sampled data panels (first row). The remaining columns of the joint data matrix are the copies of non-homogeneous reads of the sample (first row). The detail of the consolidation process is described in the panel in the third row. } \label{tab:joindata}