Authorea

Wen Jenny Shi edited sectionSequential_Cl.tex over 9 years ago

Commit id: 17a56d89c8e30980dfc198e2185b3cc60b780fb2

deletions | additions

The first step of our procedure is combing the datasets collected at different time and consolidate the invariant read sites. As a demonstration of the processing step, we provide a toy example for a virus of length 5 (Table \ref{tab:joindata}). The first row of three tables in Table \ref{tab:joindata} presents the read counts are obtained at time $t_1,\;t_2,$ and $t_3$; the table in the second row provides the combined data of the first row by merging all the sites with the same homogeneous read type. The first few columns of Table \ref{tab:joindata} are the consolidation of columns with single read type A, C, G, T, M, respectively. The sites with non-homogenous reads are copied to joint data matrix after all the combined invariant sites ($Y_1^{all\;t},Y_2^{all\;t},Y_3^{all\;t}$ in the toy example). In particular, $Y_1^{all \; t}$ in the joint data matrix (second row in Table \ref{tab:joindata}) is formed by merging columns $Y_1^{t_1}$ and $Y_1^{t_2}$. Similarly, $Y_2^{all\;t},Y_3^{all\;t}$ are formed based on sites with homogenous read of $C$ and $T$, respectively: $Y_2^{all\;t} $$Y_2^{all\;t} = Y_3^{t_1} + Y_3^{t_2},$ $Y_3^{all\;t} Y_3^{t_2},$$ $$Y_3^{all\;t} = Y_2^{t_1} + Y_5^{t_1} + Y_2^{t_2} + Y_5^{t_2} + Y_2^{t_3} + Y_4^{t_3} + Y_5^{t_3}.$ Y_5^{t_3}.$$ The following columns in the second row are $Y_4^{all\;t} $$Y_4^{all\;t} = Y_4^{t_1}, ... $ $$ The exact mapping is shown in the third row of the table.