Wen Jenny Shi edited sectionIntroduction_.tex  over 9 years ago

Commit id: cfaffd0ce2036419a020f71f3d280252f2a2e8eb

deletions | additions      

       

This need for new annotation free analytical tools has been amplified by the wealth of new viral sequence data made possible by recent advances in sequencing technology \citep{Jabara2011}. Increasingly, populations of thousands of viruses are sampled and sequenced from an infected individual, this approach captures a snapshot of the viral genetic variation within an individual. A couple studies have combined this approach with traditional passage experiments or sampling during the course of an infection \citep{Eriksson2008, Kuroda2010, Leitner1993, Wright2010}. This powerful approach reveals how genomically a population of viruses responds to evolutionary pressure. With the ever-decreasing cost of sequencing, these studies are expected to become commonplace.  Our motivating dataset is the influenza A H1N1 strain time series sample in the presence and absence of an inhibitor of neuraminidase, oseltamivir, collected from multiple passages with total two biological duplicates. We are interested in finding the genomic regions of the virus that are affected by the inhibitor. The influenza A virus (IVA) was first adapted from chicken egg to Madin-Darby canine kidney (MDCK) cells for three passages. Then the samples were serially passaged in MDCK cells in either the absence or presence of oseltamivir in replicated experiments (Figure \ref{fig:flow}). \ref{fig:flow_flu1}).  At the end of each passage, whole-genome high throughput sequencing data were collected. The size of the oval roughly corresponds to the average total read count per site.