Authorea

Michael Walker deleted EPI View.md almost 9 years ago

Commit id: d28e64afcd41cf0027997a8b1bef6add5f73e6b3

deletions | additions

Epiview ----- Matt Ritchie, Walter & Eliza Hall Institute ([email protected]) Michael Walker, PhD student, MSB-group, Pathology, MDHS, Uni of Melbourne ([email protected]) Philip Goebel, Amateur Dev, Physiotherapist ([email protected]) John Kavadias, Software Engineer+Architect, Distributed Computing, Machine Learning ([email protected]) ([email protected]) Maia Sauren, ThoughtWorks Australia, Open Knowledge Foundation ([email protected]) The Problem ---------- This team developed a simple and highly visually effective software package to produce a genetic and epigenetic display, allowing the user to view many different layers of information at once, and to easily switch between viewing a single region of one gene, to many genes, and even a whole chromosome. There are many publicly available datasets that could be used to create such visualisation software. While there were systems, each with its own pros and cons, for viewing such data already available (e.g. Galaxy, UCSC Genome Browser, IGV, SeqMonk), none provided a holistic view of the data that allows optimal visualisation. Since the differences in gene expression between cell types is due to epigenetic control via methylation, there is a particular incentive to visualize and study the epigenetic marks of a cell simultaneously with its genetic profile. So what controls whether a gene is turned on or off? We know this is influenced by how easily accessible the gene is to the factors required to turn the gene on: the more tightly packaged the gene is, the less likely it is to be accessible to factors that bind and activate it, and the less likely it will be turned on.* *We now know that the different packaging of the DNA can be correlated with various marks that are made to the DNA, called ‘epigenetic marks’. These epigenetic marks can be considered as punctuation marks in the genome – they allow the cell to interpret how to read the information contained in the DNA sequence. Due to transformational changes in the way we examine these marks throughout the whole genome, biologists are creating a wealth of data that reports not only the DNA sequence, but also the amount of various epigenetic marks, and the amount of different factors bound to the DNA throughout the whole genome. This landslide of data is highly complex, and difficult for bench biologists to interrogate. By visualising the genome and and in particular its epigenetic profile in a Circos plot, biologists are then able to interrogate the data through an intuitive interface. The web-based implementation allows the user to select the desired dataset, chromosomes, and which parts of the genome data to display. Bam files are currently the only the supported data file type but support for more compact types is planned for the future. Datasets ------- Bam files from Encode project (http://genome.ucsc.edu/encode/) Links ----- Available from git repository (http://github.com/mritchie/epiview) Tech stack ------ Epiview is based on the R libraries _RCircos_, to generate the Circos plots, and _Shiny_, to be operable via the web. This is enhanced by _D3.js_ to facilitate interactivity.