Authorea

Michael Walker edited Gene Machine.tex over 8 years ago

Commit id: d80b0570d55ef7b037f23e9ded58978bfe9fcd8c

deletions | additions

\section{Effective visualisation of data} \subsection{Team Members} Andy Kitchen (Developer, Silverpond, @auastro) Jono Chang (Developer, Silverpond, @jonochang) Toby Sargeant (Data scientist, Walter and Eliza Hall Institute) Sally Hunter (Biologist, Peter MacCallum Cancer Centre) Maria Doyle (Bioinformatician, Peter MacCallum Cancer Centre) Les Kitchen (Developer, Computing and Information Systems, University of Melbourne) \subsection{The Problem}

A lack of tools that allow researchers to easily and effectively assess their experimental data. Targeted genetic screening enables the detection of changes (mutations) in selected regions from an individual’s DNA sample. New and rapidly evolving technologies are revolutionising this area enabling massive scaling up of genetic testing. New tests are being released for medical research on a regular basis and the speed with which the field moves often makes it difficult to know how robust a new test is. Samples may also perform differently on these tests, some performing worse than others either across the entire test or just for certain genomic regions. Knowing what data can be trusted and what is bad or unreliable is of critical importance. We need a tool to help medical researchers quickly and easily assess the performance of both tests and samples. \subsection{The Solution} After considering various alternatives, we settled on a web-based data-visualization tool. The experimental data is fetched in JSON format from the server, displayed in SVG, and manipulated interactively via the d3 framework. Python flask is used for the server-side data-wrangling into JSON.

\textbf{MVP Criteria:} \begin{enumerate} \item Visualisation/Effective representation of the data \begin{list} \begin{itemize} \item Meaningful information can be derived \\ \item Aesthetically pleasing \end{list} \end{itemize} \\ \item Interactivity \begin{list} \begin{itemize} \item Ability to zoom and scroll \\ \item Ability to sort \\ \item Filter on specified criteria \\ \item Ability to select individual samples \end{list} \end{itemize} \end{enumerate} \subsection{Application/Relevance}

An in-house targeted sequencing dataset was used. Since this data is sensitive, dummy data is used for the publicly deployed sample webpage. \subsection{Links} \begin{list} \begin{itemize} \item Our github repo \\ \item A running example on dummy data. Requires Javascript. \end{list} \end{itemize} \subsection{Tech stack}

\subsection{Future functionality} 1. Additional, more detailed views for individual genes, with: * Individual sample, multi-sample and aggregate view capabilities * Genome location * Dynamic summary stats e.g. quartiles 2. Simultaneously view >=2 subgroups e.g. cases vs controls 3. Ability to switch between views/data levels 4. Links out to existing tools – e.g. IGV to visualise raw data 5. Export data summaries 6. Export graphics (jpg, png, pdf etc.) 7. Save analyses as session/project 8. Expand to other data types