Jianguo Xia

and 1 more

The initial motivation for developing MetaboAnalyst was to save time for myself. I started my PhD with Dr. David Wishart at the University of Alberta. During that time period, the main focus of the lab was, of course, the Human Metabolome Database (HMDB). The development of a metabolomics core facility was also at its full speed. As part of my PhD training, I was involved in a metabolomics study on urine samples from cancer cachexia patients. At that time, the only bioinformatics tool for metabolomics data analysis was a commercial software - SIMCA-P (Umetrics). We purchased a copy of the tool which came with a comprehensive manual. Although I could perform some “standard” data analysis to produce the numbers and graphics as seen in many metabolomics publications, I soon realized its limitations - many approaches I would like to try were not supported. I then played with Weka (https://www.cs.waikato.ac.nz/ml/weka/), a widely-used java-based machine learning tool, for classification and regression analysis. However, it lacks many features specially needed for metabolomics data analysis. In the end, I taught myself R to perform data analysis. This worked well for a short time - I analyzed the data the way I wanted, generated impressive graphics, and produced analysis reports using Sweave & Latex. However, the process soon became less enjoyable when more collaborators requested their data to be analyzed in a similar fashion. A better way is to let someone else in the lab do it. The best way is to let researchers analyze their own data - most of them are highly educated and understand the basic principles behind most analysis methods. At that time, I was the only one in the lab who knew R and statistics - how can I let other people with some basic knowledge to perform the same analysis I would do? In 2008, I started thinking seriously about developing a biologist-friendly tool for metabolomics data analysis. One of the advantages of being last in the “omics” race is the benefit of hindsight. Many of the approaches developed from other omics fields are not domain-specific and can be adapted for metabolomics. For instance, the GenePattern tool suite \citep{Reich_2006} developed by the Broad Institute gave me a lot of inspirations. Other important considerations include - be web-based, respond at real time, and be implemented in the languages I know (Perl, Java and R). During a lab meeting in the summer of 2008, I proposed this idea to David. He was a bit uncertain as he knew that I had no formal training in developing web based applications (note: I obtained my MSc in Immunology after I graduated from a 5-yr Medicine program). I was very enthusiastic and said I could get this done by the end of year. He smiled and encouraged me to pursue in this direction. As most analysis methods and graphics were already implemented in R, the key challenge was to put these functions on the web through user-friendly interface. I wanted to use a technology that will not expire soon. The Perl CGI based web framework was losing its ground at that time. Java had a lot to offer in terms of web frameworks. However, many of them are too “heavy” for me to learn in a short time. Eventually, I chose the then relatively new JavaServer Faces (JSF) technology. The next technical challenge was how to efficiently communicate between R and Java to deal with concurrency (i.e. supporting multiple users to perform data analysis at the same time). The Rserve (https://www.rforge.net/Rserve) developed by Simon Urbanek came to my rescue. I spent around three months to complete the first prototype, which captured all the steps I would do for metabolomics data analysis. The web interface was designed to be quite “conversational” and acted as a playground to allow users to freely explore many useful statistical analysis methods once their data parse certain sanity checking, processing and normalization. MetaboAnalyst (version 1.0) was published in 2009 at Nucleic Acids Research \citep{Xia_2009}. It enables a researcher with a basic understanding of metabolomics and statistics to perform data analysis to generate a comprehensive analysis report. It was also heavily used by other members within our metabolomics group and saved a lot of my time. My next focus was on functional analysis of metabolomics data. Using the same infrastructure, I developed tools for metabolite set enrichment analysis \citep{Xia_2010}, metabolomic pathway analysis \citep{12235}, as well as time-series data analysis \citep{Xia2011}. They were eventually merged under the umbrella of MetaboAnalyst (version 2.0) for the ease of use and the convenience of maintenance \citep{Xia_2012}. While I was pursuing my PhD on bioinformatics for metabolomics, the next-generation sequencing revolution was in full swing. In 2012, I received two postdoctoral fellowships from the Canadian Institutes of Health Research (CIHR) and Killam Trust, to work on next-generation sequencing in Bob Hancock’s laboratory at the University of British Columbia (UBC). While at UBC, MetaboAnalyst was gaining steady increase in user traffics, and I felt obligated to maintain MetaboAnalyst and to keep addressing user requests. For instance, I added a biomarker analysis module to support a variety of common approaches clinicians would like to perform. With growing popularity, there were signs of performance issue - many colleagues experienced significantly slow responses when they used MetaboAnalyst for teaching in a large class.  I eventually decided to totally re-implement the software, with particular focus on addressing the performance bottlenecks in both Java and R functions. I also switched to the Google Computer Engine (GCE) for hosting the web application. The result is MetaboAnalyst 3.0 \citep{Xia_2015}. The impact of this update turned out to be very significant. Google Analytics showed that the submitted analysis jobs jumped from 500~800 jobs/day to 5000~8000 jobs/day, and the server downtime was also reduced significantly. We are actively developing MetaboAnalyst 4.0 at the time of writing. The key features will be to enable more transparent & reproducible analysis, better support for untargeted metabolomics, and integration with other omics through advanced statistics and network analysis.

Richard Frankham

and 1 more

The critical event that eventually led to the first of my meta-analysis papers on genetic rescue occurred in February 2007 at a book writing session on the second edition of “Introduction to Conservation Genetics” \citep{Frankham} at Jonathan Ballou’s house in the Washington, D.C. area. Upon reaching the topic of outbreeding depression (where the effects of crossing populations results in harmful fitness effects in the progeny), we both expressed serious disquiet that the risks of outbreeding depression were being overplayed, while the potential fitness benefits of crossing (genetic rescue) were largely being ignored. One of us said “we must be able to predict the risk of outbreeding depression”. A few days later inspiration struck and we had the key to doing this: harmful effects on fitness of crossing populations typically arise when the crossed populations have fixed chromosomal differences, and/or are adapted to different environments. We subsequently recruited Katherine, Ralls, Mark Eldridge, Michele Dudash, Charles Fenster and Robert Lacy and jointly transformed this insight into a paper that was published in Conservation Biology \cite{FRANKHAM_2011}. That work was critical to the ability to use genetic rescue (variously called outcrossing or augmentation of gene flow) as a tool to save small inbred population fragments from extinction, and thereby reduce population and species extinction risks. As genetic rescue had been attempted in very few cases, we decided to write a book on “Genetic Management of Fragmented Animals and Plant Populations” in an attempt to create a paradigm shift where the discovery of genetically differentiated populations was followed, not by the conclusion that separate management of fragments was required, but by asking if any of the populations were suffering genetic erosion (inbreeding, loss of genetic variation, reduced fitness, reduced ability to evolve and elevated extinction risk), and if so, was a genetic rescue attempt justified. I drafted Chapter 6 on Genetic rescue for the book, and then decided that it needed some examples which were put into a Table. At this point, I finally recognized that a fully-fledged meta-analysis was required, as there was no overview of the effects of outcrossing in a conservation context, i.e. when an inbred population fragment with low genetic diversity was crossed to another population and where the risk of outbreeding depression in the resulting progeny was low. The meta-analysis was done without external research funding as I have been officially retired since 2002 (but am still scientifically active) and do not have grant money for any of the work described here. I am great fan of meta-analyses: not only can they be done without research funds, but they are typically highly cited, similar to reviews, and are superior scientifically to them. By mining the literature, I found 156 relevant comparisons of inbred parents and their outcrossed progeny, and 145 had beneficial effects on fitness. Only one of the cases where crossing was harmful was a convincing case of outbreeding depression (in a selfing nematode), the others likely being chance observations due to low statistical power. The median fitness benefit from augmenting gene flow was 148% in wild/stressful conditions and 45% in benign/captive ones. Consequently, there are huge potential benefits from augmenting gene flow into population fragments suffering from genetic erosion, provided the risk of outbreeding depression in proposed crosses is low. Thus, the two main impediments to genetic rescue attempts have been removed. This paper was published in Molecular Ecology \cite{Frankham_2015} (currently 123 citations in Google Scholar), and was accompanied by a commentary from Donald Waller \cite{Waller_2015}. He praised the paper, but was not convinced about the persistence of the benefits over generations. Consequently, I did further analyses on my database to compare the effects of crossing on fitness in the F1, F2 and F3 generations and this confirmed that the benefits persisted to an extent that was, if anything, better than expected. This led to the publication of a second genetic rescue meta-analysis paper in Biological Conservations \cite{Frankham_2016}. Writing of our book continued (with Paul Sunnucks being added as another author) and it was submitted to Oxford University Press in December 2016. However, during the subsequent copy editing I realised that the second genetic rescue meta-analysis paper was incomplete, as the persistence of fitness benefits following crossing is expected to depend on the breeding system. Persistence of fitness benefits across generations is expected for outbreeders, but habitual selfing after crossing will lead to loss of benefits, while mixed mating species should experience only partial persistence of fitness benefits. I subsequently extended the analyses of my database from F3 to F13 and found no significant decline in fitness benefits for outbreeding species. Further, \citet{Bijlsma_2010} found no significant change in fitness between F10 and F15 generations in outbreeding Drosophila flies. The updated findings were included in the published version of our “Genetic Management of Fragmented Animal and Plant Populations” book \citep{Frankham_2017}. This was followed by a related paper calling for a paradigm shift in the genetic management of fragmented populations \citep{Ralls_2017}.

Backstories

and 1 more

Backstory of  Highly efficient Cas9-mediated gene drive for population modification of the malaria vector mosquito Anopheles stephens \cite{Gantz2015}. Developing gene-drive technology:  In late 2014 Valentino Gantz, then a graduate student in my lab at the University of California, San Diego (UCSD), described to me an idea for a CRISPR-based strategy to uncover the effect of recessive mutations in a single generation by creating a self-propagating mutagenic genetic element.  In our discussion, it became clear that this element might also propagate itself through mating and be passed on to the next generation at frequencies much higher than predicted by  standard "Mendelian" inheritance, a phenomenon often referred to as gene-drive1.  Following a series of additional discussions within the lab and advisory faculty members at UCSD, including Marty Yanofsky (then chair of CDB), Bill McGinnis (Dean of Biological Sciences), Steve Wasserman, and Joseph Vinetz (member of UCSD Institutional Biosafety Committee), we planned, constructed, and safely tested (in Joe Vinetz's ACL2 insectary) this genetic system in the fruit fly Drosophila melanogaster.  We called this method the mutagenic chain reaction (MCR) in analogy to the polymerase chain reaction (PCR).  We now refer broadly to the use of self-propagating CRISPR-based genetic elements as "active genetics". We submitted a manuscript for publication in Science on December 31, 2014, describing a simple proof-of-principle experiment in which an MCR element inserted into the Drosophila yellow gene, which is required for body pigmentation, was transmitted to nearly all offspring.  This paper, which was published online on March 19, 2015 \cite{Gantz_2015}, attracted considerable media attention and is credited with being the first to experimentally demonstrate a CRISPR-based gene-drive in a multicellular organism with a dedicated germline.  The study was cited as one of the primary reasons for CRISPR being chosen as the breakthrough of the year in 2015 by Science magazine.  This discovery also was ranked #11 overall and #2 in biology in the 100 top discoveries of 2015 by Discover Magazine.Applying a gene-drive system to combat malaria:  One application of gene-drive technology under discussion since the mid-1960s \cite{CURTIS_1968} is to use one of many possible strategies  to generate an inheritance bias and spread a desired trait into a target population. For example, a gene-drive element could be used to either kill mosquitoes as a kind of genetic insecticide \cite{Burt_2003}, or to spread a trait throughout a mosquito population that rendered them incapable of transmitting malarial parasites \cite{2006}.   Anthony (Tony) James at the University of California at Irvine (UCI) is one researcher who has dedicated his career to combatting malaria.  Tony was the first person ever to clone and describe a protein-encoding gene from any mosquito \cite{James_1989} and in his lab he first developed reliable methods for producing transgenic mosquitoes \cite{Jasinskiene_1998}.  His group then engineered sophisticated genetic cassettes that produce single chain antibodies that were capable of fully blocking transmission of malarial parasites \cite{Isaacs_2012,Isaacs_2011}. These antibodies bind to and neutralize malarial parasites following a female mosquito feeding on blood (only female mosquitoes bite).  James proposed to use a gene-drive approach to spread the immunizing antibody cassettes into mosquito populations, which should render those mosquitoes incapable of transmitting malaria to humans.  The final prescient sentence of his most recent paper on that topic published in 2012 stated: "If coupled with a mechanism for gene spread, antibody-expressing, malaria-resistance transgenes could become a self-sustaining disease control tool" \cite{Isaacs2012}.  On January 9, 2015, shortly after submission of our paper to Science describing efficient transmission of the MCR element in Drosophila, Valentino and I contacted Tony to ask him if he might be interested in a collaboration in which we hooked up our gene-drive system to his "immunizing" gene cassette.  Tony enthusiastically agreed.  Based on subsequent discussions, Valentino designed and constructed an immunizing gene-drive element targeting insertion into a gene called kynurenine hydroxylase (kh), which is required for normal eye pigmentation.  Valentino sent the completed kh-MCR construct to the James group, who then injected this synthetic DNA into mosquitoes.  On July 29, 2015 the James team recovered 2 transgenic individuals carrying the kh-MCR construct (out of ~ 25,000 injected larvae screened).  The memorable email that Tony sent me reporting the first results indicating how the kh-MCR was transmitted to progeny was entitled: "Friar Mendel takes a break".  After analyzing the transmission of the kh-MCR element over four generations, we prepared a manuscript entitled "Highly efficient Cas9-mediated gene drive for population modification of the malaria vector mosquito Anopheles stephensi" \cite{Gantz08122015} , which is the subject of this Backstory.  This paper, published on November 23, 2015 in the Proceedings of the National Academies of Science (PNAS), described the highly efficient transmission of the kh-MCR element to >99% of offspring via male parents.  There was significant media coverage of this first demonstration of efficient CRISPR-based gene-drive in mosquitoes, which is still ongoing over two years later. Addressing safety, ethical, and societal aspects gene-drives:  Given the potential of gene-drive systems to alter wild populations, Valentino, Tony, and I have devoted significant effort to engage scientists, social scientists, governmental regulators, and the public to initiate discussion regarding the ethical use of gene-drive systems to benefit society.  These efforts have encompassed areas including ethics, political science, sociology, health policy, as well as government regulatory agencies to provide transparent and informative explanations of active genetics technology and it applications.  For example, we jointly co-authored an opinion piece in Science with 23 other colleagues suggesting precautionary measures for safe use of gene-drive systems in the laboratory \cite{Akbari_2016} and organized a UCSD/JCVI workshop on regulation of gene-drives \cite{Adelman_2017}.  We also provided briefs to the NAS and federal risk assessment agencies (e.g., an Obama administration risk assessment panel comprising members of the FBI, CIA and Department of Homeland Security as well as presentations to the JASON study group) and a series of presentations to university Biosafety (e.g., IBC) and bioethics groups.  We continue to participate in forums for scientific and community engagement and contribute actively to ongoing discussions on these important topics. A future vision for active genetics:  In January 2016, Valentino Gantz and I published our vision for the future of active genetics in a BioEssays review \cite{BIES:BIES201500102}.  In this review, we outlined applications of active genetics to gene-drive systems, auxiliary updating drive elements such as CHACRs, reversal drives such as ERACRs and eCHACRs, bi-partite trans-complementing drives, split-drive CopyCat site-directed transgenesis vectors, potential uses for human cell therapy applications, and a summary of the ethical and safety considerations associated with gene-drive technology.Acknowledgements The author thanks Anthony (Tony) James) and Valentino Gantz for helpful comments on this Backstory and for the joy of our shared adventure 1Early geneticists referred to this phenomenon as ‘meiotic drive’