loading page

Estimating phylogenies from genomes: A review of commonly used genomic data in phylogenomics
  • +4
  • Javan Carter,
  • Garth Spellman ,
  • Rebecca Kimball,
  • Rebecca Safran,
  • Erik Funk,
  • Drew Schield,
  • Nolan Kane
Javan Carter
University of Colorado at Boulder
Author Profile
Garth Spellman
Denver Museum of Nature & Science
Author Profile
Rebecca Kimball
Department of Biology
Author Profile
Rebecca Safran
University of Colorado Boulder
Author Profile
Erik Funk
University of Colorado
Author Profile
Drew Schield
University of Colorado Boulder
Author Profile
Nolan Kane
University of British Columbia
Author Profile

Abstract

Despite the increasing feasibility of sequencing whole genomes from diverse taxa, a persistent problem in phylogenomics is the selection of appropriate markers or loci for a given taxonomic group or research question. In this review, we aim to streamline the decision-making process for selecting data types used in phylogenomic studies by providing an introduction to commonly used types of genomic data, their characteristics, and their associated uses in phylogenomics. Specifically, we review the uses and features of ultraconserved elements (UCEs; including flanking regions), anchored hybrid enrichment (AHE) loci, conserved non-exonic elements (CNEE), untranslated regions (UTRs), introns, exons, mitochondrial DNA (mtDNA), single nucleotide polymorphisms (SNPs), and anonymous regions (nonspecific regions of the genome that are evenly or randomly distributed across the genome). These various data types differ in their mutation rates, likelihood of neutrality or of being strongly linked to loci under selection, and mode of inheritance, each of which are important considerations in phylogenomic reconstruction. These features give each genomic region or data type important advantages and disadvantages, depending on the biological question, number of taxa, evolutionary timescale, and analytical methods used. We provide a clear and concise outline (Table 1) as a resource to efficiently consider relevant and key aspects of each data type in order. As there are a number of factors to consider when designing phylogenomic studies, this review may serve as a primer when weighing options between multiple potential phylogenomic data types.