Introduction
Accurate sex determination is an integral aspect of conservation
genomics research, particularly when studying parameters such as
relatedness, dispersal, and philopatry. Sexing of individuals used in
conservation genomics studies typically takes place in the field at the
time of collection. However, sex assignments recorded in the field are
not always reliable and there is a wide margin for human error,
particularly for species that do not demonstrate sexual dimorphism or
when researchers are working in difficult conditions. Further, field
records can easily be lost or incorrectly transcribed during trapping
and monitoring. Genetic sex determination is a favourable alternative or
complement to field identification, as it is an objective, highly
standardized, and accurate approach that eliminates the possibility of
upstream sex misidentification confounding genomic studies (Hrovatin &
Kunej, 2017).
While PCR-based sex identification methods have been used for several
decades to identify and amplify sex chromosomes in individual samples
(Akane et al., 1992; Clapcote & Roder, 2005; McFarlane et al., 2013),
such processes can be time consuming and expensive. In addition, they
require taxon-specific primers that are not always available or
applicable to the target species. With the advent of high-throughput
sequencing (HTS) technology it is now possible to produce
high-resolution genomic data that may allow researchers to determine the
sex of sequenced individuals bioinformatically. For example, single
nucleotide polymorphisms (SNPs) in the genome can often be linked to the
sex chromosomes in model organisms, allowing sex to be determined on
chromosomal presence-absence basis (Fowler & Buonaccorsi, 2016; Lambert
et al., 2016). For non-model organisms where a well-assembled and
well-annotated reference genome is unavailable, the overall “dosage”
of sequencing reads mapping to the sex chromosomes can be assessed to
determine whether the individual is heterogametic or homogametic (Bover
et al., 2018; Gamble, 2016; Gower et al., 2019; Pečnerová et al., 2017).
Read-dosage-based approaches to sex determination have only been applied
using shotgun sequencing data, where molecules are randomly sampled and
sequenced (Flamingh et al., 2020; Motahari et al., 2013; Skoglund et
al., 2013). However, many conservation programs employ
reduced-representation sequencing approaches (e.g. RADseq), where
sequenced molecules belong to a subset of genomic loci. One commercial
provider of reduced-representation sequencing that is growing in
popularity in the conservation genomics field is Diversity Arrays
Technology (DArT) (Cummins et al., 2019; Ewart et al., 2019; Pazmiño et
al., 2018; Sansaloni et al., 2011; Schultz et al., 2018; van Deventer et
al., 2020). The DArT workflow uses restriction enzymes to reduce genomic
complexity, allowing identification of informative markers that are
subsequently sequenced for all submitted samples (Kilian et al., 2012).
However, despite the growing popularity of DArT for conservation
genomics projects, no simple and widely applicable sex-determination
framework has emerged that can be applied to DArT data. In the present
study we apply a read-dosage sex-determination approach to DArT data
from an Australian rodent, the Greater Stick-Nest Rat (Leporillus
conditor ), and demonstrate that - despite being originally designed for
application to shotgun data - this method remains robust when applied to
FASTQ files generated as part of the DArT workflow.