1.2 Function infer.sex
Purpose: Identify the genetic sex of individuals.
Input: The output of function sex.linked.filter (list of six elements), a user-specified parameter that declares the sex-determination system of the species (‘zw’ or ‘xy’), and a seed number.
How it works: This function uses the types of loci available in the input (W-linked/Y-linked, Z-linked/X-linked and gametologous loci) to assign one preliminary sex for each type of sex-linked loci:
W-linked/Y-linked loci. For a ZW-system, it preliminarily assigns ‘M’ (male) to an individual if it presents more loci with NA (i.e., missing data) than loci with called genotype (i.e., ‘0’, ‘1’ or ‘2’), and ‘F’ (female) otherwise. For a XY-system, the assignment is the opposite.Z-linked/X-linked loci. It uses the matrix of genotypes for all individuals to perform k-means clustering with two centers (using the provided seed number). The rationale is that individuals would form two distinctive clusters, one per sex. As a result, individuals are assigned to one of two sex clusters. The individual with the most loci scored as heterozygous is used to identify the sex of its cluster (‘M’ for ZW-system, and ‘F’ for XY-system), while the other cluster is identified as the opposite sex.Gametologs. It follows the same method as Z-linked/X-linked loci: performs k-means clustering in which individuals are assigned to one of two sex clusters. It also uses the individual with the most loci scored as heterozygous to identify the sex of its cluster (‘F’ for ZW-system, and ‘M’ for XY-system).
If a type of sex-linked locus was not available (e.g., zero gametologs), it assigns NA to that preliminary assignment. The function uses the preliminary assignments to output a final sex assignment: ‘F’ or ‘M’ if all preliminary assignments match, ‘*F’ or ‘*M’ if they do not.
Output: a table with the three preliminary, and final sex assignments per individual. The Table 1lso includes the raw data on which the preliminary assignments were based on: number of W-linked/Y-linked loci with missing/called genotype, number of Z-linked/X-linked loci scored as homozygous/heterozygous, and number of gametologs scored as homozygous/heterozygous
Recommended use: We created this function with the explicit intent that a person inspects the final sex assignments for which not all three preliminary assignments agree (denoted as ’*M’ or ’*F’). Some individuals may have ambiguous genotypes for one type of sex-linked loci, and given the nature of k-means clustering, they may be assigned the wrong preliminary sex. It is recommended that the user checks the output table to make a definite final assignment. We recommend it being used straight after using function filter.sex.linked .