Results

Homozygosity and Inbreeding in Norfolk Island Pedigree

Inbreeding reduces genetic diversity within a population. The measure for inbreeding (also known as consanguinity) within a population is measured by the inbreeding coefficient (F). Pedigree-based inbreeding (FPED) was estimated using the reconstructed core-pedigree (N=1388). The mean FPED was 0.011 with a maximum FPED=0.28. Of the 1388 individuals in this pedigree, 400 were estimated to have an inbreeding coefficient greater than zero indicating inbreeding has occurred sometime in the past. Genetic markers can also be used to estimate inbreeding. Using data from 502 SNP-genotyped individuals from the core NI pedigree we calculated FMARKER values. Figure 1 shows average FMARKER values per chromosome. It was observed that 87% (N=439) of all 502 genotyped individuals exhibited an F value greater than 0 (using a cut-off at 4 decimal places, see methods for detail). Using this information the global average F was calculated (F=0.011), with a maximum F=0.215 being observed. It is interesting to note that both the FPED and FMARKER are identical (0.011); this validates the accuracy of the updated core-pedigree and the approach used in the cleaning and reconstruction process.
The next step was to characterise the patterns of inbreeding across the genome of the group of core pedigree individuals. This was performed by calculating runs of homozygosity by descent (HBD) ie. areas of the genome that show a reduction of genetic diversity due to inheritance of analogous alleles. Figure 2 shows a genome-wide profile of HBD probability across genotyped core-pedigree members. Visualisation of the HBD data was split into 2 separate plots. Figure 2 A displays an average locus-specific homozygosity profile for those individuals exhibiting high HBD probability (determined as at least one locus HBD > 0.75). The average HBD for this subset of individuals was 0.042. The visualisation of peaks arising from individuals with higher than normal levels of HBD is informative of potentially smaller groups of closer related individuals within the wider NI pedigree structure. These could potentially be interesting for investigating disease associations, or could identify smaller sub-pedigrees to further facilitate the tracking of complex traits such as migraine and ocular disorders (glaucoma), both of which show increased prevalence in the NI population [27–29]⁠. Figure 2 B shows the average genome-wide locus-specific homozygosity profile for all 502 genotyped core-pedigree individuals. The average genome-wide HBD for all genotyped individuals was 0.011; it should be noted that this is the exact same value as the estimated FMARKER for these individuals. This is due to the fact that both methods are calculating inbreeding with FMARKER being a genome average and HBD being locus-specific. This broader genome-wide profile of all genotyped individuals shows numerous areas of greatly increased HBD, several of the longer genomic regions (HBD > 0.011) are detailed in Table 2. The largest observed 'peak' of HBD probability on chromosome 6 was 0.13, which spans 18Mb showing an average HBD of 0.028 across the span. This peak on chromosome 6 was unique when compared to other areas of increased HBD showing multiple peaks within the same determined run of HBD (Figure 3). Interestingly the area of highest HBD resides on top of the well-defined human leukocyte antigen (HLA) region; a highly variable area of the genome well studied and known for its role in the immune system/response and disease. Another region of high HBD was observed as 2 peaks on chromosome 11. The second peak lies on a large family of olfactory receptor genes. These genes are important in the detection and interpretation of odours [30]⁠, and are reported to show increased genetic variation in order to account for the potential limitless amount of detectable odours [31]⁠. Additional File 1 shows exactly the same data as the genome-wide figures, but have been visualised in smaller blocks of chromosomes in order to better display the location and extent of HBD across a given chromosomal region.

Correlation between inbreeding/HBD and CVD endophenotypes

An exploratory correlation analysis was conducted to investigate relationships between genomic inbreeding and 10 CVD risk traits (endophenotypes). Table 3. shows that all 10 traits exhibited some evidence of association with inbreeding (P<0.05) The strongest correlation was between CVD risk and FMARKER (Pearson's r=0.389, P<2.4x10-11). These new results are consistent with previously reported between inbreeding and CVD related traits in the NI population [17]⁠. The current study therefore supports these findings, and builds upon them with previously unidentified trait relationships which could indicate important areas for future research.
New stuff here.

Discussion 

This section explored the unique genomic structure that underlies the NI pedigree. This structure has resulted from the rich history of the original Bounty Mutineers and Polynesian founders, being shaped by genetic bottle necks, founder effects and population admixture over the span of approximately 200 years. Previous calculations for both admixture and inbreeding coefficients have been estimated in the NI population and using these metrics it has been established that there are correlations between ancestry and CVD risk in NI [17]⁠. More specific population effects upon genomic structure, such as locus-specific admixture and runs of homozygosity, have not previously been explored in the NI population. These indices have potential implications in terms of disease association and will provide important foundations for future studies in NI, especially in the investigation of disease phenotypes that differ in frequency between the European and Polynesian ancestral populations.
Using a dense set of SNPs an estimate of average inbreeding coefficient of F=0.011 was calculated for the NI population. An F of this value indicates an average relationship level somewhere between second cousins (0.0156) and second cousins once removed (0.0078). This estimate is substantially smaller than that calculated previously by Macgregor et al., F=0.044; there are several potential explanations for this. Firstly, this study was able to use a more accurate, updated pedigree. Secondly, a greater depth of data in terms of a dense set of genotype data (SNPs) was available, as opposed to a small set of microsatellites (STRs). This genomic data and the updated genealogical information facilitated the reconstruction of a more accurate representation of the core NI pedigree, which should aid in more accuracy in such calculations.
As HBD is related to inbreeding, genomic regions which show increased HBD probabilities could potentially identify loci that show a lack of genetic diversity. There are several such areas with above average HBD in the genotyped core-pedigree NI population. There are major implications for this drop in genetic diversity, especially in areas such as the HLA region on chromosome 6, which was identified as exhibiting the largest average HBD. It is well established that the HLA region contains a large amount of immune function related genes, many coding for immune cell receptors that will potentially bind and recognise antigens (foreign peptides). The HLA region requires increased genetic variation to allow near limitless increased receptor specificity from a limited number of genes. High genetic variation is critical to the function of the HLA region as it allows near limitless variation to be introduced to the receptors antigen 'recognition site', which in turn increases the potential number of foreign antigens that can be detected. Decreased variation in this region could potentially be detrimental.
Chromosome 11 also had a larger degree of concentrated increased HBD, one particular peak was observed upon a large region of olfactory receptor genes. Olfactory receptors determine the way in which an organism interprets odours [31, 32]⁠. As with the HLA region, the olfactory receptor genes are limited in number, therefore increased variation within the gene family is required to enable detection and interpretation of near limitless possible odours [33–35]⁠. Thus decreased genetic variation across this region could lead to a potential reduction in odour detecting abilities. Interestingly HLA may also be related to people's detection and perception of the odour; with several studies observing association between HLA variation and preference to odour; this may be involved in mate selection [36, 37]⁠, as at least one study found a lower than expected rate of HLA similarity between spouses in an isolated community [36]⁠. Additionally, research has shown that more married couples have distinctly different HLA (MHC) genomic backgrounds/variation than would be expected by chance alone, suggesting that selection is potentially driving for composition and differentiation within the immune systems of offspring so they are able to adapt to the threat of new diseases. Another reason for this could also an avoidance of inbreeding in an attempt to maintain a higher amount of genetic diversity within a population. In this context it is interesting that in this study has identified decreased variation in the form of increased HBD at both the HLA and olfactory receptors gene loci which warrants further exploration.
The initial aim was to identify relationships between disease related endophenotypes and indices of the unique structural properties measured in the NI population. Using updated estimates of both marker-derived admixture and inbreeding numerous significant relationships between both measures and a range of Metabolic and CVD related traits were observed, which included a robustly calculated risk score for CVD and clinical diagnosis of Metabolic Syndrome. This is not the first study to identify these factors showing increased prevalence within the NI cohort. An initial study by Bellis et al., (2006) established the basis of increased risk for CVD related disorders and outlined baseline phenotype data. This was followed by several linkage analyses using STR data which established initial genomic maps and identified loci showing significant associations with CVD traits [38, 39]⁠. This study builds upon the previous work and identifies novel findings. This is the first study to examine the high-density SNP data in association with CVD/metabolic related traits, and to use an integrative genomics approach. Showing these strong structure vs risk trait relationships provides evidence that the reconstruction of the NI core-pedigree is robust and that the genomic data (SNPs) are concordant with this. This provides confidence for future disease gene mapping studies including: association; linkage and 'admixture-mapping' in this population.
 

Conclusion

Study of isolated populations require an understanding of the unique population history and admixture, which has led to unique genomic structure both in terms of long 'runs' of homozygosity (due to inbreeding), as well as extensive stretches of locus-specific admixture (heterozygosity introduced via population admixture). Genomic structure in populations has the potential to influence genetic associations with disease, and is therefore important to consider in future study design. This knowledge can then be used appropriately as a valuable tool in disease mapping and association studies. This work increases the accuracy of previous estimates of inbreeding and documents for the first time runs of HBD in the NI population. Additionally, this study identified significant correlations between these unique structural components and disease risk traits for Metabolic Syndrome and CVD. Importantly both increased prevalence and underlying population/genetic based association with Metabolic Syndrome in the NI pedigree has been identified. This provides strong justification for further examination of the NI population in the context of Metabolic Syndrome risk and prevalence. Future research should focus in on the identified area's of locus-specific admixture and HBD in light of the correlations with MetS and related traits.

Competing interests

The authors have no competing interests to declare.

Authors' Contributions

Contributions here....

Funding

This research was supported by funding from a National Health and Medical Research Council of Australia (NHMRC) Project Grant. It was also supported by infrastructure purchased with Australian Government EIF Super Science Funds as part of the Therapeutic Innovation Australia - Queensland Node project. MCB was supported by a Corbett Postgraduate Research Scholarship. The SOLAR statistical genetics computer package is supported by a grant from the US National Institute of Mental Health (MH059490). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Acknowledgements

We would like to acknowledge Amanda Miotto and also QUT for providing computational support for this project. Additionally, we extend our appreciation to the Norfolk Islanders who volunteered for this study.