Inferring genome-wide variation in mutation and selection from polymorphism and divergence data



Write abstract here .

Authors summary for PLOS Genetics


Despite half a century spent scrutinizing levels of naturally occuring polymorphism and, more recently, levels of between species divergence at the molecular level, we know embarrassingly little about which forces explain these patterns ( See a recent review (Leffler ) for an overview of the “old riddle”) although some recent analyses on large comprattive dataset suggest that some life history traits may have a pervasive effect on the amount of neutral nucleotide diversity at a very large phylogenetic scale. What do we know from previous studies is that :

  • . Mutation rate vary substantially throughout the genome and between species divergence for presumaly neutrally evolving regions/sites can reveal that variation;

  • The amount of drift experienced by a particular locus may also vary due to selection at linked sites. There is mounting evidence that this phenomenon is quite widespread although the quantitative effect of either positive and/or negative selection at linked sites has rarely been investiagted quantitatively.

  • Last, the amount of apparent molecular adaptatation experienced by genes varies susbtantially from gene to gene. But There is also substantial variation from site to site in the amount of selection experienced by a single site.

To make progress on this question we propose a statistical framework to infer some of the key evolutionary parameter that drive the joint patterns of polymorphism (within sp) and divergence (between a pair of species). The model is parametrized in a way that allow to estimate jointly the importance of two key evolutionary factors that are central to many population genetics pb but are notoriously difficult to estimate form data: Mutation rates and they variation in the genome The distribution of fitness effects of new mutations Our approach also accounts for demography by jointly estimating nuisance parameters accounting the overall effect of the past (and generally unknown) demographic history of the sample on the expected neutral SFS.