2.6.2 Epigenetic analyses
We calculated the methylation level at each SMP and individual sample
and estimated the mean and standard deviation of DNA methylation per
group, for each separate sequence context (i.e. CG, CHG and CHH), and
across all contexts, using the complete SMP matrix (n=43,365 SMPs) as
well as the complete and polymorphic SMP matrix (n=3,769 SMPs). All
other analyses were performed for each of the two common garden
experiments by comparing control vs. Cd-treated and control vs.
Cu-treated plants from all four populations using the matrix of complete
and polymorphic SMPs.
To evaluate response to each heavy metal, we carried out a
distance-based RDA (dbRDA 94) to test the effect of
population (n=4), treatment (n=2; C vs. Cd or C vs. Cu), and their
interaction on genome wide DNA methylation with the model: epigenetic
distance ~ Population * Treatment (functioncapscale in the vegan package; Oksanen et al., 2020). This
constrained multivariate ordination approach provides an estimate of the
amount of variation in epigenetic distances explained by our predictors,
as well as the constrained ordination axes (RDA axes) used to visualize
the structure of the data in a two-dimensional space. Pairwise
epigenetic distances were estimated following Wang et al.96, where the distance (dst) between
any two samples, s and t, was calculated either based on only the level
of methylation or including the variance in methylation as follows:
\begin{equation}
d_{\text{st}}=\ \sqrt[2]{\sum_{j=1}^{n}\left\{\frac{1}{2n}\left(x_{\text{sj}}^{m}-\ x_{\text{tj}}^{m}\right)^{2}\ +\ \frac{1}{2n}\left(x_{\text{sj}}^{v}-\ x_{\text{tj}}^{v}\right)^{2}\right\}}\nonumber \\
\end{equation}Where n is the number of cytosine positions (SMPs);\(x_{\text{sj}}^{m}\) and \(x_{\text{tj}}^{m}\) represent the
methylation level of cytosine j in samples s and t
respectively; and \(x_{\text{sj}}^{v}\) and \(x_{\text{tj}}^{v}\)represent the methylation variance of cytosine j in
samples s and t respectively. The methylation variance
(\(x_{\text{ij}}^{v}\)) was estimated as the squared difference between
the methylation level of cytosine j in sample i (\(x_{\text{ij}}^{m}\))
and the average methylation of cytosine j across all samples in the
group to which sample i belongs
(\({\overset{\overline{}}{x}}_{j}^{m}\)). Using the formula above, we
also accounted for the variation in methylation profiles within
treatment groups.
We ran separate dbRDA models for each common garden experiment (i.e. C
vs. Cd and C vs. Cu) for all sequence contexts together and separately
for each sequence context. In each of these cases, we first ran the
model using the epigenetic distance matrix calculated using only
differences in methylation levels between each pair of samples (using
only the left-hand side under the square root of the formula above).
Then, we ran the model using the distance matrix calculated from both
methylation level and methylation variance (full formula). We tested the
overall significance, as well as the significance of each predictor and
their interaction using a permutation test with 9999 permutations. We
calculated the proportion of total variance explained by our models,
adjusted by the number of predictors included in the model (adj.
R2), with the function RsquareAdj (vegan
package) which provides a unique adj. R2 for the full
model. We also calculated individual adj. R2 for each
predictor by modelling the effect of each predictor while accounting for
the others using partial constrained dbRDA with the following models: i)
epigenetic distance ~ Treatment + Condition(Population);
ii) epigenetic distance ~ Population +
Condition(Treatment); iii) epigenetic distance ~
Population:Treatment + Condition(Treatment) + Condition(Population).
We identified differentially methylated cytosine positions (DMPs) as
cytosines with significant differences in their mean methylation level
between control and Cu-treated, and control and Cd-treated plants using
the R package DSS 97. First, we modeled the
methylation frequency at each cytosine position within each group using
a beta-binomial distribution with arcsine link function and the formula
“~ 0 + group” (function DMLfit.multifactor ).
This transformation proved useful to model DNA methylation because it
reduces the heteroscedasticity of the data by stabilizing its variance98,99. Second, we performed Wald tests to detect
differential methylation between groups at each position using the
function DMLtest.multifactor which reports adjusted p-values by
the Benjamini-Hochberg method (i.e. FDR). Cytosines were considered to
be differentially methylated (i.e. DMPs) when the FDR ≤ 0.05 and the
methylation change was ≥ 10%.