3. Results
The final dataset consisted of 1,801 individual sequences across 526
localities (Figure 1A). A total of 6,350 pairwise values of
DXY were derived from this dataset. DXYvalues ranged from 0 to 56.33 (mean = 9.685; median = 4.5, see Figure S1
in Supporting File 1 for the range of DXY values per
locus for each species). When the observed DXY values
are plotted in space (Figure 1B), it becomes evident that genetic breaks
(represented by midpoints between localities with relatively high values
of DXY) accumulate around three regions of the Atlantic
Forest: 1) lowland valleys within the Serra do Mar mountain range and
Paraíba do Sul river, in the southern range of the forest; 2) the Doce
river and nearby regions; 3) northern regions near the São Francisco
river.
Global models including only environmental predictors performed worse on
average than models that included both environmental predictors and
ecological traits (Figure 2A). Models based solely on environmental
predictors had mean R2 = 0.14 (ranging from 0.0007 to
0.45), whereas those included environmental and dispersal data had mean
R2 = 0.53 (ranging from 0.04 to 0.81) and those that
included environment and demographic data had mean R2= 0.43 (ranging from 0.003 to 0.77). Finally, models including
environmental data and both types of ecological traits (i.e., dispersal
and demographic traits) had mean R2 = 0.54 (ranging
from 0.06 to 0.81). A Kruskal-Wallis test suggests that the distribution
of R2 differs among all four sets of predictors
(X2 = 170.81, p -value < 0.01) and
Wilcoxon tests suggest that all models that including traits have
consistently higher predictive accuracy than models based solely on
environmental data (p -value < 0.001 for each set of
predictors including ecological traits). In addition, the inclusion of
dispersal traits led to a higher increase in R2 values
(when compared to models based solely on environmental data) than the
inclusion of demographic traits (Figure 2B).
Correlation indexes across predictor variables revealed that geographic,
topographic and bioclimatic resistance distances were highly correlated
(Table S2). Additionally, body size was highly correlated with wing
length (ρ = 0.873) and adult survival (ρ = 0.872). Environmental
distances, represented mainly by temperature seasonality and
precipitation of coldest quarter, consistently had the highest impact in
model accuracy (Figure 3). Morphological traits, represented mainly by
wing length, were equally important whenever they were included. Adult
survival and longevity were important ecological traits in models based
solely on environmental data and demographic traits, but were surpassed
by environmental data and dispersal traits whenever those were also
present. Finally, the mtDNA locus used to calculate DXYvalues was always present among the five most important variables across
all models.
Species-specific predictions show a larger variation in
R2 within each set of predictors (values ranging from
0.0001 to 0.9; Figure 4). However, models including ecological traits
tend to have higher mean R2 (Table 2; Figure 5A). A
Kruskal-Wallis test moderately supports that the distribution of
R2 differs among all four sets of predictors
(X2 = 9.53, p -value = 0.02). Similar to global
models, Wilcoxon tests of R2 values for
species-specific models suggest that all models including traits have
consistently higher predictive accuracy than models based solely on
environmental data (p -value < 0.001 for each set of
predictors including ecological traits). When considering only the model
with highest predictive power for each combination of species and locus,
it becomes clear that models including only environmental data tend to
have low predictive power (R2 < 0.17) even
when they are the best model across the four sets of predictors (Figure
5B). An exception to this pattern is the Cytb dataset for speciesSclerurus scansor , where the model based solely on environmental
data simultaneously was the best model and showed high accuracy
(R2 = 0.71; Figure 4). Finally, similar to global
models, the inclusion of dispersal traits led to a higher increase in
R2 values (when compared to models based solely on
environmental data) than the inclusion of demographic traits (Figure
S2).
Maps of the interpolated values of predicted DXY reveal
that, although models generally agree with maps of observed values
(Figure 6A), model uncertainty is higher in the northern Atlantic Forest
(hereinafter, northern AF), especially in models based solely on
environmental data (Figure 6B). Additionally, models tend to overpredict
genetic differentiation in northern AF (i.e., above the Doce River) and
underpredict differentiation in the southern Atlantic Forest
(hereinafter, southern AF; Figure 6C). Both over and underprediciton
decreases when ecological traits are added.