2.1 Data selection
Species list were initially obtained from floristic data networks from
Cerrado areas (http://cerrado.rbge.org.uk; Ratter et al., 2011; Castro
et al., 2010; Vieira et al., 2019). In addition, twenty-one areas were
sampled in the northern region of Cerrado (in the states of Piauí,
Maranhão, Pará and Amapá), summary data available at
https://doi.org/10.5061/dryad.9cnp5hqd4). To answer our question about
possible differences in the effect of climate change on different
species groups, from the 235 species recorded in our previous list, we
selected five species of wide national distribution in the Biome (SpG);
and five species with marginal distribution of the northern Cerrado,
typical of the peripheral areas investigated and absent or rare in the
central region of Brazil (SpM): Bowdichia virgilioides (754
occurrences), Byrsonima crassifolia (523), Curatella
americana (956), Himatanthus articulatus (441), Parkia
platycephala (259), Plathymenia reticulata (604), Qualea
grandiflora (858), Q. parviflora 727), Salvertia
convallariodora (561) and Vatairea macrocarpa (413) (Tab. S1 in
Appendix S1). Our choice was based on the largest data set, highest
constancy values in the national savanna areas and highest importance
value [IV] indices in the inventoried northern savannas. Finally, in
addition to inventory and field survey data, occurrence records of these
species were obtained from the Global Biodiversity Information Facility
open access platform (http://gbif.org) using the rgbif package
(Chamberlain et al., 2019) in R (R Development Core Team, 2019;
S1 File in Appendix S3).
Thus, the total known distribution area of each species in South America
was used for the model construction process (see below) and then
projected to the focal area. We opted for the smallest spatial
aggregation, restricting the presence records to a distance of 1 km.
Non-georeferenced fields, those with duplicate records within the
minimum distance (1 km²), and those outside the expected distribution
for the species were excluded. Although typical of Cerrado, they are
plants randomly recorded in ecotonal areas under the influence of
neighboring domains, and for this reason the filters were based on the
checklist of the present study, in the revised floristic lists of Castro
et al. (2010), Ratter et al. (2011), Vieira et al. (2019) and the
Species List of the Brazilian Flora (BFG, 2018;
http://floradobrasil.jbrj.gov.br). The final database included 6,106
presence records (ranging from 256 to 956; Tab. S1 in Appendix S1) for
the ten native species of Brazil, which were significantly represented
in Cerrado sites (Fig. S1 in Appendix S2).
WorldClim data on current climate and projections for 2050
(http://worldclim.org; Fick & Hijmans, 2017), at a resolution of
30 arcseg (about 1 km²) within the spatial boundaries of South America
(12°N, 56°S, 91°O, 34°O) were used for the purpose of creating
ecological niche models (ENMs). This extension was used to capture the
spectrum of climate variation for the entire known distribution of
species. Among the 20 variables (altitude and 19 bioclimatic variables),
the least collinear variables were selected using theremoveCollinearity tool (multicollinearity.cutoff = 0.75,
nb.points = 10000) from the virtualspecies package (Leroy,
Meynard, Bellard, & Courchamp, 2015): altitude, annual mean
temperature, max temperature of warmest month, isothermality, mean
diurnal range, precipitation of coldest quarter, annual precipitation,
precipitation of warmest quarter, precipitation seasonality, and
precipitation of driest quarter (Fig. S2 in Appendix S2). Future climate
projections were derived from two global atmosphere-ocean circulation
model (AOGCM) - the Community Climate System Model (CCSM4) and the
Hadley Center Global Environmental Model (HadGEM2-CC) - in two
greenhouse gas concentration situations (RCPs 4.5 and 8.5) foreseen by
the Fifth Assessment (CMIP5) of the Intergovernmental
Panel on Climate Change (IPCC, 2014), representing an optimistic
achievable scenario and a “business-as-usual” scenario.
2.2 Modeling runs and analyses
To estimate the potential distribution of species, ENMs were generated
in the biomod2 package (Thuiller, Georges, Engler, & Breiner, 2019) by
running four different algorithms: Generalized Linear Model (GLM),
Artificial Neural Network (ANN), Random Forest (RF), and Maximum Entropy
(MAXENT). Species datasets (presence + pseudo-absence records) were
randomly divided into 75% for calibration (training points) and 25%
for evaluation (test points) and this procedure was repeated 10 times
for each set for each species. To assess the predictive power of the
models, the True Skill Statistics (TSS; threshold-dependent) values
(Allouche et al., 2006) and the area under the receiver operating
characteristic (ROC) curve (Phillips, Anderson, & Schapire, 2006) were
measured. Sensitivity (proportion of correctly predicted presences/true
positives) and specificity (proportion of correctly predicted
absences/true negatives) levels were also reported. The consensus maps
were generated by the committee averaging method to represent the
concordances between the different runs. The lower performing models
(TSS < 0.5) were eliminated from the consensus building
process (ensemble). Thus, five consensus maps were generated per species
for each scenario (one current, two RCP 4.5 and two RCP 8.5) overlapping
the projected presences of species from present day to 2050 to estimate
where the environmental characteristics will resemble those of current
sites of occurrence, and where these species will lose, gain or maintain
their distribution. Continuous species suitability values were
transformed into binary data (presence/absence) using a cut-off
threshold that maximized TSS (Cutoff, Tab. S1 in Appendix S1) and the
projected area of occurrence was calculated by multiplying the cell area
by the cell count. The modeling results were visualized through the QGIS
Geographic Information System (QGIS, 2019) and analyzed using the R
Statistical environment and the raster library (File S1 in
Appendix S3).
We also measured the potential dynamics of change in the range extension
(area gain/loss) of the species based on the differences between the
projected area for each current and future climate scenario. This
analysis was made for each species separately, to recognize ecological
functional differences between plant populations. The size of the areas
with greater climate stability (referred to as refuges in the present
study) was calculated from the number of pixels considered as presence
when overlapping the models of all scenarios – individually, by status
(generalist [SpG] and marginal [SpM]), and altogether. Based on
these refuge maps, the future effectiveness of present CUs from a
climate change perspective was measured, accounting for the size of
refuge areas for SpG and SpM and both which overlap with the CUs. The
data of the Brazilian CUs were obtained from the website of the Ministry
of Environment (http://mapas.mma.gov.br).