Copepod data
Copepods are identified to species whenever possible, and shared
training and staff exchanges amongst surveys ensure comparable data.
Names of all copepod taxa were updated using the World Register of
Marine Species (WoRMS) (http://www.marinespecies.org/). Data were
available for 388 taxa and from 100,326 samples.
To test Bergmann’s Rule, we calculated the mean length of copepods in
each sample and related this to environmental drivers. Maximum and
minimum lengths of different copepod taxa were obtained from the Marine
Planktonic Copepod Database from the Observatoire Océanologique de
Banyuls-sur-Mer (https://copepodes.obs-banyuls.fr/en/index.php,
Razouls et al. (2020)) and from Richardson et al. (2006). Because
juveniles are more difficult to identify to species and reliably assign
a size than adults, they were not included in the estimate of mean
copepod size for each sample, although adults tend to be more common in
CPR samples (Richardson et al. 2006). Each taxon was assigned a
single size (the midpoint between their minimum and maximum lengths). We
calculated the mean length of copepods in a CPR sample by multiplying
abundance of adults of each taxon in the sample by their assigned
length, and then dividing their sum by the total abundance of all adults
within the sample.
Bergmann’s Rule could also potentially be influenced by spatial
differences in the trophic structure of communities because body size is
linked to the ecological role of a species (Woodward et al. 2005;
Yvon‐Durocher et al. 2011). Because carnivorous copepods are
generally larger than omnivorous ones (Mauchline 1998; Fig. S1 in
Supporting information), which could affect observed relationships, we
distinguished obvious differences in diets among taxa. We assigned a
diet (carnivore or omnivore and herbivore combined, hereafter called
omnivore) for each copepod taxon using a combination of dietary studies
from the literature (see Richardson and Schoeman (2004)) and
morphological differences in copepod mouthparts (Huys & Boxshall 1991).
To calculate the proportion of omnivores we divided the summed abundance
of omnivore within each sample by the total abundance of copepods within
samples.
Environmental data
We used sea surface temperature (SST) as an estimate of ocean
temperature for the near-surface CPR samples, and chlorophyll-aconcentration (Chl-a ) as a proxy for food availability to
copepods. Chl-a is correlated with copepod growth and fecundity
in herbivorous and omnivorous species (Richardson & Verheye 1998, 1999;
Hirst & Bunker 2003; Bunker & Hirst 2004), and more Chl-a leads
to more of these grazers and thus more carnivorous zooplankton
(Richardson & Schoeman 2004). As in situ SST and Chl-aare not collected routinely on all CPR tows, we used remotely-sensed
data. We matched the location and time of collection of CPR samples with
estimates of SST and Chl-a averaged over eight days prior to
sampling to limit loss of data due to cloud cover and to represent
feeding conditions over the recent past. For SST, we used daily Group
for High Resolution SST data, a cloud-free global product based on
satellite, buoy and ship data, and interpolated at 0.2°×0.2° resolution
(www.ghrsst.org). For surface Chl-afrom September 1997 until the end of 2016, we used daily data from the
European Space Agency Ocean Colour Climate Change Initiative data
(esa-oceancolour-cci.org).
Because this is a merged product from MERIS, Aqua-MODIS, SeaWiFS and
VIIRS satellites, these data provide better coverage than products from
a single satellite, and are more accurate globally (Mélin et al.2017). For 2017 and 2018, we combined data from Aqua-MODIS and VIIRS,
the only available satellites. Chl-a data were retrieved at 4 km
resolution and aggregated to 0.2°×0.2° resolution to limit loss of data
due to cloud cover and match the resolution of SST data. As there are no
robust satellite Chl-a data before 1997, all analyses used CPR
samples collected after this time.
We estimated predation rates on copepods by calculating the abundance of
larger predatory invertebrates in CPR samples. We summed abundances of
the primary invertebrate predators of copepods, including chaetognaths,
amphipods, cnidarians, decapods, euphausiids and siphonophores. Although
invertebrate predation is a crude indicator of total predation, it is
collected concomitantly with the copepod data we used for the size
analysis. We could not estimate potential fish predation because there
is no global abundance dataset over time and space scales matching the
CPR dataset.
To investigate whether oxygen is a driver of Bergmann’s Rule (Forsteret al. 2012; Rollinson & Rowe 2018), we used the World Ocean
Atlas 2018 (WOA-18)
(www.nodc.noaa.gov/OC5/woa18/).
We had to use much coarser-resolution data (monthly climatology averaged
at 1°×1° resolution) because we could find no observed global oxygen
product at a finer resolution.
Statistical
analyses
To test Bergmann’s Rule, we used a model-building approach. We fitted a
weighted generalised linear mixed-effect model (GLMM, Bates et al.
2011), with fixed effects for all predictors (SST, Latitude, Oxygen,
Chl-a , predator abundance, and the proportion of omnivores (Table
S1). We included the proportion of omnivores in the model because
omnivorous copepods are generally smaller than carnivorous ones (Fig.
S1). Upon visual inspection of model residuals, their magnitude
generally increased at higher fitted values. We thus used a gamma error
structure with a log-link function (there were no zero values for size).
Because estimates of mean length of copepods in a sample are more
precise when there are more specimens, we weighted observations in the
GLMM by the square-root of total copepod abundance within the sample.
To account for temporal and spatial correlation in the data, we used
three random effects associated with the time and location of sampling.
The first was a random intercept associated with differences amongst CPR
surveys, which accounts for the use of different vessels, large-scale
regional effects, and any minor methodological differences. The second
was a random intercept of Tow within Survey, which adjusts for both
temporal and spatial differences amongst tows. The last was a random
slope for days elapsed on Tow within Survey, which accounts for
spatiotemporal autocorrelation amongst samples on tows, which arises due
to differing weather and collection conditions throughout the period of
the tow. This helped remove any general linear trends over the period of
the tow.
Due to the large number of samples (>100,000) and thus high
statistical power, we did not assess the significance of predictors.
Rather, we selected the best model using the Bayesian information
criterion (BIC), which is based on the goodness of fit (log-likelihood
ratio) relative to model complexity (number of parameters). BIC is
suitable to fitting heuristic models with large numbers of observations;
it more harshly penalising overfitting than commonly-used Akaike
information criterion (Schwarz 1978; Aho et al. 2014).
From a preliminary analysis we found that SST, Latitude and Oxygen were
strongly correlated (all r>0.59, Fig. S2) and could not be
included together in the model. We thus first assessed their relative
importance in separate models with the other predictors (Chl-a ,
predator abundance, and proportion of omnivores), and then retained the
most important variable amongst SST, Latitude and Oxygen for inclusion
in subsequent analysis.
We used the pseudo R-squared described in Nakagawa & Schielzeth (2013)
to estimate the proportion of variation explained by the gamma GLMMs
(Nakagawa et al. 2017). To interpret the ecological relevance of
the drivers, we evaluated effect sizes using expected relationships of
the mean length with predictors (Nakagawa & Cuthill 2007; Sullivan &
Feinn 2012). We also evaluated ecological relevance by converting
copepod length estimates to body mass (using Wet-Weight (mg) = 0.03493 ×
Length (mm)2.9878, Pearre Jr., 1980) because the food
that is available to higher-level predators such as fish is related to
body mass rather than size.
To assess the effect of influential points within the model, we
performed sensitivity tests by iteratively identifying and removing
groups of outliers and high-leverage observations; these procedures had
no impact on overall results. Results presented thus include all
available data. By mapping residuals globally both with and without
random effects (Fig. S5), we confirmed that the use of random effects
reduced spatial autocorrelation among samples; plotting residuals
through time within Surveys and on Tows confirmed that our models
reduced temporal autocorrelation. The code for the statistical analyses
is publicly available on GitHub
(www.github.com/maxcampb/Bergmann-Rule-Copepods).