2.4 Connectivity Strength Metrics
To quantify hydrologic connectivity, we identified a source site
(Inflow) and considered the magnitude of connectivity between this
source and multiple target sites. We first analyzed connectivity
information using relative stage dynamics. To do so, we used a graphical
analysis approach by plotting the mean daily Inflow stage against the
relative stage as represented by stage z-scores (described in section
2.2) at the target sites. Strongly co-varying stage levels between
source and target may suggest the presence of connectivity while
inflection points in source-target stage relationships can help identify
thresholds at which connectivity dynamics shift (Cabezas et al., 2011).
To identify inflection points in source-target stage relationships and
identify inflection thresholds at Inflow (Istage), we
fit broken line linear regression models using the segmented package in
R, which identifies a user-defined number of inflection points (Muggeo,
2008). Because hysteresis was observed in the source-target relationship
at several sites, we removed the rising limb from the inflection point
identification process. For improved interpretability, we constrained
the analysis to either a linear fit model (zero inflection points) or a
one inflection point model and chose the model that minimized the
Bayesian information criterion (BIC). In all cases, the single
inflection point model was chosen over the linear fit. It should be
noted that while coherent hydrologic fluctuations between sites can be a
useful tool for confirming connectivity, it can also be subject to false
positives when other factors act similarly on both sites (Rinderer et
al., 2018).
We developed an approach to quantify the connectivity magnitude between
source and target sites using both geochemical and microbial indicators.
For both metrics, we quantified the magnitude, defined hereafter as
connectivity strength (σ), as a continuous variable ranging from 0 and
1. Connectivity strength denotes the degree of influence of the source
on the target. To measure connectivity strength, we assumed that when
strong hydrologic connectivity was present, source and target water
compositions would be more similar than when connectivity was weak or
absent. This is a commonly used assumption embedded in source water
mixing approaches which use aqueous geochemistry to assess hydrologic
connectivity (Cabezas et al., 2009; C. N. Jones et al., 2014). For
microbial communities, we expected that when hydrologic connectivity was
strong, the membership of the water column microbiome would be more
similar because the target community would be strongly influenced by
immigration from the source community. Conversely, when hydrologic
connectivity was weak/absent, we expected inter-species interactions
would be the dominant influence on microbiome membership and the source
and target would become less similar over time.
To calculate connectivity strength using aqueous geochemistry
(σg), we first normalized ion concentrations by their
mean and standard deviations and conducted a principle component
analysis (PCA) on all major ions present including sodium, chloride,
calcium, magnesium, potassium and sulfate ions. Analytical results
included several outlying values for chloride and potassium that were
removed due to suspected contamination. To maintain a balanced dataset,
we replaced the removed outliers by linearly interpolating reported
values from the previous and subsequent weeks at the same site. We
examined PCA eigenvalues and eigenvectors (Figure 1, Table S1), and
based on variable loadings chose to include two principle components
(PCs) for further analysis that represented two major water source
components. At each sampling date, within the 2-dimensional PC space
(PC1 and PC2), the log transformed Euclidean distance was calculated
between a given target site geochemical composition and the geochemical
composition at Inflow (i.e., source site) (Eq. 2). This value was then
rescaled to between 0 and 1 using a min-max normalization and reversed
to calculate a chemical similarity score as follows (Eqs. 1 & 2).
\(\text{ED}_{i}=\operatorname{}{\log\left(\sqrt{{({PC1}_{s_{i}}-{PC1}_{t_{i}})}^{2}+{({PC2}_{s_{i}}-{PC2}_{t_{i}})}^{2}}\right)}\)(Eq. 1)
\(\sigma_{i}=1-\left(\frac{\text{ED}_{i}-\ min(ED)}{\max\left(\text{ED}\right)-min(ED)}\right)\)(Eq. 2)
Where EDi is the logged Euclidian distance within the
PCA space on a given sampling date, the subscriptssi and ti refer
respectively to PC scores at Inflow (i.e., the source) and a target
site, σi is the connectivity strength on a given
sampling date and ED is the complete dataset.
To calculate connectivity strength using microbiome membership
(σm), on each sample date, we calculated a similarity
score using the Bray-Curtis similarity index (BC) between microbiome
membership at a given target site and Inflow (i.e., the source), as
follows (Eq. 3).
\(\text{BC}_{\text{st}}=\frac{2C_{\text{st}}}{S_{s}+S_{t}}\) (Eq. 3)
Where C is the sum of the lower of the two counts of each OTU found at
both sites while Ss is the total number of sequence
reads at Inflow and St is the total number of sequence
reads at the target site. We also conducted a principle coordinate
analysis (PCOA) using the BC dissimilarity index to visualize microbiome
membership in lower dimensional space (Figure 1c).
To identify the relationship between Inflow stage and site-level
connectivity, at each site, we fit natural cubic spline regression
equations between Inflow stage and connectivity strength for both
geochemical and microbial metrics using the splines package in R
(R Core Team, 2016). As with relative stage (i.e., stage z-scores),
because hysteresis was observed at two sites, we only used the peak
through recession period for the model fitting procedure. At Inflow
stages that were outside the range of values when connectivity strength
was measured in the field (at very high or very low stages), we assigned
a constant value for connectivity strength equal to the mean of measured
connectivity strength values measured at the four sampling dates with
either the highest or lowest Inflow stage. Using these models, we then
generated daily time series of connectivity strength at each site using
the Inflow stage record for 2018.