Introduction

A central goal in ecology is to identify and understand the processes that influence the distributions of species in space and time. Often, these assembly processes are not directly observable over feasible time scales and must instead by inferred through pattern \cite{Levin_1992}. One increasingly popular approach is to use the values and abundances of species traits in a community as evidence for the influence of particular assembly processes \cite{Cavender_Bares_2004,Ackerly_2007,Kraft_2008}. Trait-based approaches have several advantages over strictly taxonomic approaches in that they are quantitative, easily generalizable, and have explicit ties to ecological strategy and performance \cite{McGill2006,Violle_2007}.

Unfortunately, inferring process from community trait patterns is not always straightforward because different processes can lead to similar patterns, multiple processes can operate simultaneously on multiple traits, and patterns can be affected by exogenous forces. For example: community assembly is sometimes depicted as a balance between environmental filtering, in which species unable to tolerate environmental conditions are filtered out resulting in a clustering of trait values, and niche differentiation, in which competition and limiting similarity result in trait values that are more evenly spaced than expected by chance \cite{Cavender_Bares_2004,Kraft_2007}. But recent work has shown that environmentally-filtered communities can result in random or overdispersed trait patterns (e.g. when there is sufficient within-community environmental heterogeneity) \cite{DAndrea2016}, and competition-structured communities can result in clustering patterns \cite{Mayfield_2010}. In addition, pattern-based evidence of assembly processes can be obfuscated by propagule pressure from adjacent communities \cite{Leibold_2004}, or by fluctuating environmental conditions that favor different species over time \cite{Chesson_1981,Chesson_1994}.

Although it is unlikely that a single pattern-based test will ever provide incontrovertible evidence for niche differentiation, analysis of community trait structure can still shed light on assembly processes if used properly. Different metrics should be used in complementary ways to provide more detailed, and thus more interpretable characterizations of community trait structure. In one recent study, \cite{DAndrea2017} suggest a stepwise analysis pipeline in which potential niches along trait axes are identified using a clustering algorithm, and if clusters are identified, then the fine-scale abundance structure within each cluster is examined for evidence of distance-based competition. Next, tests of community trait structure should be conducted along environmental gradients where they can potentially be tied to mechanistic predictions derived from existing ecological theory \cite{Webb_2010}. Lastly, analyses of community trait structure should be used to develop and select hypotheses for experimental testing in the field, rather than be considered as compelling standalone evidence.

Here, we apply a suite of newly developed and classical metrics of community trait structure to a network of twelve grasslands positioned along temperature and precipitation gradients in southern Norway. Our tests include measures of clustering, fine-scale trait abundance structure, and whole-community trait abundance structure. We look for community-level patterns in four traits: leaf area, maximum potential canopy height, seed mass, and specific leaf area (SLA). Based on our knowledge of the system, we predict a gradual shift in importance of competitive interactions at the coldest sites to environmental filtering at the most stressful sites. We expect that competition for light will be the strongest competitive factor at the warmest sites, and thus there will competition-derived clustering in maximum height and leaf area. We expect there to be niche differentiation in SLA at the coldest sites, where there could be a tradeoff between risky fast-growth strategies and the ability to tolerate/avoid early season frosts. Ultimately, our work uses trait-based predictions of community assembly processes to glean information about the relative influence of assembly mechanisms on grassland community composition.

Methods

Data

We measured four traits: leaf area (LFA), specific leaf area (SLA), maximum plant height (MXH), and seed mass (SDM). We standardize our traits by taking the logarithm of the trait value and rescaling the logarithms to range between 0 and 1¹. We applied our tests on each trait individually, as well as on the Euclidean space formed by these traits, which is a four-dimensional hypercube of side 1.

Metrics

For each site we calculate its Rao quadratic entropy, defined as \(Q=\sum_i^{S-1}\sum_{j=i+1}^S d_{ij}p_i p_j\), where \(p_i\) and \(p_j\) are the relative abundance of species \(i\) and \(j\), \(d_{ij}\) is the absolute trait difference between them, and the sum is over all species pairs. It corresponds to the expected trait difference between two individuals randomly sampled (with replacement) from the community. We also used the functional dispersion metric proposed in \cite{Laliberte2010}, defined as the abundance-weighted mean distance \(d_i\) between each species \(i\) and the community trait centroid. That is, \(\text{FDis}=\sum_i p_i d_i\). When a single trait is considered, this is simply \(\sum_i p_i |x_i-\sum_j p_j x_j|\), where \(x_i\) is the trait value of species \(i\). Both indices have been used to quantify community functional diversity \cite{Botta-Dukat2005,Laliberte2010,Ricotta2011}. A high value indicates trait overdispersion, i.e. species cover a wider region of trait space than expected by chance. In contrast, a low value suggests that species are being filtered toward a particular trait value, possibly due to selection for optimal tolerance to local environmental conditions \cite{Keddy1992}.

In addition to test statistics based on trait dispersion, we also used a measure of the degree of even spacing between adjacent species on the trait axis. The metric is defined as \(\text{CV}=\sigma/\mu\), where \(\mu\) and \(\sigma\) are respectively the mean and standard deviation of the distances between closest neighbors in trait space. When a single trait is considered, species can be ordered by trait value, and the distance vector is \(d_i=|x_i-x_{i+1}|\) between adjacent species \(i\) and \(i+1\). A low CV indicates even spacing. Even spacing has been proposed as indicative of niche differentiation, as it maximizes exploration of niche space \cite{Mason2005}, and minimizes competitive interactions caused by trait similarity \cite{MacArthur1967}. On the other hand, recent work has raised the possibility that resource partitioning may actually lead to species clustering on the trait axis \cite{Scheffer2006}. In particular, clusters in trait space are expected if competitive exclusion is slow or if immigration replenishes species that are not niche-differentiated \cite{DAndrea2016}. Given this possibility, the coefficient of variation may actually be higher than expected by chance.

Although species may be clustered, they may still sort into niches that in turn are evenly spaced. This could occur if competition is caused by trait similarity \cite{Scheffer2006,DAndrea2017}. In that case, the most abundant species in the community might be expected to be evenly spaced even though the community as a whole is clustered. Based on these considerations, we used the CV in two metrics. First, we considered all species in the community without regard for abundance. A similar test statistic, the variance divided by the range, is commonly used to quantify evenness \cite{Stubbs2004,Kraft2008,Ingram2009}. Second, we gradually remove species from the community in increasing order of abundance, at each step calculating the CV among the remaining species. If the CV declines as the least abundant species are progressively removed, this suggests even spacing between niches concomitant with clustering between species.

Finally, we test for the presence of clusters directly by applying a cluster-finding method. Our metric uses a k-medoid clustering algorithm, which partitions trait space into groups (clusters) of species, each group with a specific medoid, i.e. the species that is closest to all other members of its group. It is an iterative process which alternately decides cluster membership and medoid identity by minimizing the average distances in trait space between species and the medoids of their clusters \cite{Kaufman1990}. We implement the algorithm using the function clara in R package cluster \cite{Maechler2016}. For each community-year, we find the number of clusters that best fits the data using R’s optim function for Markov chain Monte Carlo optimization \cite{RCoreTeam2015}. The quantity being optimized is the average silhouette width, a measure of how similar individuals are to their own cluster compared to neighboring clusters \cite{Kaufman1990}. Once the optimal number of clusters is found, the test statistic is the optimized average silhouette width. We then test for clustering by comparing the test statistic against the set of null communities.

Null model

In order to create null communities against which to compare our data, we used a mainland-island approach, where each site undergoes zero-sum birth-death neutral dynamics and immigration from a fixed regional species pool \cite{Hubbell2001}. For each site, the regional pool includes all species falling within the observed trait range, with the regional abundance of each species calculated as the mean across all sites. For each site we estimated immigration rates by fitting a neutral model to the observed relative species cover, and estimated community size by matching the neutral simulated communities to observed species richness. Estimated community size ranged from 215 individuals for Fauske to 567 for Gudmedalen, and immigration rate ranged from 0.03 for Ovstedal to 0.53 for Lavisdalen. For each site we simulated 1,000 neutral communities.

To test for significance, for each of our sites in a given year we compare the metric value to the \((1-\alpha)\)-quantile of the corresponding set of null communities. Of our five metrics, three (Rao, FDis, CV) are two-tailed, as both low and high values can be interpreted to suggest specific community assembly processes, while the other two (CVtrend, Clara) are one-tailed. We use significance level \(\alpha=0.025\) for the two-tailed tests and \(\alpha=0.05\) for the one-tailed tests.

Results

Fig. 2 summarizes our results for the 2009 census. Bar plots show the percentage of the 12 sites that tested significant against the set of null communities. We focus on the 2009 census, but our results were consistent across the years (see Figure S2 in the Supplement), indicating that deterministic factors are playing a role in the trait structure of our communities.

Leaf area and SLA, which are related traits, had similar results across most tests. Between 30% and 50% of sites were significantly overdispersed according to Rao and FDis. A smaller percentage (20%) of sites were significantly underdispersed in SLA. The CV was significantly high for leaf area in 50% of the sites, indicating uneven spacing between adjacent species. Results were weaker and more ambiguous for SLA: spacing between adjacent species was significantly even in 20% of sites, and significantly uneven in another 10%.

In contrast, seed mass showed the strongest indication of underdispersion. 30% and 50% of sites had significantly low Rao and FDis indices, respectively. Furthermore, there was no significant evenness in any of the sites according to the CV metric. And 25% of the sites showed a significant negative trend in CV as low-abundance species are removed.

Results were ambiguous for maximum plant height. Rao and FDis results were relatively strong but split between significant overdispersion and underdispersion, with the latter being a slight majority. Our CV result was also ambivalent, with 30% of sites indicating even spacing between species while another 20% indicate the opposite pattern. 20% of sites had a significant negative trend in CV as rare species are removed.

When all four traits were considered together in a Euclidean trait space, results were somewhat ambiguous for the functional dispersion metrics but tended towards overdispersion (30% overdispersion against 20% underdispersion). According to the CV, species were evenly spaced in this multidimensional space in 20% of the sites, and were not significantly uneven in any site.

Rao and FDis results were largely consistent with each other for all traits and the Euclidean space, corroborating previous results that indicate these two statistics are related \cite{Laliberte2010}.

A low percentage of sites, between 10% and 25%, showed evidence of significant clustering according to the CV trend and Clara metrics. Particularly for Clara, numbers were consistently low across traits and the Euclidean space, averaging just above 10% detection of significance. Given the null expectation of significance in 5% of the sites because of our \(\alpha=0.05\) significance cutoff, these results suggest that species are not sorting into distinguishable clusters in our sites.

Figure 3 shows the variation in the standard score of our Rao results against mean summer temperature of our sites². We see a significant trend in Rao scores against temperature for SLA and max height, plus the Euclidean space. The trend is negative in all cases, indicating that colder sites tended to be more overdispersed than warmer sites.

Results for the other metrics across the years are shown in Figure S3 in the Supplement and summarized in Table 1. Aside from FDis, which showed similar trends as Rao for the same traits, we found a negative trend in CV for leaf area in two years and for seed mass in one year, and a positive trend in Clara for SLA and the Euclidean space in one year. We also see that for SLA the positive slope in CV as low-abundance species are removed was slightly steeper in higher temperatures, whereas in leaf area, max height, and the Euclidean space the opposite was observed. It should be noted that although consistent across years, those trends were weak and the standard scores involved had small magnitude.

Trends were for the most part consistent across years. No trait showed opposite trends in different years, and many trends were observed in all four years, while some occurred in one, two, or three years (Table 1, see also Fig. 3S). We also checked for trends against mean annual precipitation, but found largely nonsignificant results (Fig. 3S).

Discussion

Our results indicate that, relative to the regional pool, the leaf traits were often overdispersed in our local alpine communities, in the sense that species with extreme trait values tend to be more abundant than would be expected from a random draw from the pool. We found no evidence that species in local communities are evenly spaced on the leaf trait axes; on the contrary, species tended to be unevenly dispersed in leaf area. There was some suggestion however that even spacing occurred between the most abundant species. Lastly, species rarely seemed to form recognizable groups in these leaf traits. The trait overdispersion concomitant with the lack of even spacing are compatible with the hypothesis that species are being selected into distinct functional groups or niches, but within each niche species either compete neutrally or are selected for a particular trait value.

Seed mass showed the opposite behavior of leaf traits, as a sizeable fraction of our local communities were underdispersed in seed mass: species with a particular seed mass tended to occur more frequently or be more abundant than those deviating from the optimum, possibly because they are better adapted to local conditions or because they are better competitors. There was some suggestion of

Even spacing between adjacent species was distinctly observed in the Euclidean space formed by leaf area, SLA, maximum plant height, and seed mass. Spacing seemed more even between the more abundant species in about one in five sites. Trait dispersion results were ambiguous in the Euclidean space, with the number of significantly underdispersed communities roughly matching that of overdispersed communities. Overall, these results are compatible with the classical idea that species avoid competition by maximizing interspecies distances in niche space.

Figures

\label{fig:Fig1}

Example data from the site of Lavsdalen from the 2009 census. Species are arranged by trait value on the x-axis, and species percentage cover is shown on the y-axis. Trait values are logged and then normalized to range between 0 and 1. Maximum height values are jittered to show species sharing same binned value.

\label{fig:Fig2}

Summary of metric results across sites from the 2009 census. For each test, percentage of sites with statistically significant results are shown for leaf area (LFA), specific leaf area (SLA), maximum plant height (MXH), seed mass (SDM), and the four-dimensional Euclidean space formed by these traits (EUC). Rao, FDis, and CV are two-tailed: bars show the percentage of sites, out of 12, whose metric values were lower than the 2.5% null percentile (red bars) or higher than 97.5% null percentile (blue bars). CV trend and Clara are one-tailed: bars show the percentage of sites with metric values exceeding the 95% null percentile.

\label{fig:Fig3}

Standard scores of the Rao metric for each site, plotted against the site’s mean summer temperature. Significant negative trends were observed for SLA, maximum plant height, and the 4-d Euclidean space. Results shown for the 2009 census.

\label{table:Table1}

Mathematically, we standardize by defining \(y_i=(\text{log}(x_i)-\text{log}(x_\text{min}))\,/\,(\text{log}(x_\text{max})-\text{log}(x_\text{min}))\), where \(x_i\) is the trait value measured for species \(i\), and \(x_\text{min}\) and \(x_\text{max}\) are the lowest and highest trait value observed in the data.↩
The standard score measures the difference between the data and the null communities relative to the variation across the nulls. If the test score in a site was \(x\), and the mean and standard deviation of the null scores were respectively \(\mu\) and \(\sigma\), then the standard score is \(z=(x-\mu)/\sigma\).↩