Rafael D'Andrea edited untitled.tex  almost 8 years ago

Commit id: 798c93acf25c609bb6e373b86a50df464303c8a2

deletions | additions      

       

\subsection*{Data}  \rem{Explain sampling design}  We measured four traits: leaf area (LFA), specific leaf area (SLA), maximum plant height (MXH), and seed mass (SDM). \rem{John: explain \rem{Explain  what competitive strategies each trait is associated with.} We standardize our traits by taking the logarithm of the trait value and rescaling the logarithms to range between 0 and 1\footnote{Mathematically, we standardize by defining $y_i=(\text{log}(x_i)-\text{log}(x_\text{min}))\,/\,(\text{log}(x_\text{max})-\text{log}(x_\text{min}))$, where $x_i$ is the trait value measured for species $i$, and $x_\text{min}$ and $x_\text{max}$ are the lowest and highest trait value observed in the data.}. We  applied our tests on each trait individually,and also on the first principal component (PC1),  as well as on  thefour-dimensional  Euclidean space formed by our functional traits. these traits, which is a four-dimensional hypercube of side 1.  \subsection*{Metrics}  For each site we calculate its Rao quadratic entropy, defined as $Q=\sum_i^{S-1}\sum_{j=i+1}^S d_{ij}p_i p_j$, where $p_i$ and $p_j$ are the relative abundance of species $i$ and $j$, $d_{ij}$ is the absolute trait difference between them, and the sum is over all species pairs. It corresponds to the expected trait difference between two individuals randomly sampled (with replacement) from the community. We also used the functional dispersion metric proposed in \cite{Laliberte2010}, defined as the abundance-weighted mean distance $d_i$  between each species $i$  and the community trait centroid. In one dimension, it reduces to That is,  $\text{FDis}=\sum_i p_i d_i$. When a single trait is considered, this is simply $\sum_i p_i  |x_i-\sum_j p_j x_j|$, where $x_i$ is the trait value of species $i$. Both indices have been used to quantify community functional diversity \cite{Botta-Dukat2005, Laliberte2010, Ricotta2011}. A high value indicates trait overdispersion, i.e. species cover a wider region of trait space than expected by chance. In contrast, a low value suggests that species are being filtered toward a particular trait value, possibly due to selection for optimal tolerance to local environmental conditions \cite{Keddy1992}. In addition to test statistics based on trait dispersion, we also used a measure of the degree of even spacing between adjacent species on the trait axis. Even spacing has been proposed as indicative of niche differentiation, as it maximizes exploration of niche space \cite{Mason2005}, and minimizes competitive interactions caused by trait similarity \cite{MacArthur1967}. If species are ordered by trait value, the The  metric is defined as $\text{CV}=\sigma/\mu$, where $\mu$ and $\sigma$ are respectively the mean and standard deviation of the distances between closest neighbors in trait space. When a single trait is considered, species can be ordered by trait value, and the  distance vector is  $d_i=|x_i-x_{i+1}|$ between adjacent species $i$ and $i+1$. A low CV indicates even spacing. Even spacing has been proposed as indicative of niche differentiation, as it maximizes exploration of niche space \cite{Mason2005}, and minimizes competitive interactions caused by trait similarity \cite{MacArthur1967}.  On the other hand, recent work has raised the possibility that resource partitioning may actually lead to species clustering on the trait axis \cite{Scheffer2006}. In particular, clusters in trait space are expected if competitive exclusion is slow or if immigration replenishes species that are not niche-differentiated \cite{DAndrea2016}. Given this possibility, the coefficient of variation may actually be higher than expected by chance. %as exclusion is actually faster in the gaps between niches than in their immediate vicinity  Although species may be clustered, they may still sort into niches that in turn are evenly spaced. This could occur if competition is caused by trait similarity \cite{Scheffer2006, DAndrea2017}. In that case, the most abundant species in the community might be expected to be evenly spaced even though the community as a whole is clustered. Based on these considerations, we used the CV in two metrics. First, we considered all species in the community without regard for abundance. A similar test statistic, the variance divided by the range, is commonly used to quantify evenness \cite{Stubbs2004, Kraft2008, Ingram2009}. Second, we gradually remove species from the community in increasing order of abundance, at each step calculating the CV among the remaining species. A negative trend in If the  CV declines  as the community is progressively trimmed towards only the most least  abundant species are progressively removed, this  suggests even spacing between niches concomitant with clustering between species. %To our knowledge, this is the first use of this metric to describe trait pattern in species assemblages.  Finally, we test for the presence of clusters directly by applying a cluster-finding method. Our metric uses a k-medoid clustering algorithm: for a given number of clusters, it decides algorithm,  which partitions trait space into groups (clusters) of species, each group with a specific medoid, i.e.\ the  species belong in that is closest to all other members of its group. It is an iterative process  which alternately decides  cluster membership and medoid identity  by minimizing the average distances in  trait distance space  between the center of each cluster species  and the individuals belonging to that cluster medoids of their clusters  \cite{Kaufman1990}. We implement the algorithm using the function \textit{clara} in R package \textit{cluster} \cite{Maechler2016}. The For each community-year, we find the  number of clusters that best fits the datais found  using R's \textit{optim} function for Markov chain Monte Carlo optimization \cite{RCoreTeam2015}.For each community-year, we search for the best fit between two clusters and half the number of species or forty clusters, whichever is smaller.  The test statistic quantity being optimized  is the average silhouette width, a measure of how similar individuals are to their own cluster compared to neighboring clusters --- thus providing \cite{Kaufman1990}. Once  the goodness of fit optimal number  of clusters is found,  the k-medoid algorithm \cite{Kaufman1990}. test statistic is the optimized average silhouette width. We then test for clustering by comparing the test statistic against the set of null communities.  \subsection*{Null model}  In order to create null communities against which to compare our data, we used a mainland-island approach, where each site undergoes zero-sum birth-death neutral dynamics and immigration from a fixed regional species pool \cite{Hubbell2001}. For each site, the regional pool includes all species falling within the observed trait range, with the regional abundance of each species calculated as the mean across all sites. For each site we estimated immigration rates by fitting a neutral model to the observed relative species cover, and estimated community size by matching the neutral simulated communities to observed species richness. Estimated community size ranged from 215 individuals for Fauske to 567 for Gudmedalen, and immigration rate ranged from 0.03 for Ovstedal to 0.53 for Lavisdalen. For each site we simulated 1,000 neutral communities. 

\section{Results}  Fig.\ 2 summarizes our results for the 2009 census. Bar plots show the percentage of the 12 sites that tested significant against the set of null communities. We focus on the 2009 census, but our results were consistent across the years (see Figure S2 in the Supplement), indicating that deterministic factors are playing a role in the trait structure of our communities.   Leaf area and SLA, which are related traits, had similar results across most tests. Between 30\% and 50\% of sites were significantly overdispersed according to Rao and FDis. A smaller percentage (20\%) of sites were significantly underdispersed in SLA. The CV was significantly high for leaf area in 50\% of the sites, indicating uneven spacing between adjacent species. Results were weaker and more ambiguous for SLA: spacing between adjacent species was significantly even in 20\% of sites, and significantly uneven in another 10\%.  In contrast, seed mass showed the strongest indication of underdispersion. 30\% and 50\% of sites had significantly low Rao and FDis indices, respectively. Furthermore, there was no significant evenness in any of the sites according to the CV metric. And 25\% of the sites showed a significant negative trend in CV as low-abundance species are removed.   Results were ambiguous for maximum plant height. Rao and FDis results were relatively strong but split between significant overdispersion and underdispersion, with the latter being a slight majority. Our CV result was also ambivalent, with 30\% of sites indicating even spacing between species while another 20\% indicate the opposite pattern. 20\% of sites had a significant negative trend in CV as rare species are removed.  When all four traits were considered together in a Euclidean trait space, results were somewhat ambiguous for the functional dispersion metrics but tended towards overdispersion (30\% overdispersion against 20\% underdispersion). According to the CV, species were evenly spaced in this multidimensional space in 20\% of the sites, and were not significantly uneven in any site.  Rao and FDis results were largely consistent with each other for all traits and the Euclidean space, corroborating previous results that indicate these two statistics are related \cite{Laliberte2010}.  A low percentage of sites, between 10\% and 25\%, showed evidence of significant clustering according to the CV trend and Clara metrics. Particularly for Clara, numbers were consistently low across traits and the Euclidean space, averaging just above 10\% detection of significance. Given the null expectation of significance in 5\% of the sites because of our $\alpha=0.05$ significance cutoff, these results suggest that species are not sorting into distinguishable clusters in our sites.  Figure 3 shows the variation of our Rao results against mean summer temperature of our sites. Points correspond to sites and are plotted by temperature on the x-axis, while the standard score of the test statistic obtained for that site is plotted on the y-axis\footnote{The standard score measures the difference between the data and the null communities scaled by the variation across the nulls. In other words, if the test score in a site was $x$, and the mean and standard deviation of the null scores were respectively $\mu$ and $\sigma$, then the standard score is $z=(x-\mu)/\sigma$.}. We see a significant trend in Rao scores against temperature for two of the four traits, SLA and max height, plus the Euclidean space. The trend is negative in all cases, indicating that colder sites tended to be more overdispersed than warmer sites. The data being shown is from the 2009 census, but results were consistent across years.  Results for the other metrics are shown in Figure S3 in the Supplement. Aside from FDis, which showed identical trends as Rao, the CV trend metric showed a trend against temperature in SLA and the Euclidean space. Those trends were probably incidental, as they were were weak and the standard scores were low. We also checked for trends against mean annual precipitation but only found weak negative trends in clustering (Clara) in seed mass and the Euclidean space.  \section{Discussion} 

\newpage  \section*{Figures}  \begin{figure}[h]  \label{fig:Fig1}  \caption{Example data from the site of Lavsdalen from the 2009 census. Species are arranged by trait value on the x-axis, and species percentage cover is shown on the y-axis. Trait values are logged and then normalized to range between 0 and 1. Maximum height values are jittered to show species sharing same binned value.}  \includegraphics[width=1\textwidth,angle=0]{Fig1}  \centering  \end{figure}  \begin{figure}[h]  \label{fig:Fig2}  \caption{Summary of metric results across sites from the 2009 census. For each test, percentage of sites with statistically significant results are shown for leaf area (LFA), specific leaf area (SLA), maximum plant height (MXH), seed mass (SDM), and the four-dimensional Euclidean space formed by these traits (EUC). Rao, FDis, and CV are two-tailed: bars show the percentage of sites, out of 12, whose metric values were lower than the 2.5\% null percentile (red bars) or higher than 97.5\% null percentile (blue bars). CV trend and Clara are one-tailed: bars show the percentage of sites with metric values exceeding the 95\% null percentile.}   \includegraphics[width=1\textwidth,angle=0]{Fig2}  \centering  \end{figure}  \begin{figure}[h]  \label{fig:Fig3}  \caption{Standard scores of the Rao metric for each site, plotted against the site's mean summer temperature. Significant negative trends were observed for SLA, maximum plant height, and the 4-d Euclidean space. Results shown for the 2009 census.}   \includegraphics[width=1\textwidth,angle=0]{Fig3}  \centering  \end{figure}  \end{document}