Data analysis
For each water body category (small lakes, large lakes, and rivers), we determined the species richness and the total number of species in IUCN status categories. We used a non-parametric Kruskal Wallis test to assess the mean differences in the species richness and IUCN categories among waterbody categories. Post hoc multiple comparisons were conducted with a Dunn’s test to determine the statistical differences between the waterbody categories. We generated a species accumulation curve for the waterbody to assess if most of the species found in the data. We determined the rarity of a particular species by summing the frequency of occurrence in the water bodies where it was found. We used the Bray Curtis dissimilarity measure to compute the ranks for water bodies and species and later used a non-metric multidimensional scaling (nMDS) to visualize the species and water bodies in 2-D ordination space. After, we performed an analysis of similarity (ANOSIM) to determine the statistical differences among waterbody categories. A similarity percentage analysis (Simper) was used to evaluate the contribution of the species to the dissimilarities between waterbody categories.
For the index, we log-transformed the CPIw values and used a Shapiro-Wilk and Levene tests to examine for normality and equality of variance, respectively. After, we used a parametric Welch 2-Sample t-test to evaluate the differences between the mean CPIw values for large and small lakes. We processed data with predefined functions in R (including specaccum, diversity, metaNMDS, simper, anosim, ) of the Vegan package (Oksanen et al. 2019), dunn test in Dunn package (Dinno, 2017).