Clinical Data Analysis
Continuous variables were reported as means and standard deviations (SD) or medians and interquartile ranges (IQR); categorical variables are reported as frequencies and percentages. Differences in baseline characteristics between cohorts were evaluated using t-tests for continuous variables, Pearson chi-square and Fisher’s exact tests were performed for categorical variables. Bivariate correlation analyses were conducted to detect the direction and strength of relations between score of each items in OABSS (e.g., Daytime frequency, Nighttime frequency, Urgency, Urgency incontinence) and bacterial abundance and the relations between clinical data and indices of bacterial alpha diversity using Spearman correlation. Statistical analysis was performed using the Statistical Package for Social Science (SPSS, version 21, USA). For differentially abundant taxa between cohorts, Wilcoxon rank sum test was applied, and Benjamini-Hochberg false discovery rate correction was performed in R (version 3.4.1, stats package). Statistical tests were based on two-tailed probability and the results were considered significant when the P value was less than 0.05.
The wrapper package Quantitative Insights Into Microbial Ecology (QIIME) was applied to process the raw reads to create an operational taxonomic units (OTUS) table. Using an open reference selection strategy with Uclust, the sequences were clustered into individual OTU at the default similarity level of 97%, and then chimera detection was performed using the the program UCHIME. Using Ribosomal Database Project Classifier to align a single representative sequence from each clustered OTU to the Greengenes database.
Alpha diversity, including the Observed species, Chao1, Shannon, Simpson and Abundance-Based Coverage Estimator (ACE) and Pielou’s index, was evaluated using QIIME. The Chao1, ACE and the Observed species were used to calculate richness, samples with larger values are richer. Evenness was calculated with Pielou’s Index, which ranks samples from 0 to 1, with 1 being completely even, while a smaller index score indicates that certain species are more abundant than others. Shannon and Simpson index combines interactions between richness and evenness, Larger Shannon diversity values indicate more diverse communities with greater richness and/or evenness, and Simpson diversity is the opposite. The difference of alpha diversity was evaluated by Wilcoxon rank sum test. Beta-diversity, measured by calculating the Bray Curtis, weighted UniFrac and unweighted UniFrac distances, was used to compare microbial composition between samples. Taxa summaries were reformatted and inputted into Linear discriminant analysis effect size (LEfSe) via the Huttenhower Lab Galaxy Server to identify significantly different bacteria as biomarkers between groups at genus level. In the settings of LEfSe,11 the significantly specific bacteria were identified using the Mann-Whitney U test, and their effect size were estimated via linear discriminant analysis (LDA). The threshold on the logarithmic LDA score for discriminative features was 2.0.