2.4 Bioinformatics and Statistical analysis
The raw sequences were processed using QIIME (version 1.9.1) (Caporaso et al., 2011). Adaptors and primers were removed using AdapterRemoval (Lindgreen, 2012). Phix contamination was removed using the DeconSeq program (Schmieder & Edwards, 2011). Reads were merged and filtered by size (according to primer set) and quality (Phred quality score > 2). The sequences were then clustered into operational taxonomic units (OTUs) using an open reference strategy based on a 97% identity with the GreenGenes Database (v13_5 (DeSantis et al., 2006) as the reference. Taxonomy was assigned with an RDP classifier (Wang et al., 2007) retrained with SILVA (Release 115 http://www.arb-silva.de) for bacterial 16S rRNA gene database, as well as with UNITE (v7.2) (https://unite.ut.ee/) for fungal ITS database. OTUs assigned to chloroplasts and mitochondria were filtered out from the data set. Chimeric OTUs were identified using uchime (version 4.2 http://drive5.com/usearch/ manual/ uchime_algo.html) and removed from the OTU table.
All statistical analyses were carried out in R. Chao and Ace ((http://www.mothur.org) were calculated to characterize the community richness; Shannon index and Simpson index were calculated to characterize the community diversity. Rarefaction curves, reflecting the sequencing depth, were calculated using custom R scripts. To characterize the richness in a specific rhizosphere community, the custom R scripts were used to obtain Shannon-Wiener curve, Venn diagrams, and the microbial community bar plots. In the β-diversity analyses, R package vegan was used to obtain the heat map. Principal coordinate analyses (PCoA) utilizing the weighted UniFrac distances were calculated using the pcoa() function of the R package Ape (Paradis, Claude, & Strimmer, 2004). Plots and figures were generated with R (version 3.2.1) using the vegan, plyr, beanplot, ggplot2, and vcd packages. Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) and FunGuild were used to predict the metabolic function of the microbial community including nitrogen, methane, and energy metabolisms (Nguyen et al., 2016). All data were analyzed using one-way ANOVA, being transformed when necessary. Post hoc tests were used to investigate relationships between main factors when interactions were not significant.