Sequences were processed by the cluster-free filtering scripts provided by Tikhonov et al. with default parameters. In essence, the scripts estimated the error rates of specific one-nt substitutions directly from the data. These error rates were used to calculate the probabilities that any given sequences were generated by sequencing error of their more abundant neighbors (the 'null model'. Only sequences with abundances above a threshold of 10 counts (larger than 10 in at least two samples) and the null-model prediction by at least 10-fold were kept as candidates Tikhonov 2014.