Home Software Services About Contact usearch manual
QIIME grossly over-estimates mock community diversity

See also
 
Defining and interpreting OTUs
  Alpha diversity
  Problems with closed- and open-reference clustering

Below are rarefaction curves generated by QIIME from llumina V4 reads of a mock community with 22 strains; data from (Bokulich et al. 2013). There are four samples, two Even and two Staggered. These figures were generated by the core_diversity_analaysis.py script generated using the procedure recommended by the QIIME Illumina tutorial (http://qiime.org/tutorials/otu_picking.html, accessed 25 Apr 2017), except for sampling depth (-e option) which is left for the user to decide. I tried a range of read depths from one thousand to one million.

The curves show that richness is inflated from a factor of ~5x (~100 OTUs at a depth of one thousand reads per sample, which is very shallow by today's standards, to ~200x (~5,000 OTUs at a depth of one million). No convergence is seen in the rarefaction curves, reflecting that almost all OTUs are due to errors which accumulate at a roughly constant rate as the number of reads increases. Thus, the reported "alpha diversity" mostly measures the diversity of uncorrected experimental artifacts rather than of biologically meaningful groups.