USEARCH manual

Alpha diversity metrics

Alpha diversity metrics are calculated using the alpha_div command. It is more accurate to say that alpha_div calculates single-sample metrics because several of the metrics are not diversity metrics.

Some metrics just consider presence / absence of an OTU, e.g. richness, but most are based on OTU frequencies. Interpreting frequencies is difficult because amplification bias causes the number of reads to correlate very badly with the number of cells, so for example the OTU with highest frequency in the reads is often not the most abundant species. Because of cross-talk, even presence / absence of a given OTU in a given sample cannot be reliably established when the OTU has low abundance. Because of these issues, it is difficult to interpret diversity metrics from traditional numerical ecology when they are applied to next-generation marker gene sequencing.

Chao-1 attempts to estimate the total number of OTUs in the community including those that were not observed. In my opinion, estimators have little value in amplicon sequencing experiments because low-abundance OTUs are often spurious which makes reliable extrapolation impossible.

Confusingly, some metrics use different units so cannot be compared with each other. For example, the popular Shannon index is a measure of entropy where the unit is bits of information if the logarithms are base 2, but people sometimes use natural logarithms (base e) or base 10. None of these variants of the Shannon index have an obvious connection to the number of OTUs, and people often do not say which variant they used, so the numerical values are difficult to interpret. Metrics using unfamiliar units can be interpreted by converting to an effective number of OTUs. The effective number of OTUs for the Shannon index is the Jost index of order 1.

Diversity metrics

Name

Units

Description

richness

OTUs

Number of OTUs with at least one read for the sample.

chao1	OTUs	Chao-1 estimator, calculated as N + S² / (2 D²) where N is the number of OTUs, S is the number of singleton OTUs and D is the number of doublet OTUs, i.e. OTUs with abundance 2.
shannon_2	bits	Shannon index (logs to base 2).
shannon_e	nats	Shannon index (logs to base e).
shannon_10	dits	Shannon index (logs to base 10).
jost	OTUs	Jost index of order q where q is specified by the -jostq command-line option, default 1.5.
jost1	OTUs	Jost index of order 1, the effective number of species given by the Shannon index.

Evenness metrics

Name	Units	Description
simpson	Probability	Simpson index, calculated as the sum over OTUs of f² where f is the frequency of the OTU. It is the probability that two randomly selected reads will belong to the same OTU. A value close to 1 indicates that a single large OTU dominates the sample, small values indicate that the reads are distributed over many OTUs.
dominance	Probability	Probability that two randomly selected reads will belong to different OTUs. Calculated as 1 – simpson.
equitability	?	Entropy (Shannon index) divided by the logarithm of the number of OTUs. Value of 1 indicates perfectly even (equal abundances), small values indicate a highly skewed abundance distribution.
robbins	Frequency	Robbins index, calculated as S / (N + 1) where S is the number of singleton OTUs and N is the total number of OTUs.
berger_parker	Frequency	Berger-Parker index. Frequency of the most abundant OTU. A value close to 1 indicates that a single large OTU dominates the sample, small values indicate that the reads are distributed over many OTUs.

Other

Name	Units	Description
reads	Reads	Total number of reads for the sample.