otutab_core command

Identifies a possible "core microbiome" of OTUs which are present in more samples than others.

Input is an OTU table in QIIME classic format.

The presence of an OTU in some or many samples can be spurious because of cross-talk or because the OTU itself is spurious. To enable manual review, the otutab_core command generates a report indicating cases where the presence of an OTU may be spurious due to cross-talk, and where an OTU may be spurious due to sequence errors.

If a sintax tabbed file is provided using the -sintaxin option, then the taxonomy of the core OTUs is included in the report.

If a distance matrix is provided using the distmxin option, this is used to identify possible dominant OTUs, i.e. high-abundance OTUs which are similar to a low-abundance OTUs in the report. If there is a dominant OTU, this may indicate that the low-abundance OTU is spurious.

The -tabbedout option specifies the output file. OTUs are sorted in order of decreasing number of samples where they are present. Fields are:

#1. OTU = name of the OTU.
#2. Samples = number of samples where the OTU has a non-zero count.
#3. Size = total number of reads assigned to this OTU.
#4. DomOTU = high-abundance "dominant" OTU which is very similar to this OTU, if any.
#5. DomSize = total number of reads assigned to the dominant OTU.
#6. DomId = identity of the dominant OTU with this OTU.
#7. Min = minimum count for this OTU.
#8. LoQ = low quartile count for this OTU.
#9. Med = median count for this OTU.
#10. HiQ = high quartile count for this OTU.
#11 Max = maximum count for this OTU.
#12 Taxonomy = condensed taxonomy prediction.

If the minimum or LoQ count is much smaller than the maximum count, this suggests that the smaller counts may be due to cross-talk.

If the size of an OTU is much smaller than a neighboring "dominant" OTU, then the OTU itself may be spurious due to sequence error.

Example

usearch -calc_distmx otus.fa -tabbedout distmx.txt \
-sparsemx_minid 0.9 -termid 0.8

usearch -sintax otus.fa -strand both -db ref16s.txt \
-tabbedout sintax.txt

usearch -otutab_core otutab.txt -distmxin distmx.txt \
-sintaxin sintax.txt -tabbedout core.txt