See also
Defining and interpreting OTUs
OTU clustering
UPARSE pipeline
UNOISE pipeline
OTU commands
The otutab command generates an OTU table by mapping reads to OTUs.
OTU table output
See OTU table output options.
Normalizing the table
After generating the table, you
should use the otutab_norm command to
normalize all samples to the same number of reads.
Query dataset
The query file can be
in FASTQ or FASTA format. Every query sequence must be labeled with a
sample identifier. The
fastx_get_sample_names command
can be used to check that your sample names are formatted correctly.
Query sequences are typically raw reads, i.e. reads after paired read merging, if applicable, but before quality filtering. Low-quality reads and singletons can often be mapped successfully to an OTU, so including them accounts for a larger fraction of the reads. The fastx_uniques_persample command can be used to find the unique sequences and abundances for all samples. This compresses the input data and makes the otutab command somewhat faster but probably not as much as you might expect (typically, the compression is only ~2x).
OTU
database
The search database is either a set of OTU sequences or "ZOTU"
sequences,
i.e. denoised sequences. Each query sequence is
mapped to the closest database sequence. Ties are broken systematically by
picking the first in database file order. A udb
database can be used. Database sequences must be labeled with
OTU identifiers. The database file is
specified by the -otus or -zotus option. Use -zotus if the OTUs are
denoised, -otus otherwise.
Identity threshold for mapping
The -id option sets the minimum
fractional identity. Default is 0.97,
corresponding to 97% identity. Denoised OTUs also use a 97% identity
threshold by
default to allow for sequencing and PCR error. See
defining and interpreting OTUs for discussion.
By default, reads are assumed to be on the same strand as the OTU sequences. You can use -strand both to search both strands.
The -notmatched option specifies a FASTA filename for sequences which are not assigned to an OTU.
The -notmatchedfq option specifies a FASTQ file for unassigned sequences (input must be FASTQ).
Annotations are stripped from the OTU sequence labels unless the -keep_annots option is specified.
Multithreading and standard output files are supported.
Example
usearch -otutab reads.fq -otus
otus.udb -otutabout otutab.txt -biomout otutab.json \
-mapout map.txt
-notmatched unmapped.fa