See also
UTAX reference data downloads
utax command
makeudb_utax command
Taxonomy predictions
Taxonomy confidence
Taxonomy training
Taxonomy benchmark results
The cluster_otus_utax command generates OTUs based on
predicted taxonomies for a set of query sequences in FASTA or FASTQ format.
A reference database in UDB format is required. The makeudb_utax command is used to create the database.
See UTAX downloads page for available reference files.
A taxonomic level, e.g. genus or family, must be specified. Each taxon at that level defines an OTU. If the confidence value falls below the threshold, or there is no prediction, then the query is assigned to an "unclassified" OTU.
The taxonomic level is specified by the -utax_level option, e.g. -utax_level g for genus. See taxonomy annotations for supported levels.
The confidence threshold is specified by the -utax_cutoff option (default 0.9).
The -utaxotusout option specifies a tabbed text output file with one record per OTU.
The -tabbedout option specifies an output file in utax output format.
The -otus option specifies a FASTA file name. One representative sequence for each OTU is written to this file. Fields are appended to the sequence labels giving otu= and tax= annotations corresponding to fields 1 and 6 in the utaxout file. The representative sequence is the first found in the query set so the input should be sorted appropriately, typically in order of decreasing abundance of unique sequences obtained by fastx_uniques.
Taxonomy predictions for all query sequences written to a utaxout file if the -utaxout option is specified.
The strand option must be specified.
Multithreading is not supported.
Example
usearch -cluster_otus_utax reads.fq -db 16s_ref.udb -utax_level
f -otus otus.fa \
-strand plus -utaxotusout otus.txt -utaxout out.utax