Home Software Services About Contact usearch manual

utax command

See also
Should I use UTAX or SINTAX? Which database?
UTAX reference data downloads
  UTAX algorithm
  cluster_otus_utax command
  makeudb_utax command
  Taxonomy predictions
  Taxonomy confidence
  Taxonomy training
  Case study: RDP predicts genus, UTAX predicts phylum
  Taxonomy benchmark results
  How to classify 16S data
  How to classify ITS data
  How to train UTAX on your own reference data
  Misop tutorial with UTAX taxonomy assignment

The utax command uses the UTAX algorithm to predict taxonomy for query sequences in FASTA or FASTQ format.

A reference database in UDB format is required. The makeudb_utax command is used to create the database. See UTAX downloads page for available reference files. See How to train UTAX on user data if you want to use your own reference data.

Taxonomy predictions are written to the utaxout file. Predictions in the -utaxout file are written twice: once with confidence values and once after applying a confidence threshold, keeping only ranks with high enough confidence. The threshold is specified by the -utax_cutoff option (default 0.9).

The -rdpout option species an output file in the format used by the RDP Naive Bayesian Classifier stand-alone program. This is for compatibility with existing scripts designed for RDP (supported in usearch v8.1.1786 and later).

The -alnout option specifies a human-readable output file showing the alignment to the top hit and identities with the nearest neighbors at each rank (see example file here). This is useful for manual review of predictions. Note that this option will cause the utax command to execute more slowly as more calculations are needed to create the alignments.

The -fastaout option specifies a FASTA file. Output sequences are written with a tax= annotation in the labels specifying the taxonomy prediction with confidences.

The strand option must be specified.

Multithreading is supported.


usearch -utax reads.fastq -db 16s.udb -utaxout reads.utax -strand both -alnout aln.txt