Home Software Services About Contact usearch manual
unoise command

See also
 
UNOISE paper

Uses the UNOISE algorithm to perform denoising (error-correction) of amplicon reads.

Input is a set of quality-filtered unique read sequences with size=nnn; abundance annotations. See UNOISE pipeline for details of how reads should be pre-processed. The input should be a complete set of reads without any clustering (except for finding uniques), so for example you should not use 97% OTUs as input. In other words, unoise cannot error-correct the output from cluster_otus or a subset of the FASTQ reads. It is ok to run unoise on the FASTQs for a single sample, though I generally recommend pooling samples before denoising.

See Tutorials for example scripts & data.

Errors are corrected as follows:
  - Reads with sequencing error are identified and removed.
  - Abundances are corrected (when the OTU table is generated).
  - Chimeras are removed.
  - PhiX sequences are removed.
  - Low-complexity sequences due to Illumina artifacts are removed.

The algorithm is designed for Illumina reads, not other technologies such as 454 pyrosequencing.

Corrected amplicon sequences are written to the -fastaout file.

The -relabel prefix option specifies a prefix for sequence labels in the output file. An integer 1, 2, 3... is appended to the prefix (requires v9.0.2140 or later).

The -minampsize option specifies the minimum abundance (size= annotation) for an error-corrected amplicon. Default is 4 in v9.0.2159 and later (it was 8 in previous versions).

An OTU table can be generated using the usearch_global command. Reads must have sample identifiers in the labels. I suggest using 97% identity for matching reads to denoised amplicon sequences (this is not an clustering identity; rather, using 97% allows for up to 3% read errors).

Example

usearch -unoise uniques.fa -tabbedout out.txt -fastaout denoised.fa