Home Software Services About Contact usearch manual
cluster_fast command
 
Clusters sequences using a variant of the UCLUST algorithm designed to maximize speed.

Sequences are automatically sorted by decreasing length prior to clustering. If this ordering is not appropriate, then the cluister_smallmem command must be used. See UCLUST sort order.

An identity threshold must be specified using the ‑id option.

The -idprefix option can give significant speed improvements on multi-core CPUs (see accept options). At high identities, sequences will probably share their first few letters, especially in next-gen sequencing applications where the first few bases are primer sequence, so using say -idprefix 2 or -idprefix 4 should not change the results much but can give big speed improvements.

Reverse-complemented matching (-strand both) is not supported. For this, you can use cluster_smallmem (v6.0.289 and later).

See also
  -centroids option
  -consout option
  Standard output file options
 
Accept options
  Termination options
  Indexing options
  Masking options
  Multithreading
  Alignment parameters
  Alignment heuristics

  Cluster sizes
  Memory requirements

Example

usearch -cluster_fast query.fasta -id 0.9 -centroids nr.fasta -uc clusters.uc