cluster_smallmem command

Clusters sequences using a variant of the UCLUST algorithm designed to minimize memory use.

Sequences must be sorted in an appropriate order prior to clustering, for example by using the sortbylength or sortbysize commands.

An identity threshold must be specified using the ‑id option.

Multithreading is not supported as this would require significant memory overhead.

By default, input sequences are expected to be sorted by decreasing length. If some other sort order is used, the ‑usersort option should be specified.

By default, nucleotide matching is done on the forward strand only. For matching on both strands, use -strand both (supported in v6.0.290 and later).

Example

usearch -cluster_smallmem query.fasta -id 0.9 -centroids nr.fasta -uc clusters.uc