Applications
Non-redundant databases |
A
"non-redundant" (NR) database contains only one representative of a given type
of sequence. Dereplication removes
identical sequences. |
Reduced redundancy databases |
Clustering at a lower threshold, e.g. 90%, may reduce the database size,
enabling faster searches with only a small loss in sensitivity. |
Algorithms
UCLUST |
UCLUST is a general-purpose clustering
algorithm which achieves significantly higher speed and sensitivity compared
with CD-HIT and other alternative algorithms (see
benchmarks). The UCLUST algorithm is
implemented in the cluster_fast and
cluster_smallmem commands. |
Dereplication |
Dereplication reports one copy of every
unique sequence in the input data. USEARCH supports both full-length and prefix
dereplication, which are implemented in the
derep_prefix and derep_fulllength
commands. |