USEARCH searches a database for the
top global hit, or top few hits. This is distinctly different from
traditional search algorithms such as BLAST, which use local alignments and
report all hits at a given E-value threshold.
The advantages of the USEARCH algorithm include its ultra-high speed, which can be orders of magnitude faster than BLAST, and reduced data output since in most applications only the top hit or the top few hits are relevant. Also, in some applications, e,g, taxonomy assignment, global alignments give better biological predictions as local alignments may give short hits with misleadingly high identity.
The USEARCH algorithm is implemented in the usearch_global command. A local alignment variant is also provided (usearch_local), though this is rarely used in practice. USEARCH is not effective on very large databases, say much greater than 1M sequences or 1Gb as a FASTA file. For huge databases, UBLAST is recommended even if global hits are required.
For both USEARCH and UBLAST commands,
accept parameters give a rich set of options for filtering hits.
UBLAST searches a database for all
local hits meeting the given accept
criteria including an E-value threshold and other options. Hits that are
global or approximately global alignments can be selected if needed.
achieves much higher speed than BLAST on large databases with comparable
sensitivity in most cases. UBLAST consistently achieves better sensitivity and
speed compared to MEGABLAST. On protein and translated searches, UBLAST
can achieve speeds hundreds of times faster than BLASTP or BLASTX with
comparable sensitivity to identities of 50% or less.