search_exact command

See also
fastx_uniques command
Dereplication

The search_exact command searches for exact, full-length matches to a database sequence. The algorithm is faster than usearch_global, uses less memory, and is guaranteed to find all correct matches, i.e. there are no heuristics.

The underlying algorithm is the same as for the fastx_uniques command.

The query sequences may be in FASTA or FASTQ format.

The database file must be in FASTA format. Indexed (udb) databases are not supported.

The -strand option is required for nucleotide databases. If -strand both is specified, then reverse-complemented exact matches will also be reported.

Standard output file options are supported. For example, the -notmatched output file can be used to save query sequences that are not found in the database and the -dbnotmatched option can be used to save database sequences that are not present in the query set. This enables incremental updating of dereplicated sequence sets.

Multithreading is supported.

Example

usearch -search_exact seqs.fa -db db.fa -blast6out matches.b6 -strand plus