See also
fastx_uniques command
Dereplication
The search_exact command searches for exact, full-length matches to a database sequence. The algorithm is faster than usearch_global, uses less memory, and is guaranteed to find all correct matches, i.e. there are no heuristics.
The underlying algorithm is the same as for the fastx_uniques command.
The query sequences may be in FASTA or FASTQ format.
The database file must be in FASTA format. Indexed (udb) databases are not supported.
The -strand option is required for nucleotide databases. If -strand both is specified, then reverse-complemented exact matches will also be reported.
Standard output file options are supported. For example, the -notmatched output file can be used to save query sequences that are not found in the database and the -dbnotmatched option can be used to save database sequences that are not present in the query set. This enables incremental updating of dereplicated sequence sets.
Multithreading is supported.
Example
usearch -search_exact seqs.fa -db db.fa -blast6out matches.b6 -strand plus