database trimming

Where possible, sequences in database files should be trimmed to minimize terminal gaps. This reduces the memory required to store the database, improves the correlation between word count and sequence similarity for the USEARCH algorithm, and increases search speed by reducing the number of spurious word matches that must be counted or extended.

For example, In next-generation 16S sequencing, it is common to sequence a region of the 16S gene between a pair of primers. In this case, it is recommended to trim the database to the sequencing primers.