New in v11

fastx_trim_primer command

The fastx_trim_primer command searches for a primer sequence close to the start of a sequence. If a match is found, the sequence is oriented onto the matching strand and the sequence up to the last matching base is deleted. If no match is found, the sequence is discarded.

This command is useful when reads are oriented on both strands, so that the primer may appear at the start or end of the read, or the position or length of the primer varies. When the primer has a fixed length and always appears at the start or other fixed position in the read, it may be easier to use the fastx_truncate command to delete a fixed number of bases. However, requiring a match to the primer can be a useful quality filter because if the primer is missing or has many mismatches, the rest of the read may also be messed up.

The -db option specifies a database file containing one or more primer sequences. Usually this is in FASTA format; udb is also supported but gives no speed advantage.

The strand option must be given.

The -width option specifies the maximum number of bases before the first base which matches the primer sequence. Default 8.

The -maxdiffs option specifies the maximum allowed number of mismatches. Default 2.

The -fastaout and -fastqout options specify output files in FASTA and FASTQ formats.

The -tabbedout option specifies a tabbed text file with these fields:

#1 Query sequence label.
#2 Primer label.
#3 Zero-based position of first matching base.
#4 Number of mismatches.
#5 Strand.

Example

usearch -fastx_trim_primer reads.fq -db primers.fa -strand both \
-maxdiffs 1 -width 0 -fastqout trimmed_reads.fq