Home Software Services About Contact     
Follow on twitter

Robert C. Edgar on twitter

11-Aug-2018 New paper describes octave plots for visualizing alpha diversity.

12-Jun-2018 New paper shows that one in five taxonomy annotations in SILVA and Greengenes are wrong.

18-Apr-2018 New paper shows that taxonomy prediction accuracy is <50% for V4 sequences.

05-Oct-2017 PeerJ paper shows low accuracy of closed- and open-ref. QIIME OTUs.

22-Sep-2017 New paper shows 97% threshold is wrong, OTUs should be 99% full-length 16S, 100% for V4.

UPARSE tutorial video posted on YouTube. Make OTUs from MiSeq reads.


 New in v11 

fastx_trim_primer command

See also
  fastx_truncate command
  orient command

The fastx_trim_primer command searches for a primer sequence close to the start of a sequence. If a match is found, the sequence is oriented onto the matching strand and the sequence up to the last matching base is deleted. If no match is found, the sequence is discarded.

This command is useful when reads are oriented on both strands, so that the primer may appear at the start or end of the read, or the position or length of the primer varies. When the primer has a fixed length and always appears at the start or other fixed position in the read, it may be easier to use the fastx_truncate command to delete a fixed number of bases. However, requiring a match to the primer can be a useful quality filter because if the primer is missing or has many mismatches, the rest of the read may also be messed up.

The -db option specifies a database file containing one or more primer sequences. Usually this is in FASTA format; udb is also supported but gives no speed advantage.

The strand option must be given.

The -width option specifies the maximum number of bases before the first base which matches the primer sequence. Default 8.

The -maxdiffs option specifies the maximum allowed number of mismatches. Default 2.

The -fastaout and -fastqout options specify output files in FASTA and FASTQ formats.

The -tabbedout option specifies a tabbed text file with these fields:

#1 Query sequence label.
#2 Primer label.
#3 Zero-based position of first matching base.
#4 Number of mismatches.
#5 Strand.


usearch -fastx_trim_primer reads.fq -db primers.fa -strand both \
  -maxdiffs 1 -width 0 -fastqout trimmed_reads.fq