ITS amplicons have large variations in length due to the biology of the region -- some of the sequence evolves neutrally, and long indels are common.
The goal of global trimming is to ensure that reads from the same species, or closely related species, have few or no terminal gaps when aligned to each other. If you don't do this, then two reads of the same biological sequence may have different lengths, and this causes problems in calculating the abundance of unique sequences.
The appropriate strategy for global trimming depends on your reads.
Paired reads which always overlap
If the
read length is long enough that the longest ITS sequence will given an
overlap of at least, say, 32 bases, then you don't need
any additional trimming: fastq_mergepairs does everything you need.
Short amplicons will create "staggered" pairs which are correctly truncated
during the merging.
Paired reads which sometimes or never overlap
If
the read length is not long enough to get overlaps on longer ITS sequences, then you can't use the reverse reads. The best strategy is simply
to discard the reverse reads (R2s) and make OTUs from the forward (R1) reads
alone. See below under "Unpaired reads" for the appropriate trimming
strategy.
Unpaired reads which never reach the reverse primer
If
you have unpaired reads which never reach the reverse primer then they
should be trimmed to a fixed length. If the reads are already fixed length
(e.g. forward Illumina reads), then no trimming is necessary. You might
choose to trim to a shorter length if the read quality is poor towards the
end of the read (see fastq_eestats2
and fastx_truncate).
Unpaired reads which sometimes or always reach the reverse primer
If a read continues past the reverse primer, then it will include
adapter sequence and then random junk. The adapter and junk must be
discarded. It is probably also a good idea to delete the primer sequence
since PCR tends to force the primer-binding locus to match the primer.
Unfortunately, there is currently no easy way to do this in USEARCH. You can
use search_oligodb to find the reverse
primer, but you will need to write your own script to truncate the reads. If
this is a real problem for you, let
me know and I'll look into making a new command for you.