USEARCH manual > global trimming | |||||||||
global trimming | |||||||||
Global trimming addresses a problem that occurs in next-generation amplicon sequencing. The issue is related to terminal gaps in cluster alignments. I'll use 16S as an example since this is the most common application where it arises, but similar considerations apply for most amplicon sequencing applications. Recommendations are summarized in the table, with explanation below.
Reads should globally alignable with no terminal gaps
Typical 16S reads are derived from amplicon sequences. Amplicons are obtained by PCR from a pair of primers. It is important to consider whether the reads cover full or partial amplicons. In both cases, read lengths vary. With full coverage reads, lengths vary primarily because amplicon lengths vary (due to hypervariable regions in the gene). Minor variations can also occur due to indel errors in the reads (common in pyrosequencing reads due to homopolymers, but very rare with Illumina). With partial coverage reads, lengths vary primarily due to quality trimming. Since quality tends to fall towards the end of the reads, the last bases tend to be less reliable. This can produce an alignment with unreliable bases towards the end, as shown in the figure below.
|