Home Software Services About Contact     

Abundance skew

See also

A chimeric amplicon usually has much lower abundance in the reads than its parent sequences. When the chimeric amplicon is formed, there is just one strand which may be duplicated in subsequent PCR cycles, while the parent sequences can be amplified in all cycles. Also, the parent sequences very likely exist in multiple copies in the original sample.

The "abundance skew" of a putative chimera is defined to be the the abundance of the least abundant putative parent sequence divided by the abundance of the putative chimeric sequence. If this ratio is large, this is more likely to be a true positive; if it is small, it is more likely to be a false positive.

Using abundance skew as a filter helps reduce the number of fase postives due to "fake models", i.e. sequences constructed from two parents which in fact are valid biological sequences. Fake models are surprisingly common, as shown in the UCHIME2 paper.

Reference (please cite)
R.C. Edgar (2016), UCHIME2: improved chimera prediction for amplicon sequencing, https://doi.org/10.1101/074252
  • UCHIME2 algorithm, improved chimera detection

  • "Fake" chimeras are common, valid biological sequences matching two-parent model

  • Perfect chimera filtering impossible even with complete and correct reference

  • Realistic chimera benchmark