|
About UCHIME
UCHIME is an algorithm for detecting chimeric sequences. It was
developed in collaboration with Brian
Haas, Jose Carlos Clemente, Chris
Quince and Rob
Knight. Chimeras are commonly created during DNA sample
amplification by PCR, especially in community sequencing experiments
using single regions such as the 16S rRNA gene in bacteria or the fungal
ITS region. UCHIME can detect
chimeras using a reference database or de novo using abundance
information on the assumption that chimeras are less abundant than their
parents because they must have undergone fewer rounds of amplification. OTU
clustering
UCHIME is most often used in the context of OTU clustering for
community sequencing experiments based on rRNA genes such as 16S, 18S
and ITS. I recommend the otupipe
script for OTU clustering. This script uses several algorithms,
including UCLUST and UCHIME, to generate OTUs from next-generation
reads. Sensitivity
and error rates
On our tests, UCHIME is more sensitive than ChimeraSlayer,
the best previous method using a reference database, especially with
short, noisy sequences and when database sequences are diverged from a
chimera's true parent sequences. The de novo mode of UCHIME has
comparable sensitivity to Perseus.
UCHIME has lower average error rate than ChimeraSlayer. The error rate
is harder to measure for de novo mode, but appears to be
comparable to Perseus.
Speed
There are two implementations of UCHIME. One is open-source
(strictly, public domain). This version is >1000x faster than
ChimeraSlayer and >100x faster than Perseus. A faster version of the
same algorithm is implemented in the USEARCH
package v4.1 and later. This is at least an order of magnitude faster
than the open-source version, and can be even faster with large
datasets.
Paper
Edgar,RC, Haas,BJ, Clemente,JC, Quince,C, Knight,R (2011) UCHIME
improves sensitivity and speed of chimera detection, Bioinformatics
doi: 10.1093/bioinformatics/btr381 [PMID
21700674].
Downloads and documentation
Test data, precompiled binaries and source code for the public
domain version can be downloaded
here.
The USEARCH package, which includes the faster implementation of UCHIME,
is available at no charge for academic use. For more information about
licensing, please visit the USEARCH
home page.
|
News
I'm
looking for new projects -- collaboration and consulting.
USEARCH
4.0 released.
USEARCH
3.0 released.
UCLUST
v2.1 supports very large MUSCLE alignments in < 1 Gb memory.
USEARCH
and UCLUST released. Search and clustering hundreds of times faster than
BLAST.
MUSCLE v3.8
released.
Blog
Send me your big sets!
Multiple
protein alignment is dead
Big
alignments -- do they make sense?
An unemployed gentleman
Fishing for significance
|