Home Software Services About Contact     
Follow on twitter

Robert C. Edgar on twitter

11-Aug-2018 New paper describes octave plots for visualizing alpha diversity.

12-Jun-2018 New paper shows that one in five taxonomy annotations in SILVA and Greengenes are wrong.

18-Apr-2018 New paper shows that taxonomy prediction accuracy is <50% for V4 sequences.

05-Oct-2017 PeerJ paper shows low accuracy of closed- and open-ref. QIIME OTUs.

22-Sep-2017 New paper shows 97% threshold is wrong, OTUs should be 99% full-length 16S, 100% for V4.

UPARSE tutorial video posted on YouTube. Make OTUs from MiSeq reads.



Karlin-Altschul statistics

Karlin-Altschul statistics provide a theory for computing the probability that a local alignment of a given score will be found between two random sequences of the same lengths as the query and database sequences. For an introduction, see this page on the NCBI web site. The probability is often expressed as an expectation value, abbreviated to E-value.

K-A statistics apply to local alignments only; E-values cannot be computed for global alignments.

According to K-A statistics, the expectation value E for a local alignment with score S is:

  E = K q d exp(-LS)

Here, q is the query sequence length, d is the database size in letters, exp is the exponential function and K and L are parameters derived from the alignment scoring parameters. L is called the Lambda parameter. USEARCH does not automatically adjust the K and Lambda parameters if the alignment scoring parameters are changed -- they must be provided on the command line.

Option Default
-ka_gapped_lambda 0.267 1.280 Lambda parameter for gapped alignments.
-ka_ungapped_lambda 0.311 1.330 Lambda parameter for ungapped alignments (HSPs).
-ka_gapped_k 0.041 0.460 K parameter for gapped alignments.
-ka_ungapped_k 0.128 0.621 K parameter for ungapped alignments (HSPs).
-ka_dbsize actual size actual size Effective total database size in letters. Most common use is when a database is split into pieces. Then -ka_dbsize should be set to the size of the original database.