USEARCH manual > options > Karlin-Altschul statistics
Karlin-Altschul statistics
 
Karlin-Altschul statistics provide a theory for computing the probability that a local alignment of a given score will be found between two random sequences of the same lengths as the query and database sequences. For an introduction, see this page on the NCBI web site. The probability is often expressed as an expectation value, abbreviated to E-value.

K-A statistics apply to local alignments only; E-values cannot be computed for global alignments.

According to K-A statistics, the expectation value E for a local alignment with score S is:

  E = K q d exp(-LS)

Here, q is the query sequence length, d is the database size in letters, exp is the exponential function and K and L are parameters derived from the alignment scoring parameters. L is called the Lambda parameter. USEARCH does not automatically adjust the K and Lambda parameters if the alignment scoring parameters are changed -- they must be provided on the command line.

Option Default
proteins
Default
nucleotides
Description
‑ka_gapped_lambda 0.267 1.280 Lambda parameter for gapped alignments.
‑ka_ungapped_lambda 0.311 1.330 Lambda parameter for ungapped alignments (HSPs).
‑ka_gapped_k 0.041 0.460 K parameter for gapped alignments.
‑ka_ungapped_k 0.128 0.621 K parameter for ungapped alignments (HSPs).
‑ka_dbsize actual size actual size Effective total database size in letters. Most common use is when a database is split into pieces. Then ‑ka_dbsize should be set to the size of the original database.