CD-HIT-EST alignment parameters

 
<< CD-HIT analysis
<< Comparing USEARCH and CD-HIT
  
CD-HIT-EST alignment parameters
The default nucleotide alignment parameters used by CD-HIT-EST, USEARCH and BLASTN are shown in the table below. The CD-HIT (scaled) column gives equivalent values scaled to a match score of +1, as used by the other programs.
 
Parameter CD-HIT CD-HIT (scaled) BLASTN USEARCH
Match score +2 +1 +1 +1
Mismatch score -2 -1 -2 -2
Internal gap open penalty 6 3 5 10
End gap open penalty 0 0 n/a 0.5
Gap extend penalty 1 0.5 2 0.5

Compared to BLASTN and USEARCH, the CD-HIT gap open penalty is lower and this tends to result in gappier alignments and higher sequence identities. 

Examples
Gappy CD-HIT alignment that has 98% id according to CD-HIT but 95% id according to USEARCH.

Gappy CD-HIT alignment that has 86% id according to CD-HIT but 86% according to USEARCH.

Gross misalignment due to CD-HIT's use of banded dynamic programming.

These parameters apply to CD-HIT v4 through at least v4.5.7.