These parameters determine the score of an alignment. They include
substitution scores and gap penalties. These are distinct from
heuristic parameters, which
control fast but approximate methods for finding the alignment with
the highest score. Ideally, changing heuristic parameters would not
change the reported alignment (because the best alignment would
always be found). By contrast, changing alignment scoring
parameters will tend to change the alignment, e.g. increasing gap
penalties will reduce the number of gaps. All scoring parameters
are floating-point values and may be specified as integers or real
numbers.
If local alignment parameters are changed,
then the Karlin-Altschul K and
Lambda parameters must also be changed in order to get correct
E-values.
Option |
Local/Global
Protein/Nucleotide |
Default |
Description |
‑lopen |
L PN |
10.0 |
Local gap open |
‑lext |
L PN |
1.0 |
Local gap extend |
‑match |
L N |
+1.0 |
Match score |
‑mismatch |
L N |
-2.0 |
Mismatch score |
‑matrix filename |
LG PN |
BLOSUM62 (aa)
+1/-2 (nt) |
Substitution matrix in NCBI BLAST format. See BLOSUM62 for an example. |
Gap penalties for global
alignments With global alignments, gap penalties are
specified using the ‑gapopen and ‑gapext options. Up to 12 separate
penalties can be specified: all combinations of query / target,
left / interior / terminal, and open / extend can be assigned
different penalties.
Default penalties are shown
in the following table.
Penalty |
|
Default |
Interior gap open |
|
10.0 nucleotides, 17.0 proteins |
End gap open |
|
1.0 |
Interior gap extend |
|
1.0 |
End gap extend |
|
0.5 |
The nucleotide defaults would
be set using these options:
-gapopen 10.0I/1.0E
-gapext 1.0I/0.5E
A numerical value for a penalty is optionally followed by one or
more letters that specify particular types of gap. Here, "10.0I"
means "Interior gap=10.0", and "1.0E" means "End gap=1.0". If no
letters are given after the numerical value, then the penalty
applies to all gaps. More than one letter can be specified, so for
example "0.5IE" means "Interior and End gap=0.5", which is the same
as all gaps. Following are valid letters: I=Interior, E=End,
L=Left, R=Right, Q=Query and T=Target. If more than one numerical
value is specified, then they must be separated by a slash
character '/'. White space is not allowed. If a star (*) is used as
the numerical value, then the gap is forbidden. Using * in an open
penalty means that the gap will never be allowed, using * in an
extension penalty means that gaps longer than one will be
forbidden. So, for example, *LQ in ‑gapopen means "left end-gaps in
the query are not allowed". A sign (plus or minus) is not allowed
in the numerical value, which can be integer or floating-point (in
which case a period '.' must be used for the decimal point). The
-gapopen and ‑gapext options are interpreted first by setting the
defaults, then by scanning the string left-to-right. Later values
override previous values.
|