See also
Search flowchart
Termination options
Accept criteria determine whether an alignment is a hit, also
called an accept. See also weak hits.
Hits are written to the output
files. The ‑maxhits N and ‑top_hits_only options specify that
only the best hits are to be reported. Note that two or more hits may be tied
for the best score or identity. Accepted hits are
written to an output file sorted by decreasing alignment score (local alignments) or by decreasing
identity (global alignments).
In clustering commands based on
UCLUST (cluster_fast and cluster_smallmem), accept options
determine whether or not a sequence matches a cluster centroid and
should be assigned to that cluster. A sequence can match only one centroid; this
is usually the first accepted centroid, but this can be changed by increasing
the -maxaccepts, in which case it will be the centroid with highest identity
(see termination options).
Accept criteria do not have default
values. If a given accept option is not specified, then the corresponding value
is not computed or tested. So for example if -id is the only option given, then
identity is the only value that is calculated from the alignment.
If more than one accept option is
specified, they are combined with AND, so all of them must be
satisfied.
Criteria that do not require an alignment,
e.g. ‑idprefix and ‑minqt, are tested before an alignment is
computed; these can give significant improvements in speed because
a target can be rejected without the overhead of computing an
alignment. Most of these are not supported by local search commands
(ublast, usearch_local and
search_local).
The -acceptall option specifies that all hits should be
accepted, overriding any other accept options.
Option |
Real/
Integer |
Global/
Local |
Need aln? |
Description |
‑evalue |
R |
L |
Y |
Maximum E-value.
Required for most commands that use local alignments. |
‑id |
R |
GL |
Y |
Minimum identity. Required for most commands that
use global alignments. |
‑query_cov |
R |
GL |
Y |
Fraction of the query sequence that is
aligned, in the range 0.0 to 1.0. With local alignments, this test is applied
AFTER a local alignment is already created, so the effect is to reject local
alignments that are too short, NOT to extend them further. With global
alignments, columns containing terminal gaps
are discarded before the test is applied. |
‑target_cov |
R |
GL |
Y |
Fraction of the target sequence that is
aligned, in the range 0.0 to 1.0.With local alignments, this test is applied
AFTER a local alignment is already created, so the effect is to reject local
alignments that are too short, NOT to extend them further. With global
alignments, columns containing terminal gaps
are discarded before the test is applied. |
‑idprefix |
I |
G |
N |
First N letters are identical. |
‑idsuffix |
R |
G |
N |
Last N letters are identical. |
‑minqt |
R |
G |
N |
Minimum value of query_seq_length /
target_seq_length. |
‑maxqt |
R |
G |
N |
Maximum value of query_seq_length /
target_seq_length. |
‑minsl |
R |
G |
N |
Minimum value of shorter_seq_length /
longer_seq_length. |
‑maxsl |
R |
G |
N |
Maximum value of shorter_seq_length /
longer_seq_length. |
‑leftjust |
|
G |
Y |
No terminal gaps at start of
alignment. |
‑rightjust |
|
G |
Y |
No terminal gaps at end of
alignment. |
‑self |
|
GL |
N |
Reject if labels are identical (i.e., reject self-hits). |
‑selfid |
|
G |
N |
Reject if sequences are identical (i.e., don't want
self-hits). |
‑maxid |
R |
GL |
Y |
Reject if identity is greater. Example: to select hits
that are 97% identical to two significant figures use ‑id 0.965 ‑maxid
0.975. |
‑minsizeratio |
R |
GL |
N |
Minimum query_size / target_size (see
size annotations). |
‑maxsizeratio |
R |
GL |
N |
Maximum query_size / target_size (see
size annotations). |
‑maxdiffs |
I |
GL |
Y |
Maximum number of differences between the sequences, i.e. the maximum
edit distance. A
difference is defined to be an alignment column containing a gap or a
substitution. |
-maxsubs |
I |
GL |
Y |
Maximum number of alignment columns containing substitutions. |
‑maxgaps |
I |
GL |
Y |
Maximum number of alignment columns containing gaps. |
‑mincols |
I |
GL |
Y |
Minimum alignment length, i.e. minimum number of columns in the alignment. |
‑maxqsize |
I |
GL |
N |
Maximum query size annotation. |
‑mintsize |
I |
GL |
N |
Minimum target size annotation. |
‑mid |
R |
GL |
Y |
Minimum match percent identity, defined as (number of columns containing
identities) / (number of columns containing letter pars). Gapped columns are
ignored. This is percent identity, not fractional identity like -id, so is in the range 0.0 to 100.0. |
|