The -search command is used to search query structure(s) against a database.
NOTE: in v2.2 and later -dbsize is required to get good E-value estimates.
The value of -dbsize is the number of chains in the database. If the database is "chunked", then
it should be the total number of chains in all chunks.
The -evalue option specifies the maximum
E-value, default is 10 except with
‑verysensitive (default 1e+6).
How to choose your E-value threshold
The -dbsize option
One of the following options must be specified
(see speed vs. sensitivity):
-fast # faster and more sensitive at E<10 than Foldseek
-sensitive # ~3x slower, recommended for most studies
-verysensitive # ~20x slower, many FPs
The query and database (-db) are each specified by a
STRUCTS option.
If the -db option is not specified, then
an all-vs-all search on the query is performed.
See databases for information on creating databases.
See output files for hit reporting.
See output columns for fields in hits tsv.
Examples
# Search one structure against a database
reseek -search 1abc.pdb -db scop40.cal -fast \
-dbsize 12000 -output hits.tsv -columns query+target+evalue
# Search several structures against a database
reseek -search Cas9.cal -db PDB.bca \
-dbsize 450000 -sensitive -evalue 0.001 -output hits.tsv \
-columns query+target+qlo+qhi+ql+tlo+thi+tl+pctid+evalue
# All-vs-all search for SCOP40 test
reseek -search scop40.cal -sensitive
-dbsize 12000 -output scop40.tsv