The -convert command is used to convert file formats.
Input file(s) are specified by a STRUCTS option.
See structure files for supported formats.
The format of an input structure file is recognized by its extension. Compressed files in
.gz are supported, e.g. .pdb.gz.
The format of an output file is specified by its command-line option,
multiple options may be specified:
-fasta # Amino acid sequences in FASTA format
-cal # C-alpha format
-bca # Binary C-alpha format
-feature_fasta # Mega feature in FASTA format
C-alpha formats cannot be converted to full structure formats (PDB or CIF) because side chain atoms are not included.
Extracting one chain or a subset
HETATM CA conversion
Like most protein structure alignment software, reseek discards HETATM records which have "CA" atom type (C-alpha), even if it is for a standard amino acid. It is not clear what to do here because as best I can tell, it is not valid to have a C-alpha HETATM record for one of the 20 common amino acid types because a hetero atom is either a non-standard amino acid or is not part of the backbone, but C-alpha is backbone by definition. However, these are quite common in the data. Most protein alignment software excludes them, though Foldseek includes them. If you want an option to include them, please let me know.
Examples
reseek -convert 1abc.pdb -fasta 1abc.fa
reseek -convert 1abc.mmcif.gz -cal 1abc.fa -fasta 1abc.fa
reseek -convert virus.files -cal virus.cal
reseek -convert PDB_mirror/ -bca PDB.bca -fasta PDB.fa
reseek -convert PDB.bca -label 1abc_A -cal 1abc_A.cal
reseek -convert PDB.bca -labels myfamilylabels.txt \
-fasta myfamily.fasta -cal myfamily.cal