PalmDB data sources
For each data source, we provide three sets of sequences: 'raw', 'palmprint' and 'palmcore'.
Raw sequences are untrimmed and unannotated sequences predicted to contain RdRp by the original data source.
Typically, these are contigs but there are exceptions, e.g. raw Wolf2018 is the full set of their trimmed
sequences appearing in their multiple alignment.
GenBank
RdRps identified using
palm_annot to search the
NCBI NR protein database.
Also includes RdRps from Wolf2018 and palm_annot search of ICTV genomes (see below).
Wolf2018
Paper
https://doi.org/10.1128/mBio.02329-18
Data
ftp://ftp.ncbi.nlm.nih.gov/pub/wolf/_suppl/rnavir18/
ICTV
Home page
https://ictvonline.org
Genomes
https://ictv.global/vmr
Wolf2020
Paper
https://doi.org/10.1038/s41564-020-0755-4
Data
ftp://ftp.ncbi.nih.gov/pub/wolf/_suppl/yangshan
Edgar2022
Paper
https://doi.org/10.1038/s41586-021-04332-2
Data
https://github.com/ababaian/serratus/wiki/
Raw1 (micro contigs, May 2021):
https://lovelywater.s3.amazonaws.com/assembly/micro/rdrp1.mu.fa
Raw2 (full contigs, June 2021):
https://serratus-public.s3.amazonaws.com/rdrp_contigs/rdrp_contigs.tar.gz
Zayed2022
Paper
https://doi.org/10.1126/science.abm5847
Data
44779_RdRP_contigs.fna.gz
Neri2020
Paper
https://doi.org/10.1016/j.cell.2022.08.023
Data
https://ftp.ncbi.nih.gov/pub/wolf/misc/JGI-TAU/
Hou2023
Paper
https://doi.org/10.1101/2023.04.18.537342
Data
http://47.93.21.181/