PalmDB

PalmDB data sources

For each data source, we provide three sets of sequences: 'raw', 'palmprint' and 'palmcore'. Raw sequences are untrimmed and unannotated sequences predicted to contain RdRp by the original data source. Typically, these are contigs but there are exceptions, e.g. raw Wolf2018 is the full set of their trimmed sequences appearing in their multiple alignment.

GenBank
RdRps identified using palm_annot to search the NCBI NR protein database. Also includes RdRps from Wolf2018 and palm_annot search of ICTV genomes (see below).

Wolf2018
Paper https://doi.org/10.1128/mBio.02329-18
Data ftp://ftp.ncbi.nlm.nih.gov/pub/wolf/_suppl/rnavir18/

ICTV
Home page https://ictvonline.org
Genomes https://ictv.global/vmr

Wolf2020
Paper https://doi.org/10.1038/s41564-020-0755-4
Data ftp://ftp.ncbi.nih.gov/pub/wolf/_suppl/yangshan

Edgar2022
Paper https://doi.org/10.1038/s41586-021-04332-2
Data https://github.com/ababaian/serratus/wiki/
Raw1 (micro contigs, May 2021): https://lovelywater.s3.amazonaws.com/assembly/micro/rdrp1.mu.fa
Raw2 (full contigs, June 2021): https://serratus-public.s3.amazonaws.com/rdrp_contigs/rdrp_contigs.tar.gz

Zayed2022
Paper https://doi.org/10.1126/science.abm5847
Data 44779_RdRP_contigs.fna.gz

Neri2020
Paper https://doi.org/10.1016/j.cell.2022.08.023
Data https://ftp.ncbi.nih.gov/pub/wolf/misc/JGI-TAU/

Hou2023
Paper https://doi.org/10.1101/2023.04.18.537342
Data http://47.93.21.181/