Some available ultra-large benchmarking datasets (>50 million compounds) consisting of released compound data and corresponding docking scores.
N. B. This list is not exhaustive.
Last checked and updated 2025-02-17
Campaign reference | Approx. compound library size | Obtain data from | Docking tool | Available format(s) | Target(s) |
---|---|---|---|---|---|
Sivula et al. (2023) J Chem Inf Model 63(18): 5773-5783. DOI 10.1021/acs.jcim.3c01239 | 1.56 billion | DOI 10.23729/2170dc9c-4905-43c3-aeee-a574d360737f | Glide-HTVS | SMILES, docking scores (csv) | SurA, GAK |
Rogers et al. (2023) Sci Data 10: 173. DOI 10.1038/s41597-023-01984-9 | 1.4 billion | DOI 10.13139/OLCF/1783186 | AutoDock-GPU | SMILES, docking scores, rescoring results (parquet) | 5 SARS-CoV-2 targets |
Luttens et al. (2023) Preprint. DOI 10.26434/chemrxiv-2023-w3x36 | 235 million | DOI 10.5281/zenodo.7903160 | DOCK3.7 | SMILES, docking scores (tsv) | 8 diverse targets |
Lyu et al. (2019) Nature 566: 224-229. DOI 10.1038/s41586-019-0917-9 | 138 million | DOI 10.6084/m9.figshare.7359401.v3 | DOCK3.7 | SMILES, docking scores | D4 dopamine receptor |
Lyu et al. (2019) Nature 566: 224-229. DOI 10.1038/s41586-019-0917-9 | 99 million | DOI 10.6084/m9.figshare.7359626.v2 | DOCK3.7 | SMILES, docking scores | AmpC beta-lactamase |
Yang et al. (2021) J Chem Theory Comput 17(11): 7106-7119. DOI 10.1021/acs.jctc.1c00810 | 138+99 million | https://s3.amazonaws.com/content.schrodinger.com/Resources/paper_data_share.zip | Glide | SMILES, docking scores | Lyu et al. targets (D4, AmpC) |