Skip to content

Knowledge4COVID 19: A resource

Maria-Esther Vidal edited this page May 25, 2022 · 5 revisions
  • Novelty: Knowledge4COVID-19 introduces a novel infrastructure to transform heterogeneous data sources into a KG. The mappings among the data sources and the unified schema are defined declaratively in RDF. Moreover, the methods implemented in SDM-RDFizer allow for the efficient execution of the KG creation process. The Knowledge4COVID-19 KG occupies 5.34 GB and is created from 1.35 GB of raw data. Knowledge4COVID-19 KG executes 178 RML triples maps over the raw data in 10 minutes. Additionally, novel prediction methods are utilized to predict interactions between drugs. We hope that these results encourage the community to create declarative pipelines for KG creation that are able to scale up to the avalanche of data that is expected in the next years.

  • Availability: Knowledge4COVID-19 is released publicly by the Scientific Data Management (SDM) group at TIB, Hannover. TIB is one of the largest libraries for science and technology in the world. Following its policy of engaging open access to scientific artifacts, it will support Knowledge4COVID-19 as a source of knowledge for SARS-CoV-2 and other viruses.
    The Knowledge4COVID-19 DE is open source, written in Python 3, and uses RML; it is available under the Apache License 2.0 license. It will be regularly updated with new data sources, triples maps, and APIs for exploration. More importantly, respecting open science good practices, Knowledge4COVID-19 is registered at Zenodo. Thus, users and practitioners can use and cite a specific version, ensuring reproducibility and traceability of any experimental evaluation.

  • Utility: A docker image of Knowledge4COVID-19 is available and enables accessing the KG locally.

  • Predicted Impact: From 24 to 26 April 2020, Knowledge4COVID-19 participated in the Pan-European hackathon #EUvsVirus organized with the aim of connecting experts, investors, and civilian organizations to devise together innovative solutions to the coronavirus outbreak. A blog describing the Knowledge4COVID-19 is available and has received great attention. Given the number of scientific publications and open data about drugs, disorders, adverse events, and drug interactions, we are positive that it will be the starting point of future developments. Furthermore, the pipeline for KG creation is domain agnostic and can be applied in other use cases.

  • Adoption and Reusability: Several projects in which the authors participate follow the same approach implemented in Knowledge4COVID-19. iASiS and BigMedilytics - lung cancer pilot are exemplary of EU H2020 projects. The iASiS RDF KG comprises more than 1.2B RDF triples collected from more than 40 heterogeneous sources; more than 1,300 RML triples maps are used to create a lung cancer KG with 500M RDF triples.

Clone this wiki locally