Skip to content

Latest commit

 

History

History
105 lines (56 loc) · 6.47 KB

FAIR_DATA_CHECKS.md

File metadata and controls

105 lines (56 loc) · 6.47 KB

The FAIR (Findable, Accessible, Interoperable, Reusable) Guiding Principles are intended to facilitate the discovery and reuse of data, not only for people, but for machines. Read the full paper here.

You can find a list of various websites to assess if a resource is FAIR at https://fairassist.org

Findable 🔎

The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.

URL to access your dataset metadata, ideally use a persistent identifier system, such as w3id.org

See Reusable

Is your dataset metadata properly refering to your data identifier? Paste an example of metadata property which point to a data identifier

Link to the resource/portal where the metadata about your dataset can be found

Accessible 📂

Once the user finds the required data, the user needs to know how can they be accessed, possibly including authentication and authorisation.

Which protocol is used to access the data? HTTPS, HTTP, FTP, SSH?

Do you need a specific authorization to access this data? Login/password?

How can you get this authorization? If applicable

Link to the metadata long term storage (e.g. a metadata portal), independent to where the data is accessible

Interoperable ⚙️

The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.

In which format are your data and metadata available? RDF, JSON, CSV, SPSS, XLXS?

Link to the data model you use ? e.g. OWL ontologies, SHACL/ShEx shapes, schemas files

Feel free to provide human-readable details here to explain which schema you use and why

Specify if your dataset builds on another data set, if additional datasets are needed to complete the data, or if complementary information is stored in a different dataset

Reusable ♻️

The ultimate goal of FAIR is to optimize the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.

  • R2. Source code to produce and publish the data is reusable

    • R2.1. Code to produce and publish the data is versioned and accessible

    Git repository URL (GitHub, GitLab, BitBucket...)

    • R2.1. The code is published with a license

    Link to the license file

    • R2.2. Code can be executed in containers or using a common packaging system easily

    Link to the documentation to install requirements, and run the code (Docker image, Pip packages, Java jar file...)

    • R2.3. Architecture and concept behind the data transformation is briefly explained

    Link to the documentation with those explanations

From the GO-FAIR website.