The FAIR (Findable, Accessible, Interoperable, Reusable) Guiding Principles are intended to facilitate the discovery and reuse of data, not only for people, but for machines. Read the full paper here.
You can find a list of various websites to assess if a resource is FAIR at https://fairassist.org
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.
URL to access your dataset metadata, ideally use a persistent identifier system, such as w3id.org
See Reusable
Is your dataset metadata properly refering to your data identifier? Paste an example of metadata property which point to a data identifier
Link to the resource/portal where the metadata about your dataset can be found
Once the user finds the required data, the user needs to know how can they be accessed, possibly including authentication and authorisation.
Which protocol is used to access the data? HTTPS, HTTP, FTP, SSH?
Do you need a specific authorization to access this data? Login/password?
How can you get this authorization? If applicable
Link to the metadata long term storage (e.g. a metadata portal), independent to where the data is accessible
The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
In which format are your data and metadata available? RDF, JSON, CSV, SPSS, XLXS?
Link to the data model you use ? e.g. OWL ontologies, SHACL/ShEx shapes, schemas files
Feel free to provide human-readable details here to explain which schema you use and why
Specify if your dataset builds on another data set, if additional datasets are needed to complete the data, or if complementary information is stored in a different dataset
The ultimate goal of FAIR is to optimize the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
-
R1. (Meta)data are richly described with a plurality of accurate and relevant attributes
Link to the data license file
Link to an example of provenance metadata for the data
Short explanation why your data model choice meet the domain community standards. With link to relevant websites about the standard.
- R1.4. Complementary: a graph summary is available to browse your data
Create and publish a descriptive graph summary for your data: number of class, instance, relations. We recommend to use the HCLS dataset description. Provide the URL to access and browse those statistics if applicable.
-
R2. Source code to produce and publish the data is reusable
- R2.1. Code to produce and publish the data is versioned and accessible
Git repository URL (GitHub, GitLab, BitBucket...)
- R2.1. The code is published with a license
Link to the license file
- R2.2. Code can be executed in containers or using a common packaging system easily
Link to the documentation to install requirements, and run the code (Docker image, Pip packages, Java jar file...)
- R2.3. Architecture and concept behind the data transformation is briefly explained
Link to the documentation with those explanations
From the GO-FAIR website.