Replace SPARQL queries and OWLTools by a ROBOT plugin #1174

gouttegd · 2025-02-03T20:10:07Z

This PR is intended to experiment using a ROBOT plugin to replace both

(1) SPARQL queries (as suggested in #1169), and

(2) OWLTools (still used in standard workflows for two things: normalising a OBO source file, and creating subsets -- #622).

For now, this is using my own “experimental ROBOT plugin”. If we are happy with the experiment, we can then create a proper ODK plugin for ROBOT (and/or push some of the features in upstream ROBOT).

Install my experimental ROBOT plugin as a built-in plugin under the name "odk". This is for experimentation only -- I use this plugin to trial the use of pluggable commands in the ODK workflows. If we go on with that route, we will create a dedicated ODK plugin later.

When preparing import modules, we do a few things: (1) add a dc:source ontology annotation, derived from the version IRI of the original ontology; (2) remove all other ontology annotations, keeping only the newly added dc:source; (3) inject proper SubAnnotationPropertyOf axioms for properties representing subsets and synonym types. All those steps are currently performed by SPARQL queries. Here we replace those queries by calls to the `odk:annotate` command, which takes care of (1) and (2), and to the `odk:normalize` command, which takes care of (3). Of note, the fact that we are no longer going through a SPARQL processing step means that we could end up with duplicated axioms with different sets of annotations. Those were automatically merged as a side-effect of the SPARQL processing (which involves dumping the output of the SPARQL processing and re-parsing it again into OWLAPI objects). Since we no longer benefit from that side-effect, we must explicitly include a step in which we merge duplicated axioms (theoretically this could be done with `robot repair --merge-axiom-annotations`, but unfortunately this command does not behave exactly like we would [1]). [1] ontodev/robot#1239

We are still using OWLTools for two things: (1) creating ontology subsets; (2) merging duplicated axioms in the source file. Those tasks can now be done by the `odk:subset` command and the `odk:normalize` command, respectively.

The inject-subset-declaration.ru and inject-synonymtype-declaration.ru SPARQL queries are no longer used in any standard workflows.

Now that the standard workflows no longer use OWLTools, there is no longer any need for OWLTools to be present in ODKLite (whose purpose is to contain all the tools needed by the standard workflows, and only those tools). We thus move it to ODKFull.

By default, the `odk:subset` does _not_ send the generated subset down the ROBOT pipeline, unless the `--replace true` option is used.

We add a new option in the 'robot_report' section called 'upper_ontology'. If set, it should be the (resolvable) IRI of an upper ontology (such as http://purl.obolibrary.org/obo/cob.owl). When set, a a new report is added to the list of ROBOT report, one that tests whether all classes of the ontology are classified under one of the classes of the upper ontology. The new report uses the same parameters as the standard ROBOT reports regarding the file to perform the check on ('report_on') and whether the check should be limited to classes within the project's namespaces or not ('use_base_iris'). See #1175

This commit fixes several issues with the generated Makefile rule that performs the alignment check: * Only include the aligment report when an upper ontology is defined. * When asked to perform the check on the -edit file, actually perform it on the $(SRCMERGED) file (for consistency with other reports). * Use the reasoner defined in the project, if any. * Fix formatting so that the generated rules are somewhat readable.

The status of the `project.robot_report.upper_ontology` field cannot simply be tested with either 'is defined' or 'is not none', because it will depend on whether a `robot_report` section exists at all: * without a `robot_report` section, the field is *defined* but is None (default value); * with a `robot_report` section but no `upper_ontology` field, the field is *not defined*. So to cover both cases, we need to test both for the existence of the field, and whether it is None or not.

matentzn

this is sooooo awesome. I have a few questions (in code) and 1 general concern.

Is there any way you would agree to move the ODK robot plugin into the INCATools org? I would feel a bit better if that component that is shaping up to be a core component of the ODK build system would live a bit more visibly here..

matentzn · 2025-02-08T11:41:41Z

template/src/sparql/inject-synonymtype-declaration.ru.jinja2

We have to be careful here of one thing: Not to delete this file in repos during some future implementation update_repo cleanup. Currently I think this is ok.

The reason, I know for a fact that this file (and the subset one) is used in at least 8 or so custom.Makefile (basically everytime someone had to customise a imports/*_import.owl goal).

matentzn · 2025-02-08T11:42:07Z