-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle or disallow Solution Modifiers for SPARQL inputs #126
Comments
Possible predicate function for checking a query for solution modifiers: def query_has_solution_modifiers(query: str) -> bool:
"""Predicate for checking if a SPARQL query has a solution modifier."""
pattern = r"}[^}]*\w+$"
result = re.search(pattern, query)
return bool(result) The regex matches any word characters after (and including) the last '}'. So this is simply for checking the existence of a solution modifier, not for cleanly extracting possible solution modifiers. |
Note: Handling of nested queries with solution modifiers is much more complicated and very likely not even necessary. |
Consider The combination of positive lookbehind and absolute anchor is able to match the solution modifier of the outermost SPARQL clause regardless of wether multiline mode is active. Also the group captures the solution modifier more cleanly. |
Actually, there is a very trivial solution for allowing incoming queries with solution modifiers, i.e. simply providing a very thin SPARQL wrapper in case a solution modifier is detected. E.g. a query with a solution modifier select ?s
where {
?s ?p ?o .
}
limit 10 can be thinly wrapped like select ?s
where {
{
select ?s
where {
?s ?p ?o .
}
limit 10
}
}
limit 2
offset 0 LIMIT/OFFSET in that case represent RDFProxy query modifications, which could then be applicable without problems. I think a warning should be emitted if a query with solution modifiers is detected though, there simply is no/rarely need for passing queries with solution modifiers to RDFProxy. |
This solution is only applicable for ungrouped models, grouped models significantly more complicated and I am currently having trouble even imagining a potential sane use case for allowing solution modifiers for grouped models. |
Actually, I am in favor of generally disallowing solution modifiers. All that can be done with solution modifiers can be done with RDFProxy parameters and elaborately nesting and wrapping queries if solution modifiers are detected very likely would lead to unexpected results for users. |
Given a grouped model, for a simple query with solution modifier like select ?x ?y
where {
values (?x ?y) {
(1 2)
(1 3)
(2 4)
(3 5)
}
}
limit 2 the following steps are necessary for handling the query:
So for the query above, the following query would need to get generated: select ?x ?y
where {
{
select ?x ?y
where {
values (?x ?y) {
(1 2)
(1 3)
(2 4)
(3 5)
}
}
limit 2
}
# injected
{
select distinct ?x
where {
values (?x ?y) {
(1 2)
(1 3)
(2 4)
(3 5)
}
}
order by ?x limit 2 offset 0
}
} Again, I wonder if generally disallowing outer solution modifiers would not be the better option here. |
rdfproxy.SPARQLModelAdapter
implements pagination by dynamically modifying the supplied SPARQL query.For ungrouped models, OFFSET and LIMIT modifiers are injected, for grouped models a subquery (based on the initial query) is generated and injected into the initial query.
E.g. for
and a
group_by
definition in the respective model specifying "parent" as the grouping value, the following query gets generated:See #114.
This raises the question though, how to handle input queries that themselves define solution modifiers.
rdfproxy
currently simply does not handle that case, passing a query with solution modifiers is undefined behavior and will almost certainly lead to unintended effects. So the case urgently MUST be handled.The easiest and probably best solution to that problem is to simply disallow solution modifiers for SPARQL inputs.
First of all, I cannot really think of a reason to use SPARQL queries with solution modifiers in
rdfproxy
. LIMIT and OFFSET are handled byrdfproxy
pagination and GROUP BY is handled byrdfproxy
's mapping and grouping mechanism.Secondly, proper handling of that case would come with significant overhead and difficulties. Apart from the lack of proper SPARQL query analysis facilities for Python, one would also need to think about and decide, what cases of solution modifiers in an inbound query could even be meaningful in the context of
rdfproxy
.The text was updated successfully, but these errors were encountered: