-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of EXISTS
and NOT EXISTS
#1703
Conversation
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Add the workflows.....
Signed-off-by: Johannes Kalmbach <[email protected]>
Another test...
Signed-off-by: Johannes Kalmbach <[email protected]>
More sparql conformance stuff...
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1703 +/- ##
==========================================
+ Coverage 90.07% 90.12% +0.04%
==========================================
Files 396 399 +3
Lines 38021 38199 +178
Branches 4266 4281 +15
==========================================
+ Hits 34247 34426 +179
+ Misses 2486 2473 -13
- Partials 1288 1300 +12 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
As a next step, I want to write some comments. Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
The only thing that is missing, is some corner case tests, and maybe cleaning up the parsing of the active dataset clauses. Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
EXISTS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great and I have made a thorough pass now and committed some changes. A few more questions and then this is ready to merge:
-
In the code for
ExistsExpression::getCacheKey
you mention that it can happen that theExistsJoin
is not yet set up because the query planning is not yet finished, in which case you set a random cache key. For which scenario is this relevant? In particular, when does it ever happen that query processing starts before query planning is completed? -
In
SparqlAntlrParserTest.cpp
, you define aselectABarFooMatcher
, which can takeFROM
andFROM NAMED
clauses as arguments. But then you only use it with the default arguments. -
At the end of
ExistsJoinTest.cpp
there are two TODOs by yourself suggesting a few more tests, how about adding them? -
Is it hard to use
ad_utility::callFixedSize
for theExistsJoin
or should that be a separate PR? -
Can you check the one SonarCloud issue that is not a TODO?
-
Are you fine with the patch coverage reported by CodeCov?
EXISTS
EXISTS
and NOT EXISTS
# Conflicts: # .github/workflows/sparql-conformance.yml
Signed-off-by: Johannes Kalmbach <[email protected]>
# Conflicts: # src/engine/CMakeLists.txt # src/engine/sparqlExpressions/ExistsExpression.h # test/engine/ExistsJoinTest.cpp
…l changes. Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joka921 This looks great now, thanks a lot. I tried it on a few queries on Wikidata and they worked fine. I have committed some minor comment improvements and then this should be almost ready to merge, provided that all the tests run through.
What I don't understand is the addition of https://github.com/ad-freiburg/qlever/blob/87ba7cad5b1124b4095beb9fa31b040e85a7de17/.github/workflows/upload-sparql-conformance.yml . In the current master, we have https://github.com/ad-freiburg/qlever/blob/master/.github/workflows/sparql-conformance-uploader.yml, which looks a bit different though. Please clarify.
EXISTS
and NOT EXISTS
EXISTS
and NOT EXISTS
It got lost in the merge conflict resolution.
Conformance check passed ✅Test Status Changes 📊
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted the involuntary changes in .github/workflows
(regarding the SPARQL conformance test) and fixed another small merge bug. Everything looks great and I will happily merge this now!
PS: The failing macOS check has nothing to do with this PR in particular (the packet installation step fails).
This implements
EXISTS
via a newExistsJoin
operation. Namely, for each operation involving an expression (for example, aBIND
or aFILTER
), anExistJoin
operation is added for each occurence ofEXISTS
in that expression. TheExistsJoin
adds an additional column (with a unique variable name) that contains the result of theEXISTS
and is used to evaluate the expression. This can can be seen in the "Analysis" tree when executing a query. The current implementation has the following limitations, which will be addressed in future PRs:ExistsJoin
operation is not yet lazyEXISTS
is currently handled completely independent from the rest of the query (except forFROM
ANDFROM NAMED
clauses, which are inherited from the outer query); it is an ongoing debate whether that is correct when theEXISTS
containsFILTER
s that refer to variables from outside theEXISTS
.UNDEF
values in the join columns, theExistsJoin
uses our generic zipper join, which is not particularly efficient