Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unescape Unicode sequences in the SPARQL parser #1770

Merged
merged 13 commits into from
Feb 7, 2025

Conversation

RobinTF
Copy link
Collaborator

@RobinTF RobinTF commented Feb 6, 2025

This PR makes sure escape sequences are applied before passing the string to ANTLR for the real parsing step (see the SPARQL 1.1 specification for details). UTF-16 surrogate pairs are correctly handled. Also the ctre version is incremented to use search_all (non-deprecated variant of range).

@RobinTF RobinTF force-pushed the implement-utf-8-pre-parsing branch from 30f7865 to f42bdf1 Compare February 6, 2025 11:15
Copy link

codecov bot commented Feb 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.01%. Comparing base (a307781) to head (777224d).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1770      +/-   ##
==========================================
+ Coverage   90.00%   90.01%   +0.01%     
==========================================
  Files         395      395              
  Lines       37838    37904      +66     
  Branches     4258     4263       +5     
==========================================
+ Hits        34055    34120      +65     
- Misses       2484     2486       +2     
+ Partials     1299     1298       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much.

src/parser/SparqlParserHelpers.cpp Show resolved Hide resolved
src/parser/SparqlParserHelpers.cpp Show resolved Hide resolved
src/parser/SparqlParserHelpers.cpp Outdated Show resolved Hide resolved
src/parser/SparqlParserHelpers.cpp Outdated Show resolved Hide resolved
src/parser/SparqlParserHelpers.cpp Outdated Show resolved Hide resolved
src/parser/SparqlParserHelpers.cpp Show resolved Hide resolved
test/SparqlParserTest.cpp Show resolved Hide resolved
test/SparqlParserTest.cpp Show resolved Hide resolved
test/SparqlParserTest.cpp Show resolved Hide resolved
@sparql-conformance
Copy link

Conformance check passed ✅

Test Status Changes 📊

Number of Tests Previous Status Current Status
2 Failed Passed

Details: https://qlever.cs.uni-freiburg.de/sparql-conformance-ui?cur=777224de8c38a47e06818d1acb752c08250ade1d&prev=a307781592842f82bdfef78fc47fc832fd37368d

Copy link

sonarqubecloud bot commented Feb 6, 2025

@joka921 joka921 changed the title Unescape Unicode sequences before parsing Unescape Unicode sequences in the SPARQL parser Feb 7, 2025
@joka921 joka921 merged commit d7a70c7 into ad-freiburg:master Feb 7, 2025
24 checks passed
@RobinTF RobinTF deleted the implement-utf-8-pre-parsing branch February 7, 2025 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants