Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support terminal lookbehind #1356

Merged
merged 1 commit into from
Feb 13, 2024
Merged

Support terminal lookbehind #1356

merged 1 commit into from
Feb 13, 2024

Conversation

msujew
Copy link
Member

@msujew msujew commented Jan 25, 2024

Closes #1355

Allows adopters to write positive/negative lookbehind in both EBNF and Regex notation, with full support in the lexer/parser phase.

@msujew msujew added the parser Parser related issue label Jan 25, 2024
@snarkipus
Copy link
Contributor

Yay - this cleans up my request from #591!

Copy link
Contributor

@sailingKieler sailingKieler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me! 🚀

@@ -49,7 +49,7 @@ export class DefaultTokenBuilder implements TokenBuilder {

protected buildTerminalToken(terminal: TerminalRule): TokenType {
const regex = terminalRegex(terminal);
const pattern = regex.flags.includes('u') ? this.regexPatternFunction(regex) : regex;
const pattern = this.requiresCustomPattern(regex) ? this.regexPatternFunction(regex) : regex;
Copy link
Contributor

@sailingKieler sailingKieler Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right that this distinction and the corresponding RegExp wrapping in regexPatternFunction is just there to prevent Chevrotain from applying validation rules to the given regex pattern?

Copy link
Member Author

@msujew msujew Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does have something to do with Chevrotain, yes. Basically, Chevrotain will apply a few optimizations on token types based on the regex you pass. If you pass a non-trivial regex (usually those with the /u flag or lookbehind), those optimizations cannot be performed. Instead of ignoring optimizations, it will instead throw an error. To disable those errors, we wrap the regex inside of an custom pattern function.

@szarnekow
Copy link

Do you plan to filter code completion based on the look-behind patterns, too?

@msujew
Copy link
Member Author

msujew commented Jan 30, 2024

@szarnekow Can you give an example? I don't think the completion takes the specific token pattern into account at all right now.

@szarnekow
Copy link

I see. The completion engine doesn't use the terminal rules to check inserted tokens for validity at all.

@msujew
Copy link
Member Author

msujew commented Jan 30, 2024

Yes, it will currently just insert whatever the scope provider returns, ignoring any potential parser/lexer error that this might result in.

@szarnekow
Copy link

szarnekow commented Jan 30, 2024

grammar test
entry Main: things=AorB+;
AorB: A|B;
A: ('A' | 'C') b=[B:B_TERMINAL]? ';';
B: name=B_TERMINAL ';'
terminal B_TERMINAL: (?<!'A')'B';
hidden terminal WS: /\\s+/;

Model with cursor positions <1> and <2> where I'd invoke CA:

B;
A<1>;
C<2>;

I wouldn't expect B to show up at <1> but at <2> - though I wouldn't get any proposals in either case because of the default reg filter in the completion engine.

@msujew
Copy link
Member Author

msujew commented Jan 31, 2024

I see, thanks for the example. I have created a separate issue, see #1368 👍

@msujew msujew force-pushed the msujew/support-lookbehind branch from 9a2d1ad to dca2ee3 Compare February 13, 2024 15:40
@msujew msujew merged commit 60934de into main Feb 13, 2024
5 checks passed
@msujew msujew deleted the msujew/support-lookbehind branch February 13, 2024 15:57
@msujew msujew added this to the v3.0.0 milestone Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parser Parser related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support lookbehind in terminal tokens
4 participants