Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update binary_operator_definitions regex to not stylize commented definitions #362

Merged
merged 2 commits into from
Jan 24, 2025

Conversation

FedericoPonzi
Copy link
Collaborator

@FedericoPonzi FedericoPonzi commented Jan 21, 2025

As per title, this seems to be the culprit.
Diagnosed by removing the definitions from the grammar file one at a time, reload the plugin and checked against the very helpful MRE provided in #361. Removing binary_operator_definitions made it working again.
Unescaped regex:

(\w+\s*(\(?[-<:>=&@\/%#!X\$\*\+\.\|\?\^\\]+\)?|\\[a-z]+\s)\s*\w+)\s*==(?!==)

From regex101:
image

Fix: I included a negative look-behind to exclude matching whenever this string starts with the sequence "*". It should still match \ and * separately.

I included a test for the grammar that was failing before my fix, identifying the definition in the comment as something else. Please give it a try and let me know if anything else is breaking for you.

Highlight before the patch:
image

Highlight after the patch:
image

Unescaped regex:

(\w+\s*(\(?(?!\\\*)[-<:>=&@\/%#!X\$\*\+\.\|\?\^\\]+\)?|\\[a-z]+\s)\s*\w+)\s*==(?!==)

From regex101:
image


(just some utils) Original unescaped regex:

const fs = require('fs');
const jsonData = JSON.parse(fs.readFileSync('regex.json', 'utf8'));
const regex = new RegExp(jsonData.match, 'g');
console.log("Regex as a string:", regex.source);
// Write the regex back to a new JSON file
const outputJsonData = { match: regex.source };
console.log(JSON.stringify(outputJsonData, null, 2));

@FedericoPonzi FedericoPonzi marked this pull request as draft January 22, 2025 07:38
@FedericoPonzi
Copy link
Collaborator Author

FedericoPonzi commented Jan 22, 2025

This is the before:
image
the --- x part is styled differently, I'm not sure if it's expected or not. I will take another look later.

@lemmy
Copy link
Member

lemmy commented Jan 22, 2025

@FedericoPonzi Do you know if it is possible to use a real parser for syntax highlighting in VSCode instead of relying on regular expressions?

@FedericoPonzi
Copy link
Collaborator Author

After some research: https://code.visualstudio.com/api/language-extensions/syntax-highlight-guide

Syntax highlight can be implemented via LSP, but this is usually an additional layer on top called Semantic Highlight: https://code.visualstudio.com/api/language-extensions/semantic-highlight-guide

See rust syntax overview for reference: https://marketplace.visualstudio.com/items?itemName=dustypomerleau.rust-syntax

So my understanding is that you can implement it using LSP, but the regex based highlight should still be used for the first pass.

@FedericoPonzi FedericoPonzi force-pushed the issue-361 branch 2 times, most recently from 0a6e2d8 to 4f1acf7 Compare January 23, 2025 19:14
@FedericoPonzi FedericoPonzi marked this pull request as ready for review January 23, 2025 19:15
@FedericoPonzi
Copy link
Collaborator Author

I had some issues due to the escaping lol. I've updated the util code for unescaping to use javascript, in this way regex101 was correctly showing the wrong matches.

@FedericoPonzi FedericoPonzi merged commit 558b3ae into tlaplus:master Jan 24, 2025
4 checks passed
@FedericoPonzi FedericoPonzi deleted the issue-361 branch January 24, 2025 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants