Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python3] Remove obsolete grammar python3 #3865

Closed
wants to merge 7 commits into from
Closed

[python3] Remove obsolete grammar python3 #3865

wants to merge 7 commits into from

Conversation

kaby76
Copy link
Contributor

@kaby76 kaby76 commented Nov 30, 2023

This is a fix for #3833.

  • The tests from python3 have be copied into python3_12_0. Everything parses with the new grammar, apparently correct, although I have not checked expression.
  • The directory python/python3/ containing grammar has been removed.
  • The super pom.xml has been updated.

To do:

  • Bring python3_12_0/ grammars into standard target agnostic format.
    • Rename "self" to "this".
    • Include comments for Cpp target.
  • Add ports for Cpp, Dart, Go, JavaScript, PHP, TypeScript.
  • Add missing Reset() methods to parser and lexer base classes to reinitialize. Critical for standard template drivers -tokens argument.

@RobEin
Copy link
Contributor

RobEin commented Dec 1, 2023

Hi,
I work on ports and modifications, I hope I have more time lately.

@kaby76
Copy link
Contributor Author

kaby76 commented Dec 1, 2023

Hi, I work on ports and modifications, I hope I have more time lately.

Not necessary. I'll be done with the remaining ports by the end of the weekend.

Also, I want to review this grammar more carefully. For example, I noticed my static checks flagged there were useless parentheses. I was going to correct that, but then realized that those parentheses included negative lookahead checks that were in the PEG grammar that you didn't add. Did you check these cases carefully? I also noticed there are no .tree files to test parse trees.

@RobEin
Copy link
Contributor

RobEin commented Dec 1, 2023

Thanks for the porting help.

I started to investigate these lookaheads and I came to the conclusion that they are only due to speedups in the PEG grammar, which in turn cause a slowdown in ANTLR4 due to the semantic predicates.
Speedups are needed in CPython because of the many special alternatives.
By special alternatives, I mean those alternatives that contain invalid rules or actions.
Since these are only needed by CPython, I combined these alternatives where possible and eliminated the lookaheads associated with them.
I haven't checked all the lookaheads I've deleted, but it passed all the tests I ran.

@RobEin
Copy link
Contributor

RobEin commented Dec 1, 2023

Sorry, but I don't know what you mean by testing .tree files.

@kaby76
Copy link
Contributor Author

kaby76 commented Dec 1, 2023

The .tree files are the parse trees as represented by the toStringTree() method in the Antlr runtime. The presence of the files test consistency of the tree across targets, OSes, between versions of the grammar, and between versions of the Antlr tool and runtime.

@RobEin
Copy link
Contributor

RobEin commented Dec 1, 2023

I understand. So comparing the text-tree of the same input with an older Python parser and the current python3_12_0.
In principle, they should be the same.
Good idea. I haven't done this yet, but I'll check it out.

@RobEin
Copy link
Contributor

RobEin commented Dec 1, 2023

Unfortunately, this is not a feasible way, because the Python parser rules in older versions have different names than the rule names in python3_12_0.

@kaby76
Copy link
Contributor Author

kaby76 commented Dec 1, 2023

Unfortunately, this is not a feasible way, because the Python parser rules in older versions have different names than the rule names in python3_12_0.

The trees aren't used between different versions of Python and the .peg. After all, when a new version of python and .peg come along, an entirely new grammar will be created. The tests can be copied over and trees re-mastered.

@RobEin
Copy link
Contributor

RobEin commented Dec 19, 2023

Can I help something about translations?

@kaby76
Copy link
Contributor Author

kaby76 commented Dec 19, 2023

Can I help something about translations?

@RobEin I'm working on the C++ port. I haven't had time to visit this lately because of the many issues with the builds for the repo. I am planning to continue this port soon.

@RobEin
Copy link
Contributor

RobEin commented Dec 19, 2023

I may start the javascript port if it does not cause collision.

@kaby76
Copy link
Contributor Author

kaby76 commented Dec 19, 2023

I may start the javascript port if it does not cause collision.

That'll be great. Thanks.

@RobEin
Copy link
Contributor

RobEin commented Jan 10, 2024

The JavaScript port is ready.
In addition:

  • self references renamed to this in the parser grammar and transformGrammar.py added to Python3 port
  • added reset() methods to the PythonLexerBase.*

I will also make a PR to the antlr/grammars-v4 soon.
I continue the porting with TypeScript.

@kaby76 kaby76 closed this by deleting the head repository Feb 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants