Compatibility/Integration with other systems #26
Replies: 3 comments 6 replies
-
This is a very important topic. Antlr is a "parser generator", but it needs to work with tools beyond the scope of a "parser generator". The first thing that has to be clarified is the grammar for Antlr5 itself. This is because nothing can proceed on reading and analyzing grammars without the meta-meta model. The other thing that should be kept in mind is to think of the parse tree (and hopefully have round-trip inport/export) as a "DOM". This is because this is the standard many tools use. Intertoken text should be placed in the PT as attributes. I would probably add line/column information as well, although this duplicates information that can be derived from the frontier of the PT. What query language do you intend for LionWeb to support? It doesn't say much. I have been writing a requirements spec for the next version of Trash, which is command-line grammar toolkit. Among other things, my plan is to include the Z3 SMT theorem prover for analysis of grammars. The other part I intend to do is to use an up-to-date XQuery engine. Trash passes around parse trees in a JSON format, but it includes other information about the char buffer, and minimal information about the grammar used in the parse. Passing around just a parse tree (actually a collection of many) is not enough. |
Beta Was this translation helpful? Give feedback.
-
@KvanTTT you are correct that WA will not resolve all performance problems with AntLR. The idea is that it will significantly improve performance with JS/TS and Python. But other optimizations are more than welcome, and thanks to the unified Wasm runtime, they will become available for all targets - at least that's the proposed strategy. |
Beta Was this translation helpful? Give feedback.
-
The above pattern has a right-recursion/de-Kleene operator pattern as well in scotty.g4. This pattern causes a large max-k lookahead to resolve when to finish the rule, causing full context fallbacks. Very, very expensive for Antlr. Simple cases are easy to detect with XPath expressions, but more complicated ones that invoke string rewrites are harder. |
Beta Was this translation helpful? Give feedback.
-
This is a bit of a broad topic but I would be curious to hear your opinions about this.
In general we all use parsers as components of some applications: a smart editor, a transpiler, a compiler, an interpreter, a code analysis system, etc.
So parsers need to be integrated with these other systems that basically needs to consume parse-trees returned by ANTLR (and possibly a list of issues), need to transform the parse-trees, store the parse-trees, generate stuff out of the parse-trees etc.
Sometimes these systems also need to learn about the structure of the language: here I do not mean the syntax but the structure of the different kinds of nodes and also the list of the different kinds of nodes. What some people would define as meta-model (or M2). When I have this need I either parse the grammar or use reflection to derive the structure of the language from the structure of the Context classes.
This is a problem that I am seeing over and over again with different systems in the language engineering field and one possible answer is this project called LionWeb. In essence the idea is to define a series of formats and protocols for interoperability.
In the case of ANTLR, if ANTLR had a compatibility layer with LionWeb we could benefit from all the infrastructure that there is currently in LionWeb and the one that eventually will be produced:
In the future one could also share the effort to build common infrastructure, like for example code-generators using as inputs a LionWeb models or tree-rewriting/model-transformations systems that work on every LionWeb models (here the alternatives is creating a system for tree-rewriting that is specific to ANTLR5).
Besides LionWeb, I am curious to hear if you encountered cases where you wanted to somehow import a parse-tree into some other system, or serialize it, or process it using existing tools and libraries which required you to write some sort of adapters.
I hope this does not sound too confused or broad.
Beta Was this translation helpful? Give feedback.
All reactions