PyRD

PyRD is a recursive descent parser generator with backtracking implemented in Python.

Installation

PyRD can be installed by running python setup.py install from the top level of the repository.

Command Line Usage

PyRD can be invoked as either a python module with python -m pyrd <input.grammar> <output.py> or as a standalone program with pyrdg <input.grammar> <output.py>. The syntax of <input.grammar> is described in the next section, and <output.py> will be generated as an importable Python file.

`.grammar` Syntax

.grammar files consist of 2 sections, separated by the marker %%. The first section consists of a list of the parser's production rules, and the second section consists of raw Python code that will be appended to the end of the generated parser. Prepending to the generated parser is currently not supported. Production rules are specified as follows:

production_id :: list of space-separated subparsers {raw python code}
    | another list of subparsers {raw python code}
    | more subparsers {raw python code};

Production rules are tried sequentially, so if the first rule belonging to an id succeeds the second will not be run. The values returned by any subparser in a production rule are populated by name into that production rule's python code, eg:
list_index :: list_ "[" index "]" { return list_[index] };
Two special parsers for string literals and regexes can be accessed with "characters-to-match" and /regex-to-match/, respectively. Their results can be accessed by indexing the special variable parsed in their production rule's python code, eg:
int_in_brackets :: "[" /-?[0-9]+/ "]" {return int(parsed[1])};
The python code between brackets is run whenever the production rule succeeds, and is used as that parser's value when called from another production rule. It is substituted as-is into the generated parser, so care must be taken to comply with Python's indentation rules.

An example parser that finds the maximum or minimum of a comma-separated list of integers might look like:

list     :: operator "(" items ")" {return ops[operator](items[::-1])};
operator :: /(min|max)/ {return parsed[0]};
items    :: num "," items {items.append(num); return items}
          | num {return [num]};
num      :: /-?[0-9]+/ {return int(parsed[0])};
%%
ops = {
  "min":min,
  "max":max
}
if __name__ == '__main__':
    parsed = List().parse(input())
    print(parsed)

The complete description for .grammar files can be found in examples/grammar.grammar

Examples

The examples/ directory contains several example grammars for PyRD. Examples can be compiled and tested as follows:

$ pyrdg examples/calc.grammar calc.py
Grammar parsed successfully.
$ echo "1 + 2 * 3" | python calc.py
7

Benchmarking

PyRD does not generate efficient parsers. The following parse times were achieved with a json parser generated by PyRD (examples/json.grammar):

File Size (KB)	Runtime (PyRD)	Runtime (builtin JSON parser)
10	0.120s	< 0.1s
30	0.257s	< 0.1s
70	0.548s	< 0.1s
140	1.193s	< 0.1s
400	5.521s	< 0.1s
1000	44.8s	0.1s

Future Work

Support more flexible naming of production rule ids.
Document PyRD Classes.
Improve runtime by implementing some form of LL(k) prediction.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
examples		examples
pyrd		pyrd
scripts		scripts
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyRD

Installation

Command Line Usage

`.grammar` Syntax

Examples

Benchmarking

Future Work

About

Releases

Packages

Languages

westphallm1/PyRD

Folders and files

Latest commit

History

Repository files navigation

PyRD

Installation

Command Line Usage

.grammar Syntax

Examples

Benchmarking

Future Work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`.grammar` Syntax

Packages