Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rapidyaml: add parsing of ys to integer-based events #209

Open
wants to merge 1 commit into
base: rapidyaml
Choose a base branch
from

Conversation

biojppm
Copy link
Collaborator

@biojppm biojppm commented Jan 2, 2025

Adds a parser producing integer events, and integers signifying strings
by indexing into the parsed YS string.

Each event is provided as an integer bitmask. The event bits are
defined in both the C++ side (ysparse_evt_handler.hpp) and the Java
side (org.rapidyaml.Evt), and they need to stay consistent on both
ends. When a string is associated with an event, it is provided as an
integer offset and length.

For example, the YAML say: 2 + 2 produces the following sequence of
integers:

BSTR,
BDOC,
VAL|BMAP|BLCK,
KEY|SCLR|PLAI, 0, 3, // "say"
VAL|SCLR|PLAI, 5, 5, // "2 + 2"
EMAP,
EDOC,
ESTR,

Note that the scalar events, ie "say" and "2 + 2", are followed each
by two extra integers encoding the offset and length of the scalar's
string. These two extra integers are present whenever the event has
any of the bits SCLR, ALIA, ANCH or TAG. For ease of use, there is a
bitmask HAS_STR, which enables quick testing by a simple flags & HAS_STR.

Also, where a string requires filtering, the parser filters it
in-place in the input string, and returns the extra integers
pertaining to the resulting filtered string.

The existing EDN-producing parser was not removed, and the
EVT-producing parser was added both on C++ and on the Java JNI bridge.

Tests were enlarged to cover the new event parsing, both for C++ and
Java.

To benefit from this approach, the YS code should be changed to

Other changes:

  • rapidyaml->ysparse: start renaming the C++ library providing the
    JNI. ysparse is a more accurate description of what the library does.
  • Directly use only the required rapidyaml source files instead of
    amalgamating into a single header.
  • Makefile:
    • Add target to generate the JNI header
    • Improve target dependencies

@biojppm biojppm requested a review from ingydotnet January 2, 2025 20:14
@biojppm biojppm self-assigned this Jan 2, 2025
@biojppm biojppm marked this pull request as draft January 2, 2025 20:14
@biojppm biojppm changed the base branch from rapidyaml to rapidyaml-static-linking January 2, 2025 20:17
@biojppm biojppm force-pushed the rapidyaml-int-events branch 3 times, most recently from eb417b3 to 721953a Compare January 4, 2025 18:15
@biojppm biojppm force-pushed the rapidyaml-static-linking branch from f08cc57 to 4cadd1f Compare January 4, 2025 20:12
@biojppm biojppm force-pushed the rapidyaml-int-events branch from 721953a to c8e9440 Compare January 4, 2025 20:36
@biojppm biojppm changed the title WIP Rapidyaml int events rapidyaml: add parsing of ys to integer-based events Jan 4, 2025
@biojppm biojppm marked this pull request as ready for review January 4, 2025 21:25
@biojppm biojppm changed the base branch from rapidyaml-static-linking to rapidyaml January 4, 2025 21:25
@biojppm biojppm force-pushed the rapidyaml-int-events branch 2 times, most recently from 2d34d2c to 06db894 Compare January 5, 2025 00:53
@ingydotnet ingydotnet force-pushed the rapidyaml branch 2 times, most recently from 298e77f to b530cc8 Compare January 5, 2025 15:20
@biojppm biojppm force-pushed the rapidyaml-int-events branch 2 times, most recently from 88fab31 to 6c5925d Compare January 5, 2025 16:23
@biojppm biojppm force-pushed the rapidyaml-int-events branch from 6c5925d to 5d04f15 Compare January 5, 2025 16:25
Adds a parser producing integer events, and integers signifying strings
by indexing into the parsed YS string.

Each event is provided as an integer bitmask. The event bits are
defined in both the C++ side (ysparse_evt_handler.hpp) and the Java
side (org.rapidyaml.Evt), and they need to stay consistent on both
ends. When a string is associated with an event, it is provided as an
integer offset and length.

For example, the YAML `say: 2 + 2` produces the following sequence of
integers:

```c++
BSTR,
BDOC,
VAL|BMAP|BLCK,
KEY|SCLR|PLAI, 0, 3, // "say"
VAL|SCLR|PLAI, 5, 5, // "2 + 2"
EMAP,
EDOC,
ESTR,
```

Note that the scalar events, ie "say" and "2 + 2", are followed each
by two extra integers encoding the offset and length of the scalar's
string. These two extra integers are present whenever the event has
any of the bits SCLR, ALIA, ANCH or TAG. For ease of use, there is a
bitmask HAS_STR, which enables quick testing by a simple `flags & HAS_STR`.

Also, where a string requires filtering, the parser filters it
in-place in the input string, and returns the extra integers
pertaining to the resulting filtered string.

The existing EDN-producing parser was not removed, and the
EVT-producing parser was added both on C++ and on the Java JNI bridge.

Tests were enlarged to cover the new event parsing, both for C++ and
Java.

To benefit from this approach, the YS side must implement a mechanism
to convert the int event sequence into its internal data-structure,
using the symbols in the class org.rapidyaml.Evt.

Other changes:

- rapidyaml->ysparse: start renaming the C++ library providing the
  JNI. ysparse is a more accurate description of what the library does.
- Directly use only the required rapidyaml source files instead of
  amalgamating into a single header.
- Makefile:
  - Add target to generate the JNI header
  - Improve target dependencies
@biojppm biojppm force-pushed the rapidyaml-int-events branch from 5d04f15 to 7f43cb6 Compare January 5, 2025 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant