Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepares to support reading binary Ion 1.1. #639

Merged
merged 6 commits into from
Nov 21, 2023

Conversation

tgregg
Copy link
Contributor

@tgregg tgregg commented Nov 15, 2023

Description of changes:

  • Adds FlexUInt reading logic to IonCursorBinary. This is used to read the length of variable-length types, as in calculateEndIndex_1_1.
  • Augments IonTypeID with Ion 1.1 support. These are pre-computed, so the big ugly method is not on the performance path. That said, beautification ideas are welcome.
  • Enables the reader to recognize the Ion 1.1 IVM.
  • Refactors test utilities that will be used by Ion 1.1 decoding tests from IonEncoder_1_1Test into TestUtils.

This will be covered by tests that will be added in future PRs as support for the Ion types is added incrementally. I've intentionally left some performance-related TODOs in the code for future investigation.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@tgregg tgregg requested a review from popematt November 15, 2023 22:19
@@ -567,8 +571,18 @@ public static boolean symbolTableEquals(final SymbolTable first, final SymbolTab
public static class BinaryIonAppender {
private final ByteArrayOutputStream out = new ByteArrayOutputStream();

public BinaryIonAppender(int minorVersion) throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just have an enum of the different Ion versions so that it's not possible to provide an invalid version, or are there performance implications to doing that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no performance implications in this test code, but it's just us writing the tests, so I don't think it's super important for us to protect ourselves from providing an invalid version to this utility.

If the question is intended to apply broadly to the source, there are some performance implications when we get into, say, branching vs. dynamic dispatch. For example, if we wanted a non-trivial enum that contained abstract methods to be overridden by each version. This is similar to how I wrote the first Ion 1.1 prototype, and I found simple branching to be much more efficient. We could have a simple enum that just enumerates the acceptable versions, but on the read side we still have to validate the version that was declared to the encoding, then map to the enum. After validation, I'm not sure if it really matters whether we compare an int or an enum value on the branches. Maybe an enum starts being more clean once we have another major version, but we're a ways off from that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An enum seems like a good idea to me.

type = IonType.STRUCT;
} else if (lowerNibble == 0x5) {
type = IonType.INT;
} else if (lowerNibble == 0x6) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an OpCodes class, so you could write:

Suggested change
} else if (lowerNibble == 0x6) {
} else if (id == OpCodes.VARIABLE_LENGTH_DECIMAL) {

...and similarly for others where it's a single value. I don't think it's any more expensive to compare a whole byte instead of a nibble, so it would be nice to do this for readability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, #645

@tgregg tgregg changed the base branch from deprecated-ion-11-encoding-branch to ion-11-encoding November 20, 2023 22:45
@popematt popematt force-pushed the ion-11-encoding-read-pr1 branch from 8c111e1 to c0392c0 Compare November 21, 2023 02:08
@popematt popematt merged commit 53ee450 into ion-11-encoding Nov 21, 2023
2 of 4 checks passed
@tgregg tgregg deleted the ion-11-encoding-read-pr1 branch November 21, 2023 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants