-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support for macros to IonRawBinaryWriter_1_1 #668
Conversation
@@ -55,7 +55,7 @@ class IonRawBinaryWriter_1_1 internal constructor( | |||
/** | |||
* Clears this [ContainerInfo] of old data and initializes it with the given new data. | |||
*/ | |||
fun reset(type: ContainerType? = null, position: Long = -1, isDelimited: Boolean = false) { | |||
fun reset(type: ContainerType, position: Long, isDelimited: Boolean = false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ Incidental change—I realize that the default values are always supplied for both type
and position
, so I removed the default values.
@@ -448,6 +467,9 @@ class IonRawBinaryWriter_1_1 internal constructor( | |||
List, SExp, Struct -> { | |||
val contentLength = currentContainer.length | |||
if (contentLength <= 0xF && !currentContainer.usesFlexSym) { | |||
// TODO: Right now, this is skipped if we switch to FlexSym after starting a struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ This is incidental. I was aware of this when I implemented structs, but I had since forgotten it, and then I rediscovered this while I was writing tests for E-Exps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide a little more detail here? What are the implications? That we can't reclaim space for FlexSym structs? Why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scenario is when we have a struct that starts out as a SID struct, has at least one SID field before switching to FlexSyms, and has a total length of less than 16 bytes. The consequence of this limitation is that we're wasting a byte. E.g.:
FC | Variable length SID Struct ]____ Could be replaced with
17 | Length = 11 ] just `CB` to save a byte.
81 | SID 64
5E | true
01 | switch to FlexSym encoding
FB 66 6F 6F | FlexSym 'foo'
5E | true
02 01 | FlexSym SID 64
5E | true
Ideally, we could write this starting with CB
but we don't store enough state right now to track whether it's safe to do that. We only know that the struct is currently using FlexSyms, which means that we don't know whether the OpCode at the start is FC
or FD
. If it's FC
, then it's safe to collapse FC 17
to CB
(as in the example above). If it actually started with FD
, then going to CB
would cause the struct to be malformed.
It's not difficult to solve this—it just requires a way to check how the struct started out and a few more branches for the different scenarios—but I didn't really want to address it while the struct encoding is still in a bit of flux.
// Instead of allocating iterator, we share one iterator instance within the scope of the container stack and | ||
// reset the cursor every time we track back to the ancestors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ Incidental formatting change so that the line isn't ridiculously long.
@@ -1044,12 +1044,12 @@ private void patchSingleByteTypedOptimisticValue(final byte type, final Containe | |||
if (info.length <= 0xD) | |||
{ | |||
// we fit -- overwrite the type byte | |||
buffer.writeUInt8At(info.position - 1, type | info.length); | |||
buffer.writeByteAt(info.position - 1, type | info.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ The writeUInt8At
method got renamed to writeByteAt
because I found it misleading to use writeUInt8
to write FlexInt.ZERO
. I checked all usages of the method, and the new name writeByteAt
works for all of them. None of the existing usages were specifically trying to write an 8-bit integer; rather they were writing either a sentinel (such as an op code) or 8 bits of a larger value.
while (this.index != index) { | ||
blocks.remove(this.index--); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ This fixes a bug in WriteBuffer
that could potentially lead to corrupted data. Basically, if we change the index
without either removing or resetting the blocks higher than index
, then when we start writing again, and we get to one of those blocks, we wouldn't have started at position 0. For example:
| Block 0 | Block 1 |
|=========|=========|
| | | (1) write "ABCDEF" to the buffer
| A B C D | E F _ _ |
| | | (2) truncate to position 3
| A B C _ | E F _ _ | <-- Notice that while block 0 is reset to the correct position, there
| | | was no clean up done in block 1
| | |
| | | (3) write "123" to the buffer
| A B C 1 | E F 2 3 | !!! Block 1 resumed at offset 2, so we still
| | | have the "EF" that we tried to truncate earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long has this bug existed? Does it require patching in master and/or previous versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's been there all along, but I haven't looked into the potential impact. Since it's been there all along, it's even possible that the current implementation has a way to avoid the buggy behavior by cleaning the blocks at some other time.
@@ -1300,7 +1301,11 @@ public void writeVarUIntDirect2At(long position, long value) | |||
block.data[offset + 1] = (byte) ((value & VAR_INT_MASK) | VAR_INT_FINAL_OCTET_SIGNAL_MASK); | |||
} | |||
|
|||
public void writeUInt8At(final long position, final long value) | |||
public void writeByteAt(final long position, final byte value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ Convenience overload. If it's a perf problem we can remove it, but I bet the JIT is smart enough to optimize this out.
5032e52
to
77b5f5e
Compare
src/main/java/com/amazon/ion/impl/bin/IonRawBinaryWriter_1_1.kt
Outdated
Show resolved
Hide resolved
@@ -448,6 +467,9 @@ class IonRawBinaryWriter_1_1 internal constructor( | |||
List, SExp, Struct -> { | |||
val contentLength = currentContainer.length | |||
if (contentLength <= 0xF && !currentContainer.usesFlexSym) { | |||
// TODO: Right now, this is skipped if we switch to FlexSym after starting a struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide a little more detail here? What are the implications? That we can't reclaim space for FlexSym structs? Why not?
Co-authored-by: Tyler Gregg <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## ion-11-encoding #668 +/- ##
==================================================
Coverage ? 65.05%
Complexity ? 5828
==================================================
Files ? 200
Lines ? 25029
Branches ? 4468
==================================================
Hits ? 16282
Misses ? 7433
Partials ? 1314 ☔ View full report in Codecov by Sentry. |
Issue #, if available:
None
Description of changes:
Updates the signature of
stepInEExp()
and implements it.There are a couple of incidental changes that I've called out with a 🗺️ comment.
Potentially controversial... this PR uses the end delimiter to signal the end of a macro's args if that macro has rest-parameter in it's signature.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.