-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sstable: Fix restarts integer overflow #4232
base: master
Are you sure you want to change the base?
sstable: Fix restarts integer overflow #4232
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a test with a very large block and verify we can iterate it.
Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @EdwardX29 and @jbowens)
sstable/rowblk/rowblk_iter.go
line 116 at r3 (raw file):
// offset is the byte index that marks where the current key/value is // encoded in the block. offset int64
You can consider adding a type offsetInBlock int64
which would make it evident what fields are offsets.
sstable/rowblk/rowblk_iter.go
line 200 at r3 (raw file):
type blockEntry struct { offset int64 keyStart uint32
Aren't the rest of these also offsets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @RaduBerinde)
sstable/rowblk/rowblk_iter.go
line 116 at r3 (raw file):
Previously, RaduBerinde wrote…
You can consider adding a
type offsetInBlock int64
which would make it evident what fields are offsets.
Good idea—we can add a comment on the type discussing the motivation for using an int64
, including our expectation that an offset may be > math.MaxUint32
(even though restart points are serialized as uint32s, because the offset fields are sometimes updated to point after the last KV in the block)
sstable/rowblk/rowblk_iter.go
line 200 at r3 (raw file):
Previously, RaduBerinde wrote…
Aren't the rest of these also offsets?
+1, I think they're all offsets except for valSize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4, 1 of 1 files at r5.
Reviewable status: 2 of 3 files reviewed, 6 unresolved discussions (waiting on @EdwardX29 and @RaduBerinde)
sstable/rowblk/rowblk_iter_test.go
line 481 at r8 (raw file):
var largeKeySize int64 = 2 << 30 // 2GB key size var largeValueSize int64 = 2 << 30 // 2GB value size
nit: these can be const
instead of var
, and if you omit the int64
too, then down below we won't have to cast into an int
when calling bytes.Repeat
sstable/rowblk/rowblk_iter_test.go
line 534 at r8 (raw file):
} KVTestPairs := func() []KVTestPair {
curious why use an anonymous function here?
sstable/rowblk/rowblk_iter_test.go
line 538 at r8 (raw file):
for i := 0; i < numKVs; i++ { key := fmt.Sprintf("key-%04d", i) value := strings.Repeat("a", valueSize)
nit: since we're memory constrained, maybe lift this to the outer scope as:
value4MB := bytes.Repeat([]byte("a"), valueSize)
so that every KV we add is reusing the same constructed byte slice rather than re-creating it
sstable/rowblk/rowblk_iter_test.go
line 584 at r8 (raw file):
numKVs := 63 valueSize := 4 * (1 << 20) var FourGB int64 = 4 * (1 << 30)
nit: use const
s for those of these that are constant
sstable/rowblk/rowblk_iter.go
line 198 at r8 (raw file):
} // Type representing an offset in a block
nit: it's idiomatic to start a Go comment on an identifier with the name of the identifier, in this case:
// offsetInBlock represents an offset in a block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 3 files reviewed, 2 unresolved discussions (waiting on @jbowens and @RaduBerinde)
sstable/rowblk/rowblk_iter_test.go
line 481 at r8 (raw file):
Previously, jbowens (Jackson Owens) wrote…
nit: these can be
const
instead ofvar
, and if you omit theint64
too, then down below we won't have to cast into anint
when callingbytes.Repeat
32 bit architectures are not happy with large constants being used as int
s in bytes.Repeat
as constant expressions are evaluated at compile time. Adding a check for strconv.IntSize == 32
doesn't solve the issue either. CI run. However, using var
and int64
solves this issue as overflow is at runtime instead; run here
sstable/rowblk/rowblk_iter_test.go
line 534 at r8 (raw file):
Previously, jbowens (Jackson Owens) wrote…
curious why use an anonymous function here?
Originally was going to have both Multi large KV tests in one test so I wrote a lambda to generate KVs for both. switched to standard initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to squash the commits before merging
Reviewable status: 1 of 3 files reviewed, 2 unresolved discussions (waiting on @EdwardX29 and @RaduBerinde)
sstable/rowblk/rowblk_iter_test.go
line 481 at r8 (raw file):
Previously, EdwardX29 (Edward Xiao) wrote…
32 bit architectures are not happy with large constants being used as
int
s inbytes.Repeat
as constant expressions are evaluated at compile time. Adding a check forstrconv.IntSize == 32
doesn't solve the issue either. CI run. However, usingvar
andint64
solves this issue as overflow is at runtime instead; run here
Ah, got it.
fix restarts overflow add comment for int64 change update blockEntry offset members to uint32 add check for when block is greater than MaximumSize add offsetInBlock type to represent offsets add tests for integer overflow in SeekGE() and writing large blocks adjust integers in tests for 32 bit machines Skip block overflow tests for CI mem constraints Use constants in tests and strconv.IntSize check revert tests to int64 constants and int cast revert tests to int64 vars add explicit check for CI in tests
10f183e
to
1b3ca39
Compare
Updates block and iter structs to use int64 for restarts, offset, and nextOffset. Changes prevent integer overflow found in this issue (#4216). Tested using
go test ./...