Eliminate possibility of OutOfMemory errors from scaling Kinesis Shards #250

jbeemster · 2022-02-08T11:37:02Z

Currently the S3 Loader buffers operate on a per-shard basis. This means that if your rotation buffer is 64mb each shard allocated to the consumer can consume this much memory. If the number of shards suddenly scales you run the risk of needing not 64mb of memory for this buffer but instead N x MaxByteBuffer - let alone overhead for the JVM and processing in general being done.

This behavior makes it impossible to auto-scale consistently as you never know how much memory an individual consumer might end up needing.

istreeter · 2022-02-08T11:44:23Z

A good way to fix this is to rewrite the loader as a fs2 app, using fs2-aws as the kinesis source. Using that library, the shards would share a single buffer, so memory usage should be approximately constant even as the number of input shards increases.

This would be a big rewrite of the app, but it's a change we want to do anyway, and it is consistent with how we are now writing all other Snowplow apps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eliminate possibility of OutOfMemory errors from scaling Kinesis Shards #250

Eliminate possibility of OutOfMemory errors from scaling Kinesis Shards #250

jbeemster commented Feb 8, 2022

istreeter commented Feb 8, 2022

Eliminate possibility of OutOfMemory errors from scaling Kinesis Shards #250

Eliminate possibility of OutOfMemory errors from scaling Kinesis Shards #250

Comments

jbeemster commented Feb 8, 2022

istreeter commented Feb 8, 2022