Fix the issue of high RAM usage and long runtime for huge bootstrap_balances.json file #365
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix the issue of high RAM usage and long runtime for huge bootstrap_balances.json file.
Motivation
User reported 2 issues when running bootstrap_balances.json (101K accounts) for IOTA:
memory_limit_disable=false
, bootstrap process will stop at 35Kth accounts because of transaction storage limit.This ticket aims to resolve both issues so that:
Links to the issues:
coinbase/mesh-cli#259
coinbase/mesh-cli#260
Investigation
In general, bootstrap balances consists of 3 steps:
I've experimented in 32G RAM machine by bootstrapping different number of accounts and computed the duration of step 2) and 3). The result is as follows:
When running bootstrap scripts with 101K accounts, the process stopped at 35K records with %6.4 memory usage, which is ~2G.
Solution
Based on investigation, the solution is to split the huge number of accounts into different batches. Each batch contains 600 records. We commit each batch before reading the next batch. By reducing the number of accounts committed in each transaction, we can achieve a better memory usage (101K records used %4 of total RAM which is ~1.3G) and much better running time (~60s for 101K records).