wazevo(frontend): simple bounds check elimination on mem access #1883
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch changes the lowering of memory access and introduces a very
simple bounds check elimination which is performed per Basic Block.
For example, the following assembly, which is extracted from
Zig's memset implementation without bulk-memory feature,
needs the boundary check only at the first i32.store8 inside the loop because
the access range can be statically known to be lower than the previous ones.
In short, with this patch the frontend compiler caches the maximum of such
"already checked memory bounds" and uses them to optimize out the entire
bounds check sequences.
The example is now lowered like
vs previously
As a result, running the entire Zig stdlib gets 1.5x faster and the resulting binary
was reduced from 80BM to 65MB. Also coremark benchmark score improved
from 12843.565 to 13535.463 on my local run.
As a future work, we can expand this beyond the per-block and make it CFG-aware
to more aggressively eliminate the bounds check. But this simple one has already
improved enough the baseline!