forked from katef/libfsm
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CDATA codegen updates, part 3 #33
Open
silentbicycle
wants to merge
5
commits into
main
Choose a base branch
from
sv/cdata-codegen-updates-part-3
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
These fields are only used at the end of input, and moving them into a different struct will make the per-state data accessed during DFA execution more compact.
Each state has two 256-bitsets, stored as a uint64_t[4], but the individual words in those have a lot of duplication. Add a table with every unique word, sorted descending by frequency, and replace the per-state labels and label_group_starts arrays with an array of offsets into the label_word table. Typically these offsets will fit in a uint8_t (though the code generation will switch to a uint16_t when necessary), making the per-state data much smaller. The label_word table's most commonly used entries are all grouped together and should stay in cache.
The edge sets leak when halting with FSM_DETERMINISE_WITH_CONFIG_STATE_LIMIT_REACHED.
Previously, the CDATA codegen wrote eager_outputs directly into the caller's match bit buffer as they were encountered. Instead, set them in a stack-allocated buffer, and then copy them to the caller's if the DFA match succeeds overall. In order to avoid repeatedly checking for whether an eager_output has already been set (in the buffer), this collects the set of all distinct eager_output IDs and then remaps the array with eager output IDs to offsets into the unique set. This condenses the (sparse) set into a dense series 0..n that can be represented by flags in a stack-allocated bit vector (with a size known at compile time), and redundant eager_outputs harmlessly set flag bits that are already set. If the overall match succeeds, that bit vector is matched up with the unique ID array and the sparse values are written into the caller's buffer. Because the unique ID array is sorted, the relative ordering of the sparse and dense IDs is preserved (and 0 stays 0), so using non-ascending values as terminators still works.
deg4uss3r
approved these changes
Dec 11, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admittedly I am coming into the code base pretty cold and C isn't my most preferred language but I do not see anything here that would prevent this from being approved.
Because of the for loop init condition, this was checking bits `1 << 1..64` in every word -- it was unintentionally ignoring the least significant bit. It should check `1 << 0..64`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Some further updates, for a new Device Detection canary. This still isn't quite ready to go upstream, but it's getting closer.
Changes:
.end
and.endid_offset
fields from the state struct table to a separate table. Those fields aren't used until the end of input, moving them out makes the state table more dense and improves locality.uint8_t
oruint16_t
offsets into a shared word table the overall binary ends up considerably smaller. They are sorted by use (descending), so the most frequently used ones are likely to stay in cache.fsm_determinise_with_config
's early exit katef/libfsm#504