forked from google/gematria
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exeriments: add missing #include <iterator>
#1
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This patch fixes the C++ formatting which was causing the clang-format job to fail. Very minor change, but if we're going to have C++ formatting in CI, we should probably keep it green.
The bazel files were also changed after the recent push and no longer are formatted correctly according to buildifier. This patch does the necessary reformatting.
This patch adds support for running the bazel build in CI to easily catch bazel build regressions. Running the bazel tests will be added in a future patch.
Currently the github actions workflow runs whenever something is pushed to a branch or against any pull request. This means that if a branch is pushed to the main google/gematria repository and a PR is opened against google/gematria:main, there will be duplicate jobs, one for the push event and one for the pull request event. This patch fixes this behavior by restricting the push event to the main branch.
For some reason, using an older version of pyink causes odd failures in the CI complaining about missing functions in a vendored dependency. Upgrading to the most recent version fixes the issue. Maybe should try and make this more hermetic at some point with a lockfile or something, but for now this is a simple fix.
Use the "relative-to-llvm-root" invludes for all LLVM headers.
The previous commit reworked all of the llvm includes, but broke the code formatting job in the meantime. This patch fixes the code formatting to get the CI back to green.
This prevents vim's .swp files from ending up in the repository and commits during development.
Currently the Bazel build is failing due to the CI trying to pull in bazel v7.0.0. This patch pins the bazel version to v6.4.0 which gets the CI green again, allowing further time to work on performing the necessary migration steps to get everything working properly with v7.0.0.
This patch adds in an alternative to the existing find_accessed_addrs infrastructure that uses llvm-exegesis as a backend. We believe this will more closely match the execution environment of the benchmarking environment and should avoid some of the pitfalls of the existing memory annotation infrastructure (at the current expense of speed).
Added a script `dataset/convert_bhive_to_llvm_exegesis_input.cc` that takes a bhive csv file containing a sort of x86 basic blocks in hex format and a `llvm-exegesis` template that defines initial register values (`dataset/llvm-exegesis_wrapper.S` is used by default) as input. It will output code snippets that can be executed by `llvm-exegesis` in target output_dir. Each code snippet(output file) will contain one x86 basic block. ### How to run it? ```bash # build the latest repo bazel build ... # create an output directory mkdir output # run script ./bazel-bin/gematria/datasets/convert_bhive_to_llvm_exegesis_input --bhive_csv=gematria/testing/testdata/test.csv --output_dir=output ```
Nothing currently wrong with the old bazelisk version, but we're two minor versions out of date at this point and there's no real point in not updating.
This patch bumps the LLVM version to a recent ToT commit and patches the Exegesis annotator so that everything compiles and works correctly. Mostly wanting to get this in as I remember debugging weird failures due to things compiling despite a constructor changing within upstream Exegesis.
This is the canonical path to use for the CMake build according to the documentation. Add this to the gitignore so git add --all works with a CMake build setup within the repository.
A couple additional cases were recently added into the BB Address Map section in llvm/llvm-project#74128. This broke llvm-cm as a couple method names changed. This patch provides the quick fix to get everything working. Eventually llvm-cm will need some work to support multiple BBRanges within a single function.
This patch sets up a ninja target for testing llvm-cm, akin to the tool specific test targets in the monorepo. This allows for actually running the tests. Documentation has been added to the README.
Now that the check target is wired up through CMake, we should test this through CI to ensure that there aren't any regressions.
…oogle#33) This patch makes the exegesis annotator work with mappings that are not page aligned. Currently if a segfault occurs at an address that isn't page aligned, Exegesis will try and map an address at that address and then the mmap call will silently fail as there is no error handling wired up for that and it will segfault again, thus creating a loop within the annotator.
This patch adjusts the constants in the BHive conversion script from the default values added in with the script to the values used within the original BHive paper. These constants were experimentally determined within the BHive paper to be high enough to not underflow/access addresses in the first page yet low enough to not access any addresses above the virtual address space ceiling. Anecdotally, these constants have also seemed to work better than the original ones. In addition, this patch prefixes the memory value with zeroes so that llvm-exegesis is able to assume the correct bit width.
…e#37) This patch fixes the register class that the BHive to Exegesis conversion script uses. Currently it is only pulling in registers that don't have a REX encoding, which doesn't even include R9-R15. This patch fixes that behavior by using the more generic LLVM 64-bit GPR Register class, but specifically looking at registers that don't require a REX2 encoding, as there is no hardware in the wild yet that supports APX.
This patch adds a max_bb_count flag to the exegesis converter which makes it easier to generate only a subset of basic blocks from a large CSV for testing purposes. This reduces the need for manually splitting CSVs at the cost of little additional complexity.
This patch adds a JSON output option to the BHive converter. This makes it significantly easier to implement other scripts down the line that ingest this data. This also cuts down the number of inodes that a large data set will use by a significant amount, which can be a problem on some file systems.
Before this patch, --blocks-per-json-file was setup as unsigned in the flag definitions, but the value was assigned to a normal int later on. This resulted in the check that the value is <= triggering with the default value of the maximum unsigned 32 bit integer. This patch fixes that by assigning the value of the flag to an unsigned value rather than a normal int.
This patch bumps the LLVM version to the latest (as of writing the patch) LLVM commit. This is necessary to pull in new llvm-exegesis functionality like the middle half repetition mode.
This patch adds a progress reporting flag to the annotator script. This can be quite useful when using a slower implementation of FindAccessedAddrs like FindAccessedAddrsExegesis to get a gauge on what sort of progress is being made and to show that the application isn't stuck somewhere when processing a large number of blocks.
PiperOrigin-RevId: 576065680
PiperOrigin-RevId: 580863794
`CreateRandomLinkedList` is not initializing `value` for the last link. Switch to a backwards list creation, which makes the code much simpler. Add test. PiperOrigin-RevId: 582251222
PiperOrigin-RevId: 582887102
PiperOrigin-RevId: 582887618
By default it's all zeroes. So anything read from memory will be zero. This causes issues with pointers to pointers, as we won't be able to map the inner zero address. And also causes floating point exceptions when we divide by the read value. PiperOrigin-RevId: 582989630
This doesn't matter now as there's no padding, but will matter later once we add extra fields. PiperOrigin-RevId: 584862339
0 isn't mappable, so this test only works right now because we don't treat it as a hard error when we fail to map an address accessed in a previous iteration. We're going to change this, so change the test. PiperOrigin-RevId: 586481033
PiperOrigin-RevId: 592152755
…the pipe. PiperOrigin-RevId: 592213983
The tool relies on low-level system libraries and process manipulation to detect the addresses, and it is incompatible with sanitizers that expect a well-behaved C++ program and use a lot of hacks of their own. PiperOrigin-RevId: 592216550
This is the address which will turn up when reading from newly mapped sections, due to how we initialize them. If a block dereferences this pointer, we want it to segfault, so that we can detect the read. On my machine it's unmapped, so all is well. On some machines, it isn't. Really we'd like to unmap everything we can, as blocks can access arbitrary addresses. But this is difficult to do without hitting things which turn out to be necessary and break things in weird ways when unmapped. But we can at least easily fix this one issue with a single targetted munmap. PiperOrigin-RevId: 596433115
Notably we now return an error when the child is unable to map all of the given addresses. With this change we now only succeed on ~85% of blocks from BHive, as before this error wasn't propagated and this case was counted as a success. PiperOrigin-RevId: 596435740
So that accessing memory at a small negative offset from a register will be possible. On the BHive dataset: Before this change 46746 / 330016 blocks failed (~15%) After this change 808 / 330016 blocks fail (~0.25%) PiperOrigin-RevId: 596435824
We still set them all to the same initial register value as before, but this lets us change that in the future, and also lets the caller know what they should be set to without having to depend on our implementation details. PiperOrigin-RevId: 599693469
PiperOrigin-RevId: 599693592
This seems to fix about half of the remaining failures: Before: 329211 / 330016 After: 329605 / 330016 The exact number can differ, as this commit introduces randomness. PiperOrigin-RevId: 602572368
This will make it easier to iterate on things that address failures. PiperOrigin-RevId: 602622698
Updates LLVM usage to match [cca9f9b78fc6](llvm/llvm-project@cca9f9b78fc6) PiperOrigin-RevId: 603664366
PiperOrigin-RevId: 604844084
PiperOrigin-RevId: 606143069
…asier PiperOrigin-RevId: 606784084
SIGFPE can come from (for example) dividing one register by another, and the denominator is zero. So, randomising can potentially help. This only fixes ~20 of the remaining failures, but it's an easy change. PiperOrigin-RevId: 610920806
The changes are due to changes in behavior of FindAccessedAddrs() imported by the recent merge.
This probably "just works" because we're using a compiler pre D127675. An up to date compiler would require the missing include.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#include <iterator>