forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with 5dac691b (Oct 11) (11) #448
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…rs, buffer.*.pN (llvm#110714)" v2 (llvm#111708)" This reverts commit 4b4a0d4. New test fails on buildbots https://lab.llvm.org/buildbot/#/builders/63/builds/2039 https://lab.llvm.org/buildbot/#/builders/127/builds/1055
…mm_cvtsi64_ss SSE1 intrinsics Followup to llvm#111001
…vm#111746) The libcxx/benchmarks directory was moved to libcxx/test/benchmarks, which is already checked by that grep command.
This is a dependency of llvm#80007.
…#80007) This patch always defines the cxx_shared, cxx_static & other top-level targets. However, they are marked as EXCLUDE_FROM_ALL when we don't want to build them. Simply declaring the targets should be of no harm, and it allows other projects to mention these targets regardless of whether they end up being built or not. This patch basically moves the definition of e.g. cxx_shared out of the `if (LIBCXX_ENABLE_SHARED)` and instead marks it as EXCLUDE_FROM_ALL conditionally on whether LIBCXX_ENABLE_SHARED is passed. It then does the same for libunwind and libc++abi targets. I purposefully avoided to reformat the files (which now has inconsistent indentation) because I wanted to keep the diff minimal, and I know this is an area of the code where folks may have downstream diffs. I will re-indent the code separately once this patch lands. This is a reapplication of 79ee034, which was reverted in a353909 because it broke the TSAN and the Fuchsia builds. Resolves llvm#77654 Differential Revision: https://reviews.llvm.org/D134221
…m#105664) Default to Global address space for memrefs that do not have an explicit address space set in the IR. --------- Co-authored-by: Victor Perez <[email protected]> Co-authored-by: Jakub Kuderski <[email protected]> Co-authored-by: Victor Perez <[email protected]>
We need to use the MaterializeTemporaryExpr here so the checks in ExprConstant.cpp do the right thing.
…change) (llvm#111816) This makes tests more portable. Make variables for LLVM utils are passed to `make` on Darwin as well. Co-authored-by: Vladimir Vereschaka <[email protected]>
…M_TARGETS_TO_BUILD. (llvm#111382) From llvm#111356
…m#111127) [template.bitset.general] indicates that `bitset` shouldn't have member typedef-names `iterator` and `const_iterator`. Currently libc++'s typedef-names are causing ambiguity in name lookup, which isn't conforming. As these iterator types are themselves useful, I think we should just use __uglified member typedef-names for them. Fixes llvm#111125
…1803) Fixes llvm#110265 Adding check-all causes us to run some tests twice if a project specific target like check-clang is also added. check-pstl is an alternative but as far as I can tell, check-all does not include this so we have not been running the tests in CI anyway. When I tried to run check-pstl locally I got a lot of compiler errors but have not found any instructions on how to setup a correct build environment. Even if such instructions exist, it's probably more than we want to do in CI. According to Louis Dionne, the project is probably not active. So if it's ever revived it'll be up to the new contributors to enable testing.
) Derived type results of BIND(C) function should be returned according the the C ABI for returning the related C struct type. This currently did not happen since the abstract-result pass was forcing the Fortran ABI for all derived type results. use the bind_c attribute that was added on call/func/dispatch in FIR to prevent such rewrite in the abstract result pass, and update the target-rewrite pass to deal with the struct return ABI. So far, the target specific part of the target-rewrite is only implemented for X86-64 according to the "System V Application Binary Interface AMD64 v1", the other targets will hit a TODO, just like for BIND(C), VALUE derived type arguments. This intends to deal with llvm#102113.
Fixes: llvm#111815 This patch replaces usage of the python `imp` library, which is deprecated since python3.4 and removed in python3.12, with the `importlib` library. As part of this update the repeated find_module+load_module pattern is moved into a utility function, since the importlib equivalent is much more verbose.
) It turns out that {s,u}int_to_fp nodes get their operation action from their operand's type, not the result type, so we don't need to set it for fp16 or bf16. vp_{s,u}int_to_fp uses the result type though so we need to keep it. This also means that we can lower int_to_fp for fixed length bf16 vectors already, so this adds tests for that. The cost model test changes are due to BasicTTIImpl's getCastInstrCost not taking into account that int_to_fp needs its legal type swapped. This can be fixed in a later patch, but its worth noting that the affected types in the tests currently crash when lowered anyway (due to them needing split at LMUL > 8)
Reverts llvm#111163, as this was merged prematurely.
…lvm#111129) Before this patch, redundant COPY couldn't be removed for the following case: ``` %reg1 = COPY %const-reg ... // There is a def of %const-reg %reg2 = COPY killed %reg1 ``` where this can be optimized to: ``` ... // There is a def of %const-reg %reg2 = COPY %const-reg ``` This patch allows for such optimization by not invalidating defined constant registers. This is safe, as architectures like AArch64 and RISCV replace a dead definition of a GPR with a zero constant register for certain instructions.
This change extends the current method for creating ABI object to allow users (plugin libraries) to create custom ABI objects for their needs. This is accomplished by inheriting one of the common ABIs and overriding one or more of the methods to create a custom ABI. To use a custom ABI for a given coroutine the coro.begin.custom.abi intrinsic is used in place of the coro.begin intrinsic. This takes an additional i32 arg that specifies the index of an ABI generator for the custom ABI object in a SmallVector passed to the CoroSplitPass ctor. The detailed changes include: * Add the llvm.coro.begin.custom intrinsic used to specify the index of the custom ABI to use for the given coroutine. * Add constructors to CoroSplit that take a list of generators that create the custom ABI object. * Extend the CreateNewABI function used by CoroSplit to return a unique_ptr to an ABI object. * Add has/getCustomABI methods to CoroBeginInst class. * Add a unittest for a custom ABI. See doc update here: llvm#111781
Summary: This had some leftover references to the old namespace and didn't put restrict on it.
…lvm#111760) The previous error test line is using a 16bit instruction to indicate an error. However this is a poor pick. The 16bit instructions on AMDGPU is under development and thus, some downstream branches are not showing this exact error message. Changing it to another error dasm code.
After the refactor in: * ed22913, the `args_in` and `args_out` attributes are no longer used by `linalg.generic`. This patch removes most the remaining references. I've left out BufferDeallocationInternals.md, which doesn't seem maintained anymore and is quite out of sync with other bits of MLIR (e.g. `test.generic` instead of `linalg.generic`).
A follow-up for llvm#111816. This is to fix buildbot failure https://lab.llvm.org/staging/#/builders/195/builds/4242. TestSymbolFileJSON.py doesn't pass with llvm-strip on macOS. Apparently, llvm-strip/llvm-objcopy can't clean symbols from Mach-O nlists.
In `--icf=safe_thunks` mode, the linker differentiates `keepUnique` functions by creating thunks during a post-processing step after Identical Code Folding (ICF). While this ensures that `keepUnique` functions themselves are not incorrectly merged, it overlooks functions that reference these `keepUnique` symbols. If two functions are identical except for references to different `keepUnique` functions, the current ICF algorithm incorrectly considers them identical because it doesn't account for the future differentiation introduced by thunks. This leads to incorrect deduplication of functions that should remain distinct. To address this issue, we modify the ICF comparison to explicitly check for references to `keepUnique` functions during deduplication. By doing so, functions that reference different `keepUnique` symbols are correctly identified as distinct, preventing erroneous merging and ensuring the correctness of the linked output.
…lvm#111858) Reverts llvm#111678 Causes ARM failure in test suite. TYPE(C_PTR) result should not regress even if struct ABI no implemented for the target. https://lab.llvm.org/buildbot/#/builders/143/builds/2731 I need to revisit this.
Removes a dependency on LLVM in `xray_interface.cpp` by replacing `llvm_unreachable` with compiler-rt's `UNREACHABLE`. Applies clang-format to some unformatted changes. Original PR: llvm#90959
…ges (llvm#111562) The bulk of this change are new tests to check that we get a "Not yet implemneted: *some stuff here*" message when using some not yet supported OpenMP functionality. For some of these cases, this also means adding additional clauses to a filter list in OpenMP.cpp - this changes nothing [to the best of my understanding] other than allowing the clause to get to the point where it can be rejected in a TODO with a more clear message. One of the TOOD filters were missing Mergeable clause, so this was also added and the existing test updated for the new more specific error message. There is no functional change intended here.
The original patch had a reasonably significant bug. You could not use `.insn` to assemble encodings that had any bits set above the low 32 bits. This is due to the fact that `getMachineOpValue` was truncating the immediate value, and I did not commit enough tests of useful cases. This changes the result of `getMachineOpValue` to be able to return the 48-bit and 64-bit immediates needed for the wider `.insn` directives. I took the opportunity to move some of the test cases around in the file to make looking at the output of `llvm-objdump` a little clearer.
…#111538) Introduce a description of late forwarding to the Neoverse-V1 Scheduling model.
…#111982) These are already in target specific test directories.
…lvm#111733) Add example to document that single statement `else` needs a brace if the associated `if` needs a brace.
…lvm#111752) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).
…rc.b (llvm#111828) This patch generalizes the DAG combine for `(sub (shl X, 8), X) => (orc.b X)` into the more general form of `(sub (shl X, 8 - Y), (srl X, Y)) => (orc.b X)`. Alive2 generalized proof: https://alive2.llvm.org/ce/z/dFcf_n Related issue: llvm#96595 Related PR: llvm#96680
This reverts commit 3f9998a. It breaks downstream tests with egregious numerical differences. Unfortunately no upstream tests are broken, but the fact that a prior iteration of the commit (pre-optimization) does work with our downstream tests (coming from the Triton repo) supports the claim that the final version of the commit is incorrect. Reverting now so that the original author can evaluate.
…llvm#111990) Fix build failure from the rename change. Looks like one additional reference sneaked in between pre-commit checks and the commit itself.
…n template calls. (llvm#111457)" See discussion in llvm#111711 This reverts commit 4dadf42.
…llvm#107350)" See discussion in llvm#111711 This reverts commit 224519b.
See discussion in llvm#111711 This reverts commit 6213aa5.
This reduces the total number of TableGen records produced by AMDGPU.td by about 6%.
It would be nice to see what our users think about this change, as this is something that WG21/EWG quite wants to fix a handful of questionable issues with UB. Depending on the outcome of this after being committed, we might instead suggest EWG undeprecate this, and require a bit of 'magic' from the lexer. Additionally, this patch makes it so we emit this diagnostic ALSO in cases where the literal name is reserved. It doesn't make sense to limit that. --------- Co-authored-by: Vlad Serebrennikov <[email protected]>
Fixes 0e91323 / llvm#111531 For reasons I can't explain, a clean build works fine for me, and all the bots are working fine. But if I rebuild in some way the make tool becomes None. Looking at the other variables, they had these extra lines so I've added those for make and it seems to solve the problem.
…lvm#111519) Lowering fixed-size BUILD_VECTORS without Neon may introduce stack spills, leading to more stores/reloads than if the stores were not merged. In some cases, it can also prevent using paired store instructions. In the future, we may want to relax when SVE is available, but currently, the SVE lowerings for BUILD_VECTOR are limited to a few specific cases.
Add a new enumeration `SuppressInlineNamespaceMode` to `PrintingPolicy` that is explicit about how to handle inline namespaces. `SuppressInlineNamespace` uses that enumeration now instead of a Boolean value. Specializing a template from an inline namespace should be transparent. For instance ``` namespace foo { inline namespace v1 { template<typename A> void function(A&); } } namespace foo { template<> void function<int>(int&); } ``` `hasName` should match both declarations of `foo::function`. Makes the behavior of `matchesNodeFullSlow` and `matchesNodeFullFast` consistent, fixing an assert inside `HasNameMatcher::matchesNode`.
…m#111824) This hasn't been used for several years, so it's effectively dead code at this point.
This improves the CI output by providing collapsable sections for sub-parts of our build. This was originally opened as llvm#75233. Co-authored-by: eric <[email protected]>
The purpose of this optimization is to make the VL argument, for instructions that have a VL argument, as small as possible. This is implemented by visiting each instruction in reverse order and checking that if it has a VL argument, whether the VL can be reduced. By putting this pass before VSETVLI insertion, we see three kinds of changes to generated code: 1. Eliminate VSETVLI instructions 2. Reduce the VL toggle on VSETVLI instructions that also change vtype 3. Reduce the VL set by a VSETVLI instruction The list of supported instructions is currently whitelisted for safety. In the future, we could add more instructions to `isSupportedInstr` to support even more VL optimization. We originally wrote this pass because vector GEP instructions do not take a VL, which leads us to emit code that uses VL=VLMAX to implement GEP in the RISC-V backend. As a result, some of the vector instructions will write to lanes, specifically between the intended VL and VLMAX, that will never be read. As an alternative to this pass, we considered adding a vector predicated GEP instruction, but this would not fit well into the intrinsic type system since GEP has a variable number of arguments, each with arbitrary types. The second approach we considered was to put this pass after VSETVLI insertion, but we found that it was more difficult to recognize optimization opportunities, especially across basic block boundaries -- the data flow analysis was also a bit more expensive and complex. While this pass solves the GEP problem, we have expanded it to handle more cases of VL optimization, and there is opportunity for the analysis to be improved to enable even more optimization. We have a few follow up patches to post, but figured this would be a good start. --------- Co-authored-by: Craig Topper <[email protected]> Co-authored-by: Kito Cheng <[email protected]>
mgehre-amd
approved these changes
Jan 14, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.