Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with 5dac691b (Oct 11) (11) #448

Merged
merged 178 commits into from
Jan 20, 2025

Conversation

jorickert
Copy link

No description provided.

metaflow and others added 30 commits October 10, 2024 13:37
…vm#111746)

The libcxx/benchmarks directory was moved to libcxx/test/benchmarks,
which is already checked by that grep command.
…#80007)

This patch always defines the cxx_shared, cxx_static & other top-level
targets. However, they are marked as EXCLUDE_FROM_ALL when we don't want
to build them. Simply declaring the targets should be of no harm, and it
allows other projects to mention these targets regardless of whether
they end up being built or not.

This patch basically moves the definition of e.g. cxx_shared out of the
`if (LIBCXX_ENABLE_SHARED)` and instead marks it as EXCLUDE_FROM_ALL
conditionally on whether LIBCXX_ENABLE_SHARED is passed. It then does
the same for libunwind and libc++abi targets. I purposefully avoided to
reformat the files (which now has inconsistent indentation) because I
wanted to keep the diff minimal, and I know this is an area of the code
where folks may have downstream diffs. I will re-indent the code
separately once this patch lands.

This is a reapplication of 79ee034, which was reverted in
a353909 because it broke the TSAN and the Fuchsia builds.

Resolves llvm#77654

Differential Revision: https://reviews.llvm.org/D134221
…m#105664)

Default to Global address space for memrefs that do not have an explicit address space set in the IR.

---------

Co-authored-by: Victor Perez <[email protected]>
Co-authored-by: Jakub Kuderski <[email protected]>
Co-authored-by: Victor Perez <[email protected]>
We need to use the MaterializeTemporaryExpr here so the checks in
ExprConstant.cpp do the right thing.
…change) (llvm#111816)

This makes tests more portable.
Make variables for LLVM utils are passed to `make` on Darwin as well.

Co-authored-by: Vladimir Vereschaka <[email protected]>
)

There are artificial one-use limitations on foldExtractedCmps. Adjust
the costs to account for multi-use, and strip the one-use matcher,
lifting the limitations.
…m#111127)

[template.bitset.general] indicates that `bitset` shouldn't have member
typedef-names `iterator` and `const_iterator`. Currently libc++'s
typedef-names are causing ambiguity in name lookup, which isn't
conforming.

As these iterator types are themselves useful, I think we should just
use __uglified member typedef-names for them.

Fixes llvm#111125
…1803)

Fixes llvm#110265

Adding check-all causes us to run some tests twice if a project specific
target like check-clang is also added.

check-pstl is an alternative but as far as I can tell, check-all does
not include this so we have not been running the tests in CI anyway.

When I tried to run check-pstl locally I got a lot of compiler errors
but have not found any instructions on how to setup a correct build
environment. Even if such instructions exist, it's probably more than we
want to do in CI.

According to Louis Dionne, the project is probably not active. So if
it's ever revived it'll be up to the new contributors to enable testing.
)

Derived type results of BIND(C) function should be returned according
the the C ABI for returning the related C struct type.

This currently did not happen since the abstract-result pass was forcing
the Fortran ABI for all derived type results.
use the bind_c attribute that was added on call/func/dispatch in FIR to
prevent such rewrite in the abstract result pass, and update the
target-rewrite pass to deal with the struct return ABI.

So far, the target specific part of the target-rewrite is only
implemented for X86-64 according to the "System V Application Binary
Interface AMD64 v1", the other targets will hit a TODO, just like for
BIND(C), VALUE derived type arguments.

This intends to deal with
llvm#102113.
Fixes: llvm#111815

This patch replaces usage of the python `imp` library, which is
deprecated since python3.4 and removed in python3.12, with the
`importlib` library. As part of this update the repeated
find_module+load_module pattern is moved into a utility function, since
the importlib equivalent is much more verbose.
)

It turns out that {s,u}int_to_fp nodes get their operation action from
their operand's type, not the result type, so we don't need to set it
for fp16 or bf16. vp_{s,u}int_to_fp uses the result type though so we
need to keep it.

This also means that we can lower int_to_fp for fixed length bf16
vectors already, so this adds tests for that.

The cost model test changes are due to BasicTTIImpl's getCastInstrCost
not taking into account that int_to_fp needs its legal type swapped.
This can be fixed in a later patch, but its worth noting that the
affected types in the tests currently crash when lowered anyway (due to
them needing split at LMUL > 8)
…lvm#111129)

Before this patch, redundant COPY couldn't be removed for the following
case:
```
  %reg1 = COPY %const-reg
  ... // There is a def of %const-reg
  %reg2 = COPY killed %reg1
```
where this can be optimized to:
```
  ... // There is a def of %const-reg
  %reg2 = COPY %const-reg
```

This patch allows for such optimization by not invalidating defined
constant registers. This is safe, as architectures like AArch64 and
RISCV replace a dead definition of a GPR with a zero constant register
for certain instructions.
This change extends the current method for creating ABI object to allow
users (plugin libraries) to create custom ABI objects for their needs.
This is accomplished by inheriting one of the common ABIs and overriding
one or more of the methods to create a custom ABI. To use a custom ABI
for a given coroutine the coro.begin.custom.abi intrinsic is used in
place of the coro.begin intrinsic. This takes an additional i32 arg that
specifies the index of an ABI generator for the custom ABI object in a
SmallVector passed to the CoroSplitPass ctor.

The detailed changes include:
* Add the llvm.coro.begin.custom intrinsic used to specify the index of
the custom ABI to use for the given coroutine.
* Add constructors to CoroSplit that take a list of generators that
create the custom ABI object.
* Extend the CreateNewABI function used by CoroSplit to return a
unique_ptr to an ABI object.
* Add has/getCustomABI methods to CoroBeginInst class.
* Add a unittest for a custom ABI.

See doc update here: llvm#111781
Summary:
This had some leftover references to the old namespace and didn't put
restrict on it.
…lvm#111760)

The previous error test line is using a 16bit instruction to indicate an
error. However this is a poor pick.

The 16bit instructions on AMDGPU is under development and thus, some
downstream branches are not showing this exact error message. Changing
it to another error dasm code.
After the refactor in:
  * ed22913,

the `args_in` and `args_out` attributes are no longer used by
`linalg.generic`. This patch removes most the remaining references.
I've left out BufferDeallocationInternals.md, which doesn't seem
maintained anymore and is quite out of sync with other bits of MLIR
(e.g. `test.generic` instead of `linalg.generic`).
A follow-up for llvm#111816.

This is to fix buildbot failure
https://lab.llvm.org/staging/#/builders/195/builds/4242.

TestSymbolFileJSON.py doesn't pass with llvm-strip on macOS. Apparently,
llvm-strip/llvm-objcopy can't clean symbols from Mach-O nlists.
In `--icf=safe_thunks` mode, the linker differentiates `keepUnique`
functions by creating thunks during a post-processing step after
Identical Code Folding (ICF). While this ensures that `keepUnique`
functions themselves are not incorrectly merged, it overlooks functions
that reference these `keepUnique` symbols.

If two functions are identical except for references to different
`keepUnique` functions, the current ICF algorithm incorrectly considers
them identical because it doesn't account for the future differentiation
introduced by thunks. This leads to incorrect deduplication of functions
that should remain distinct.

To address this issue, we modify the ICF comparison to explicitly check
for references to `keepUnique` functions during deduplication. By doing
so, functions that reference different `keepUnique` symbols are
correctly identified as distinct, preventing erroneous merging and
ensuring the correctness of the linked output.
…lvm#111858)

Reverts llvm#111678

Causes ARM failure in test suite. TYPE(C_PTR) result should not regress
even if struct ABI no implemented for the target.
https://lab.llvm.org/buildbot/#/builders/143/builds/2731
I need to revisit this.
sebastiankreutzer and others added 26 commits October 11, 2024 13:11
Removes a dependency on LLVM in `xray_interface.cpp` by replacing
`llvm_unreachable` with compiler-rt's `UNREACHABLE`.
Applies clang-format to some unformatted changes. 

Original PR: llvm#90959
…ges (llvm#111562)

The bulk of this change are new tests to check that we get a "Not yet
implemneted: *some stuff here*" message when using some not yet
supported OpenMP functionality.

For some of these cases, this also means adding additional clauses to a
filter list in OpenMP.cpp - this changes nothing [to the best of my
understanding] other than allowing the clause to get to the point where
it can be rejected in a TODO with a more clear message. One of the TOOD
filters were missing Mergeable clause, so this was also added and the
existing test updated for the new more specific error message.

There is no functional change intended here.
The original patch had a reasonably significant bug. You could not use
`.insn` to assemble encodings that had any bits set above the low 32
bits. This is due to the fact that `getMachineOpValue` was truncating
the immediate value, and I did not commit enough tests of useful cases.

This changes the result of `getMachineOpValue` to be able to return the
48-bit and 64-bit immediates needed for the wider `.insn` directives.

I took the opportunity to move some of the test cases around in the file
to make looking at the output of `llvm-objdump` a little clearer.
…#111538)

Introduce a description of late forwarding to the Neoverse-V1 Scheduling model.
…#111982)

These are already in target specific test directories.
…lvm#111733)

Add example to document that single statement `else` needs a brace if
the associated `if` needs a brace.
…lvm#111752)

Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
…rc.b (llvm#111828)

This patch generalizes the DAG combine for `(sub (shl X, 8), X) =>
(orc.b X)`
into the more general form of `(sub (shl X, 8 - Y), (srl X, Y)) =>
(orc.b X)`.

Alive2 generalized proof: https://alive2.llvm.org/ce/z/dFcf_n
Related issue: llvm#96595
Related PR: llvm#96680
This reverts commit 3f9998a.

It breaks downstream tests with egregious numerical differences.

Unfortunately no upstream tests are broken, but the fact that
a prior iteration of the commit (pre-optimization) does work
with our downstream tests (coming from the Triton repo) supports
the claim that the final version of the commit is incorrect.

Reverting now so that the original author can evaluate.
…llvm#111990)

Fix build failure from the rename change. Looks like one additional
reference sneaked in between pre-commit checks and the commit itself.
This reduces the total number of TableGen records produced by AMDGPU.td
by about 6%.
It would be nice to see what our users think about this change, as this
is something that WG21/EWG quite wants to fix a handful of questionable
issues with UB. Depending on the outcome of this after being committed,
we might instead suggest EWG undeprecate this, and require a bit of
'magic' from the lexer.

Additionally, this patch makes it so we emit this diagnostic ALSO in
cases where the literal name is reserved. It doesn't make sense to limit
that.

---------

Co-authored-by: Vlad Serebrennikov <[email protected]>
Fixes 0e91323 /
llvm#111531

For reasons I can't explain, a clean build works fine for me, and all
the bots are working fine. But if I rebuild in some way the make tool
becomes None.

Looking at the other variables, they had these extra lines so I've added
those for make and it seems to solve the problem.
…lvm#111519)

Lowering fixed-size BUILD_VECTORS without Neon may introduce stack
spills, leading to more stores/reloads than if the stores were not
merged. In some cases, it can also prevent using paired store
instructions.

In the future, we may want to relax when SVE is available, but
currently, the SVE lowerings for BUILD_VECTOR are limited to a few
specific cases.
Add a new enumeration `SuppressInlineNamespaceMode` to `PrintingPolicy` that
is explicit about how to handle inline namespaces. `SuppressInlineNamespace`
uses that enumeration now instead of a Boolean value.

Specializing a template from an inline namespace should be transparent.
For instance

```
namespace foo {
    inline namespace v1 {
        template<typename A>
        void function(A&);
    }
}

namespace foo {
    template<>
    void function<int>(int&);
}
```

`hasName` should match both declarations of `foo::function`.

Makes the behavior of `matchesNodeFullSlow` and `matchesNodeFullFast`
consistent, fixing an assert inside `HasNameMatcher::matchesNode`.
…m#111824)

This hasn't been used for several years, so it's effectively dead code
at this point.
This improves the CI output by providing collapsable sections for
sub-parts of our build.

This was originally opened as llvm#75233.

Co-authored-by: eric <[email protected]>
)

Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.fma
  * vector.reduce
The purpose of this optimization is to make the VL argument, for
instructions that have a VL argument, as small as possible. This is
implemented by visiting each instruction in reverse order and checking
that if it has a VL argument, whether the VL can be reduced.

By putting this pass before VSETVLI insertion, we see three kinds of
changes to generated code:
1. Eliminate VSETVLI instructions
2. Reduce the VL toggle on VSETVLI instructions that also change vtype
3. Reduce the VL set by a VSETVLI instruction

The list of supported instructions is currently whitelisted for safety.
In the future, we could add more instructions to `isSupportedInstr` to
support even more VL optimization.

We originally wrote this pass because vector GEP instructions do not
take a VL, which leads us to emit code that uses VL=VLMAX to implement
GEP in the RISC-V backend. As a result, some of the vector instructions
will write to lanes, specifically between the intended VL and VLMAX,
that will never be read. As an alternative to this pass, we considered
adding a vector predicated GEP instruction, but this would not fit well
into the intrinsic type system since GEP has a variable number of
arguments, each with arbitrary types. The second approach we considered
was to put this pass after VSETVLI insertion, but we found that it was
more difficult to recognize optimization opportunities, especially
across basic block boundaries -- the data flow analysis was also a bit
more expensive and complex.

While this pass solves the GEP problem, we have expanded it to handle
more cases of VL optimization, and there is opportunity for the analysis
to be improved to enable even more optimization. We have a few follow up
patches to post, but figured this would be a good start.

---------

Co-authored-by: Craig Topper <[email protected]>
Co-authored-by: Kito Cheng <[email protected]>
@jorickert jorickert requested a review from mgehre-amd January 13, 2025 17:16
@mgehre-amd mgehre-amd merged commit 88d6c0c into bump_to_a1c9dd7c Jan 20, 2025
11 checks passed
@mgehre-amd mgehre-amd deleted the bump_to_5dac691b branch January 20, 2025 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.