[AutoBump] Merge with 8a9921f5 (Oct 23) (17) #454

jorickert · 2025-01-14T15:54:34Z

No description provided.

This is one of the many PRs to fix errors with LLVM_ENABLE_WERROR=on. Built by GCC 11. Refactor the code to avoid the false warning llvm-project/llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp llvm-project/llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp: In function ‘int LLVMFuzzerInitialize(int*, char***)’: llvm-project/llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp:141:43: error: ISO C++ forbids zero-size array ‘argv’ [-Werror=pedantic] 141 | ExitOnError ExitOnErr(std::string(*argv[0]) + ": error:"); |

Clang uses timestamp files to track the last time an implicitly-built PCM file was verified to be up-to-date with regard to its inputs. With `-fbuild-session-{file,timestamp}=` and `-fmodules-validate-once-per-build-session` this reduces the number of times a PCM file is checked per "build session". The behavior I'm seeing with the current scheme is that when lots of Clang instances wait for the same PCM to be built, they race to validate it as soon as the file lock gets released, causing lots of concurrent IO. This patch makes it so that the timestamp is written by the same Clang instance responsible for building the PCM while still holding the lock. This makes it so that whenever a PCM file gets compiled, it's never re-validated in the same build session. I believe this is as sound as the current scheme. One thing to be aware of is that there might be a time interval between accessing input file N and writing the timestamp file, where changes to input files 0..<N would not result in a rebuild. Since this is the case current scheme too, I'm not too concerned about that. I've seen this speed up `clang-scan-deps` by ~27%.

llvm#112904 will add typechecking to submulticlass arguments, and these ones are currently mistyped.

We already have the .o, there is no reason to go .o -> YAML -> .o

…13350) This corrects a couple off by ones related to the sampling of **instrumented** counters, and enables setting 100% rates for burst sampling (burst duration = period). Off by ones: Prior to this change it was impossible to set a period of 65535 because this was converted to fast sampling which rollsover at USHRT_MAX + 1 (65536). Similarly the burst durations would collect burst duration + 1 counts as they used an ULE comparison. 100% sampling: Although this is not useful for a productionized use case, it does allow for more deterministic testing with the sampling checks in place. After all the off by ones are fixed, allowing for 100% sampling is a matter of letting burst duration = period.

Reverts llvm#68176 Introduced BuildBot failure: llvm#68176 (comment)

With sampled instrumentation (llvm#69535), profile counts may appear corrupt and `fixFuncEntryCount` may assert. In particular a function can have a 0 block count for its entry, while later blocks are non zero. This is only likely to happen for colder functions, so it is reasonable to take any action that does not crash. Here we simply bail from fixing the entry count.

…0569) Extend the logic added in 123c036 (llvm#76612) to support pointers to non-builtin types by using the mangled name of the canonical type. PR: llvm#110569

…cessible outside of Sema (llvm#113206) Moves `IsIntangibleType` from SemaHLSL to Type class and renames it to `isHLSLIntangibleType`. The existing `isHLSLIntangibleType` is renamed to `isHLSLBuiltinIntangibleType` and updated to return true only for the builtin `__hlsl_resource_t` type. This change makes `isHLSLIntangibleType` functionality accessible outside of Sema, for example from clang CodeGen.

Add support for ``llvm.nvvm.fshl.clamp`` and ``llvm.nvvm.fshr.clamp`` intrinsics. These intrinsics are similar to the generic llvm funnel shift, except that the shift value is clamped to the integer width. Currently only ``i32`` is supported and is implemented with the `shf.[rl].clamp.b32` PTX instruction.

…2802) Store Swift mangled names in DW_AT_linkage_name. The Swift compiler emits only the type mangled name in debug information, and LLDB uses those mangled names as keys to look up size, alignment, fields, etc from either reflection metadata or Swift modules. Additionally, emit types linkage names for types into the accelerator table if they exist and they're different from the display name.

… invalid (llvm#104540) Fixes llvm#102945.

…e with flexible array init (llvm#113336) Fixes: llvm#113187 Avoid to create init function since clang does not support global variable with flexible array init. It will cause assertion failure later.

This patch adds functionality for atomically reading `llvm.struct` types. Fixes: llvm#93441

llvm#113260) …tyle

Fixes llvm#113256.

Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368

…ds (llvm#113264) Looks like having a constant in `Z` also caused infinite loops. This fixes llvm#113240.

…ructs (llvm#113045) According to OpenMPv5.2 1.2.6, "For Fortran, a scalar variable with intrinsic type, as defined by the base language, excluding character type.". Likewise, section 4.3.1.3 states that atomic operations are on "scalar variables of intrinsic type". This PR hence introduces a check to error out when CHARACTER type is used in atomic operations. Fixes llvm#112918

…lvm#113108) Restricts the verifier for tensor.pack and tensor.unpack Ops so that the following is no longer allowed: ```mlir %c8 = arith.constant 8 : index %0 = tensor.pack %input inner_dims_pos = [0, 1] inner_tiles = [8, %c8] into %output : tensor<?x?xf32> -> tensor<?x?x8x8xf32> ``` Specifically, in line with other Tensor Ops, require: * a dynamic dimensions for each (dynamic) SSA value, * a static dimension for each static size (attribute). In the example above, a static dimension (8) is mixed with a dynamic size (%c8). Note that this is mostly deleting existing code - that's because this change simplifies the logic in verifier. For more context: * https://discourse.llvm.org/t/tensor-ops-with-dynamic-sizes-which-behaviour-is-more-correct

This fixes the infer output shape of TOSA slice op for start/size values that are out-of-bound or -1 added tests to check: - size = -1 - size is out of bound - start is out of bound Signed-off-by: Tai Ly <[email protected]>

Fix ordering of checks in atomic02.f90.

Reverts llvm#108306

…cl (llvm#113276) This is more similar to the diagnostic output of the current interpreter

The patch adds graceful handling of incorrectly constructed MLIR operation with less operands than expected.

…#104764)

…h-abs feature This is to align with GAS. Additionally, there are some minor changes: the definition and expansion process of the TLS_DESC pseudo-instruction were modified in the same style. Reviewed By: heiher Pull Request: llvm#112858

Fixes llvm#113154 The encodings used for llvm.trap() on ARM were all marked as barriers and terminators. This lead to stack frame destroy code being inserted before the trap if the trap was the last thing in the function and it had no return statement. ``` void fn() { volatile int i = 0; __builtin_trap(); } ``` Produced: ``` fn: push {r11, lr} << stack frame create <...> mov sp, r11 pop {r11, lr} << stack frame destroy .inst 0xe7ffdefe << trap bx lr ``` All the other targets don't mark them this way, instead they mark them with isTrap. I've changed ARM to do this, which fixes the code generation: ``` fn: push {r11, lr} << stack frame create <...> .inst 0xe7ffdefe << trap mov sp, r11 pop {r11, lr} << stack frame destroy bx lr ``` I've updated the existing trap test to force the need for a stack frame, then check that the instruction immediately after the trap is resetting the stack pointer. debugtrap was already working but I've added the same checks for it anyway.

Co-authored-by: Alex Richardson <[email protected]>

…literal in StackAddressEscape This patch simplifies the diagnostic message in the core.StackAddrEscape for stack memory associated with compound literals by removing the redundant "returned to caller" suffix. Example: https://godbolt.org/z/KxM67vr7c ```c // clang --analyze -Xanalyzer -analyzer-checker=core.StackAddressEscape void* compound_literal() { return &(unsigned short){((unsigned short)0x22EF)}; } ``` warning: Address of stack memory associated with a compound literal declared on line 2 **returned to caller returned to caller** [core.StackAddressEscape]

This PR updates the cast to bool from IntN to treat any non-zero value as TRUE. This makes the cast more resilient to non-generic (i.e. "non 1") TRUE values. Signed-off-by: Dmitriy Smirnov <[email protected]>

…3305) Extends `nowait` support for other device directives. This PR refactors the task generation utils used for the `target` directive so that they are general enough to be reused for other device directives as well.

… docs (llvm#112869) * Note up front that the author may not have permissions to use the merge button and should ask a reviewer to do those steps. * Make it clear that a single commit PR can be landed with a single button click. * There are in fact 3 ways to land a multi-commit PR. * Order the ways in increasing amount of overhead for the PR author. * Put them in bullet point sections so they are visually separate. * Add a note that force pushes can be problematic when the PR has multiple authors, but don't go too much into how to solve that, Git's docs are better here anyway.

Until now, these options have been hardcoded as downstream patches in LLD. Add them to the driver so that the private patches can be removed. PS5 only. The implementation of these behaviours will remain in the proprietary linker on PS4. SIE tracker: TOOLCHAIN-16704

hlfir.assign currently has the `MemoryEffects<[MemWrite]` which makes it look like it can write to anything. This is good for some cases where the assign effect cannot be precisely described through the MLIR side effect API (e.g., when the LHS is a descriptor and it is not possible to get an OpOperand describing the data address, or when derived type are involved and finalization could be called, or user defined assignment for some components). For the most common case of hlfir.assign on intrinsic types without whole allocatable LHS, this is pessimistic. This patch implements a finer description of the side effects when possible, and also adds the proper read/allocate/free effects when relevant. The ultimate goal is to suppress the generation of temporary for the LHS address when dealing with an assignment to a vector subscripted LHS where the vector subscript is an array constructor that does not refer to the LHS (as in `x([a,b]) = y`). Two more patches will follow to enable this.

…oca (llvm#113321) See https://reviews.llvm.org/D157626 for the rational of declare having side effects. The write effect is to scary for passes that look for read/write effects without caring about the resource affected. I know Slava asked for it, but I think the creation of the `DebuggingResource` was enough and that a write is too much. The alloca effect is sufficient to prevent DCE to remove it, which is all we care about currently. This currently is flag as a reason for creating LHS temporary in assignment to vector subscripted entity with array constructor. There is a lot of read/write side effect analysis in the "lower-hlfir-ordered-assignments" pass, and I feel like we will just keep adding weird "debug ressource" bypassing here and there with these side effects.

…nt (llvm#113330) Last patch required to avoid creating a temporary for the LHS when dealing with `x([a,b]) = y`. The code dealing with "ordered assignments" (where, forall, user and vector subscripted assignments) is saving the evaluated RHS/LHS and masks if they have write effects because this write effects should not be evaluated when they affect entities that may be written to in other contexts after the evaluation and before the re-evaluation. But when dealing with write to storage allocated in the region for the expression being evluated, there is no problem to re-evaluate the write: it has no effect outside of the expression evaluation that owns the allocation. In the case of `x([a,b]) = y`, the temporary is created for the vector subscript. Raising the HLFIR abstraction for simple array constructors may be a good idea, but local temps are created in other contexts, so this fix is more generic.

…ons (llvm#113292) This patch adds the zeroing predicate forms (Pg/z) of the following instructions: - FCVTXNT - FCVTNT - FCVTLT - BFCVTNT As specified in https://developer.arm.com/documentation/ddi0602. Co-authored-by: Spencer Abson [[email protected]](mailto:[email protected])

…ls (llvm#113283) On ARM64EC, external function calls emit a pair of weak-dependency aliases: `func` to `#func` and `#func` to the `func` guess exit thunk (instead of a single undefined `func` symbol, which would be emitted on other targets). Allow such aliases to be overridden by lazy archive symbols, just as we would for undefined symbols.

The Intel C++ Compiler (ICX) passes linker flags through the driver unlike MSVC and clang-cl, and therefore needs them to be prefixed with `/Qoption,link` (the equivalent of `-Wl,` for gcc on *nix). Use `LINKER:` prefix wherever supported by cmake, when that's not possible fall-back to `${CMAKE_CXX_LINKER_WRAPPER_FLAG}`. CMake replaces these with `/Qoption,link` for ICX and with the empty string for MSVC and clang-cl. For `target_link_libraries` neither `LINKER:` (not supported prior to CMake 3.32) nor `${CMAKE_CXX_LINKER_WRAPPER_FLAG}` (does not begin with `-` would be taken as a library name) works, use `-Qoption,link` directly within a conditional generator expression that we're linking with ICX. For MSVC and clang-cl no functional change is intended. Tested by compiling with ICX and setting `CMAKE_(EXE|SHARED|STATIC|MODULE)_LINKER_FLAGS_INIT` to `-Werror=unknown-argument`. RFC: https://discourse.llvm.org/t/rfc-cmake-linker-flags-need-wl-equivalent-for-intel-c-icx-on-windows/82446

…lazy archive symbol to the symbol table on ARM64EC (llvm#113284) On ARM64EC, a function symbol may appear in both mangled and demangled forms: - ARM64EC archives contain only the mangled name, while the demangled symbol is defined by the object file as an alias. - x86_64 archives contain only the demangled name (the mangled name is usually defined by an object referencing the symbol as an alias to a guess exit thunk). - ARM64EC import files contain both the mangled and demangled names for thunks. If more than one archive defines the same function, this could lead to different libraries being used for the same function depending on how they are referenced. Avoid this by checking if the paired symbol is already defined before adding a symbol to the table.

…m#112928) Member pointers refer to data or function members of a `CXXRecordDecl` and require a `MSInheritanceAttr` in order to be complete. Without that we cannot calculate their size in memory. The attempt has been causing a crash further down in the clang AST context. In order to implement the feature, DWARF will need a new attribtue to convey the information. For the moment, this patch teaches LLDB to handle to situation and avoid the crash.

…lvm#111130) Before this patch, redundant COPY couldn't be removed for the following case: ``` $R0 = OP ... ... // Read of %R0 $R1 = COPY killed $R0 ``` This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: ``` $R1 = OP ... ... // Replace all uses of %R0 with $R1 ```

This PR merges large offsets into the base address loading.

llvm#113309) llvm-cxxfilt can demangle names of data symbols, in addition to function names. $ llvm-cxxfilt _ZN6garden5gnomeE garden::gnome And type names too, on request: $ llvm-cxxfilt -t i int Update some overly specific the wording in the --help and documentation that suggests otherwise.

This patch adds these new vector sizes for neon: mfloat8x16_t and mfloat8x8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323

llvm#111531) Bot maintainers should be aware and it became too much of a burden for developers. In particular on Windows, where make.exe won't be found in Path typically.

…2867) The Intel C++ Compiler (ICX) passes linker flags through the driver unlike MSVC and clang-cl, and therefore needs them to be prefixed with `/Qoption,link` (the equivalent of -Wl, for gcc on *nix). Use the `LINKER:` prefix for the `/EXPORT:` options in clang-repl, this expands to the correct flag for ICX and nothing for MSVC / clang-cl. RFC: https://discourse.llvm.org/t/rfc-cmake-linker-flags-need-wl-equivalent-for-intel-c-icx-on-windows/82446

These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see llvm#108980 (comment)). Note: Both SLEEF and ArmPL state that they do not set `errno`: - https://developer.arm.com/documentation/101004/2410/General-information/Arm-Performance-Libraries-math-functions * "The vector functions in libamath which are available on Linux may not set errno nor raise exceptions" - https://sleef.org/2-references/libm/ * "These functions do not set errno nor raise an exception."

…llvm#113167) Define `OmpIteratorSpecifier` and `OmpIteratorModifier` parser classes, and add parsing for them. Those are reusable between any clauses that use iterator modifiers. Add support for iterator modifiers to the MAP clause up to lowering, where a TODO message is emitted.

Reverts llvm#112603

…llvm#113452)

…at container-inserter does (llvm#113103) This patch implements LWG4016: container-insertable checks do not match what container-inserter does.

…lvm#111236) The underlying issue with msan was fixed by llvm#113200

When compiling for an SVE target we can use INDEX to generate constant fixed-length step vectors, e.g.: ``` uint32x4_t foo() { return (uint32x4_t){0, 1, 2, 3}; } ``` Currently: ``` foo(): adrp x8, .LCPI1_0 ldr q0, [x8, :lo12:.LCPI1_0] ret ``` With INDEX: ``` foo(): index z0.s, #0, #1 ret ``` The logic for this was already in `LowerBUILD_VECTOR`, though it was hidden under a check for `!Subtarget->isNeonAvailable()`. This patch refactors this to enable the corresponding code path unconditionally for constant step vectors (as long as we can use SVE for them).

jsji and others added 30 commits October 22, 2024 17:39

[ARM] Use proper types for these records. (llvm#113370)

fe480cf

llvm#112904 will add typechecking to submulticlass arguments, and these ones are currently mistyped.

[NFC] [MTE] Remove useless yaml2obj from test (llvm#113374)

2e0506f

We already have the .o, there is no reason to go .o -> YAML -> .o

Revert "[LLVM] Add IRNormalizer Pass" (llvm#113392)

8a12e01

Reverts llvm#68176 Introduced BuildBot failure: llvm#68176 (comment)

[TBAA] Extend pointer TBAA to pointers of non-builtin types. (llvm#11…

4334f31

…0569) Extend the logic added in 123c036 (llvm#76612) to support pointers to non-builtin types by using the mangled name of the canonical type. PR: llvm#110569

[clang-tidy] Fix cppcoreguidelines-pro-type-union-access if memLoc is…

0fbf91a

… invalid (llvm#104540) Fixes llvm#102945.

[clang codegen] avoid to crash when emit init func for global variabl…

bd6c430

…e with flexible array init (llvm#113336) Fixes: llvm#113187 Avoid to create init function since clang does not support global variable with flexible array init. It will cause assertion failure later.

[llvm][OpenMP] Handle complex types in atomic read (llvm#111377)

645e6f1

This patch adds functionality for atomically reading `llvm.struct` types. Fixes: llvm#93441

[clang-format] Use RemoveEmptyLinesInUnwrappedLines in clang-format s… (

b69ac31

llvm#113260) …tyle

[clang-format] Handle C# goto case constructs (llvm#113257)

d005be3

Fixes llvm#113256.

[X86] Update Model value for Arrow Lake. (llvm#113273)

9e3d465

Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368

[X86] combineAndNotOrIntoAndNotAnd - don't fold other constant operan…

49ebe32

…ds (llvm#113264) Looks like having a constant in `Z` also caused infinite loops. This fixes llvm#113240.

[TOSA] bug fix infer shape for slice (llvm#108306)

3b9526b

This fixes the infer output shape of TOSA slice op for start/size values that are out-of-bound or -1 added tests to check: - size = -1 - size is out of bound - start is out of bound Signed-off-by: Tai Ly <[email protected]>

[flang][NFC] Fix failing atomic tests

b39760c

Fix ordering of checks in atomic02.f90.

[LV] Regenerate check-lines for some tests.

ddbb382

Revert "[TOSA] bug fix infer shape for slice" (llvm#113413)

8ad8db9

Reverts llvm#108306

[Bazel][SystemZ] Update for llvm#112975

20c5983

[clang][bytecode] Diagnose non-const initialiers in diagnoseUnknownDe…

46ad7ff

…cl (llvm#113276) This is more similar to the diagnostic output of the current interpreter

[AMDGPU] Add a new target for gfx1153 (llvm#113138)

076aac5

[mlir][ods] Verify access to operands in inferReturnTypes (llvm#112574)

076d3e2

The patch adds graceful handling of incorrectly constructed MLIR operation with less operands than expected.

[libc++] <experimental/simd> Add unary operators for class simd (llvm…

2c3d7d5

…#104764)

DavidSpickett and others added 30 commits October 23, 2024 09:06

[RISCV][MC] Support imm symbol in parseCSRSystemRegister (llvm#112007)

6eb93d0

Co-authored-by: Alex Richardson <[email protected]>

[MLIR][SPIRV] Update cast from IntN to Bool (llvm#113329)

27158ed

This PR updates the cast to bool from IntN to treat any non-zero value as TRUE. This makes the cast more resilient to non-generic (i.e. "non 1") TRUE values. Signed-off-by: Dmitriy Smirnov <[email protected]>

[CodeGen][NewPM] Port OptimizePHIs to NPM (llvm#113433)

c4c60c0

[LoongArch] Merge base and offset for large offsets (llvm#113277)

b225b15

This PR merges large offsets into the base address loading.

[CLANG][AArch64]Add Neon vectors for mfloat8_t (llvm#99865)

6dad29a

This patch adds these new vector sizes for neon: mfloat8x16_t and mfloat8x8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323

[lldb][CMake] If make isn't found, print a warning but don't error out (

ba19e98

llvm#111531) Bot maintainers should be aware and it became too much of a burden for developers. In particular on Windows, where make.exe won't be found in Path typically.

Revert "[PowerPC] Expand global named register support" (llvm#113457)

a19f05b

Reverts llvm#112603

[PS5][Driver] Query OPT_r/OPT_shared/OPT_static just once (NFC) (…

5560f7e

…llvm#113452)

[libc++][ranges] LWG4016: container-insertable checks do not match wh…

7c72199

…at container-inserter does (llvm#113103) This patch implements LWG4016: container-insertable checks do not match what container-inserter does.

Reapply "[InstCombine] Folding (icmp eq/ne (and X, -P2), INT_MIN)" (l…

294726d

…lvm#111236) The underlying issue with msan was fixed by llvm#113200

[AutoBump] Merge with 8a9921f (Oct 23)

3f5d12c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with 8a9921f5 (Oct 23) (17) #454

[AutoBump] Merge with 8a9921f5 (Oct 23) (17) #454

jorickert commented Jan 14, 2025

[AutoBump] Merge with 8a9921f5 (Oct 23) (17) #454

Are you sure you want to change the base?

[AutoBump] Merge with 8a9921f5 (Oct 23) (17) #454

Conversation

jorickert commented Jan 14, 2025