aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-02-05[libc++] Fix ambiguity when using std::scoped_allocator constructor (#80261)Rajveer Singh Bharadwaj2-4/+10
Fixes #78754
2024-02-05[libc++] Rename __bit_reference template parameter to avoid conflict (#80661)Dimitry Andric1-5/+5
As of 4d20cfcf4eb08217ed37c4d4c38dc395d7a66d26, `__bit_reference` contains a template `__fill_n` with a bool `_FillValue` parameter. Unfortunately there is a relatively widely used piece of scientific software called NetCDF, which exposes a (C) macro `_FillValue` in its public headers. When building the NetCDF C++ bindings, this quickly leads to compilation errors when the macro interferes with the template in `__bit_reference`. Rename the parameter to `_FillVal` to avoid the conflict.
2024-02-05[mlir][openacc] Add legalize data pass for compute operation (#80351)Valentin Clement (バレンタイン クレメン)12-20/+302
This patch adds a simple pass to replace the uses inside compute operation. It replaces the `varPtr` values with their corresponding `accPtr` values gathered through the dataClauseOperands. private and reductions variables are not included in this pass since they will normally be replace when they are materialized. --------- Co-authored-by: Slava Zakharin <szakharin@nvidia.com>
2024-02-05[clang][Interp] Handle __assume like __builtin_assume.Timm Bäder2-4/+7
2024-02-05[X86] printZeroUpperMove - add support for mask predicated instructionsSimon Pilgrim2-7/+11
Handle masked predicated movss/movsd in addConstantComments now that we can generically handle the destination + mask register This will more significantly help improve 'fixup constant' comments from #73509
2024-02-05[X86] printBroadcast - add support for mask predicated instructionsSimon Pilgrim2-49/+61
Handle masked predicated load/broadcasts in addConstantComments now that we can generically handle the destination + mask register This will more significantly help improve 'fixup constant' comments from #73509
2024-02-05[X86] printExtend - add support for mask predicated instructionsSimon Pilgrim2-56/+26
Remove handling from EmitAnyX86InstComments and handle all VPMOVSX/VPMOVZX comments in addConstantComments now that we can generically handle the destination + mask register and shuffle mask comment
2024-02-05[X86] Split up getShuffleComment into printShuffleMask and ↵Simon Pilgrim1-34/+35
printDstRegisterName helpers. NFC. This will allow us to easily use printDstRegisterName for other mask predicate destination registers, and printout shuffle masks from other instruction types.
2024-02-05[libc++abi] Replace usage of raw assert by _LIBCXXABI_ASSERT (#80689)Louis Dionne1-2/+2
We strive not to use raw assert(...) anymore in libc++abi in preparation for using the hardening framework.
2024-02-05[mlir][bufferization][NFC] Pass `DeallocationOptions` instead of flags (#80675)Matthias Springer6-26/+33
Pass `DeallocationOptions` instead of `privateFuncDynamicOwnership`. This will make it easier to add new options in the future.
2024-02-05[libc++] Add missing <errno.h> include in threading support headers (#80311)Louis Dionne1-0/+1
This was incorrectly removed when I split up the header.
2024-02-05[libc] implement insque and remque (#80305)Sirui Mu14-2/+393
This PR implements the `insque` and `remque` entrypoint functions.
2024-02-05[analyzer] Support interestingness in ArrayBoundV2 (#78315)NagyDonat3-125/+607
This commit improves alpha.security.ArrayBoundV2 in two connected areas: (1) It calls `markInteresting()` on the symbolic values that are responsible for the out of bounds access. (2) Its index-is-in-bounds assumptions are reported in note tags if they provide information about the value of an interesting symbol. This commit is limited to "display" changes: it introduces new diagnostic pieces (potentially to bugs found by other checkers), but ArrayBoundV2 will make the same assumptions and detect the same bugs before and after this change. As a minor unrelated change, this commit also updates/removes some very old comments which became obsolete due to my previous changes.
2024-02-05[libc++] Add missing conditionals for feature-test macros (#80168)Louis Dionne7-122/+254
We noticed that some feature-test macros were not conditional on configuration flags like _LIBCPP_HAS_NO_FILESYSTEM. As a result, code attempting to use FTMs would not work as intended. This patch adds conditionals for a few feature-test macros, but more issues may exist. rdar://122020466
2024-02-05[analyzer] Model Microsoft "__assume" in the same way as clang ↵Loïc Joly2-1/+20
"__builtin_assume"
2024-02-05[Libomptarget] Remove unused 'SupportsEmptyImages' API function (#80316)Joseph Huber5-17/+1
Summary: This function is always false in the current implementation and is not even considered required. Just remove it and if someone needs it in the future they can add it back in. This is done to simplify the interface prior to other changes
2024-02-05[mlir][EmitC] Add support for external functions (#80547)Marius Brehler8-26/+50
This adds a conversion from an externaly defined `func.func`, a `func.func` without function body, to an `emitc.func` with an `extern` specifier.
2024-02-05[flang] Improve alias analysis to be precise for box and box.base_addr (#80335)Razvan Lupusoru2-1/+41
After PR#68727 the source for both the fir.box_addr and a box became the same. Thus the detection that only one of the sources was direct and the special logic around it was being skipped. As a result, the test included would show a "MayAlias" result instead of a "NoAlias" result.
2024-02-05[libc] Refactor _build_gpu_objects cmake function. (#80631)lntue8-96/+129
2024-02-05[VPlan] Implement type inference for ICmp.Florian Hahn2-0/+37
This fixes a crash in the attached test case due to missing type inference for ICmp VPInstructions.
2024-02-05[clang][Interp] Fix MemberExpr initializing an existing value (#79973)Timm Baeder3-2/+13
This is similar to c1ad363e6eba308fa94c47374ee98b3c79693a35, but with the additional twist that initializing an existing value from a `MemberExpr` was not working correctly.
2024-02-05[Clang] Fix crash when recovering from an invalid pack indexing type. (#80652)cor3ntin2-0/+21
If the pattern of a pack indexing type did not contain a pack, we would still construct a pack indexing type (to improve error messages) but we would fail to make the type as dependent, leading to infinite recursion when trying to extract a canonical type.
2024-02-05[InstCombine] Fold ((cst << x) & 1) --> x == 0 when cst is odd (#79772)elhewaty2-3/+90
Fold ((cst << x) & 1) to zext(x == 0) when cst is odd. Fixes: https://github.com/llvm/llvm-project/issues/73384 Alive2: https://alive2.llvm.org/ce/z/5RbaK6
2024-02-05[lldb][Docs] Remove unnecessary colon in titleDavid Spickett1-2/+2
2024-02-05[Clang] Make AMDGPU OpenCL tests require AMD registered targetJoseph Huber2-0/+4
Summary: These tests likely always failed but was hidden by the expected return value. Simply make them require AMDGPU as a registered target so they don't fail on other machines.
2024-02-05[clang][Interp] Support zero init for complex types (#79728)Timm Baeder2-2/+39
Initialize both elements to 0.
2024-02-05[SPIR-V] Include SPIRV-Tools tests in CI (#80479)Natalie Chouinard1-1/+1
2024-02-05[clang][Interp] Reject bitcasts to atomic typesTimm Bäder5-2/+34
The current interpreter does this, so follow suit to match its diagnostics.
2024-02-05[AMDGPU] Allow w64 ballot to be used on w32 targets (#80183)Joseph Huber2-4/+4
Summary: Currently we cannot compile `__builtin_amdgcn_ballot_w64` on non-wave64 targets even though it is valid. This is relevant for making library code that can handle both without needing to check the wavefront size. This patch relaxes the semantic check for w64 so it can be used normally.
2024-02-05[Offload] Fix entry global names on NVPTX targetJoseph Huber1-4/+10
Summary: The PTX language rejects globals with `.` in the name. We need to change the global name if we are targeting NVPTX to prevent the toolchain from complaining.
2024-02-05[FPEnv][AMDGPU] Correct strictfp tests.Kevin P. Neal6-30/+32
Correct AMDGPU strictfp tests to follow the rules documented in the LangRef: https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics These tests needed the strictfp attribute added to function calls and some declarations. Some of the tests now pass with D146845, others get farther along and fail with D146845. The tests revealed that further work is required in mostly AMDGPU atomics to get the tests passing. Since I was here anyway I removed the strictfp attribute from some constrained intrinsic declarations. They have this attribute by default. Test changes verified with D146845.
2024-02-05[X86] getShuffleComment - use MI description to determine AVX512 masked ↵Simon Pilgrim1-9/+4
predicates instead of src index offsets.
2024-02-05[libc++] Add missing include of <string.h> in POSIX fallbacks for localeLouis Dionne1-0/+1
2024-02-05[RDF] Skip over NoRegister. NFCI. (#80672)Jay Foad1-1/+1
This just avoids useless work of adding NoRegister to BaseSet, for consistency with other places that iterate over all physical registers.
2024-02-05[mlir][ArmSME] Add rewrites to swap extract of extend (#80407)Cullen Rhodes2-1/+197
In mixed matmul lowering (e.g., i8 to i32) we're seeing the following sequence: %0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32> %1 = vector.extract %0[0] : vector<[8]xi32> from vector<4x[8]xi32> %lhs = vector.scalable.extract %1[0] : vector<[4]xi32> from vector<[8]xi32> ... (same for rhs) %2 = vector.outerproduct %lhs, %rhs, %acc vector<[4]xi32>, vector<[4]xi32> // x4 chained by accumulator This chain of 4 outer products can be fused into a single 4-way widening variant but the pass doesn't match on the IR, as it expects the source of the inputs to be an extend and it can't look through the extracts. This patch fixes this with two rewrites that swaps extract(extend) into extend(extract). Related to #78975, #79288.
2024-02-05[mlir][ArmSME][nfc] Fix docs for 2-way opsCullen Rhodes1-5/+5
The "Refer to" and table shouldn't be in the example code sequence.
2024-02-05[libc] tiny fix for doc (#80512)Schrodinger ZHU Yifan3-3/+3
2024-02-05[libc++] fix `counting_semaphore` lost wakeups (#79265)Hui3-20/+108
Fixes #77659 Fixes #46357 Picked up from https://reviews.llvm.org/D114119
2024-02-05[flang]Add support for -moutline-atomics and -mno-outline-atomics (#78755)Mats Petersson9-22/+73
This adds the support to add the target-feature to outline atomic operations (calling the runtime library instead).
2024-02-05[X86] addConstantComments - split VPERMILPS/VPERMILPD handling to reduce ↵Simon Pilgrim1-33/+12
repeated switch cases etc. NFC.
2024-02-05[X86] Add common getSrcIdx helper to determine source index after AVX512 ↵Simon Pilgrim1-20/+14
masked predicates. NFC.
2024-02-05[RISCV] Add tests for reduce.fmaximum/fminimum. NFC (#80553)Shih-Po Hung3-0/+2810
This is to add test coverage for crash report in #80340
2024-02-05AMDGPU: Set max supported div/rem size to 64 (#80669)Matt Arsenault3-6/+5469
This enables IR expansion for i128 divisions. The vector case is still broken because ExpandLargeDivRem doesn't try to handle them. Fixes: SWDEV-426193
2024-02-05[AMDGPU][PromoteAlloca] Support memsets to ptr allocas (#80678)Pierre van Houtryve2-4/+66
Fixes #80366
2024-02-05[CodeGen] Convert tests to opaque pointers (NFC)Nikita Popov457-12419/+12407
2024-02-05AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis (#80003)Petar Avramovic21-259/+827
Implement PhiLoweringHelper for GlobalISel in DivergenceLoweringHelper. Use machine uniformity analysis to find divergent i1 phis and select them as lane mask phis in same way SILowerI1Copies select VReg_1 phis. Note that divergent i1 phis include phis created by LCSSA and all cases of uses outside of cycle are actually covered by "lowering LCSSA phis". GlobalISel lane masks are registers with sgpr register class and S1 LLT. TODO: General goal is that instructions created in this pass are fully instruction-selected so that selection of lane mask phis is not split across multiple passes. patch 3 from: https://github.com/llvm/llvm-project/pull/73337
2024-02-05[AMDGPU] Insert spill codes for the SGPRs used for EXEC copy (#79428)Christudasan Devadasan4-18/+135
The SGPR registers used for preserving EXEC mask while lowering the whole-wave register spills and copies should be preserved at the prolog and epilog if they are in the CSR range. It isn't happening when there is only wwm-copy lowered and there are no wwm-spills. This patch addresses that problem.
2024-02-05[ARM] Convert tests to opaque pointers (NFC)Nikita Popov112-5129/+5129
2024-02-05[AVR] Convert tests to opaque pointers (NFC)Nikita Popov53-583/+573
2024-02-05[libc] Fix generated float128 header for aarch64 target. (#78017)lntue10-28/+56