aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-17Revert "[RISCV] Support Zama16b1p0 (#88474)"revert-88474-zama16bZijunZhaoCCK7-25/+1
This reverts commit b090569685699abe4a8031ad442a0f81e373146b.
2024-04-17[AArch64] Update latencies for Cortex-A510 scheduling model (#87293)Usman Nadeem209-6481/+6488
Updated according to the Software Optimization Guide for Arm® Cortex®‑A510 Core Revision: r1p3 Issue 6.0.
2024-04-17[RISCV] Add coverage for strength reduction of mul by small negative immediatesPhilip Reames2-0/+186
2024-04-17[CostModel][X86] Recognise vector rotation by uniform constant patternsSimon Pilgrim11-153/+204
Adds suitable costs for AVX512 targets (we still rely on default expansion for AVX2 and earlier)
2024-04-17[github] Add ClangIR to new-prs-labeler.yml (#86088)Nathan Lanza1-0/+6
2024-04-17[InstCombine] Add phase ordering test for #88239. NFCCraig Topper1-0/+55
2024-04-17[InstCombine] Add test case for turning sub into xor using dominating ↵Craig Topper1-0/+23
condition. NFC I plan to disable using dominating conditions for turning sub into xor, but first we need that demonstrates it currently happens.
2024-04-17[GlobalISel][AArch64] Add LLRINT support (#88702)David Green18-104/+251
This hooks up G_INTRINSIC_LLRINT instructions, very similar to the lrint nodes that already exist. On AArch64 they are treated the same as lrint with the default return types.
2024-04-17[libc++][pstl] Promote CPU backends to top-level backends (#88968)Louis Dionne40-169/+154
This patch removes the two-level backend dispatching mechanism we had in the PSTL. Instead of selecting both a PSTL backend and a PSTL CPU backend, we now only select a top-level PSTL backend. This greatly simplifies the PSTL configuration layer. While this patch technically removes some flexibility from the PSTL configuration mechanism because CPU backends are not considered separately, it opens the door to a much more powerful configuration mechanism based on chained backends in a follow-up patch. This is a step towards overhauling the PSTL dispatching mechanism.
2024-04-17[InstCombine] Use `auto *` instead of `auto` in `visitSIToFP`; NFCNoah Goldstein1-1/+1
2024-04-17[CostModel][X86] Add basic GFNI target test coverage for shift/rotate costsSimon Pilgrim23-0/+2737
2024-04-17[libc] set cmake dependencies for condattr test (#89103)Nick Desaulniers2-1/+8
The entrypoints are not yet exposed on non-x86. Express this dependency to unbreak post submit. Fixes #88987
2024-04-17[SLP]Attempt to vectorize long stores, if short one failed.Alexey Bataev2-72/+80
We can try to vectorize long store sequences, if short ones were unsuccessful because of the non-profitable vectorization. It should not increase compile time significantly (stores are sorted already, complexity is n x log n), but vectorize extra code. Metric: size..text Program size..text results results0 diff test-suite :: External/SPEC/CINT2006/400.perlbench/400.perlbench.test 1088012.00 1088236.00 0.0% test-suite :: SingleSource/UnitTests/matrix-types-spec.test 480396.00 480476.00 0.0% test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 664613.00 664661.00 0.0% test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 664613.00 664661.00 0.0% test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 2041105.00 2040961.00 -0.0% test-suite :: MultiSource/Applications/JM/lencod/lencod.test 836563.00 836387.00 -0.0% test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 1035100.00 1032140.00 -0.3% In all benchmarks extra code gets vectorized Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/88563
2024-04-17[MLIR] Update doc comment in ViewLikeInterface.td (NFC) (#89074)Abdul Raheem1-1/+1
Signed-off: Abdul Raheem Beigh <abdulraheembeigh@gmail.com>
2024-04-17[bazel][libc] Add missing dep after b854a2323337be2633b1135f590678a17e9d1adeJorge Gorbe Moya1-3/+4
2024-04-17[RISCV] Fix typo in RISCVScheduleV.td that was introduced in 60a1158Michael Maitland1-1/+1
2024-04-17[bazel][mlir] Add missing dep after 4f88c2311130791cf69da34b743b1b3ba7584a7bJorge Gorbe Moya1-0/+1
2024-04-17[mlir][tosa] Fix tosa.Resize-to-linalg lowering (#88514)fabrizio-indirli2-73/+60
2024-04-17[libc][POSIX][pthreads] implement pthread_condattr_t functions (#88987)Nick Desaulniers22-8/+545
Implement: - pthread_condattr_destroy - pthread_condattr_getclock - pthread_condattr_getpshared - pthread_condattr_init - pthread_condattr_setclock - pthread_condattr_setpshared Fixes: #88581
2024-04-17[FMV] Remove useless features according the latest ACLE spec. (#88965)Alexandros Lamprineas15-377/+287
As explained in https://github.com/ARM-software/acle/pull/315 we are deprecating features which aren't adding any value. These are: sha1, pmull, dit, dgh, ebf16, sve-bf16, sve-ebf16, sve-i8mm, sve2-pmull128, memtag2, memtag3, ssbs2, bti, ls64_v, ls64_accdata
2024-04-17[flang][cuda] Lower ALLOCATE for device variable (#88980)Valentin Clement (バレンタイン クレメン)2-10/+154
Replace the runtime call to `AllocatableAllocate` for CUDA device variable to the newly added `fir.cuda_allocate` operation.
2024-04-17[flang][cuda] Update memory effect on fir.cuda_allocate op (#88930)Valentin Clement (バレンタイン クレメン)3-3/+3
Add MemRead effect on the box operand as the descriptor might be read when performing the allocation of the data. Also update the expected type of the box operand to be a reference. Check in the verifier that this is a reference to a box or class type. This addresses the comment made post commit on #88586
2024-04-17[compiler-rt] Use __atomic builtins whenever possibleAlexander Richardson8-370/+56
The code in this file dates back to 2012 when Clang's support for atomic builtins was still quite limited. The bugs referenced in the comment at the top of the file have long been fixed and using the compiler builtins directly should now generate slightly better code. Additionally, this allows using the atomic builtin header for platforms where the __sync_builtins are lacking (e.g. Arm Morello). This change does not introduce any code generation changes for __tsan_read*/__tsan_write* or __tsan_func_{entry,exit} on x86, which indicates the previously noted compiler issues have been fixed. We also have to touch the non-clang codepaths here since the only way we can make this work easily is by making the memory_order enum match the compiler-provided macros, so we have to update the debug checks that assumed the enum was always a bitflag. The one downside of this change is that 32-bit MIPS now definitely requires libatomic (but that may already have been needed for RMW ops). Reviewed By: dvyukov Pull Request: https://github.com/llvm/llvm-project/pull/84439
2024-04-17[Clang][Parse] Diagnose requires expressions with explicit object parameters ↵Krystian Stasiowski4-1/+25
(#88974) Clang currently allows the following: ``` auto x = requires (this int) { true; }; ``` This patch addresses that.
2024-04-17[libc][c23][fenv] Implement fetestexceptflag (#87828)Robin Caloudis19-10/+110
Provide C23 `fetestexceptflag` function according to 7.6.4.6 in the latest [revision of the C standard](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf) from 2023-04-02. Closes https://github.com/llvm/llvm-project/issues/87565.
2024-04-17Revert "[Clang][AArch64] Warn when calling non/streaming about vector size ↵Dinar Temirbulatov6-184/+5
difference (#79842)" This reverts commit 4e85e1ffcaf161736e27a24c291c1177be865976
2024-04-17[clang][NFC] Refactor `Sema::RedeclarationKind`Vlad Serebrennikov13-88/+118
This patch converts the enum into scoped enum, and moves it into its own header for the time being. It's definition is needed in `Sema.h`, and is going to be needed in upcoming `SemaObjC.h`. `Lookup.h` can't hold it, because it includes `Sema.h`.
2024-04-17[libc++][chrono] Improves date formatting. (#86127)Mark de Wever4-101/+7
The formatting of years has been done manually since the results of %Y outside the "typical" range may produce unexpected values. The same applies to %F which is identical to %Y-%m-%d. None of these conversion specifiers is affected by the locale used. So it's trivial to manually handle this case. This removes several platform specific ifdefs from the tests.
2024-04-17[VPlan] Check for VPWidenLoadRecipe directly in truncateToMinBW. (NFCI).Florian Hahn1-3/+1
Since ne After a separate recipe has been introduced for wide loads in a9bafe91dd0, we can directly check for load recipes in the early bail-out and remove the redundant bail out for stores.
2024-04-17[VectorCombine] Remove single quotes from "-passes=vector-combine"Simon Pilgrim3-3/+3
These confuse the update_test_checks.py script when run by DOS cmd.exe
2024-04-17[CostModel][X86] Update BITREVERSE costs for GFNI targetsSimon Pilgrim5-277/+284
Inspired by the recent patches by @shamithoke - we have real scheduler model numbers for GFNI instructions now, allowing us to calculate an upper bounds costs table instead of performing it analytically.
2024-04-17[lldb] XFAIL TestDetachResumes on windowsPavel Labath1-0/+1
2024-04-17[NFC] Clean dead code in ParsedAttr.h (#89064)yronglin1-6/+1
Signed-off-by: yronglin <yronglin777@gmail.com>
2024-04-17[libc] Replace mentions of `LIBC_FULLBUILD` with `LLVM_LIBC_FULL_BUILD` in ↵Rajveer Singh Bharadwaj2-3/+3
'examples/' (#88657) Resolves #88328
2024-04-17[mlir][py] Add NVGPU's `TensorMapDescriptorType` in py bindings (#88855)Guray Ozen6-0/+101
This PR adds NVGPU dialects' TensorMapDescriptorType in the py bindings. This is a follow-up issue from [this PR](https://github.com/llvm/llvm-project/pull/87153#discussion_r1546193095)
2024-04-17[AMDGPU] Fix predicates for BUFFER_ATOMIC_FMIN/FMAX patterns (#89066)Jay Foad2-1/+73
Use OtherPredicates to avoid interfering with other uses of SubtargetPredicate for GFX12.
2024-04-17[VPlan] Factor out helper to recursively collect all users (NFCI).Florian Hahn2-16/+19
Factor out logic to collect all users recursively to be re-used in https://github.com/llvm/llvm-project/pull/87816.
2024-04-17[C99] Remove WG14 N522 from the C status pageAaron Ballman1-5/+0
This paper is about type compatibility rules that changed in C99, but this is only applicable across translation units and so there's nothing for us to test. The specific change was that C89 allowed different tag types (e.g., struct and union) to be compatible and C99 tightened that restriction. This is a case where the user gets whatever they get if they link two TUs with incompatible tag types.
2024-04-17[RISCV] Explicitly bail if something modifies VL/VTYPE in doLocalPostpassLuke Lau2-1/+5
If an instruction between MI and NextMI uses VL or VTYPE we demand the respective fields so as to not clobber them at their uses. But we don't consider if something might modify VL or VTYPE, and will happily coalesce two vsetvlis when we need to preserve them. This fixes this by skipping to the next vsetvli. Demanding the fields isn't enough, as we need to preserve the VL and VTYPE values even if no fields are demanded. In practice this doesn't happen, presumably due to there not being any instructions that write to VL or VTYPE without reading them. But I noticed this whilst working on a separate patch and split it out.
2024-04-17[PowerPC] 32-bit large code-model support for toc-data (#85129)Zaara Syeda7-36/+144
This patch adds the pseudo op ADDItocL for 32-bit large code-model support for toc-data.
2024-04-17[RISCV] Add test for doLocalPostpass issue not checking if VL was modified. NFCLuke Lau1-0/+25
2024-04-17[LLVM][CodeGen] Fix register lane liveness tracking in RegisterPressure (#88892)Krzysztof Parzyszek1-18/+21
Re-enable an old assertion in `decreaseSetPressure`.
2024-04-17[mlir] expose transform dialect symbol merge to python (#87690)Oleksandr "Alex" Zinenko5-2/+120
This functionality is available in C++, make it available in Python directly to operate on transform modules.
2024-04-17[Inline] Regenerate inline-switch-default-2.ll (NFC)DianQK1-176/+0
2024-04-17[AMDGPU][Docs] Fix broken link to HRF memory model reference (#88696)Fabian Ritter1-1/+1
The link to the Heterogeneous-race-free Memory Models ASPLOS'14 paper by Hower et al. pointed to a bogus website, probably because the domain ownership has changed. This patch updates it to a version hosted on research.cs.wisc.edu.
2024-04-17[VP] Correct lowering of predicated fma and faddmul to avoid strictfp. (#85272)Kevin P. Neal4-11/+203
Correct missing cases in a switch that result in @llvm.vp.fma.v4f32 getting lowered to a constrained fma intrinsic. Vector predicated lowering to contrained intrinsics is not supported currently, and there's no consensus on the path forward. We certainly shouldn't be introducing constrained intrinsics into a function that isn't strictfp. Problem found with D146845.
2024-04-17[lldb] Fix evaluation of expressions with static initializers (#89063)Pavel Labath1-2/+6
After 281d71604f418eb952e967d9dc4b26241b7f96a, llvm generates 32-bit relocations, which overflow when we load these objects into high memory. Interestingly, setting the code model to "large" does not help here (perhaps it is the default?). I'm not completely sure that this is the right thing to do, but it doesn't seem to cause any ill effects. I'll follow up with the author of that patch about the expected behavior here.
2024-04-17[TailDuplicator] Add maximum predecessors and successors to consider tail ↵Quentin Dian2-0/+280
duplicating blocks (#78582) Fixes #78578. Duplicating a BB which has both multiple predecessors and successors will result in a complex CFG and also may cause huge amount of PHI nodes. See https://github.com/llvm/llvm-project/issues/78578#issuecomment-1962363580 for a detailed description of the limit.
2024-04-17[RISCV] Fix clang-tidy warning about else after return. NFCLuke Lau1-1/+3
2024-04-17[mlir] transform.apply_patterns support more config options (#88484)Oleksandr "Alex" Zinenko3-1/+41
Greedy rewrite driver has options to control the number of rewrites applies. Expose those via the corresponding transform op.