rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-04-17	Revert "[RISCV] Support Zama16b1p0 (#88474)"revert-88474-zama16b	ZijunZhaoCCK	7	-25/+1
	This reverts commit b090569685699abe4a8031ad442a0f81e373146b.
2024-04-17	[AArch64] Update latencies for Cortex-A510 scheduling model (#87293)	Usman Nadeem	209	-6481/+6488
	Updated according to the Software Optimization Guide for Arm® Cortex®‑A510 Core Revision: r1p3 Issue 6.0.
2024-04-17	[RISCV] Add coverage for strength reduction of mul by small negative immediates	Philip Reames	2	-0/+186

2024-04-17	[CostModel][X86] Recognise vector rotation by uniform constant patterns	Simon Pilgrim	11	-153/+204
	Adds suitable costs for AVX512 targets (we still rely on default expansion for AVX2 and earlier)
2024-04-17	[github] Add ClangIR to new-prs-labeler.yml (#86088)	Nathan Lanza	1	-0/+6

2024-04-17	[InstCombine] Add phase ordering test for #88239. NFC	Craig Topper	1	-0/+55

2024-04-17	[InstCombine] Add test case for turning sub into xor using dominating ↵	Craig Topper	1	-0/+23
	condition. NFC I plan to disable using dominating conditions for turning sub into xor, but first we need that demonstrates it currently happens.
2024-04-17	[GlobalISel][AArch64] Add LLRINT support (#88702)	David Green	18	-104/+251
	This hooks up G_INTRINSIC_LLRINT instructions, very similar to the lrint nodes that already exist. On AArch64 they are treated the same as lrint with the default return types.
2024-04-17	[libc++][pstl] Promote CPU backends to top-level backends (#88968)	Louis Dionne	40	-169/+154
	This patch removes the two-level backend dispatching mechanism we had in the PSTL. Instead of selecting both a PSTL backend and a PSTL CPU backend, we now only select a top-level PSTL backend. This greatly simplifies the PSTL configuration layer. While this patch technically removes some flexibility from the PSTL configuration mechanism because CPU backends are not considered separately, it opens the door to a much more powerful configuration mechanism based on chained backends in a follow-up patch. This is a step towards overhauling the PSTL dispatching mechanism.
2024-04-17	[InstCombine] Use `auto *` instead of `auto` in `visitSIToFP`; NFC	Noah Goldstein	1	-1/+1

2024-04-17	[CostModel][X86] Add basic GFNI target test coverage for shift/rotate costs	Simon Pilgrim	23	-0/+2737

2024-04-17	[libc] set cmake dependencies for condattr test (#89103)	Nick Desaulniers	2	-1/+8
	The entrypoints are not yet exposed on non-x86. Express this dependency to unbreak post submit. Fixes #88987
2024-04-17	[SLP]Attempt to vectorize long stores, if short one failed.	Alexey Bataev	2	-72/+80
	We can try to vectorize long store sequences, if short ones were unsuccessful because of the non-profitable vectorization. It should not increase compile time significantly (stores are sorted already, complexity is n x log n), but vectorize extra code. Metric: size..text Program size..text results results0 diff test-suite :: External/SPEC/CINT2006/400.perlbench/400.perlbench.test 1088012.00 1088236.00 0.0% test-suite :: SingleSource/UnitTests/matrix-types-spec.test 480396.00 480476.00 0.0% test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 664613.00 664661.00 0.0% test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 664613.00 664661.00 0.0% test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 2041105.00 2040961.00 -0.0% test-suite :: MultiSource/Applications/JM/lencod/lencod.test 836563.00 836387.00 -0.0% test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 1035100.00 1032140.00 -0.3% In all benchmarks extra code gets vectorized Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/88563
2024-04-17	[MLIR] Update doc comment in ViewLikeInterface.td (NFC) (#89074)	Abdul Raheem	1	-1/+1
	Signed-off: Abdul Raheem Beigh <abdulraheembeigh@gmail.com>
2024-04-17	[bazel][libc] Add missing dep after b854a2323337be2633b1135f590678a17e9d1ade	Jorge Gorbe Moya	1	-3/+4

2024-04-17	[RISCV] Fix typo in RISCVScheduleV.td that was introduced in 60a1158	Michael Maitland	1	-1/+1

2024-04-17	[bazel][mlir] Add missing dep after 4f88c2311130791cf69da34b743b1b3ba7584a7b	Jorge Gorbe Moya	1	-0/+1

2024-04-17	[mlir][tosa] Fix tosa.Resize-to-linalg lowering (#88514)	fabrizio-indirli	2	-73/+60

2024-04-17	[libc][POSIX][pthreads] implement pthread_condattr_t functions (#88987)	Nick Desaulniers	22	-8/+545
	Implement: - pthread_condattr_destroy - pthread_condattr_getclock - pthread_condattr_getpshared - pthread_condattr_init - pthread_condattr_setclock - pthread_condattr_setpshared Fixes: #88581
2024-04-17	[FMV] Remove useless features according the latest ACLE spec. (#88965)	Alexandros Lamprineas	15	-377/+287
	As explained in https://github.com/ARM-software/acle/pull/315 we are deprecating features which aren't adding any value. These are: sha1, pmull, dit, dgh, ebf16, sve-bf16, sve-ebf16, sve-i8mm, sve2-pmull128, memtag2, memtag3, ssbs2, bti, ls64_v, ls64_accdata
2024-04-17	[flang][cuda] Lower ALLOCATE for device variable (#88980)	Valentin Clement (バレンタインクレメン)	2	-10/+154
	Replace the runtime call to `AllocatableAllocate` for CUDA device variable to the newly added `fir.cuda_allocate` operation.
2024-04-17	[flang][cuda] Update memory effect on fir.cuda_allocate op (#88930)	Valentin Clement (バレンタインクレメン)	3	-3/+3
	Add MemRead effect on the box operand as the descriptor might be read when performing the allocation of the data. Also update the expected type of the box operand to be a reference. Check in the verifier that this is a reference to a box or class type. This addresses the comment made post commit on #88586
2024-04-17	[compiler-rt] Use __atomic builtins whenever possible	Alexander Richardson	8	-370/+56
	The code in this file dates back to 2012 when Clang's support for atomic builtins was still quite limited. The bugs referenced in the comment at the top of the file have long been fixed and using the compiler builtins directly should now generate slightly better code. Additionally, this allows using the atomic builtin header for platforms where the __sync_builtins are lacking (e.g. Arm Morello). This change does not introduce any code generation changes for __tsan_read/__tsan_write or __tsan_func_{entry,exit} on x86, which indicates the previously noted compiler issues have been fixed. We also have to touch the non-clang codepaths here since the only way we can make this work easily is by making the memory_order enum match the compiler-provided macros, so we have to update the debug checks that assumed the enum was always a bitflag. The one downside of this change is that 32-bit MIPS now definitely requires libatomic (but that may already have been needed for RMW ops). Reviewed By: dvyukov Pull Request: https://github.com/llvm/llvm-project/pull/84439
2024-04-17	[Clang][Parse] Diagnose requires expressions with explicit object parameters ↵	Krystian Stasiowski	4	-1/+25
	(#88974) Clang currently allows the following: ``` auto x = requires (this int) { true; }; ``` This patch addresses that.
2024-04-17	[libc][c23][fenv] Implement fetestexceptflag (#87828)	Robin Caloudis	19	-10/+110
	Provide C23 `fetestexceptflag` function according to 7.6.4.6 in the latest [revision of the C standard](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf) from 2023-04-02. Closes https://github.com/llvm/llvm-project/issues/87565.
2024-04-17	Revert "[Clang][AArch64] Warn when calling non/streaming about vector size ↵	Dinar Temirbulatov	6	-184/+5
	difference (#79842)" This reverts commit 4e85e1ffcaf161736e27a24c291c1177be865976
2024-04-17	[clang][NFC] Refactor `Sema::RedeclarationKind`	Vlad Serebrennikov	13	-88/+118
	This patch converts the enum into scoped enum, and moves it into its own header for the time being. It's definition is needed in `Sema.h`, and is going to be needed in upcoming `SemaObjC.h`. `Lookup.h` can't hold it, because it includes `Sema.h`.
2024-04-17	[libc++][chrono] Improves date formatting. (#86127)	Mark de Wever	4	-101/+7
	The formatting of years has been done manually since the results of %Y outside the "typical" range may produce unexpected values. The same applies to %F which is identical to %Y-%m-%d. None of these conversion specifiers is affected by the locale used. So it's trivial to manually handle this case. This removes several platform specific ifdefs from the tests.
2024-04-17	[VPlan] Check for VPWidenLoadRecipe directly in truncateToMinBW. (NFCI).	Florian Hahn	1	-3/+1
	Since ne After a separate recipe has been introduced for wide loads in a9bafe91dd0, we can directly check for load recipes in the early bail-out and remove the redundant bail out for stores.
2024-04-17	[VectorCombine] Remove single quotes from "-passes=vector-combine"	Simon Pilgrim	3	-3/+3
	These confuse the update_test_checks.py script when run by DOS cmd.exe
2024-04-17	[CostModel][X86] Update BITREVERSE costs for GFNI targets	Simon Pilgrim	5	-277/+284
	Inspired by the recent patches by @shamithoke - we have real scheduler model numbers for GFNI instructions now, allowing us to calculate an upper bounds costs table instead of performing it analytically.
2024-04-17	[lldb] XFAIL TestDetachResumes on windows	Pavel Labath	1	-0/+1

2024-04-17	[NFC] Clean dead code in ParsedAttr.h (#89064)	yronglin	1	-6/+1
	Signed-off-by: yronglin <yronglin777@gmail.com>
2024-04-17	[libc] Replace mentions of `LIBC_FULLBUILD` with `LLVM_LIBC_FULL_BUILD` in ↵	Rajveer Singh Bharadwaj	2	-3/+3
	'examples/' (#88657) Resolves #88328
2024-04-17	[mlir][py] Add NVGPU's `TensorMapDescriptorType` in py bindings (#88855)	Guray Ozen	6	-0/+101
	This PR adds NVGPU dialects' TensorMapDescriptorType in the py bindings. This is a follow-up issue from [this PR](https://github.com/llvm/llvm-project/pull/87153#discussion_r1546193095)
2024-04-17	[AMDGPU] Fix predicates for BUFFER_ATOMIC_FMIN/FMAX patterns (#89066)	Jay Foad	2	-1/+73
	Use OtherPredicates to avoid interfering with other uses of SubtargetPredicate for GFX12.
2024-04-17	[VPlan] Factor out helper to recursively collect all users (NFCI).	Florian Hahn	2	-16/+19
	Factor out logic to collect all users recursively to be re-used in https://github.com/llvm/llvm-project/pull/87816.
2024-04-17	[C99] Remove WG14 N522 from the C status page	Aaron Ballman	1	-5/+0
	This paper is about type compatibility rules that changed in C99, but this is only applicable across translation units and so there's nothing for us to test. The specific change was that C89 allowed different tag types (e.g., struct and union) to be compatible and C99 tightened that restriction. This is a case where the user gets whatever they get if they link two TUs with incompatible tag types.
2024-04-17	[RISCV] Explicitly bail if something modifies VL/VTYPE in doLocalPostpass	Luke Lau	2	-1/+5
	If an instruction between MI and NextMI uses VL or VTYPE we demand the respective fields so as to not clobber them at their uses. But we don't consider if something might modify VL or VTYPE, and will happily coalesce two vsetvlis when we need to preserve them. This fixes this by skipping to the next vsetvli. Demanding the fields isn't enough, as we need to preserve the VL and VTYPE values even if no fields are demanded. In practice this doesn't happen, presumably due to there not being any instructions that write to VL or VTYPE without reading them. But I noticed this whilst working on a separate patch and split it out.
2024-04-17	[PowerPC] 32-bit large code-model support for toc-data (#85129)	Zaara Syeda	7	-36/+144
	This patch adds the pseudo op ADDItocL for 32-bit large code-model support for toc-data.
2024-04-17	[RISCV] Add test for doLocalPostpass issue not checking if VL was modified. NFC	Luke Lau	1	-0/+25

2024-04-17	[LLVM][CodeGen] Fix register lane liveness tracking in RegisterPressure (#88892)	Krzysztof Parzyszek	1	-18/+21
	Re-enable an old assertion in `decreaseSetPressure`.
2024-04-17	[mlir] expose transform dialect symbol merge to python (#87690)	Oleksandr "Alex" Zinenko	5	-2/+120
	This functionality is available in C++, make it available in Python directly to operate on transform modules.
2024-04-17	[Inline] Regenerate inline-switch-default-2.ll (NFC)	DianQK	1	-176/+0

2024-04-17	[AMDGPU][Docs] Fix broken link to HRF memory model reference (#88696)	Fabian Ritter	1	-1/+1
	The link to the Heterogeneous-race-free Memory Models ASPLOS'14 paper by Hower et al. pointed to a bogus website, probably because the domain ownership has changed. This patch updates it to a version hosted on research.cs.wisc.edu.
2024-04-17	[VP] Correct lowering of predicated fma and faddmul to avoid strictfp. (#85272)	Kevin P. Neal	4	-11/+203
	Correct missing cases in a switch that result in @llvm.vp.fma.v4f32 getting lowered to a constrained fma intrinsic. Vector predicated lowering to contrained intrinsics is not supported currently, and there's no consensus on the path forward. We certainly shouldn't be introducing constrained intrinsics into a function that isn't strictfp. Problem found with D146845.
2024-04-17	[lldb] Fix evaluation of expressions with static initializers (#89063)	Pavel Labath	1	-2/+6
	After 281d71604f418eb952e967d9dc4b26241b7f96a, llvm generates 32-bit relocations, which overflow when we load these objects into high memory. Interestingly, setting the code model to "large" does not help here (perhaps it is the default?). I'm not completely sure that this is the right thing to do, but it doesn't seem to cause any ill effects. I'll follow up with the author of that patch about the expected behavior here.
2024-04-17	[TailDuplicator] Add maximum predecessors and successors to consider tail ↵	Quentin Dian	2	-0/+280
	duplicating blocks (#78582) Fixes #78578. Duplicating a BB which has both multiple predecessors and successors will result in a complex CFG and also may cause huge amount of PHI nodes. See https://github.com/llvm/llvm-project/issues/78578#issuecomment-1962363580 for a detailed description of the limit.
2024-04-17	[RISCV] Fix clang-tidy warning about else after return. NFC	Luke Lau	1	-1/+3

2024-04-17	[mlir] transform.apply_patterns support more config options (#88484)	Oleksandr "Alex" Zinenko	3	-1/+41
	Greedy rewrite driver has options to control the number of rewrites applies. Expose those via the corresponding transform op.