riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
12 days	[IR] NFC: Remove 'experimental' from partial.reduce.add intrinsic (#158637)	Sander de Smalen	1	-4/+2
	The partial reduction intrinsics are no longer experimental, because they've been used in production for a while and are unlikely to change.
2025-08-26	[ComplexDeinterleaving] Use LLVM ADTs (NFC) (#154754)	Benjamin Maxwell	1	-26/+28
	This swaps out STL types for their LLVM equivalents. This is recommended in the LLVM coding standards: https://llvm.org/docs/CodingStandards.html#c-standard-library
2025-08-20	[ComplexDeinterleaving] Use BumpPtrAllocator for CompositeNodes (NFC) (#153217)	Benjamin Maxwell	1	-111/+116
	I was looking over this pass and noticed it was using shared pointers for CompositeNodes. However, all nodes are owned by the deinterleaving graph and are not released until the graph is destroyed. This means a bump allocator and raw pointers can be used, which have a simpler ownership model and less overhead than shared pointers. The changes in this PR are to: - Add a `SpecificBumpPtrAllocator<CompositeNode>` to the `ComplexDeinterleavingGraph` - This allocates new nodes and will deallocate them when the graph is destroyed - Replace `NodePtr` and `RawNodePtr` with `CompositeNode *`
2025-08-12	[AArch64] Support symmetric complex deinterleaving with higher factors (#151295)	David Sherwood	1	-143/+325
	For loops such as this: ``` struct foo { double a, b; }; void foo(struct foo dst, struct foo src, int n) { for (int i = 0; i < n; i++) { dst[i].a += src[i].a * 3.2; dst[i].b += src[i].b * 3.2; } } ``` the complex deinterleaving pass will spot that the deinterleaving associated with the structured loads cancels out the interleaving associated with the structured stores. This happens even though they are not truly "complex" numbers because the pass can handle symmetric operations too. This is great because it means we can then perform normal loads and stores instead. However, we can also do the same for higher interleave factors, e.g. 4: ``` struct foo { double a, b, c, d; }; void foo(struct foo dst, struct foo src, int n) { for (int i = 0; i < n; i++) { dst[i].a += src[i].a * 3.2; dst[i].b += src[i].b * 3.2; dst[i].c += src[i].c * 3.2; dst[i].d += src[i].d * 3.2; } } ``` This PR extends the pass to effectively treat such structures as a set of complex numbers, i.e. ``` struct foo_alt { std::complex<double> x, y; }; ``` with equivalence between members: ``` foo_alt.x.real == foo.a foo_alt.x.imag == foo.b foo_alt.y.real == foo.c foo_alt.y.imag == foo.d ``` I've written the code to handle sets with arbitrary numbers of complex values, but since we only support interleave factors between 2 and 4 I've restricted the sets to 1 or 2 complex numbers. Also, for now I've restricted support for interleave factors of 4 to purely symmetric operations only. However, it could also be extended to handle complex multiplications, reductions, etc. Fixes: https://github.com/llvm/llvm-project/issues/144795
2025-07-29	Fix build warnings after 6fbc397964340ebc9cb04a094fd04bef9a53abc3 (#151100)	David Sherwood	1	-7/+0

2025-07-29	[IR] Add new CreateVectorInterleave interface (#150931)	David Sherwood	1	-10/+7
	This PR adds a new interface to IRBuilder called CreateVectorInterleave, which can be used to create vector.interleave intrinsics of factors 2-8. For convenience I have also moved getInterleaveIntrinsicID and getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where it can be used by IRBuilder.
2025-06-18	[LLVM][ComplexDeinterleaving] Update splat identification to include vector ↵	Paul Walker	1	-0/+3
	ConstantInt/FP. (#144516)
2025-05-13	[ComplexDeinterleave] Don't try to combine single FP reductions. (#139469)	Florian Hahn	1	-0/+4
	Currently the apss tries to combine floating point reductions, without checking for the correct fast-math flags and it also creates invalid IR (using llvm.reduce.add for FP types). For now, just bail out for non-integer types. PR: https://github.com/llvm/llvm-project/pull/139469
2025-05-08	Reapply "IR: Remove uselist for constantdata (#137313)" (#138961)	Matt Arsenault	1	-0/+3
	Reapply "IR: Remove uselist for constantdata (#137313)" This reverts commit 5936c02c8b9c6d1476f7830517781ce8b6e26e75. Fix checking uselists of constants in assume bundle queries
2025-05-07	Revert "IR: Remove uselist for constantdata (#137313)"	Kirill Stoimenov	1	-3/+0
	Possibly breaks the build: https://lab.llvm.org/buildbot/#/builders/24/builds/8119 This reverts commit 87f312aad6ede636cd2de5d18f3058bf2caf5651.
2025-05-06	IR: Remove uselist for constantdata (#137313)	Matt Arsenault	1	-0/+3
	This is a resurrected version of the patch attached to this RFC: https://discourse.llvm.org/t/rfc-constantdata-should-not-have-use-lists/42606 In this adaptation, there are a few differences. In the original patch, the Use's use list was replaced with an unsigned* to the reference count in the value. This version leaves them as null and leaves the ref counting only in Value. Remove use-lists from instances of ConstantData (which are shared across modules and have no operands). To continue supporting most of the use-list API, store a ref-count in place of the use-list; this is for API like Value::use_empty and Value::hasNUses. Operations that actually need the use-list -- like Value::use_begin -- will assert. This change has three benefits: 1. The compiler output cannot in any way depend on the use-list order of instances of ConstantData. 2. There's no use-list traffic when adding and removing simple constants from operand lists (although there is ref-count traffic; YMMV). 3. It's cheaper to serialize use-lists (since we're no longer serializing the use-list order of things like i32 0). The downside is that you can't look at all the users of ConstantData, but traversals of users of i32 0 are already ill-advised. Possible follow-ups: - Track if an instance of a ConstantVector/ConstantArray/etc. is known to have all ConstantData arguments, and drop the use-lists to ref-counts in those cases. Callers need to check Value::hasUseList before iterating through the use-list. - Remove even the ref-counts. I'm not sure they have any benefit besides minimizing the scope of this commit, and maintaining the counts is not free. Fixes #58629 Co-authored-by: Duncan P. N. Exon Smith <dexonsmith@apple.com>
2025-05-04	[CodeGen] Remove unused local variables (NFC) (#138441)	Kazu Hirata	1	-2/+0

2025-04-21	[llvm] Use llvm::SmallVector::pop_back_val (NFC) (#136533)	Kazu Hirata	1	-2/+1

2025-04-19	[llvm] Use llvm::SmallVector::pop_back_val (NFC) (#136441)	Kazu Hirata	1	-4/+2

2025-04-18	ComplexDeinterleaving: Avoid using getNumUses (#136354)	Matt Arsenault	1	-1/+1

2025-04-13	[CodeGen] Avoid repeated hash lookups (NFC) (#135540)	Kazu Hirata	1	-2/+3

2025-03-19	[llvm] Fix crash when complex deinterleaving operates on an unrolled loop ↵	Nicholas Guy	1	-0/+11
	(#129735) When attempting to perform complex deinterleaving on an unrolled loop containing a reduction, the complex deinterleaving pass would fail to accommodate the wider types when accumulating the unrolled paths. Instead of trying to alter the incoming IR to fit expectations, the pass should instead decide against processing any reduction that results in a non-complex or non-vector value.
2025-01-09	[llvm] Fix crash caused by reprocessing complex reductions (#122077)	Nicholas Guy	1	-1/+1
	If a complex pattern had the shape of both a complex->complex reduction and a complex->single reduction, the matching would recognise both and deem the graph a valid transformation. Preventing this reprocessing results in only one of these matching, meaning that in the case of an invalid graph, we don't try to transform it anyway.
2025-01-06	Complex deinterleaving/single reductions build fix Reapply "Add support for ↵	Nicholas Guy	1	-14/+276
	single reductions in ComplexDeinterleavingPass (#112875)" (#120441) This reverts commit 76714be5fd4ace66dd9e19ce706c2e2149dd5716, fixing the build failure that caused the revert. The failure stemmed from the complex deinterleaving pass identifying a series of add operations as a "complex to single reduction", so when it tried to transform this erroneously identified pattern, it faulted. The fix applied is to ensure that complex numbers (or patterns that match them) are used throughout, by checking if there is a deinterleave node amidst the graph.
2024-12-18	Revert "Add support for single reductions in ComplexDeinterleavingPass ↵	Florian Hahn	1	-264/+14
	(#112875)" This reverts commit b3eede5e1fa7ab742b86e9be22db7bccd2505b8a. This has been breaking most AArch64 stage2 builds for 4+ hours, reverting to get the bots back to green. https://lab.llvm.org/buildbot/#/builders/41/builds/4172 https://lab.llvm.org/buildbot/#/builders/4/builds/4281 https://lab.llvm.org/buildbot/#/builders/199/builds/263 https://lab.llvm.org/buildbot/#/builders/198/builds/334 https://lab.llvm.org/buildbot/#/builders/143/builds/4276 https://lab.llvm.org/buildbot/#/builders/17/builds/4725
2024-12-18	Add support for single reductions in ComplexDeinterleavingPass (#112875)	Nicholas Guy	1	-14/+264
	The Complex Deinterleaving pass assumes that all values emitted will result in complex numbers, this patch aims to remove that assumption and adds support for emitting just the real or imaginary components, not both.
2024-11-12	[CodeGen] Remove unused includes (NFC) (#115996)	Kazu Hirata	1	-1/+0
	Identified with misc-include-cleaner.
2024-04-29	Move several vector intrinsics out of experimental namespace (#88748)	Maciej Gabka	1	-17/+13
	This patch is moving out following intrinsics: * vector.interleave2/deinterleave2 * vector.reverse * vector.splice from the experimental namespace. All these intrinsics exist in LLVM for more than a year now, and are widely used, so should not be considered as experimental.
2024-03-05	[NFC][RemoveDIs] Always use iterators for inserting PHIs	Jeremy Morse	1	-1/+1
	It's becoming potentially unsafe to insert a PHI instruction using a plain Instruction pointer. Switch all the remaining sites that create and insert PHIs to use iterators instead. For example, the code in ComplexDeinterleavingPass.cpp is definitely at-risk of mixing PHIs and debug-info.
2023-09-06	[ComplexDeinterleaving] Use MapVector to fix codegen non-determinism.	Florian Hahn	1	-1/+2

2023-08-31	[CodeGen] Fix incorrect insertion point selection for reduction nodes in ↵	Igor Kirillov	1	-1/+11
	ComplexDeinterleavingPass When replacing ComplexDeinterleavingPass::ReductionOperation, we can do it either from the Real or Imaginary part. The correct way is to take whichever is later in the BasicBlock, but before the patch, we just always took the Real part. Fixes https://github.com/llvm/llvm-project/issues/65044 Differential Revision: https://reviews.llvm.org/D159209
2023-08-04	[CodeGen] Improve speed of ComplexDeinterleaving pass	Igor Kirillov	1	-10/+8
	Cache all results of running `identifyNode`, even those that do not identify potential complex operations. This patch prevents ComplexDeinterleaving pass from repeatedly trying to identify Nodes for the same pair of instructions. Fixes https://github.com/llvm/llvm-project/issues/64379 Differential Revision: https://reviews.llvm.org/D156916
2023-07-19	[CodeGen] Extend ComplexDeinterleaving pass to recognise patterns using ↵	Igor Kirillov	1	-53/+121
	integer types AArch64 introduced CMLA and CADD instructions as part of SVE2. This change allows to generate such instructions when this architecture feature is available. Differential Revision: https://reviews.llvm.org/D153808
2023-07-10	[CodeGen] Fix incorrectly detected reduction bug in ComplexDeinterleaving pass	Igor Kirillov	1	-4/+12
	Using ACLE intrinsics, it is possible to create a loop that the deinterleaving pass incorrectly classified as a reduction loop. For example, for fixed-width vectors the loop was like below: vector.body: %a = phi <4 x float> [ %init.a, %entry ], [ %updated.a, %vector.body ] %b = phi <4 x float> [ %init.b, %entry ], [ %updated.b, %vector.body ] ... ; Does not depend on %a or %b: %updated.a = ... %updated.b = ... Differential Revision: https://reviews.llvm.org/D154598
2023-07-05	[CodeGen] Add support for Splats in ComplexDeinterleaving pass	Igor Kirillov	1	-0/+81
	This commit allows generating of complex number intrinsics for expressions with constants or loops invariants, which are represented as splats. For instance, after vectorizing loops in the following code snippets, the ComplexDeinterleaving pass will be able to generate complex number intrinsics: ``` complex<> x = ...; for (int i = 0; i < N; ++i) c[i] = a[i] * b[i] * x; ``` or ``` for (int i = 0; i < N; ++i) c[i] = a[i] * b[i] * (11.0 + 3.0i); ``` Differential Revision: https://reviews.llvm.org/D153355
2023-07-03	[CodeGen] Refactor ComplexDeinterleaving to run identification on Values ↵	Igor Kirillov	1	-109/+94
	instead of Instructions This change will make it easier to add identification of complex constants in future patches. Differential Revision: https://reviews.llvm.org/D153446
2023-06-28	[NFC]Fix possibly derefer nullptr in ComplexDeinterleavingPass.cpp	Wang, Xin10	1	-1/+1
	Fix static analyzer reports issue, add assert to avoid analyzer report. Reviewed By: igor.kirillov Differential Revision: https://reviews.llvm.org/D153942
2023-06-27	Fix the ComplexDeinterleaving bug when handling mixed reductions.	Igor Kirillov	1	-0/+4
	Add a missing check that ensures that ComplexDeinterleaving for reduction is only analyzed for Real and Imaginary Instructions of the same type. Differential Revision: https://reviews.llvm.org/D153862
2023-06-23	Revert "Revert "[CodeGen] Extend reduction support in ComplexDeinterleaving ↵	Igor Kirillov	1	-0/+59
	pass to support predication"" Adds the capability to recognize SelectInst that appear in the IR. These instructions are generated during scalable vectorization for reduction and when the code contains conditions inside the loop body or when "-prefer-predicate-over-epilogue=predicate-dont-vectorize" is set. Differential Revision: https://reviews.llvm.org/D152558 This reverts commit ab09654832dba5cef8baa6400fdfd3e4d1495624. Reason: Reapplying after removing unnecessary default case in switch expression.
2023-06-22	Revert "[CodeGen] Extend reduction support in ComplexDeinterleaving pass to ↵	Vitaly Buka	1	-63/+0
	support predication" ComplexDeinterleavingPass.cpp:1849:3: error: default label in switch which covers all enumeration values This reverts commit 116953b82130df1ebd817b3587b16154f659c013.
2023-06-22	[CodeGen] Extend reduction support in ComplexDeinterleaving pass to support ↵	Igor Kirillov	1	-0/+63
	predication Adds the capability to recognize SelectInst that appear in the IR. These instructions are generated during scalable vectorization for reduction and when the code contains conditions inside the loop body or when "-prefer-predicate-over-epilogue=predicate-dont-vectorize" is set. Differential Revision: https://reviews.llvm.org/D152558
2023-06-14	[CodeGen] Fix a warning	Kazu Hirata	1	-4/+0
	This patch fixes: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:1790:3: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]
2023-06-14	[CodeGen] Add support for reductions in ComplexDeinterleaving pass	Igor Kirillov	1	-24/+295
	This commit enhances the ComplexDeinterleaving pass to handle unordered reductions in simple one-block vectorized loops, supporting both SVE and Neon architectures. Differential Revision: https://reviews.llvm.org/D152022
2023-05-31	[CodeGen] Improve handling -Ofast generated code by ComplexDeinterleaving pass	Igor Kirillov	1	-35/+547
	Code generated with -Ofast and -O3 -ffp-contract=fast (add -ffinite-math-only to enable vectorization) can differ significantly. Code compiled with -O3 can be deinterleaved using patterns as the instruction order is preserved. However, with the -Ofast flag, there can be multiple changes in the computation sequence, and even the real and imaginary parts may not be calculated in parallel. For more details, refer to llvm/test/CodeGen/AArch64/complex-deinterleaving--fast.ll and llvm/test/CodeGen/AArch64/complex-deinterleaving--contract.ll tests. This patch implements a more general approach and enables handling most -Ofast cases. Differential Revision: https://reviews.llvm.org/D148558
2023-05-30	[CodeGen] Refactor IR generation functions to use IRBuilder in ↵	Igor Kirillov	1	-15/+15
	ComplexDeinterleaving pass This patch updates several functions in LLVM's IR generation code to accept an IRBuilder object as an argument, rather than an Instruction that indicates the insertion point for new instructions. This change is necessary to handle sophisticated -Ofast optimization cases from D148558 where it's unclear which instructions should be used as the insertion point for new operations. Differential Revision: https://reviews.llvm.org/D148703
2023-05-20	[llvm] Reduce ComplexDeinterleavingPass.h includes	Elliot Goodrich	1	-0/+1
	Remove the unnecessary `"llvm/IR/PatternMatch.h"` include directive from `ComplexDeinterleavingPass.h` and move it to the corresponding source file. Add missing includes that were transitively included by this header to 3 other source files. This reduces the total number of preprocessing tokens across the LLVM source files in `lib` from (roughly) 1,964,876,961 to 1,935,091,611 - a reduction of ~1.52%. This should result in a small improvement in compilation time.
2023-05-20	Revert "[llvm] Reduce ComplexDeinterleavingPass.h includes"	Elliot Goodrich	1	-1/+0
	This reverts commit 058ca5c07106d38ad66e3ec4972a613a64e88151.
2023-05-20	[llvm] Reduce ComplexDeinterleavingPass.h includes	Elliot Goodrich	1	-0/+1
	Remove the unnecessary `"llvm/IR/PatternMatch.h"` include directive from `ComplexDeinterleavingPass.h` and move it to the corresponding source file. Add missing includes that were transitively included by this header to 2 other source files. This reduces the total number of preprocessing tokens across the LLVM source files in `lib` from (roughly) 1,964,876,961 to 1,935,091,611 - a reduction of ~1.52%. This should result in a small improvement in compilation time. Differential Revision: https://reviews.llvm.org/D150514
2023-04-21	[CodeGen] Enable AArch64 SVE FCMLA/FCADD instruction generation in ↵	Igor Kirillov	1	-119/+169
	ComplexDeinterleaving This commit adds support for scalable vector types in theComplexDeinterleaving pass, allowing it to recognize and handle `llvm.vector.interleave2` and `llvm.vector.deinterleave2` intrinsics for both fixed and scalable vectors Differential Revision: https://reviews.llvm.org/D147451
2023-04-21	Fix uninitialized scalar members in CodeGen	Akshay Khadse	1	-1/+2
	This change fixes some static code analysis warnings. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148811
2023-04-18	[CodeGen] Enable processing of interconnected complex number operations	Igor Kirillov	1	-71/+118
	With this patch, ComplexDeinterleavingPass now has the ability to handle any number of interconnected operations involving complex numbers. For example, the patch enables the processing of code like the following: for (int i = 0; i < 1000; ++i) { a[i] = w[i] * v[i]; b[i] = w[i] * u[i]; } This code has multiple arrays containing complex numbers and a common subexpression `w` that appears in two expressions. Differential Revision: https://reviews.llvm.org/D146988
2023-04-17	Fix uninitialized pointer members in CodeGen	Akshay Khadse	1	-2/+2
	This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303
2023-03-28	[ComplexDeinterleaving] Propagate fast math flags to symmetric operations.	David Green	1	-4/+4
	This is a simple patch to make sure fast math flags are propagated through to the newly created symmetric operations, which can help with later simplifications. Differential Revision: https://reviews.llvm.org/D146409
2023-03-14	[Codegen][ARM][AArch64] Support symmetric operations on complex numbers	Nicholas Guy	1	-6/+95
	Differential Revision: https://reviews.llvm.org/D142482
2023-03-14	Cleanup of Complex Deinterleaving pass (NFCI)	Nicholas Guy	1	-5/+13
	Differential Revision: https://reviews.llvm.org/D143177