riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
8 days	[SimplifyCFG] Allow some switch optimizations early in the pipeline (#158242)	Nikita Popov	1	-0/+2
	While we do not want to form actual lookup tables early, we do want to perform some optimizations, as they may enable inlining of the much simpler form. Builds on https://github.com/llvm/llvm-project/pull/156477, which originally included this change as well. This PR makes two changes on top of it: * Do not perform the optimization early if it requires adding a mask check. These make the resulting IR less analyzable. * Add a new SimplifyCFG option that controls switch-to-arithmetic conversion separately from switch-to-lookup conversion. Enable the new flag at the end of the function simplification pipeline. This means that we attempt the arithmetic conversion before inlining, but avoid it in the early pipeline, where it may lose information.
9 days	[AllocToken] Introduce AllocToken instrumentation pass (#156838)	Marco Elver	1	-0/+1
	Introduce `AllocToken`, an instrumentation pass designed to provide tokens to memory allocators enabling various heap organization strategies, such as heap partitioning. Initially, the pass instruments functions marked with a new attribute `sanitize_alloc_token` by rewriting allocation calls to include a token ID, appended as a function argument with the default ABI. The design aims to provide a flexible framework for implementing different token generation schemes. It currently supports the following token modes: - TypeHash (default): token IDs based on a hash of the allocated type - Random: statically-assigned pseudo-random token IDs - Increment: incrementing token IDs per TU For the `TypeHash` mode introduce support for `!alloc_token` metadata: the metadata can be attached to allocation calls to provide richer semantic information to be consumed by the AllocToken pass. Optimization remarks can be enabled to show where no metadata was available. An alternative "fast ABI" is provided, where instead of passing the token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the token ID is directly encoded into the name of the called function (e.g., `__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this offers more efficient instrumentation by avoiding the overhead of passing an additional argument at each allocation site. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1] --- This change is part of the following series: 1. https://github.com/llvm/llvm-project/pull/160131 2. https://github.com/llvm/llvm-project/pull/156838 3. https://github.com/llvm/llvm-project/pull/162098 4. https://github.com/llvm/llvm-project/pull/162099 5. https://github.com/llvm/llvm-project/pull/156839 6. https://github.com/llvm/llvm-project/pull/156840 7. https://github.com/llvm/llvm-project/pull/156841 8. https://github.com/llvm/llvm-project/pull/156842
2025-09-25	[llvm] Add `vfs::FileSystem` to `PassBuilder` (#160188)	Jan Svoboda	1	-2/+3
	Some LLVM passes need access to the filesystem to read configuration files and similar. In some places, this is achieved by grabbing the VFS from `PGOOptions`, but some passes don't have access to these and resort to just calling `vfs::getRealFileSystem()`. This PR allows setting the VFS directly on `PassBuilder` that's able to pass it down to all passes that need it.
2025-09-25	[LV] Remove EVLIndVarSimplify pass (#160454)	Luke Lau	1	-1/+0
	Initially this was needed to replace the fixed-step canonical IV with the variable-step EVL IV, but this was eventually superseded by the loop vectorizer doing this transform itself in #147222. The pass was then removed from the RISC-V pipeline in #151483 and the loop vectorizer stopped emitting the metadata used by the pass in #155760, so now there's no users of it.
2025-09-19	[PassBuilder] Add callback invoking to PassBuilder string API (#157153)	Gabriel Baraldi	1	-20/+111
	This is a very rough state of what this can look like, but I didn't want to spend too much time on what could be a dead end. Currently the only way to invoke callbacks is by using the default pipelines, this is an issue if you want to define your own pipeline using the C string API (we do that in LLVM.jl in julia) so I extended the api to allow for invoking those callbacks just like one would call a pass of that kind. There are some questions about the params that these callbacks take and also I'm missing some of them (some of them are also invoked by the backend so we may not want to expose them) Code written with AI help, bugs are mine. (Not sure what policy for this is on LLVM)
2025-09-19	[CodeGen][NewPM] Port `ReachingDefAnalysis` to new pass manager. (#159572)	Mikhail Gudim	1	-0/+1
	In this commit: (1) Added new pass manager support for `ReachingDefAnalysis`. (2) Added printer pass. (3) Make old pass manager use `ReachingDefInfoWrapperPass`
2025-09-18	[DropUnnecessaryAssumes] Add pass for dropping assumes (#159403)	Nikita Popov	1	-0/+1
	This adds a new pass for dropping assumes that are unlikely to be useful for further optimization. It works by discarding any assumes whose affected values are one-use (which implies that they are only used by the assume, i.e. ephemeral). This pass currently runs at the start of the module optimization pipeline, that is post-inline and post-link. Before that point, it is more likely for previously "useless" assumes to become useful again, e.g. because an additional user of the value is introduced after inlining + CSE.
2025-09-18	[NewPM] Remove BranchProbabilityInfo from FunctionToLoopPassAdaptor. NFCI ↵	Luke Lau	1	-4/+1
	(#159516) No loop pass seems to use now it after LoopPredication stopped using it in https://reviews.llvm.org/D111668
2025-09-10	[AMDGPU] Change expand-fp opt level argument syntax (#157408)	Frederik Harwath	1	-12/+10
	Align the syntax used for the optimization level argument of the expand-fp pass in textual descriptions of pass pipelines with the syntax used by other passes taking a similar argument. That is, use e.g. `expand-fp<O1>` instead of `expand-fp<opt-level=1>`.
2025-09-03	[AMDGPU] Implement IR expansion for frem instruction (#130988)	Frederik Harwath	1	-0/+24
	This patch implements a correctly rounded expansion of the frem instruction in LLVM IR. This is useful for target architectures for which such an expansion is too involved to be implement in ISel Lowering. The expansion is based on the code from the AMD device libs and has been tested successfully against the OpenCL conformance tests on amdgpu. The expansion is implemented in the preexisting "expand-fp" pass. It replaces the expansion of "frem" in ISel for the amdgpu target; it is enabled for targets which do not directly support "frem" and for which no matching "fmod" LibCall is available. --------- Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
2025-08-29	[SCEVDivision] Add SCEVDivisionPrinterPass with corresponding tests (#155832)	Ryotaro Kasuga	1	-0/+1
	This patch introduces `SCEVDivisionPrinterPass` and registers it under the name `print<scev-division>`, primarily for testing purposes. This pass invokes `SCEVDivision::divide` upon encountering `sdiv`, and prints the numerator, denominator, quotient, and remainder. It also adds several test cases, some of which are currently incorrect and require fixing. Along with that, this patch added some comments to clarify the behavior of `SCEVDivision::divide`, as follows: - This function does NOT actually perform the division - Given the `Numerator` and `Denominator`, find a pair `(Quotient, Remainder)` s.t. `Numerator = Quotient * Denominator + Remainder` - The common condition `Remainder < Denominator` is NOT necessarily required - There may be multiple solutions for `(Quotient, Remainder)`, and this function finds one of them - Especially, there is always a trivial solution `(0, Numerator)` - The following computations may wrap - The multiplication of `Quotient` and `Denominator` - The addition of `Quotient * Denominator` and `Remainder` Related discussion: #154745
2025-08-17	[llvm] Remove unused includes (NFC) (#154051)	Kazu Hirata	1	-1/+0
	These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-07-31	Revert "[PGO] Add `llvm.loop.estimated_trip_count` metadata" (#151585)	Joel E. Denny	1	-1/+0
	Reverts llvm/llvm-project#148758 [As requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)
2025-07-31	[PGO] Add `llvm.loop.estimated_trip_count` metadata (#148758)	Joel E. Denny	1	-0/+1
	This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As [suggested in the RFC comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4), it adds the new metadata to all loops at the time of profile ingestion and estimates each trip count from the loop's `branch_weights` metadata. As [suggested in the PR #128785 review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036), it does so via a new `PGOEstimateTripCountsPass` pass, which creates the new metadata for each loop but omits the value if it cannot estimate a trip count due to the loop's form. An important observation not previously discussed is that `PGOEstimateTripCountsPass` often cannot estimate a loop's trip count, but later passes can sometimes transform the loop in a way that makes it possible. Currently, such passes do not necessarily update the metadata, but eventually that should be fixed. Until then, if the new metadata has no value, `llvm::getLoopEstimatedTripCount` disregards it and tries again to estimate the trip count from the loop's current `branch_weights` metadata.
2025-07-23	[PGO] Add ProfileInjector and ProfileVerifier passes (#147388)	Mircea Trofin	1	-0/+1
	Adding 2 passes, one to inject `MD_prof` and one to check its presence. A subsequent patch will add these (similar to debugify) to `opt` (and, eventually, a variant of this, to `llc`) Tracking issue: #147390
2025-07-23	[CodeGen] Add a pass for testing finalizeBundle (#149813)	Jay Foad	1	-0/+1
	This allows for unit testing of finalizeBundle with standard MIR tests using update_mir_test_checks.py.
2025-07-21	Reapply "[GVN] memoryssa implies no-memdep (#149473)" (#149767)	Madhur Amilkanthwar	1	-0/+4
	Enabling one of MemorySSA or MD implies the other is off. Already approved in https://github.com/llvm/llvm-project/pull/149473 but I had to revert as I missed updating one test.
2025-07-21	Revert "[GVN] memoryssa implies no-memdep (#149473)" (#149766)	Madhur Amilkanthwar	1	-4/+0
	This reverts commit 60d2d94db253a9fdc7bd111120c803f808564b30.
2025-07-21	[GVN] memoryssa implies no-memdep (#149473)	Madhur Amilkanthwar	1	-0/+4
	Enabling one of MemorySSA or MD implies the other is off.
2025-07-16	[CodeGen][NPM] Port ProcessImplicitDefs to NPM (#148110)	Vikram Hegde	1	-0/+1
	same as https://github.com/llvm/llvm-project/pull/138829 Co-authored-by : Oke, Akshat <[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>
2025-07-15	[CodeGen][NPM] Register Function Passes (#148109)	Vikram Hegde	1	-0/+2
	same as https://github.com/llvm/llvm-project/pull/138828, Co-authored-by : Oke, Akshat <[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>
2025-07-10	[CodeGen][NewPM] Port "PostRAMachineSink" pass to NPM (#129690)	Vikram Hegde	1	-0/+1

2025-07-09	[CodeGen][NPM] Port InitUndef to NPM (#138495)	Akshat Oke	1	-0/+1

2025-07-09	Utils: Add pass to declare runtime libcalls (#147534)	Matt Arsenault	1	-0/+1
	This will be useful for testing the set of calls for different systems, and eventually the product of context specific modifiers applied. In the future we should also know the type signatures, and be able to emit the correct one.
2025-07-07	[CodeGen][NPM] Allow nested MF pass managers for -passes (#128852)	Akshat Oke	1	-1/+10
	This allows `machine-function(p1,machine-function(...))` instead of erroring. Effectively it is flattened to a single MFPM.
2025-06-30	[PassBuilder][FatLTO] Expose FatLTO pipeline via pipeline string (#146048)	Nikita Popov	1	-0/+35
	Expose the FatLTO pipeline via `-passes="fatlto-pre-link<Ox>"`, similar to all the other optimization pipelines. This is to allow reproducing it outside clang. (Possibly also useful for C API users.)
2025-06-27	[LowerAllowCheckPass] allow to specify runtime.check hotness (#145998)	Florian Mayer	1	-0/+13

2025-06-27	[PassBuilder] Treat pipeline aliases as normal passes (#146038)	Nikita Popov	1	-55/+18
	Pipelines like `-passes="default<O3>"` are currently parsed in a special way. Switch them to work like normal, parameterized module passes.
2025-06-05	[MemProf] Split MemProfiler into Instrumentation and Use. (#142811)	Snehasish Kumar	1	-1/+2
	Most of the recent development on the MemProfiler has been on the Use part. The instrumentation has been quite stable for a while. As the complexity of the use grows (with undrifting, diagnostics etc) I figured it would be good to separate these two implementations.
2025-06-04	[llvm] Remove unused includes (NFC) (#142733)	Kazu Hirata	1	-11/+0
	These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-06-02	[HashRecognize] Introduce new analysis (#139120)	Ramkumar Ramachandra	1	-0/+1
	Introduce a fresh analysis for recognizing polynomial hashes, with the rationale that several targets have specific instructions to optimize things like CRC and GHASH (eg. X86 and RISC-V crypto extension). We limit the scope to polynomial hashes computed in a Galois field of characteristic 2, since this class of operations can also be optimized in the absence of target-specific instructions to use a lookup table. At the moment, we only recognize the CRC algorithm. RFC: https://discourse.llvm.org/t/rfc-new-analysis-for-polynomial-hash-recognition/86268
2025-05-27	[NFC][LLVM] Minor namespace fixes in PassBuilder (#141288)	Rahul Joshi	1	-4/+2
	- No need to prefix `PointerType` with `llvm::`. - Avoid namespace block to define `PrintPipelinePasses`.
2025-05-27	[NFC][LLVM] Use formatv automatic index assignment in PassBuilder (#141286)	Rahul Joshi	1	-78/+71

2025-05-23	[NFC][CodeGen] Adopt MachineFunctionProperties convenience accessors (#141101)	Rahul Joshi	1	-14/+13

2025-05-22	Adding IR2Vec as an analysis pass (#134004)	S. VenkataKeerthy	1	-0/+1
	This PR introduces IR2Vec as an analysis pass. The changes include: - Logic for generating Symbolic encodings. - 75D learned vocabulary. - lit tests. Here is the link to the RFC - https://discourse.llvm.org/t/rfc-enhancing-mlgo-inlining-with-ir2vec-embeddings Acknowledgements: contributors - https://github.com/IITH-Compilers/IR2Vec/graphs/contributors --------- Co-authored-by: svkeerthy <venkatakeerthy@google.com> Co-authored-by: Mircea Trofin <mtrofin@google.com>
2025-05-14	[LV][EVL] Introduce the EVLIndVarSimplify Pass for EVL-vectorized loops ↵	Min-Yih Hsu	1	-0/+1
	(#131005) When we enable EVL-based loop vectorization w/ predicated tail-folding, each vectorized loop has effectively two induction variables: one calculates the step using (VF x vscale) and the other one increases the IV by values returned from experiment.get.vector.length. The former, also known as canonical IV, is more favorable for analyses as it's "countable" in the sense of SCEV; the latter (EVL-based IV), however, is more favorable to codegen, at least for those that support scalable vectors like AArch64 SVE and RISC-V. The idea is that we use canonical IV all the way until the end of all vectorizers, where we replace it with EVL-based IV using EVLIVSimplify introduced here. Such that we can have the best from both worlds. This Pass is enabled by default in RISC-V. However, since we haven't really vectorize loops with predicate tail-folding by default, this Pass is no-op at this moment.
2025-05-14	[GlobalISel] Add a GISelValueTracker printing pass (#139687)	David Green	1	-0/+1
	This adds a GISelValueTrackingPrinterPass that can print the known bits and sign bit of each def in a function. It is built on the new pass manager and so adds a NPM GISelValueTrackingAnalysis, renaming the older class to GISelValueTrackingAnalysisLegacy. The first 2 functions from the AArch64GISelMITest are ported over to an mir test to show it working. It also runs successfully on all files in llvm/test/CodeGen/AArch64/GlobalISel/*.mir that are not invalid. It can hopefully be used to test GlobalISel known bits analysis more directly in common cases, without jumping through the hoops that the C++ tests requires.
2025-04-30	[Passes] Remove extra ';' outside of a function (NFC)	Jie Fu	1	-1/+1
	/llvm-project/llvm/lib/Passes/PassBuilder.cpp:1508:2: error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi] 1508 \| }; \| ^ 1 error generated.
2025-04-30	[CodeGen][NPM] Port VirtRegRewriter to NPM (#130564)	Akshat Oke	1	-0/+13

2025-04-30	[CodeGen][NewPM] Port "ShrinkWrap" pass to NPM (#129880)	Vikram Hegde	1	-0/+1

2025-04-30	[CodeGen] Port MachineUniformityAnalysis to new pass manager (#137578)	paperchalice	1	-0/+1
	- Add new pass manager version of `MachineUniformityAnalysis `. - Query `TargetTransformInfo` in new pass manager version. - Use `printAsOperand` when printing machine function name
2025-04-29	[CodeGen][NewPM] Port "PrologEpilogInserter" to NPM (#130550)	Vikram Hegde	1	-0/+1

2025-04-18	[CodeGen][NPM] Port UnreachableMachineBlockElim to NPM (#136127)	Akshat Oke	1	-0/+1

2025-04-15	[CodeGen][NPM] Port StackFrameLayoutAnalysisPass to NPM (#130070)	Akshat Oke	1	-0/+1

2025-04-14	[CodeGen][NPM] Port MachineSanitizerBinaryMetadata to NPM (#130069)	Akshat Oke	1	-0/+1
	Didn't find a test for this (but there are tests for the `Function` version of this pass)
2025-04-14	[CodeGen][NPM] Port RemoveLoadsIntoFakeUses to NPM (#130068)	Akshat Oke	1	-0/+1

2025-04-14	[CodeGen][NPM] Port BranchRelaxation to NPM (#130067)	Akshat Oke	1	-0/+1
	This completes the PreEmitPasses.
2025-04-09	[CodeGen][NPM] Port PostRAHazardRecognizer to NPM (#130066)	Akshat Oke	1	-0/+1

2025-04-01	IRNormalizer: Replace cl::opts with pass parameters (#133874)	Matt Arsenault	1	-0/+25
	Not sure why the "fold-all" option naming didn't match the variable "FoldPreOutputs", but I've preserved the difference. More annoyingly, the pass name "normalize" does not match the pass name IRNormalizer and should probably be fixed one way or the other. Also the existing test coverage for the flags is lacking. I've added a test that shows they parse, but we should have tests that they do something.
2025-04-01	[CodeGen][NPM] Port XRayInstrumentation to NPM (#129865)	Akshat Oke	1	-0/+1