riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
8 hours	[SimplifyCFG] Ensure selects have not been constant folded in ↵	Antonio Frighetto	1	-22/+21
	`foldSwitchToSelect` Make sure selects do exist prior to assigning weights to edges. Fixes: https://github.com/llvm/llvm-project/issues/161137.
3 days	[profcheck][SimplifyCFG] Propagate !prof from `switch` to `select` (#159645)	Mircea Trofin	1	-11/+75
	Propagate `!prof` from `switch` instructions. Issue #147390
5 days	[profcheck] Option to inject distinct small weights (#159644)	Mircea Trofin	1	-29/+47
	There are cases where the easiest way to regression-test a profile change is to add `!prof` metadata, with small numbers as to simplify manual verification. To ensure coverage, this (the inserting) may become tedious. This patch makes `prof-inject` do that for us, if so opted in. The list of weights used is a bunch of primes, used as a circular buffer. Issue #147390
6 days	[SimplifyCFG] Avoid using isNonIntegralPointerType()	Alexander Richardson	1	-8/+19
	This is an overly broad check, the transformation made here can be done safely for pointers with index!=repr width. This fixes the codegen regression introduced by https://github.com/llvm/llvm-project/pull/105735 and should be beneficial for AMDGPU code-generation once the datalayout there no longer uses the overly strict `ni:` specifier. Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/159890
9 days	[IR] Fix a few implicit conversions from TypeSize to uint64_t. NFC (#159894)	Craig Topper	1	-2/+2

10 days	Reland [BasicBlockUtils] Handle funclets when detaching EH pad blocks (#159379)	Gábor Spaits	1	-28/+69
	Fixes #148052 . Last PR did not account for the scenario, when more than one instruction used the `catchpad` label. In that case I have deleted uses, which were already "choosen to be iterated over" by the early increment iterator. This issue was not visible in normal release build on x86, but luckily later on the address sanitizer build it has found it on the buildbot. Here is the diff from the last version of this PR: #158435 ```diff diff --git a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp index 91e245e5e8f5..1dd8cb4ee584 100644 --- a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp +++ b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp @@ -106,7 +106,8 @@ void llvm::detachDeadBlocks(ArrayRef<BasicBlock > BBs, // first block, the we would have possible cleanupret and catchret // instructions with poison arguments, which wouldn't be valid. if (isa<FuncletPadInst>(I)) { - for (User User : make_early_inc_range(I.users())) { + SmallPtrSet<BasicBlock , 4> UniqueEHRetBlocksToDelete; + for (User User : I.users()) { Instruction ReturnInstr = dyn_cast<Instruction>(User); // If we have a cleanupret or catchret block, replace it with just an // unreachable. The other alternative, that may use a catchpad is a @@ -114,33 +115,12 @@ void llvm::detachDeadBlocks(ArrayRef<BasicBlock > BBs, if (isa<CatchReturnInst>(ReturnInstr) \|\| isa<CleanupReturnInst>(ReturnInstr)) { BasicBlock ReturnInstrBB = ReturnInstr->getParent(); - // This catchret or catchpad basic block is detached now. Let the - // successors know it. - // This basic block also may have some predecessors too. For - // example the following LLVM-IR is valid: - // - // [cleanuppad_block] - // \| - // [regular_block] - // \| - // [cleanupret_block] - // - // The IR after the cleanup will look like this: - // - // [cleanuppad_block] - // \| - // [regular_block] - // \| - // [unreachable] - // - // So regular_block will lead to an unreachable block, which is also - // valid. There is no need to replace regular_block with unreachable - // in this context now. - // On the other hand, the cleanupret/catchret block's successors - // need to know about the deletion of their predecessors. - emptyAndDetachBlock(ReturnInstrBB, Updates, KeepOneInputPHIs); + UniqueEHRetBlocksToDelete.insert(ReturnInstrBB); } } + for (BasicBlock EHRetBB : + make_early_inc_range(UniqueEHRetBlocksToDelete)) + emptyAndDetachBlock(EHRetBB, Updates, KeepOneInputPHIs); } } ```
11 days	[VPlan] Simplify Plan's entry in removeBranchOnConst. (#154510)	Florian Hahn	1	-26/+0
	After https://github.com/llvm/llvm-project/pull/153643, there may be a BranchOnCond with constant condition in the entry block. Simplify those in removeBranchOnConst. This removes a number of redundant conditional branch from entry blocks. In some cases, it may also make the original scalar loop unreachable, because we know it will never execute. In that case, we need to remove the loop from LoopInfo, because all unreachable blocks may dominate each other, making LoopInfo invalid. In those cases, we can also completely remove the loop, for which I'll share a follow-up patch. Depends on https://github.com/llvm/llvm-project/pull/153643. PR: https://github.com/llvm/llvm-project/pull/154510
12 days	[SCCP] Relax two-instruction range checks (#158495)	Yingwei Zheng	1	-0/+54
	If we know x in R1, the range check `x in R2` can be relaxed into `x in Union(R2, Inverse(R1))`. The latter one may be more efficient if we can represent it with one icmp. Fixes regressions introduced by https://github.com/llvm/llvm-project/pull/156497. Proof for `(X & -Pow2) == C -> (X - C) < Pow2`: https://alive2.llvm.org/ce/z/HMgkuu Compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=ead4f3e271fdf6918aef2ede3a7134811147d276&to=bee3d902dd505cf9b11499ba4f230e4e8ae96b92&stat=instructions%3Au
12 days	Revert "Reland "[BasicBlockUtils] Handle funclets when detaching EH p… ↵	Gábor Spaits	1	-85/+28
	(#159292) …ad blocks" (#158435)" This reverts commit 41cef78227eb909181cb9360099b2d92de8d649f.
13 days	Reland "[BasicBlockUtils] Handle funclets when detaching EH pad blocks" ↵	Gábor Spaits	1	-28/+85
	(#158435) When removing EH Pad blocks, the value defined by them becomes poison. These poison values are then used by `catchret` and `cleanupret`, which is invalid. This commit replaces those unreachable `catchret` and `cleanupret` instructions with `unreachable`.
13 days	Add DebugSSAUpdater class to track debug value liveness (#135349)	Stephen Tozer	2	-0/+391
	This patch adds a class that uses SSA construction, with debug values as definitions, to determine whether and which debug values for a particular variable are live at each point in an IR function. This will be used by the IR reader of llvm-debuginfo-analyzer to compute variable ranges and coverage, although it may be applicable to other debug info IR analyses.
2025-09-14	[SimplifyCFG] Refine metadata handling during instruction hoisting (#158448)	William Moses	1	-1/+1
	Co-authored-by: Nikita Popov <npopov@redhat.com>
2025-09-14	Revert "[BasicBlockUtils] Handle funclets when detaching EH pad blocks" ↵	Arthur Eubanks	1	-43/+1
	(#158364) Reverts llvm/llvm-project#157363 Causes crashes, see https://github.com/llvm/llvm-project/pull/157363#issuecomment-3286783238
2025-09-12	[MemProf] Optionally allow transformation of nobuiltin operator new (#158396)	Teresa Johnson	1	-11/+32
	For cases where we can guarantee the application does not override operator new.
2025-09-12	[NFC] Leave a comment in `Local.cpp` about debug info & sample profiling ↵	Mircea Trofin	1	-2/+5
	(#155296) Issue #152767
2025-09-12	[Utils] Fix a warning	Kazu Hirata	1	-1/+2
	This patch fixes: llvm/lib/Transforms/Utils/SimplifyCFG.cpp:338:6: error: unused function 'isSelectInRoleOfConjunctionOrDisjunction' [-Werror,-Wunused-function]
2025-09-12	[SimplfyCFG] Set `MD_prof` for `select` used for certain conditional ↵	Mircea Trofin	1	-1/+31
	simplifications (#154426) There’s a pattern where a branch is conditioned on a conjunction or disjunction that ends up being modeled as a `select` where the first operand is set to `true` or the second to `false`. If the branch has known branch weights, they can be copied to the `select`. This is worth doing in case later the `select` gets transformed to something else (i.e. if we know the profile, we should propagate it). Issue #147390
2025-09-11	[BasicBlockUtils] Handle funclets when detaching EH pad blocks (#157363)	Gábor Spaits	1	-1/+43
	Fixes #148052 . When removing EH Pad blocks, the value defined by them becomes poison. These poison values are then used by `catchret` and `cleanupret`, which is invalid. This commit replaces those unreachable `catchret` and `cleanupret` instructions with `unreachable`.
2025-09-11	[SimplifyCFG] Set branch weights when merging conditional store to address ↵	Mircea Trofin	1	-0/+16
	(#154841)
2025-09-11	[PGO] Add llvm.loop.estimated_trip_count metadata (#152775)	Joel E. Denny	1	-30/+114
	This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As the RFC explains, that metadata enables future patches, such as PR #128785, to fix block frequency issues without losing estimated trip counts.
2025-09-11	[SCEVExp] Remove special-case handling umul_with_overflow by 1 (NFCI).	Florian Hahn	1	-14/+5
	b50ad945dd4faa288 added umul_with_overflow simplifications to InstSimplifyFolder (used by SCEVExpander) and 9b1b93766dfa34ee9 added dead instruction cleanup to SCEVExpander. Remove special handling of umul by 1, handled automatically due to the changes above.
2025-09-10	[DebugInfo][Mem2Reg] Assign uninitialized values with annotated locs (#157716)	Stephen Tozer	1	-2/+5
	In PromoteMem2Reg, we perform a DFS over the CFG and track, for each alloca, its incoming value and its associated incoming DebugLoc, both of which are taken from stores to that alloca; these values and DebugLocs are propagated to PHI nodes when new blocks are reached. In the event that for one incoming edge no store instruction has been seen, we propagate an UndefValue and an empty DebugLoc to the PHI. This is a perfectly valid occurrence, and assigning an empty DebugLoc to the PHI is the correct course of action; therefore, we should pass an annotated DebugLoc instead, so that in DebugLoc coverage tracking we correctly do not expect a valid DebugLoc to be present; we generally mark allocas as having CompilerGenerated locations, so I've chosen to use the same annotation to represent the uninitialized value of that alloca. This change is NFC outside of DebugLoc coverage tracking builds.
2025-09-09	SimplifyCFG: Enable switch replacements in more cases (#156477)	Jessica Del	1	-8/+22
	In some cases, we can replace a switch with simpler instructions or a lookup table. For instance, if every case results in the same value, we can simply replace the switch with that single value. However, lookup tables are not always supported. Targets, function attributes and compiler options can deactivate lookup table creation. Currently, even simpler switch replacements like the single value optimization do not get applied, because we only enable these transformations if lookup tables are enabled. This PR enables the other kinds of replacements, even if lookup tables are not supported. First, it checks if the potential replacements are lookup tables. If they are, then check if lookup tables are supported and whether to continue. If they are not, then we can apply the other transformations. Originally, lookup table creation was delayed until late stages of the compilation pipeline, because it can result in difficult-to-analyze code and prevent other optimizations. As a side effect of this change, we can also enable the simpler optimizations much earlier in the compilation process.
2025-09-09	[AArch64] Enable RT and partial unrolling with reductions for Apple CPUs. ↵	Florian Hahn	1	-2/+4
	(#149699) Update unrolling preferences for Apple Silicon CPUs to enable partial unrolling and runtime unrolling for small loops with reductions. This builds on top of unroller changes to introduce parallel reduction phis, if possible: https://github.com/llvm/llvm-project/pull/149470. PR: https://github.com/llvm/llvm-project/pull/149699
2025-09-09	[LoopUtils] Simplify expanded RT-checks (#157518)	Ramkumar Ramachandra	2	-0/+4
	Follow up on 528b13d ([SCEVExp] Add helper to clean up dead instructions after expansion.) to hoist the SCEVExapnder::eraseDeadInstructions call from LoopVectorize into the LoopUtils APIs add[Diff]RuntimeChecks, so that other callers (LoopDistribute and LoopVersioning) can benefit from the patch.
2025-09-09	Reapply "[SCEVExp] Add helper to clean up dead instructions after expansion. ↵	Florian Hahn	1	-0/+21
	(#157308)" This reverts commit eeb43806eb1b40e690aeeba496ee974172202df9. Recommit with with a fix for MSan failure ( https://lab.llvm.org/buildbot/#/builders/169/builds/14799), by adding a set to track deleted values. Using the InsertedInstructions set is not sufficient, as it use asserting value handles as keys, which may dereference the value at construction. Original message: Add new helper to erase dead instructions inserted during SCEV expansion but not being used due to InstSimplifyFolder simplifications. Together with https://github.com/llvm/llvm-project/pull/157307 this also allows removing some specialized folds, e.g. https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp#L2205 PR: https://github.com/llvm/llvm-project/pull/157308
2025-09-08	[SimplifyCFG] Support not in chain of comparisons. (#156497)	Andreas Jonson	1	-0/+3
	Proof: https://alive2.llvm.org/ce/z/cpXuCb
2025-09-08	Revert "[SCEVExp] Add helper to clean up dead instructions after expansion. ↵	Florian Hahn	1	-16/+0
	(#157308)" This reverts commit 528b13df571c86a2c5b8305d7974f135d785e30f. Triggers MSan errors in some configurations, e.g. https://lab.llvm.org/buildbot/#/builders/169/builds/14799
2025-09-08	[SCEVExp] Add helper to clean up dead instructions after expansion. (#157308)	Florian Hahn	1	-0/+16
	Add new helper to erase dead instructions inserted during SCEV expansion but not being used due to InstSimplifyFolder simplifications. Together with https://github.com/llvm/llvm-project/pull/157307 this also allows removing some specialized folds, e.g. https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp#L2205 PR: https://github.com/llvm/llvm-project/pull/157308
2025-09-06	[MemProf] Always add hints to allocations with memprof attributes (#157222)	Teresa Johnson	1	-40/+35
	Apply hints even if the attribute is the default "notcold" or "ambiguous", to enable better tracking through the allocator. Add an option to control the ambiguous allocation hint value.
2025-09-05	[LLD][COFF] Add more `--time-trace` tags for ThinLTO linking (#156471)	Alexandre Ganea	1	-0/+3
	In order to better see what's going on during ThinLTO linking, this PR adds more profile tags when using `--time-trace` on a `lld-link.exe` invocation. After PR, linking `clang.exe`: <img width="3839" height="2026" alt="Capture d’écran 2025-09-02 082021" src="https://github.com/user-attachments/assets/bf0c85ba-2f85-4bbf-a5c1-800039b56910" /> Linking a custom (Unreal Engine game) binary gives a completly different picture, probably because of using Unity files, and the sheer amount of input files (here, providing over 60 GB of .OBJs/.LIBs). <img width="1940" height="1008" alt="Capture d’écran 2025-09-02 102048" src="https://github.com/user-attachments/assets/60b28630-7995-45ce-9e8c-13f3cb5312e0" />
2025-09-05	[SCEVExp] Fix early exit in ComputeEndCheck. (#156910)	Florian Hahn	1	-2/+9
	ComputeEndCheck incorrectly returned false for unsigned predicates starting at zero and a positive step. The AddRec could still wrap if Step * trunc ExitCount wraps or trunc ExitCount strips leading 1s. Fixes https://github.com/llvm/llvm-project/issues/156849. PR: https://github.com/llvm/llvm-project/pull/156910
2025-09-05	[Utils] Remove an unnecessary cast (NFC) (#157023)	Kazu Hirata	1	-1/+1
	MergedCounts is of type double.
2025-09-04	[SimplifyCFG] Probabilities associated with same condition are constant ↵	Mircea Trofin	1	-16/+5
	(#155734) The branch weights capture probability. The probability has everything to do with the (SSA) value the condition is predicated on, and nothing to do with the position in the CFG.
2025-09-04	[profcheck] Allow `unknown` function entry count (#155918)	Mircea Trofin	1	-4/+7
	Some passes synthesize functions, e.g. WPD, so we may need to indicate “this synthesized function’s entry count cannot be estimated at compile time” - akin to `branch_weights`. Issue #147390
2025-09-04	[LoopUnroll] Introduce parallel reduction phis when unrolling. (#149470)	Florian Hahn	1	-0/+136
	When partially or runtime unrolling loops with reductions, currently the reductions are performed in-order in the loop, negating most benefits from unrolling such loops. This patch extends unrolling code-gen to keep a parallel reduction phi per unrolled iteration and combining the final result after the loop. For out-of-order CPUs, this allows executing mutliple reduction chains in parallel. For now, the initial transformation is restricted to cases where we unroll a small number of iterations (hard-coded to 4, but should maybe be capped by TTI depending on the execution units), to avoid introducing an excessive amount of parallel phis. It also requires single block loops for now, where the unrolled iterations are known to not exit the loop (either due to runtime unrolling or partial unrolling). This ensures that the unrolled loop will have a single basic block, with a single exit block where we can place the final reduction value computation. The initial implementation also only supports parallelizing loops with a single reduction and only integer reductions. Those restrictions are just to keep the initial implementation simpler, and can easily be lifted as follow-ups. With corresponding TTI to the AArch64 unrolling preferences which I will also share soon, this triggers in ~300 loops across a wide range of workloads, including LLVM itself, ffmgep, av1aom, sqlite, blender, brotli, zstd and more. PR: https://github.com/llvm/llvm-project/pull/149470
2025-09-04	[Utils] Remove an unnecessary cast (NFC) (#156813)	Kazu Hirata	1	-2/+2
	getZExtValue() already return uint64_t.
2025-09-02	[MemProf] Allow hint update on existing calls to nobuiltin hot/cold new ↵	Teresa Johnson	1	-2/+36
	(#156476) Explicit calls to ::operator new are marked nobuiltin and cannot be elided or updated as they may call user defined versions. However, existing calls to the hot/cold versions of new only need their hint parameter value updated, which does not mutate the call.
2025-09-01	[NFC] SimplifyCFG: Detect switch replacement earlier in `switchToLookup` ↵	Jessica Del	1	-124/+145
	(#155602) This PR is the first part to solve the issue in #149937. The end goal is enabling more switch optimizations on targets that do not support lookup tables. SimplifyCFG has the ability to replace switches with either a few simple calculations, a single value, or a lookup table. However, it only considers these options if the target supports lookup tables, even if the final result is not a LUT, but a few simple instructions like muls, adds and shifts. To enable more targets to use these other kinds of optimization, this PR restructures the code in `switchToLookup`. Previously, code was generated even before choosing what kind of replacement to do. However, we need to know if we actually want to create a true LUT or not before generating anything. Then we can check for target support only if any LUT would be created. This PR moves the code so it first determines the replacement kind and then generates the instructions. A later PR will insert the target support check after determining the kind of replacement. If the result is not a LUT, then even targets without LUT support can replace the switch with something else.
2025-09-01	[RelLookupTableConverter] Make GEP type independent (#155404)	Nikita Popov	1	-44/+61
	This makes the RelLookupTableConverter independent of the type used in the GEP. In particular, it removes the requirement to have a leading zero index.
2025-08-31	[SimplifyCFG] Support trunc nuw in chain of comparisons. (#155087)	Andreas Jonson	1	-0/+9
	proof: https://alive2.llvm.org/ce/z/5PNCds
2025-08-30	[NFC] Fix typos 'seperate' -> 'separate' (#144368)	Roman	1	-1/+1
	Correct few typos: 'seperate' -> 'separate' .
2025-08-29	[DirectX] Make dx.RawBuffer an op that can't be replaced (#154620)	Farzon Lotfi	1	-1/+1
	fixes #152348 SimplifyCFG collapses raw buffer store from a if\else load into a select. This change prevents the TargetExtType dx.Rawbuffer from being replace thus preserving the if\else blocks. A further change was needed to eliminate the phi node before we process Intrinsic::dx_resource_getpointer in DXILResourceAccess.cpp
2025-08-28	[SimplifyCFG] Move token type check into canReplaceOperandWithVariable()	Nikita Popov	2	-8/+3
	We cannot form phis/selects of token type, so this should be checked inside canReplaceOperandWithVariable().
2025-08-27	[DebugInfo] Drop extra DIBuilder::finalizeSubprogram() calls (NFC) (#155618)	Vladislav Dzhidzhoev	1	-1/+0
	After #139914, `DIBilder::finalize()` finalizes both declaration and definition DISubprograms. Therefore, there is no need to call `DIBuilder::finalizeSubprogram()` right before `DIBilder::finalize()`.
2025-08-26	[Transforms] Allow non-regex Source in SymbolRewriter in case of using ↵	Dmitry Vasilyev	1	-30/+42
	ExplicitRewriteDescriptor (#154319) Do not check that Source is a valid regex in case of Target (explicit) transformation. Source may contain special symbols that may cause an incorrect `invalid regex` error. Note that source and exactly one of [Target, Transform] must be provided. `Target (explicit transformation)`: In this kind of rule `Source` is treated as a symbol name and is matched in its entirety. `Target` field will denote the symbol name to transform to. `Transform (pattern transformation)`: This rule treats `Source` as a regex that should match the complete symbol name. `Transform` is a regex specifying the name to transform to.
2025-08-26	[NFC][SimplifyCFG] Simplify operators for the combined predicate in ↵	Mircea Trofin	1	-10/+10
	`mergeConditionalStoreToAddress` (#155058) This is about code readability. The operands in the disjunction forming the combined predicate in `mergeConditionalStoreToAddress` could sometimes be negated twice. This patch addresses that. 2 tests needed updating because they exposed the double negation and now they don’t.
2025-08-26	[SCEVExp] Check if getPtrToIntExpr resulted in CouldNotCompute.	Florian Hahn	1	-2/+5
	This fixes a crash trying to use SCEVCouldNotCompute, if getPtrToIntExpr failed. Fixes https://github.com/llvm/llvm-project/issues/155287
2025-08-25	[LoopPeel] Address followup comments on #121104 (#155221)	Ryotaro Kasuga	1	-11/+9
	This is a follow-up PR for post-commit comments in #121104 . Details: - Rename `mergeTwoCounter` to `mergeTwoCounters` (add trailing `s`). - Avoid duplicated hash lookup. - Use `///` instead of `//`. - Fix typo.
2025-08-24	[NFC][SimplifyCFG] Fix a return value in `ConstantComparesGatherer` (#155154)	Yingwei Zheng	1	-1/+1
	`ICI->getOperand(0)` is non-null.