aboutsummaryrefslogtreecommitdiff
path: root/llvm/test/Transforms/CodeGenPrepare
AgeCommit message (Collapse)AuthorFilesLines
13 days[CodeGenPrepare] Fix infinite loop with same-type bitcasts (#176694)nataliakokoromyti1-0/+27
OptimizeNoopCopyExpression was sinking same-type bitcasts (e.g. bitcast i32 to i32) which would then be reintroduced by optimizePhiType, causing an infinite loop. Fix by adding a check (PhiTy == ConvertTy) in optimizePhiType to skip the conversion when types are already identical. Fixes #176688.
2026-01-21[IR] Make dead_on_return attribute optionally sizedAiden Grossman1-1/+1
This patch makes the dead_on_return parameter attribute optionally require a number of bytes to be passed in to specify the number of bytes known to be dead upon function return/unwind. This is aimed at enabling annotating the this pointer in C++ destructors with dead_on_return in clang. We need this to handle cases like the following: ``` struct X { int n; ~X() { this[n].n = 0; } }; void f() { X xs[] = {42, -1}; } ``` Where we only certain that sizeof(X) bytes are dead upon return of ~X. Otherwise DSE would be able to eliminate the store in ~X which would not be correct. This patch only does the wiring within IR. Future patches will make clang emit correct sizing information and update DSE to only delete stores to objects marked dead_on_return that are provably in bounds of the number of bytes specified to be dead_on_return. Reviewers: nikic, alinas, antoniofrighetto Pull Request: https://github.com/llvm/llvm-project/pull/171712
2026-01-18[CGP][AArch64] Do not sink instructions that might read/write memory. (#176182)David Green1-6/+92
The test case's call instruction was being sank past the point where the memory it accessed was valid. Add a check that CGP does not try to sink instruction that might be invalid to move. Fixes #176095
2026-01-06[CGP] Use getSigned() for scale during address sinkingNikita Popov1-0/+20
The scale is a signed quantity. This avoids an assertion failure with github.com/llvm/llvm-project/pull/171456.
2026-01-04[NFC] Delete unnecessary apostrophe at the end of its (#173974)willmafh1-1/+1
2026-01-04[IR] Reland Optimize PHINode::removeIncomingValue() and ↵Mingjie Xu1-1/+1
PHINode::removeIncomingValueIf() to use the swapping strategy. (#174274) Reland #171963, #172639 and #173444, they are reverted in 86b9f90b9574b3a7d15d28a91f6316459dcfa046 because of introducing non-determinism in compiles. The non-determinism has been fixed in 9b8addffa70cee5b2acc5454712d9cf78ce45710.
2025-12-29Revert 159f1c048e08a8780d92858cfc80e723c90235e3 (#173893)Walter Lee1-1/+1
This causes non-determinism in compiles. From nikic: "FYI the non-determinism is also visible on llvm-opt-benchmark. Maybe repeatedly running test cases from https://github.com/dtcxzyw/llvm-opt-benchmark/commit/299446d99f04024d5f569ce1f7e9338c9bcf55fe could reproduce the issue..." Also revert dependent 796fafeff92fe5d2d20594859e92607116e30a16 and e135447bda617125688b71d33480d131d1076a72.
2025-12-17[IR] Optimize PHINode::removeIncomingValue() by swapping removed incoming ↵Mingjie Xu1-1/+1
value with the last incoming value. (#171963) Current implementation uses `std::copy` to shift all incoming values after the removed index. This patch optimizes `PHINode::removeIncomingValue()` by replacing the linear shift of incoming values with a swap-with-last strategy. After this change, the relative order of incoming values after removal is not preserved. This improves compile-time for PHI nodes with many predecessors. Depends: https://github.com/llvm/llvm-project/pull/171955 https://github.com/llvm/llvm-project/pull/171956 https://github.com/llvm/llvm-project/pull/171960 https://github.com/llvm/llvm-project/pull/171962
2025-10-25[CodeGenPrepare] Don't simplify incomplete expression tree in ↵Yingwei Zheng1-0/+38
AddrModeCombine (#164628) Since new select/phi instructions may construct loops, the expression tree to be simplified may still be incomplete (i.e., it may contain select with dummy values or phi without incoming values). This patch removes the call to simplifyInstruction for now, as it doesn't break existing tests. Original PR: https://reviews.llvm.org/D36073 Fix the crash reported in https://github.com/llvm/llvm-project/pull/163453#issuecomment-3429922732.
2025-10-23[test][Transforms] Remove unsafe-fp-math uses part 1 (NFC) (#164742)paperchalice3-9/+3
Post cleanup for #164534.
2025-10-20[IR] Replace alignment argument with attribute on masked intrinsics (#163802)Nikita Popov7-55/+55
The `masked.load`, `masked.store`, `masked.gather` and `masked.scatter` intrinsics currently accept a separate alignment immarg. Replace this with an `align` attribute on the pointer / vector of pointers argument. This is the standard representation for alignment information on intrinsics, and is already used by all other memory intrinsics. This means the signatures now match llvm.expandload, llvm.vp.load, etc. (Things like llvm.memcpy used to have a separate alignment argument as well, but were already migrated a long time ago.) It's worth noting that the masked.gather and masked.scatter intrinsics previously accepted a zero alignment to indicate the ABI type alignment of the element type. This special case is gone now: If the align attribute is omitted, the implied alignment is 1, as usual. If ABI alignment is desired, it needs to be explicitly emitted (which the IRBuilder API already requires anyway).
2025-10-10[CGP] Fix missing sign extension for base offset in optimizeMemoryInst (#161377)Vladimir Radosavljevic1-0/+81
If we have integers larger than 64-bit we need to explicitly sign extend them, otherwise we will get wrong zero extended values.
2025-09-16[RISCV] Improve fixed vector handling in isCtpopFast. (#158380)Craig Topper1-1/+1
Previously we considered fixed vectors fast if Zvbb or Zbb is enabled. Zbb only helps if the vector type will end up being scalarized.
2025-08-08[IR] Remove size argument from lifetime intrinsics (#150248)Nikita Popov2-8/+8
Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).
2025-08-05[LLVM][CDP] Move AArch64 test into AArch64 directory.Paul Walker1-0/+0
2025-08-05[LLVM][CGP] Allow finer control for sinking compares. (#151366)Paul Walker1-0/+44
Compare sinking is selectable based on the result of hasMultipleConditionRegisters. This function is too coarse grained by not taking into account the differences between scalar and vector compares. This PR extends the interface to take an EVT to allow finer control. The new interface is used by AArch64 to disable sinking of scalable vector compares, but with isProfitableToSinkOperands updated to maintain the cases that are specifically tested.
2025-07-26[CodeGenPrepare] Make sure that `AddOffset` is also a loop invariant (#150625)Yingwei Zheng1-0/+34
Closes https://github.com/llvm/llvm-project/issues/150611.
2025-07-03[CGP] Update tests to use autogen scripts, and refresh check linesPhilip Reames2-96/+547
Reducing manual update work required for an upcoming change.
2025-06-28[CodeGenPrepare] Filter out unrecreatable addresses from memory optimization ↵Evgenii Kudriashov1-0/+106
(#143566) Follow up on #139303
2025-06-26[tests] Additional coverage for gather/scatter address optimizationsPhilip Reames1-0/+66
2025-06-06[CGP] Bail out if (Base|Scaled)Reg does not dominate insert point. (#142949)Florian Hahn1-0/+76
(Base|Scaled)Reg may not dominate the chosen insert point, if there are multiple uses of the address. Bail out if that's the case, otherwise we will generate invalid IR. In some cases, we could probably adjust the insert point or hoist the (Base|Scaled)Reg. Fixes https://github.com/llvm/llvm-project/issues/142830. PR: https://github.com/llvm/llvm-project/pull/142949
2025-05-15[CodeGenPrepare] Make sure instruction get from SunkAddrs is before ↵weiguozhi1-0/+44
MemoryInst (#139303) Function optimizeBlock may do optimizations on a block for multiple times. In the first iteration of the loop, MemoryInst1 may generate a sunk instruction and store it into SunkAddrs. In the second iteration of the loop, MemoryInst2 may use the same address and then it can reuse the sunk instruction stored in SunkAddrs, but MemoryInst2 may be before MemoryInst1 and the corresponding sunk instruction. In order to avoid use before def error, we need to find appropriate insert position for the sunk instruction. Fixes #138208.
2025-05-08Propagate DebugLocs on phis in BreakCriticalEdges (#133492)Orlando Cazalet-Hyams1-0/+51
The pull request discusses whether this change is needed or not. We leant towards "it can't hurt" on the basis that it's at worst slightly unecessary (but not incorret). The motivation for the patch came from reviewing code duplication sites to update for Key Instructions, finding this, trying to generate a test case and seeing the DebugLocs aren't propagated.
2025-04-23[CodeGenPrepare] Unfold slow ctpop when used in power-of-two test (#102731)Sergei Barannikov2-0/+208
DAG combiner already does this transformation, but in some cases it does not have a chance because either CodeGenPrepare or SelectionDAGBuilder move icmp to a different basic block. https://alive2.llvm.org/ce/z/ARzh99 Fixes #94829 Pull Request: https://github.com/llvm/llvm-project/pull/102731
2025-04-10[Verifier][CGP] Allow integer argument to dbg_declare (#134803)Nikita Popov1-0/+33
Relaxes the newly added verifier rule to also allow an integer argument to dbg_declare, which is interpreted as a pointer. Adjust CGP to deal with it gracefully. Fixes https://github.com/llvm/llvm-project/issues/134523. Alternative to https://github.com/llvm/llvm-project/pull/134601.
2025-03-14[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298)Jeremy Morse5-5/+0
These date back to when the non-intrinsic format of variable locations was still being tested and was behind a compile-time flag, so not all builds / bots would correctly run them. The solution at the time, to get at least some test coverage, was to have tests opt-in to non-intrinsic debug-info if it was built into LLVM. Nowadays, non-intrinsic format is the default and has been on for more than a year, there's no need for this flag to exist. (I've downgraded the flag from "try" to explicitly requesting non-intrinsic format in some places, so that we can deal with tests that are explicitly about non-intrinsic format in their own commit).
2025-02-06[IR] Generalize Function's {set,get}SectionPrefix to GlobalObjects, the base ↵Mingming Liu2-4/+4
class of {Function, GlobalVariable, IFunc} (#125757) This is a split of https://github.com/llvm/llvm-project/pull/125756
2025-01-29[IR] Convert from nocapture to captures(none) (#123181)Nikita Popov1-4/+4
This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.
2025-01-28[Clang] Cleanup docs and comments relating to -fextend-variable-liveness ↵Stephen Tozer1-1/+1
(#124767) This patch contains a number of changes relating to the above flag; primarily it updates comment references to the old flag names, "-fextend-lifetimes" and "-fextend-this-ptr" to refer to the new names, "-fextend-variable-liveness[={all,this}]". These changes are all NFC. This patch also removes the explicit -fextend-this-ptr-liveness flag alias, and shortens the help-text for the main flag; these are both changes that were meant to be applied in the initial PR (#110000), but due to some user-error on my part they were not included in the merged commit.
2025-01-06[AArch64] Improve codegen of vectorised early exit loops (#119534)David Sherwood1-0/+189
Once PR #112138 lands we are able to start vectorising more loops that have uncountable early exits. The typical loop structure looks like this: vector.body: ... %pred = icmp eq <2 x ptr> %wide.load, %broadcast.splat ... %or.reduc = tail call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> %pred) %iv.cmp = icmp eq i64 %index.next, 4 %exit.cond = or i1 %or.reduc, %iv.cmp br i1 %exit.cond, label %middle.split, label %vector.body middle.split: br i1 %or.reduc, label %found, label %notfound found: ret i64 1 notfound: ret i64 0 The problem with this is that %or.reduc is kept live after the loop, and since this is a boolean it typically requires making a copy of the condition code register. For AArch64 this requires an additional cset instruction, which is quite expensive for a typical find loop that only contains 6 or 7 instructions. This patch attempts to improve the codegen by sinking the reduction out of the loop to the location of it's user. It's a lot cheaper to keep the predicate alive if the type is legal and has lots of registers for it. There is a potential downside in that a little more work is required after the loop, but I believe this is worth it since we are likely to spend most of our time in the loop.
2024-12-01[CodeGenPrepare] Drop nsw flags in `optimizeLoadExt` (#118180)Yingwei Zheng1-0/+86
Alive2: https://alive2.llvm.org/ce/z/pMcD7q Closes https://github.com/llvm/llvm-project/issues/118172.
2024-11-11[llvm] Remove `br i1 undef` from some regression tests [NFC] (#115691)Lee Wei2-4/+4
This PR aims to remove undefined behavior from tests under the directory `llvm/transforms/CodegenPrepare, ConstantHoisting, Coroutines` etc.
2024-11-06[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548)Paul Walker3-14/+14
2024-10-31[CGP] [CodeGenPrepare] Folding `urem` with loop invariant value plus offset ↵goldsteinn1-7/+9
(#104724) This extends the existing fold: ``` for(i = Start; i < End; ++i) Rem = (i nuw+- IncrLoopInvariant) u% RemAmtLoopInvariant; ``` -> ``` Rem = (Start nuw+- IncrLoopInvariant) % RemAmtLoopInvariant; for(i = Start; i < End; ++i, ++rem) Rem = rem == RemAmtLoopInvariant ? 0 : Rem; ``` To work with a non-zero `IncrLoopInvariant`. This is a common usage in cases such as: ``` for(i = 0; i < N; ++i) if ((i + 1) % X) == 0) do_something_occasionally_but_not_first_iter(); ``` Alive2 w/ i4/unrolled 6x (needs to be ran locally due to timeout): https://alive2.llvm.org/ce/z/6tgyN3 Exhaust proof over all uint8_t combinations in C++: https://godbolt.org/z/WYa561388
2024-09-02[CGP] Regenerate `revert-constant-ptr-propagation-on-calls.ll` test (NFC)Antonio Frighetto1-0/+1
Multiple buildbots were previously failing.
2024-09-02[CGP] Undo constant propagation of pointers across callsAntonio Frighetto1-1/+1
It may be profitable to revert SCCP propagation of C++ static values, if such constants are pointers, in order to avoid redundant pointer computation, since the method returning the constant is non-removable.
2024-09-02[CGP] Introduce test for PR102926 (NFC)Antonio Frighetto1-0/+169
2024-08-29[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)Stephen Tozer2-0/+87
This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-22[X86] Allow speculative BSR/BSF instructions on targets with CMOV (#102885)Simon Pilgrim1-61/+19
Currently targets without LZCNT/TZCNT won't speculate with BSR/BSF instructions in case they have a zero value input, meaning we always insert a test+branch for the zero-input case. This patch proposes we allow speculation if the target has CMOV, and perform a branchless select instead to handle the zero input case. This will predominately help x86-64 targets where we haven't set any particular cpu target. We already always perform BSR/BSF instructions if we were lowering a CTLZ/CTTZ_ZERO_UNDEF instruction.
2024-08-21[AArch64] Bail out for scalable vecs in areExtractShuffleVectors (#105484)Sjoerd Meijer1-0/+19
The added test triggers the following assert in `areExtractShuffleVectors` that is called from `shouldSinkOperands`: Assertion `(!isScalable() || isZero()) && "Request for a fixed element count on a scalable object"' failed. I don't think scalable types can be extract shuffles, so bail early if this is the case.
2024-08-20Recommit "[CodeGenPrepare] Folding `urem` with loop invariant value"Noah Goldstein1-13/+33
Was missing remainder on `Start` value. Also changed logic as as nikic suggested (getting loop from `PN` instead of `Rem`). The prior impl increased the complexity of the code and made debugging it more difficult. Closes #104877
2024-08-20[CodeGenPrepare][X86] Add tests for fixing `urem` transform; NFCNoah Goldstein1-2/+189
2024-08-18Revert "[CodeGenPrepare] Folding `urem` with loop invariant value"Noah Goldstein1-29/+8
This reverts commit c64ce8bf283120fd145a57d0e61f9697f719139d. Seems to be causing stage2 failures on buildbots. Reverting while I investigate.
2024-08-18[CodeGenPrepare] Folding `urem` with loop invariant valueNoah Goldstein1-8/+29
``` for(i = Start; i < End; ++i) Rem = (i nuw+ IncrLoopInvariant) u% RemAmtLoopInvariant; ``` -> ``` Rem = (Start nuw+ IncrLoopInvariant) % RemAmtLoopInvariant; for(i = Start; i < End; ++i, ++rem) Rem = rem == RemAmtLoopInvariant ? 0 : Rem; ``` In its current state, only if `IncrLoopInvariant` and `Start` both being zero. Alive2 seemed unable to prove this (see: https://alive2.llvm.org/ce/z/ATGDp3 which is clearly wrong but still checks out...) so wrote an exhaustive test here: https://godbolt.org/z/WYa561388 Closes #96625
2024-08-18[CodeGenPrepare][X86] Add tests for folding `urem` with loop invariant ↵Noah Goldstein1-0/+858
value; NFC
2024-08-09[AArch64] Sink operands to fmuladd. (#102297)David Green1-0/+245
A fmuladd can be treated as a fma when sinking operands to the intrinsic, similar to D126234. Addresses a small part of #102195
2024-06-18[CodeGenPrepare] Use MapVector to stabilize iteration orderFangrui Song1-1/+3
DenseMap iteration order is not guaranteed to be deterministic. Without the change, llvm/test/Transforms/CodeGenPrepare/X86/statepoint-relocate.ll would fail when `combineHashValue` changes (#95970). Fixes: dba7329ebb0dbe1fabb3faaedfd31da3b8bd611d
2024-06-14 [RemoveDIs] Print IR with debug records by default (#91724)Stephen Tozer5-34/+115
This patch makes the final major change of the RemoveDIs project, changing the default IR output from debug intrinsics to debug records. This is expected to break a large number of tests: every single one that tests for uses or declarations of debug intrinsics and does not explicitly disable writing records. If this patch has broken your downstream tests (or upstream tests on a configuration I wasn't able to run): 1. If you need to immediately unblock a build, pass `--write-experimental-debuginfo=false` to LLVM's option processing for all failing tests (remember to use `-mllvm` for clang/flang to forward arguments to LLVM). 2. For most test failures, the changes are trivial and mechanical, enough that they can be done by script; see the migration guide for a guide on how to do this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates 3. If any tests fail for reasons other than FileCheck check lines that need updating, such as assertion failures, that is most likely a real bug with this patch and should be reported as such. For more information, see the recent PSA: https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
2024-05-30[ConstantFold] Remove notional over-indexing fold (#93697)Nikita Popov4-4/+4
The data-layout independent constant folding currently has some rather gnarly code for canonicalizing GEP indices to reduce "notional overindexing", and then infers inbounds based on that canonicalization. Now that we canonicalize to i8 GEPs, this canonicalization is essentially useless, as we'll discard it as soon as the GEP hits the data-layout aware constant folder anyway. As such, I'd like to remove this code entirely. This shouldn't have any impact on optimization capabilities.
2024-05-14[test][LoongArch] Add -mattr=+d option. NFCwanglei1-1/+1
Because most of tests assume target-abi=`lp64d`, adding the corresponding feature is reasonable. rg -l loongarch -g '!*.s' | xargs sed -i '/mtriple=loongarch/ {/-mattr=/!{/target-abi/! s/mtriple=loongarch.. /&-mattr=+d /}}'