aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen
AgeCommit message (Collapse)AuthorFilesLines
2025-07-15[DAGCombiner][AArch64] Prevent SimplifyVCastOp from creating illegal scalar ↵Craig Topper1-8/+10
types after type legalization. (#148970) Fixes #148949
2025-07-15[IA] Use a single callback for lowerDeinterleaveIntrinsic [nfc] (#148978)Philip Reames1-12/+7
This essentially merges the handling for VPLoad - currently in lowerInterleavedVPLoad which is shared between shuffle and intrinsic based interleaves - into the existing dedicated routine. My plan is that if we like this factoring is that I'll do the same for the intrinsic store paths, and then remove the excess generality from the shuffle paths since we don't need to support both modes in the shared VPLoad/Store callbacks. We can probably even fold the VP versions into the non-VP shuffle variants in the analogous way.
2025-07-15[SelectionDAG] improve error messages for invalid operator bundle (#148945)Florian Mayer1-34/+34
2025-07-15[CodeGen] Use setNoVRegs. NFC. (#148831)Jay Foad1-2/+1
2025-07-15[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)Jeremy Morse1-1/+1
There are no longer debug-info instructions, thus we don't need this skipping. Horray!
2025-07-15SafeStack: Check if __safestack_pointer_address is available (#147917)Matt Arsenault1-3/+14
Start using RuntimeLibcalls in the base implementation of getSafeStackPointerLocation instead of hardcoding the function names.
2025-07-15[LLVM][DAGCombiner] Fix size calculations in calculateByteProvider. (#148425)Paul Walker1-2/+2
calculateByteProvider only cares about scalars or a single element within a vector. For the later there is the VectorIndex parameter to identify the element. All other properties, and specificially Index, are related to the underyling scalar type and thus when taking the size of a type it's the scalar size that matters. Fixes https://github.com/llvm/llvm-project/issues/148387
2025-07-15SafeStack: Emit __safestack_pointer_address call through RuntimeLibcalls ↵Matt Arsenault1-1/+9
(#147916) Stop using hardcoded function named and check availability. This only fixes the forced usage via command line in the pass itself; the implementations inside of TargetLoweringBase hide additional call emission.
2025-07-15[NFC] Hoist pseudo probe desc emission code for reuse (#148756)Haohai Wen1-22/+1
This PR is part of #123870. The pseudo probe desc emission code can be reused by other target.
2025-07-15SafeStack: Emit call to __stack_chk_fail through RuntimeLibcalls (#147915)Matt Arsenault1-1/+9
Avoid hardcoding the function name, and query if it's really supported or not.
2025-07-15StackProtector: Use RuntimeLibcalls to query libcall names (#147913)Matt Arsenault1-10/+18
The compiler should not introduce calls to arbitrary strings that aren't defined in RuntimeLibcalls. Previously OpenBSD was disabling the default __stack_chk_fail, but there was no record of the alternative __stack_smash_handler function it emits instead. This also avoids a random triple check in the pass.
2025-07-15[DAG] canCreateUndefOrPoison - add handling for ISD::ABS nodes (#148791)Simon Pilgrim1-0/+5
Unlike the abs intrinsic, the ISD::ABS node defines ABS(INT_MIN) -> INT_MIN, so no undef/poison is created by the node itself
2025-07-14[llvm] Remove unused includes (NFC) (#148768)Kazu Hirata2-2/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-07-14[CodeGen] Remove an unnecessary cast (NFC) (#148764)Kazu Hirata1-1/+1
getExpression() already returns DIExpression *.
2025-07-14[DAGCombiner] Pass SDNodeFlags to getNode instead of modifying nodes. (#148744)Craig Topper1-11/+8
getNode has logic to intersect flags correctly if the new node happens to CSE with an existing node. Setting node flags after getNode bypasses this logic and may change the node for other uses where the flags don't hold.
2025-07-14[SelectionDAG] improve error message for invalid op bundles (#148722)Florian Mayer1-6/+23
2025-07-14[SelectionDAG] [KCFI] Allow "kcfi" on invoke (#148742)Florian Mayer1-1/+1
This is handled in CallBase, so it is valid for both call and invoke
2025-07-14[DAGCombiner] Pass SDNodeFlags to getSelect instead of modifying the node ↵Craig Topper1-12/+5
returned. (#148733)
2025-07-14[IA][NFC] Factoring out helper functions that extract (de)interleaving ↵Min-Yih Hsu1-64/+10
factors (#148689) Factoring out and combining `isInterleaveIntrinsic`, `isDeinterleaveIntrinsic`, and `getIntrinsicFactor` into `getInterleaveIntrinsicFactor` and `getDeinterleaveIntrinsicFactor` inside VectorUtils. NFC.
2025-07-14[DAG] SelectionDAG::canCreateUndefOrPoison - add ISD::FCOPYSIGN (#148617)woruyu1-0/+1
### Summary This PR resolves https://github.com/llvm/llvm-project/issues/147694
2025-07-14[GlobaISel] Allow expanding of sdiv -> mul by constant (#146504)jyli01161-20/+114
Allows expand of sdiv->mul by constant combine for the general case. Previously this was only occurring in the exact case. This is part of the resolution to issue #118090
2025-07-14[CodeGen][NPM] VirtRegRewriter: Set VirtReg flag (#148107)Vikram Hegde1-0/+2
same as https://github.com/llvm/llvm-project/pull/138660, Co-authored-by : Oke, Akshat <[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>
2025-07-13[AArch64] "Support" debug info for SVE types on Windows. (#147865)Eli Friedman1-2/+9
There isn't any way to encode a variable in an SVE register, and there isn't any way to encode a scalable offset, and as far as I know that's unlikely to change in the near future. So suppress any debug info which would require those encodings. This isn't ideal, but we need to ship something which doesn't crash. Alternatively, for Z registers, we could emit debug info assuming the vector length is 128 bits, but that seems like it would lead to unintuitive results. The change to AArch64FrameLowering is needed to avoid a crash. But we can't actually test that the returned offset is correct: LiveDebugValues performs the query, then discards the result.
2025-07-13DAG: Use fast variants of fast math libcalls (#147481)Matt Arsenault1-26/+99
Hexagon currently has an untested global flag to control fast math variants of libcalls. Add fast variants as explicit libcall options so this can be a flag based lowering decision, and implement it. I have no idea what fast math flags the hexagon case requires, so I picked the maximally potentially relevant set of flags although this probably is refinable per call. Looking in compiler-rt, I'm not sure if the fast variants are anything more than aliases.
2025-07-11[CodeGen] Do not use subsituteRegister to update implicit def (#148068)Peiming Liu1-8/+21
It seems `subsituteRegister` checks `FromReg == ToReg` instead of `TRI->isSubRegisterEq`. This PR simply reverts the original PR (https://github.com/llvm/llvm-project/pull/131361) to its initial implementation (without using `subsituteRegister`). Not sure whether it is a desired fix (and by no means that I am an expert on LLVM backend), but it does fix a numeric error on our internal workload. Original author: @sdesmalen-arm
2025-07-11[X86][GlobalISel] Added support for llvm.get.rounding (#147716)JaydeepChauhan141-0/+3
- This implementation is adapted from SDAG X86TargetLowering::LowerGET_ROUNDING. - llvm.set.rounding will be added later because it involves MXCSR updates currently unsupported.
2025-07-11[DAGCombine] Change isBuildVectorAll* -> isConstantSplatVectorAll* for ↵jjasmine1-4/+4
Vselect (#147305) Change isBuildVectorAll* -> isConstantSplatVectorAll* in VSelect in case the fold happens after BuildVector has been canonically transformed to Splat or if the Splat is initially in vselect already - Fixes #73454 - Update related test cases, add extra tests in wasm --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-07-11[NFC] Correct typo: invertion -> inversion (#147995)Fraser Cormack2-6/+6
2025-07-10[NFC] Split UniqueBBID definition to a separate file. (#148043)Rahman Lavaee3-0/+3
2025-07-11[MachinePipeliner] Add validation for missed loop-carried memory deps (#145878)Ryotaro Kasuga1-40/+154
This patch adds an additional validation step to ensure that the generated schedule does not violate loop-carried memory dependencies. Prior to this patch, incorrect schedules could be produced due to the lack of checks for the following types of dependencies: - load-to-store backward (from bottom to top within the BB) dependencies - store-to-load dependencies - store-to-store dependencies One possible solution to this issue is to add these dependencies directly to the dependency graph, although doing so may lead to performance degradation. In addition, no known cases of incorrect code generation caused by these missing dependencies have been observed in practice. Given these factors, this patch introduces a post-scheduling validation phase to check for such previously missed dependencies, instead of adding them to the graph before searching for a schedule. Since no actual problems have been identified so far, it is likely that most generated schedules are already valid. Therefore, this additional validation is not expected to cause performance degradation in practice. Split off from #135148 . The remaining tasks are as follows: - Address other missing loop-carried dependencies (e.g., output dependencies between physical registers, barrier instructions, and instructions that may raise floating-point exceptions) - Remove code that are currently retained to maintain the existing behavior but probably unnecessary. - Eliminate `SwingSchedulerDAG::isLoopCarriedDep` and use `SwingSchedulerDDG` to traverse edges after dependency analysis part.
2025-07-11RuntimeLibcalls: Add entries for objc runtime calls (#147920)Matt Arsenault1-28/+35
Stop emitting these calls by name in PreISelIntrinsicLowering. This is still kind of a hack. We should be going through the abstract RTLIB:Libcall, and then checking if the call is really supported in this module. Do this as a placeholder until RuntimeLibcalls is a module analysis.
2025-07-10[DAG] Handle truncated splat in isBoolConstant (#145473)David Green2-7/+9
This allows truncated splat / buildvector in isBoolConstant, to allow certain not instructions to be recognized post-legalization, and allow vselect to optimize. An override for x86 avx512 predicated vectors is required to avoid an infinite recursion from the code that detects zero vectors. From: ``` // Check if the first operand is all zeros and Cond type is vXi1. // If this an avx512 target we can improve the use of zero masking by // swapping the operands and inverting the condition. ```
2025-07-10[llvm] export private symbols needed by unittests (#145767)Andrew Rogers5-47/+57
## Purpose Export a small number of private LLVM symbols so that unit tests can still build/run when LLVM is built as a Windows DLL or a shared library with default hidden symbol visibility. ## Background The effort to build LLVM as a WIndows DLL is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307). Some LLVM unit tests use internal/private symbols that are not part of LLVM's public interface. When building LLVM as a DLL or shared library with default hidden symbol visibility, the symbols are not available when the unit test links against the DLL or shared library. This problem can be solved in one of two ways: 1. Export the private symbols from the DLL. 2. Link the unit tests against the intermediate static libraries instead of the final LLVM DLL. This PR applies option 1. Based on the discussion of option 2 in #145448, this option is preferable. ## Overview * Adds a new `LLVM_ABI_FOR_TEST` export macro, which is currently just an alias for `LLVM_ABI`. * Annotates the sub-set of symbols under `llvm/lib` that are required to get unit tests building using the new macro.
2025-07-10[CodeGen] commuteInstruction should update implicit-def (#131361)Sander de Smalen1-1/+8
When the RegisterCoalescer adds an implicit-def when coalescing a SUBREG_TO_REG (#123632), this causes issues when removing other COPY nodes by commuting the instruction because it doesn't take the implicit-def into consideration. This PR fixes that.
2025-07-10[InlineSpiller] Drop unused elements in Virt2SiblingsMap. NFC (#147866)csstormq1-1/+1
2025-07-10Reland "[CodeGen] Expose the extensibility of PassConfig to plugins (#139059)"Tcc1001-0/+2
Add missing dependencies to unittest target Original patch broke BUILD_SHARED bots and required revert #147947
2025-07-10Revert "[CodeGen] Expose the extensibility of PassConfig to plugins" (#147947)Jan Patrick Lehr1-2/+0
Reverts llvm/llvm-project#139059 This broke https://lab.llvm.org/buildbot/#/builders/10/builds/9125/steps/8/logs/stdio The bot does a SHARED_LIBS=ON build. I can reproduce locally with the CMake cache file in offload/cmake/caches/AMDGPUBot.cmake as the build config.
2025-07-10[CodeGen] Expose the extensibility of PassConfig to plugins (#139059)Tcc1001-0/+2
This PR exposes the backend pass config to plugins via a callback. Plugin authors can register a callback that is being triggered before the target backend adds their passes to the pipeline. In the callback they then get access to the `TargetMachine`, the `PassManager`, and the `TargetPassConfig`. This allows plugins to call `TargetPassConfig::insertPass`, which is honored in the subsequent `addPass` of the main backend. We implemented this using the legacy pass manager since backends still use it as the default.
2025-07-10TargetLowering: Avoid a use of PointerType::getUnqual (#147884)Matt Arsenault1-1/+3
Use the default globals address space
2025-07-10[CodeGen][NewPM] Port "PostRAMachineSink" pass to NPM (#129690)Vikram Hegde2-30/+54
2025-07-10[TargetLowering] Change getOptimalMemOpType and findOptimalMemOpLowering to ↵Boyao Wang2-6/+8
take LLVM Context (#147664) Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So that we can use EVT::getVectorVT to generate EVT type in getOptimalMemOpType. Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).
2025-07-09[LegalizeTypes] Preserve disjoint flag when expanding OR. (#147640)Craig Topper1-2/+7
2025-07-09MachineLICM: Merge logic for implicit and explicit definitions.Peter Collingbourne1-15/+5
Anatoly Trosinenko found that when hasSideEffect was set to 0 in the definition of LOADgotAUTH, MultiSource/Benchmarks/Ptrdist/ks/ks test from llvm-test-suite started to crash. The issue was traced down to MachineLICM pass placing LOADgotAUTH right after an unrelated copy to x16 like rewriting this code: ```` bb.0: renamable $x16 = COPY renamable $x12 B %bb.1 bb.1: ... /* use $x16 */ ... renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv /* use $x20 */ ... ```` like the following: ```` bb.0: renamable $x16 = COPY renamable $x12 renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv B %bb.1 bb.1: ... /* use $x16 */ ... /* use $x20 */ ... ``` The issue was caused by inconsistent logic between implicit and explicit operand definitions, where the implicit side was incorrectly skipping checking RUDefs for dead operands, leading to RuledOut not being set for the X16 operand. Because there isn't really a semantic difference between implicit and explicit operands at this point, let's remove the isImplicit check and adjust the logic to do the same thing in both cases: - For implicit operands, we now check and update RUDefs in the same way as explicit operands. - For explicit operands, we now allow dead operands to be skipped. Reviewers: arsenm, s-barannikov, atrosinenko Reviewed By: arsenm, s-barannikov Pull Request: https://github.com/llvm/llvm-project/pull/147624
2025-07-09[IA] Partially revert interface change from 4a66baPhilip Reames1-3/+3
As noted in post commit review, the API change here was not required. I'd apparently confused myself when teasing apart patches from my development branch.
2025-07-09[SHT_LLVM_BB_ADDR_MAP] Emit callsite offsets in the `SHT_LLVM_BB_ADDR_MAP` ↵Rahman Lavaee1-7/+30
section. (#146563) Callsite offsets will help map addresses to the right position in the basic block (before or after a callsite). This PR also bumps the BBAddrMap version to 3. The encoding/decoding ability is already pushed upstream 8d7a8fcc3ab9f6d4c4a7e4312876fe94ed3d6c4f.
2025-07-09[GlobalISel] Add Saturated Truncate Instructions (#147526)jyli01161-0/+6
Introduces saturated truncate instructions to Global ISel: G_TRUNC_SSAT_S, G_TRUNC_SSAT_U, G_TRUNC_USAT_U. These were previously introduced to SDAG to reduce redundant code. The patch only initially introduces the instruction, a later patch will follow to add combines and legalization for each instruction.
2025-07-09[IA] Support deinterleave intrinsics w/ fewer than N extracts (#147572)Philip Reames1-9/+13
For the fixed vector cases, we already support this, but the deinterleave intrinsic cases (primary used by scalable vectors) didn't. Supporting it requires plumbing through the Factor separately from the extracts, as there can now be fewer extracts than the Factor. Note that the fixed vector path handles this slightly differently - it uses the shuffle and indices scheme to achieve the same thing.
2025-07-09[CodeGen][NPM] Port InitUndef to NPM (#138495)Akshat Oke2-16/+36
2025-07-09DAG: Remove dead declaration of ExpandSinCosLibCall (#147673)Matt Arsenault1-1/+0
2025-07-09RuntimeLibcalls: Remove table of soft float compare cond codes (#146082)Matt Arsenault2-35/+87
Previously we had a table of entries for every Libcall for the comparison to use against an integer 0 if it was a soft float compare function. This was only relevant to a handful of opcodes, so it was wasteful. Now that we can distinguish the abstract libcall for the compare with the concrete implementation, we can just directly hardcode the comparison against the libcall impl without this configuration system.