aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
AgeCommit message (Collapse)AuthorFilesLines
6 days[SelectionDAG] Remove `UnsafeFPMath` in LegalizeDAG (#146316)paperchalice1-1/+1
These global flags hinder further improvements like [[RFC] Honor pragmas with -ffp-contract=fast](https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast) and pass concurrency support. Remove them incrementally.
12 days[SelectionDAG] Pass SDNodeFlags through getNode instead of setFlags. (#149852)Craig Topper1-28/+22
getNode updates flags correctly for CSE. Calling setFlags after getNode may set the flags where they don't apply. I've added a Flags argument to getSelectCC and the signature of getNode that takes an ArrayRef of EVTs.
2025-07-13DAG: Use fast variants of fast math libcalls (#147481)Matt Arsenault1-26/+99
Hexagon currently has an untested global flag to control fast math variants of libcalls. Add fast variants as explicit libcall options so this can be a flag based lowering decision, and implement it. I have no idea what fast math flags the hexagon case requires, so I picked the maximally potentially relevant set of flags although this probably is refinable per call. Looking in compiler-rt, I'm not sure if the fast variants are anything more than aliases.
2025-07-09DAG: Remove dead declaration of ExpandSinCosLibCall (#147673)Matt Arsenault1-1/+0
2025-07-08[DAG] Add generic expansion for ISD::FCANONICALIZE nodes (#142105)Dominik Steenken1-0/+26
This PR takes the work previously done by @pawan-nirpal-031 on X86 in #106370, and makes it available in common code. This should enable all targets to use `__builtin_canonicalize` for all `f(16|32|64|128)` data types. Canonicalization is implemented here as multiplication by `1.0`, as suggested in [the docs](https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic).
2025-06-09[CodeGen] Construct SmallVector with ArrayRef (NFC) (#143391)Kazu Hirata1-3/+1
2025-06-06RuntimeLibcalls: Rename fminimum_num/fmaximum_num enums (#143078)Matt Arsenault1-6/+6
Add the underscore to match the libm spelling
2025-06-04Revert "[SDAG] Fix fmaximum legalization errors (#142170)"Nikita Popov1-7/+93
This reverts commit 58cc1675ec7b4aa5bc2dab56180cb7af1b23ade5. I also made the incorrect assumption that we know both values are +/-0.0 here as well. Revert for now.
2025-06-03[SelectionDAG][AArch64] Legalize power of 2 vector.[de]interleaveN (#141513)Luke Lau1-0/+53
After https://github.com/llvm/llvm-project/pull/139893, we now have [de]interleave intrinsics for factors 2-8 inclusive, with the plan to eventually get the loop vectorizer to emit a single intrinsic for these factors instead of recursively deinterleaving (to support scalable non-power-of-2 factors and to remove the complexity in the interleaved access pass). AArch64 currently supports scalable interleaved groups of factors 2 and 4 from the loop vectorizer. For factor 4 this is currently emitted as a series of recursive [de]interleaves, and normally converted to a target intrinsic in the interleaved access pass. However if for some reason the interleaved access pass doesn't catch it, the [de]interleave4 intrinsic will need to be lowered by the backend. This patch legalizes the node and any other power-of-2 factor to smaller factors, so if a target can lower [de]interleave2 it should be able to handle this without crashing. Factor 3 will probably be more complicated to lower so I've left it out for now. We can disable it in the AArch64 cost model when implementing the loop vectorizer changes.
2025-06-02[SDAG] Fix fmaximum legalization errors (#142170)Nikita Popov1-93/+7
FMAXIMUM is currently legalized via IS_FPCLASS for the signed zero handling. This is problematic, because it assumes the equivalent integer type is legal. Many targets have legal fp128, but illegal i128, so this results in legalization failures. Fix this by replacing IS_FPCLASS with checking the bitcast to integer instead. In that case it is sufficient to use any legal integer type, as we're just interested in the sign bit. This can be obtained via a stack temporary cast. There is existing FloatSignAsInt functionality used for legalization of FABS and similar we can use for this purpose. Fixes https://github.com/llvm/llvm-project/issues/139380. Fixes https://github.com/llvm/llvm-project/issues/139381. Fixes https://github.com/llvm/llvm-project/issues/140445.
2025-05-16[SelectionDAG] Rename MemSDNode::getOriginalAlign to getBaseAlign. NFC (#139930)Craig Topper1-19/+18
This matches the underlying function in MachineMemOperand and how it is printed when BaseAlign differs from Align.
2025-04-23[SDag][ARM][RISCV] Allow lowering CTPOP into a libcall (#101786)Sergei Barannikov1-20/+62
This is a reland of #99752 with the bug fixed (see test diff in the third commit in this PR). All `popcount` libcalls return `int`, but `ISD::CTPOP` returns the type of the argument, which can be wider than `int`. The fix is to make DAG legalizer pass the correct return type to `makeLibCall` and sign-extend the result afterwards. Original commit message: The main change is adding CTPOP to `RuntimeLibcalls.def` to allow targets to use LibCall action for CTPOP. DAG legalizers are changed accordingly. Pull Request: https://github.com/llvm/llvm-project/pull/101786
2025-04-16Fix 'unannotated fall-through between switch labels' warning. (#136000)Jonas Paulsson1-2/+2
2025-04-16[SystemZ] Add support for 16-bit floating point. (#109164)Jonas Paulsson1-0/+19
- _Float16 is now accepted by Clang. - The half IR type is fully handled by the backend. - These values are passed in FP registers and converted to/from float around each operation. - Compiler-rt conversion functions are now built for s390x including the missing extendhfdf2 which was added. Fixes #50374
2025-04-10Reland "[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the ↵zhijian lin1-0/+14
poison value in the IR." (#135056) A new ISD::POISON SDNode is introduced to represent the poison value in the IR, replacing the previous use of ISD::UNDEF
2025-04-09Revert "[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the ↵Jakub Kuderski1-14/+0
poison value in the IR." (#135060) Reverts llvm/llvm-project#125883 This PR causes crashes in RISC-V codegen around f16/f64 poison values: https://github.com/llvm/llvm-project/pull/125883#issuecomment-2787048206
2025-04-07[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison ↵zhijian lin1-0/+14
value in the IR. (#125883) A new ISD::POISON SDNode is introduced to represent the `poison value` in the IR, replacing the previous use of ISD::UNDEF.
2025-03-27Port `NVPTXTargetLowering::LowerCONCAT_VECTORS` to SelectionDAG (#120030)Ethan Kaji1-1/+28
Ports `NVPTXTargetLowering::LowerCONCAT_VECTORS` to `llvm/lib/CodeGen/SelectionDAG` as requested in https://github.com/llvm/llvm-project/issues/116695.
2025-03-25Calculate KnownBits from Metadata correctly for vector loads (#128908)LU-JOHN1-0/+8
Calculate KnownBits correctly from metadata for vector loads. --------- Signed-off-by: John Lu <John.Lu@amd.com>
2025-03-12[NVPTX] Legalize ctpop and ctlz in operation legalization (#130668)Alex MacLean1-1/+2
By pulling the truncates and extensions out of operations during operation legalization we enable more optimization via DAGCombiner. While the test cases show only cosmetic improvements (unlikely to impact the final SASS) in real programs the exposure of these truncates can allow for more optimization.
2025-03-07[SelectionDAG] Clean up some redundant setting of node flags (NFC) (#130307)John Brawn1-5/+2
PR #130124 added a use of FlagInserter to the start of SelectionDAGLegalize::PromoteNode, making some of the places where we set flags be redundant, so remove them. The places where the setting of flags remains are in non-floating-point operations.
2025-03-07[SelectionDAG] Preserve fast math flags when legalizing/promoting (#130124)John Brawn1-0/+3
When we have a floating-point operation that a target doesn't support for a given type, but does support for a wider type, then there are two ways this can be handled: * If the target doesn't have any registers at all of this type then LegalizeTypes will convert the operation. * If we do have registers but no operation for this type, then the operation action will be Promote and it's handled in PromoteNode. In both cases the operation at the wider type, and the conversion operations to and from that type, should have the same fast math flags as the original operation. This is being done in preparation for a DAGCombine patch which makes use of these fast math flags.
2025-02-28[SelectionDAG][RISCV] Promote VECREDUCE_{FMAX,FMIN,FMAXIMUM,FMINIMUM} (#128800)Jim Lin1-7/+18
This patch also adds the tests for VP_REDUCE_{FMAX,FMIN,FMAXIMUM,FMINIMUM}, which have been supported for a while.
2025-02-11[RTLIB] Rename getFSINCOS() to getSINCOS (NFC) (#126705)Benjamin Maxwell1-2/+2
This makes the name more consistent with the other helpers.
2025-02-11[IR] Add llvm.sincospi intrinsic (#125873)Benjamin Maxwell1-4/+9
This adds the `llvm.sincospi` intrinsic, legalization, and lowering (mostly reusing the lowering for sincos and frexp). The `llvm.sincospi` intrinsic takes a floating-point value and returns both the sine and cosine of the value multiplied by pi. It computes the result more accurately than the naive approach of doing the multiplication ahead of time, especially for large input values. ``` declare { float, float } @llvm.sincospi.f32(float %Val) declare { double, double } @llvm.sincospi.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincospi.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincospi.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincospi.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincospi.v4f32(<4 x float> %Val) ``` Currently, the default lowering of this intrinsic relies on the `sincospi[f|l]` functions being available in the target's runtime (e.g. libc).
2025-02-07[IR] Add `llvm.modf` intrinsic (#121948)Benjamin Maxwell1-3/+7
This adds the `llvm.modf` intrinsic, legalization, and lowering (mostly reusing the lowering for sincos and frexp). The `llvm.modf` intrinsic takes a floating-point value and returns both the integral and fractional parts (as a struct). ``` declare { float, float } @llvm.modf.f32(float %Val) declare { double, double } @llvm.modf.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.modf.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.modf.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.modf.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.modf.v4f32(<4 x float> %Val) ``` This corresponds to the libm `modf` function but returns multiple values in a struct (rather than take output pointers), which makes it easier to vectorize.
2025-01-31[LegalizeDAG] Use Base+Offset instead of Offset+Base for jump tablesAlexander Richardson1-5/+4
This is needed for architectures that actually use strict pointer arithmetic instead of integers such as AArch64 with FEAT_CPA (see https://github.com/llvm/llvm-project/pull/105669) or CHERI. Using an index as the first operand of pointer arithmetic may result in an invalid output. While there are quite a few codegen changes here, these only change the order of registers in add instructions. One MIPS combine had to be updated to handle the new node order. Reviewed By: topperc Pull Request: https://github.com/llvm/llvm-project/pull/125279
2025-01-10[SDAG] Set IsPostTypeLegalization flag in LegalizeDAG (#122278)Nikita Popov1-8/+3
This runs after type legalization and as such should set IsPostTypeLegalization when creating libcalls. I don't think this makes any observable difference right now, but I ran into this issue in an upcoming patch.
2024-12-18[SelectionDAG] Rename SDNode::uses() to users(). (#120499)Craig Topper1-2/+2
This function is most often used in range based loops or algorithms where the iterator is implicitly dereferenced. The dereference returns an SDNode * of the user rather than SDUse * so users() is a better name. I've long beeen annoyed that we can't write a range based loop over SDUse when we need getOperandNo. I plan to rename use_iterator to user_iterator and add a use_iterator that returns SDUse& on dereference. This will make it more like IR.
2024-12-03[TargetLowering] Use Type* instead of EVT in shouldSignExtendTypeInLibCall. ↵Craig Topper1-2/+2
(#118587) I want to use this function for GISel too so Type * is a better common interface. All of the callers already convert EVT to Type * as needed by calling lowering anyway.
2024-12-03[SelectionDAG] Rename CallOptions::IsSExt to IsSigned. NFC (#118574)Craig Topper1-1/+1
This is eventually passed to shouldSignExtendTypeInLibCall which calls it IsSigned.
2024-11-25[SelectionDAG] Require last operand of (STRICT_)FP_ROUND to be a ↵Craig Topper1-5/+9
TargetConstant. (#117639) Fix all the places I could find that did't do this. We were already mostly correct for FP_ROUND after 9a976f36615dbe15e76c12b22f711b2e597a8e51, but not STRICT_FP_ROUND.
2024-11-25[SelectionDAG][RISCV][AArch64] Allow f16 STRICT_FLDEXP to be promoted. Fix ↵Craig Topper1-0/+13
integer promotion of STRICT_FLDEXP in type legalizer. (#117633) A special case in type legalization wasn't accounting for different operand numbering between FLDEXP and STRICT_FLDEXP. AArch64 already asked STRICT_FLDEXP to be promoted, but had no test for it.
2024-11-25[AMDGPU] Use getSignedConstant() where necessary (#117328)Nikita Popov1-6/+6
Create signed constant using getSignedConstant(), to avoid future assertion failures when we disable implicit truncation in getConstant(). This also touches some generic legalization code, which apparently only AMDGPU tests.
2024-11-06[SDAG] Merge multiple-result libcall expansion into ↵Benjamin Maxwell1-46/+11
DAG.expandMultipleResultFPLibCall() (#114792) This merges the logic for expanding both FFREXP and FSINCOS into one method `DAG.expandMultipleResultFPLibCall()`. This reduces duplication and also allows FFREXP to benefit from the stack slot elimination implemented for FSINCOS. This method will also be used in future to implement more multiple-result intrinsics (such as modf and sincospi).
2024-10-31[SDAG] Support expanding `FSINCOS` to vector library calls (#114039)Benjamin Maxwell1-70/+1
This shares most of its code with the scalar sincos expansion. It allows expanding vector FSINCOS nodes to a library call from the specified `-vector-library`. The upside of this is it will mean the vectorizer only needs to handle the sincos intrinsic, which has no memory effects, and this can handle lowering the intrinsic to a call that takes output pointers.
2024-10-30[LegalizeDAG] Use getSignedConstant. NFCCraig Topper1-1/+1
2024-10-31[SDAG] Simplify `SDNodeFlags` with bitwise logic (#114061)Yingwei Zheng1-5/+2
This patch allows using enumeration values directly and simplifies the implementation with bitwise logic. It addresses the comment in https://github.com/llvm/llvm-project/pull/113808#discussion_r1819923625.
2024-10-29DAG: Fix legalization of vector addrspacecasts (#113964)Matt Arsenault1-0/+3
2024-10-29[IR] Add `llvm.sincos` intrinsic (#109825)Benjamin Maxwell1-0/+21
This adds the `llvm.sincos` intrinsic, legalization, and lowering. The `llvm.sincos` intrinsic takes a floating-point value and returns both the sine and cosine (as a struct). ``` declare { float, float } @llvm.sincos.f32(float %Val) declare { double, double } @llvm.sincos.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val) ``` The lowering is built on top of the existing FSINCOS ISD node, with additional type legalization to allow for f16, f128, and vector values.
2024-10-16[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760)Tex Riddell1-0/+7
This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Based on example PR #96222 and fix PR #101268, with some differences due to 2-arg intrinsic and intermediate refactor (RuntimeLibCalls.cpp). - Add llvm.experimental.constrained.atan2 - Intrinsics.td, ConstrainedOps.def, LangRef.rst - Add to ISDOpcodes.h and TargetSelectionDAG.td, connect to intrinsic in BasicTTIImpl.h, and LibFunc_ in SelectionDAGBuilder.cpp - Update LegalizeDAG.cpp, LegalizeFloatTypes.cpp, LegalizeVectorOps.cpp, and LegalizeVectorTypes.cpp - Update isKnownNeverNaN in SelectionDAG.cpp - Update SelectionDAGDumper.cpp - Update libcalls - RuntimeLibcalls.def, RuntimeLibcalls.cpp - TargetLoweringBase.cpp - Expand for vectors, promote f16 - X86ISelLowering.cpp - Expand f80, promote f32 to f64 for MSVC Part 4 for Implement the atan2 HLSL Function #70096.
2024-10-08Fix comment typo in ExpandFCOPYSIGN (#111489)Ralf Jung1-1/+2
I noticed this while following https://github.com/llvm/llvm-project/pull/111269. It makes little sense that FCOPYSIGN would look at the sign of `x`, right? Surely this must be `y`. Also fix the inconsistency where it's sometimes `x` and sometimes `X`.
2024-10-04[SDAG][RISCV] Don't promote VP_REDUCE_{FADD,FMUL} (#111000)Luke Lau1-3/+0
In https://reviews.llvm.org/D153848, promotion was added for a variety of f16 ops with zvfhmin, including VP reductions. However I don't believe it's correct to promote f16 fadd or fmul reductions to f32 since we need to round the intermediate results. Today if we lower @llvm.vp.reduce.fadd.nxv1f16 on RISC-V, we'll get two different results depending on whether we compiled with +zvfh or +zvfhmin, for example with a 3 element reduction: ; v9 = [0.1563, 5.97e-8, 0.00006104] ; zvfh vsetivli x0, 3, e16, m1, ta, ma vmv.v.i v8, 0 vfredosum.vs v8, v9, v8 vfmv.f.s fa0, v8 ; fa0 = 0.1563 ; zvfhmin vsetivli x0, 3, e16, m1, ta, ma vfwcvt.f.f.v v10, v9 vsetivli x0, 3, e32, m1, ta, ma vmv.v.i v8, 0 vfredosum.vs v8, v10, v8 vfmv.f.s fa0, v8 fcvt.h.s fa0, fa0 ; fa0 = 0.1564 This same thing happens with reassociative reductions e.g. vfredusum.vs, and this also applies for bf16. I couldn't find anything in the LangRef for reductions that suggest the excess precision is allowed. There may be something we can do in Clang with -fexcess-precision=fast, but I haven't looked into this yet. I presume the same precision issue occurs with fmul, but not with fmin/fmax/fminimum/fmaximum. I can't think of another way of lowering these other than scalarizing, and we can't scalarize scalable vectors, so this just removes the promotion and adjusts the cost model to return an invalid cost. (It looks like we also don't currently cost fmul reductions, so presumably they also have an invalid cost?) I think this should be enough to stop the loop vectorizer or SLP from emitting these intrinsics.
2024-09-25[SDAG] Honor signed arguments in floating point libcalls (#109134)Timothy Pearson1-1/+2
In ExpandFPLibCall, an assumption is made that all floating point libcalls that take integer arguments use unsigned integers. In the case of ldexp and frexp, this assumption is incorrect, leading to miscompilation and subsequent target-dependent incorrect operation. Indicate that ldexp and frexp utilize signed arguments in ExpandFPLibCall. Fixes #108904 Signed-off-by: Timothy Pearson <tpearson@solidsilicon.com>
2024-09-24[SDAG] Avoid creating redundant stack slots when lowering FSINCOS (#108401)Benjamin Maxwell1-55/+51
When lowering `FSINCOS` to a library call (that takes output pointers) we can avoid creating new stack allocations if the results of the `FSINCOS` are being stored. Instead, we can take the destination pointers from the stores and pass those to the library call. --- Note: As a NFC this also adds (and uses) `RTLIB::getFSINCOS()`.
2024-08-30[AArch64][SelectionDAG] Vector splitting and promotion for histogram ↵Max Beck-Jones1-0/+5
intrinsic (#103037) Adds support for wider-than-legal vector types for the histogram intrinsic (llvm.experimental.vector.histogram.add) by splitting the vector. Also adds integer promotion for the Inc operand.
2024-08-17[SelectionDAG] Use getAllOnesConstant.Craig Topper1-2/+3
2024-08-16[SelectionDAG][X86] Add SelectionDAG::getSignedConstant and use it in a few ↵Craig Topper1-1/+1
places. (#104555) PR #80309 proposes to have users of APInt's uint64_t constructor opt-in to implicit truncation. Currently, that patch requires SelectionDAG::getConstant to opt-in. This patch adds getSignedConstant so we can start fixing some of the cases that require implicit truncation.
2024-08-15Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649)YunQiang Su1-0/+17
C23 introduced new functions fminimum_num and fmaximum_num, and they follow the minimumNumber and maximumNumber of IEEE754-2019. Let's introduce new intrinsics to support them. This patch introduces support only support for scalar values. The support of vector (vp, vp.reduce, vector.reduce), experimental.constrained will be added in future patches. With this patch, MIPSr6 and LoongArch can work out of box with fcanonical and fmax/fmin. Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while they have no fcanonical support yet. I will add it in future patches. The FMIN/FMAX of RISC-V instructions follows the minimumNumber/maximumNumber of IEEE754-2019. We can just add it in future patch. Background https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735 Currently we have fminnum/fmaxnum, which have different behavior on different platform for NUM vs sNaN: 1) Fallback to fmin(3)/fmax(3): return qNaN. 2) ARM64/ARM32+Neon: same as libc. 3) MIPSr6/LoongArch/RISC-V: return NUM. And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008 will submit as separated patches.
2024-08-13[SelectionDAG] Replace EVTToAPFloatSemantics with MVT/EVT::getFltSemantics. ↵Craig Topper1-3/+3
(#103001)