aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/TargetLoweringBase.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-01-20[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810)Graham Hunter1-0/+3
Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to lower to an ISD node in SelectionDAGBuilder then expand in LegalizeVectorOps, instead of doing everything in the builder. The new ISD node (vector_find_last_active) only covers finding the index of the last active element of the mask, and extracting the element + handling passthru is left to existing ISD nodes.
2024-12-13[GISel] Remove unused DataLayout operand from getApproximateEVTForLLT (#119833)Craig Topper1-1/+1
2024-12-09[TargetLowering] Return Align from getByValTypeAlignment (NFC) (#119233)Sergei Barannikov1-6/+3
2024-11-12[X86][BF16] Add libcall for FP128 -> BF16 (#115825)Feng Zou1-0/+2
This is to fix #115710.
2024-11-04SafeStack: Respect alloca addrspace (#112536)Matt Arsenault1-1/+4
Just insert addrspacecast in cases where the alloca uses a different address space, since I don't know what else you could possibly do.
2024-10-29[IR] Add `llvm.sincos` intrinsic (#109825)Benjamin Maxwell1-2/+3
This adds the `llvm.sincos` intrinsic, legalization, and lowering. The `llvm.sincos` intrinsic takes a floating-point value and returns both the sine and cosine (as a struct). ``` declare { float, float } @llvm.sincos.f32(float %Val) declare { double, double } @llvm.sincos.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val) ``` The lowering is built on top of the existing FSINCOS ISD node, with additional type legalization to allow for f16, f128, and vector values.
2024-10-28Check hasOptSize() in shouldOptimizeForSize() (#112626)Ellis Hoag1-1/+0
2024-10-16[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760)Tex Riddell1-3/+4
This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Based on example PR #96222 and fix PR #101268, with some differences due to 2-arg intrinsic and intermediate refactor (RuntimeLibCalls.cpp). - Add llvm.experimental.constrained.atan2 - Intrinsics.td, ConstrainedOps.def, LangRef.rst - Add to ISDOpcodes.h and TargetSelectionDAG.td, connect to intrinsic in BasicTTIImpl.h, and LibFunc_ in SelectionDAGBuilder.cpp - Update LegalizeDAG.cpp, LegalizeFloatTypes.cpp, LegalizeVectorOps.cpp, and LegalizeVectorTypes.cpp - Update isKnownNeverNaN in SelectionDAG.cpp - Update SelectionDAGDumper.cpp - Update libcalls - RuntimeLibcalls.def, RuntimeLibcalls.cpp - TargetLoweringBase.cpp - Expand for vectors, promote f16 - X86ISelLowering.cpp - Expand f80, promote f32 to f64 for MSVC Part 4 for Implement the atan2 HLSL Function #70096.
2024-09-24[SDAG] Avoid creating redundant stack slots when lowering FSINCOS (#108401)Benjamin Maxwell1-0/+5
When lowering `FSINCOS` to a library call (that takes output pointers) we can avoid creating new stack allocations if the results of the `FSINCOS` are being stored. Instead, we can take the destination pointers from the stores and pass those to the library call. --- Note: As a NFC this also adds (and uses) `RTLIB::getFSINCOS()`.
2024-09-19Reland "[X86][BF16] Add libcall for F80 -> BF16 (#109116)" (#109143)Phoebe Wang1-0/+2
This reverts commit ababfee78714313a0cad87591b819f0944b90d09. Add X86 FP80 check.
2024-09-18Revert "[X86][BF16] Add libcall for F80 -> BF16" (#109140)Phoebe Wang1-2/+0
Reverts llvm/llvm-project#109116
2024-09-18[X86][BF16] Add libcall for F80 -> BF16 (#109116)Phoebe Wang1-0/+2
This fixes #108936, but the calling convention doesn't match with GCC. I doubt we have such a lib function for now, so leave the calling convention as is.
2024-08-31Revert "[RISCV] RISCV vector calling convention (2/2)" (#97994)Brandon Wu1-10/+2
This reverts commit 91dd844aa499d69c7ff75bf3156e2e3593a88057. Stacked on https://github.com/llvm/llvm-project/pull/97993
2024-08-21Scalarize the vector inputs to llvm.lround intrinsic by default. (#101054)Sumanth Gundapaneni1-2/+3
Verifier is updated in a different patch to let the vector types for llvm.lround and llvm.llround intrinsics.
2024-08-15Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649)YunQiang Su1-0/+1
C23 introduced new functions fminimum_num and fmaximum_num, and they follow the minimumNumber and maximumNumber of IEEE754-2019. Let's introduce new intrinsics to support them. This patch introduces support only support for scalar values. The support of vector (vp, vp.reduce, vector.reduce), experimental.constrained will be added in future patches. With this patch, MIPSr6 and LoongArch can work out of box with fcanonical and fmax/fmin. Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while they have no fcanonical support yet. I will add it in future patches. The FMIN/FMAX of RISC-V instructions follows the minimumNumber/maximumNumber of IEEE754-2019. We can just add it in future patch. Background https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735 Currently we have fminnum/fmaxnum, which have different behavior on different platform for NUM vs sNaN: 1) Fallback to fmin(3)/fmax(3): return qNaN. 2) ARM64/ARM32+Neon: same as libc. 3) MIPSr6/LoongArch/RISC-V: return NUM. And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008 will submit as separated patches.
2024-08-14[DAG] Support saturated truncate (#99418)hanbeom1-0/+5
A truncate is considered saturated if no additional conversion is required between the target and return values. If the target is saturated when attempting to truncate from a vector, there is an opportunity to optimize it. Previously, each architecture had its own attempt at optimization, leading to redundant code. This patch implements common logic by introducing three new ISDs: `ISD::TRUNCATE_SSAT_S`: When the operand is a signed value and the range of values matches the range of signed values of the destination type. `ISD::TRUNCATE_SSAT_U`: When the operand is a signed value and the range of values matches the range of unsigned values of the destination type. `ISD::TRUNCATE_USAT_U`: When the operand is an unsigned value and the range of values matches the range of unsigned values of the destination type. These ISDs indicate a saturated truncate. Fixes https://github.com/llvm/llvm-project/issues/85903
2024-07-24[AMDGPU] Implement llvm.lrint intrinsic lowering (#98931)Sumanth Gundapaneni1-9/+9
This patch enabled the target-independent lowering of llvm.lrint via GlobalISel. For SelectionDAG, the instrinsic is custom lowered for AMDGPU.
2024-07-23[AMDGPU] Implement llvm.lround intrinsic lowering. (#98970)Sumanth Gundapaneni1-7/+10
This patch enables the target-independent lowering of llvm.lround via GlobalISel. For SelectionDAG, the instrinsic is custom lowered for AMDGPU. In order to support vector floating point input for llvm.lround, this patch extends the target independent APIs and provide support for scalarizing. pr98950 is needed to let verifier allow vector floating point types
2024-07-20Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped ↵Joseph Huber1-381/+35
(#98512)" This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5. I moved the `ISD` dependencies into the CodeGen portion of the handling, it's a little awkward but it's the easiest solution I can think of for now.
2024-07-20ReformatNAKAMURA Takumi1-3/+3
2024-07-20Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped ↵NAKAMURA Takumi1-3/+384
(#98512)" This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69. (llvmorg-19-init-17714-gc05126bdfc3b) See #99610
2024-07-17[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289)Lawrence Benson1-0/+3
This PR adds a new vector intrinsic `@llvm.experimental.vector.compress` to "compress" data within a vector based on a selection mask, i.e., it moves all selected values (i.e., where `mask[i] == 1`) to consecutive lanes in the result vector. A `passthru` vector can be provided, from which remaining lanes are filled. The main reason for this is that the existing `@llvm.masked.compressstore` has very strong constraints in that it can only write values that were selected, resulting in guard branches for all targets except AVX-512 (and even there the AMD implementation is _very_ slow). More instruction sets support "compress" logic, but only within registers. So to store the values, an additional store is needed. But this combination is likely significantly faster on many target as it avoids branches. In follow up PRs, my plan is to add target-specific lowerings for x86, SVE, and possibly RISCV. I also want to combine this with a store instruction, as this is probably a common case and we can avoid some memory writes in that case. See [discussion in forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663) for initial discussion on the design.
2024-07-16[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)Joseph Huber1-384/+3
Summary: The LTO pass and LLD linker have logic in them that forces extraction and prevent internalization of needed runtime calls. However, these currently take all RTLibcalls into account, even if the target does not support them. The target opts-out of a libcall if it sets its name to nullptr. This patch pulls this logic out into a class in the header so that LTO / lld can use it to determine if a symbol actually needs to be kept. This is important for targets like AMDGPU that want to be able to use `lld` to perform the final link step, but does not want the overhead of uncalled functions. (This adds like a second to the link time trivially)
2024-07-12[NVPTX] Disable all RTLib libcalls (#98672)Joseph Huber1-0/+7
Summary: This patch explicitly disables runtime calls to be emitted from the NVPTX backend. This allows other utilities to know that we do not need to worry about emitting these.
2024-07-11[Darwin] Fix availability of exp10 for watchOS, tvOS, xROS. (#98542)Florian Hahn1-9/+8
Update availability information added in 1eb7f055d9a. exp10 is available on iOS >= 7.0 and macOS >= 10.9. On all other platforms, it is available on any version. Also drop the x86 check, as the availability only depends on the OS version, not the target platform. PR: https://github.com/llvm/llvm-project/pull/98542
2024-07-11[X86][CodeGen] Add base trig intrinsic lowerings (#96222)Farzon Lotfi1-7/+17
This change is an implementation of https://github.com/llvm/llvm-project/issues/87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds constraint intrinsics and some lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. The only x86 specific change was for f80. https://github.com/llvm/llvm-project/issues/70079 https://github.com/llvm/llvm-project/issues/70080 https://github.com/llvm/llvm-project/issues/70081 https://github.com/llvm/llvm-project/issues/70083 https://github.com/llvm/llvm-project/issues/70084 https://github.com/llvm/llvm-project/issues/95966 The x86 lowering is going to be done in three pr changes with this being the first. A second PR will be put up for Loop Vectorizing and then SLPVectorizer. The constraint intrinsics is also going to be in multiple parts, but just 2. This part covers just the llvm specific changes, part2 will cover clang specifc changes and legalization for backends than have special legalization requirements like aarch64 and wasm.
2024-07-11[LLVM] Factor disabled Libcalls into the initializer (#98421)Joseph Huber1-0/+138
Summary: These Libcalls represent which functions are available to the backend. If a runtime call is not available, the target sets the the name to `nullptr`. Currently, this logic is spread around the various targets. This patch pulls all of the locations that disable libcalls into the intializer. This patch is effectively NFC. The motivation behind this patch is that currently the LTO handling uses the list of all runtime calls to determine which functions cannot be internalized and must be extracted from static libraries. We do not want this to happen for libcalls that are not emitted by the backend. A follow-up patch will move out this logic so the LTO pass can know which rtlib calls are actually used by the backend.
2024-07-04[SelectionDAG] Remove LegalTypes argument from getShiftAmountTy. NFC (#97757)Craig Topper1-2/+2
This argument is no longer used inside the function. Remove it from the interface.
2024-07-04[SelectionDAG] Ignore LegalTypes parameter in ↵Craig Topper1-2/+1
TargetLoweringBase::getShiftAmountTy. (#97645) When this flag was false, `getShiftAmountTy` would return `PointerTy` instead of the target's preferred shift amount type for scalar shifts. This used to be needed when the target's preferred type wasn't large enough to support the shift amount needed for an illegal type. For example, any scalar type larger than i256 on X86 since X86's preferred shift amount type is i8. For a while now, we've had code that uses `MVT::i32` if `LegalTypes` is true, but the target's preferred type is too small. This fixed a repeated cause of crashes where the `LegalTypes` flag wasn't set to false when illegal types could be present. This has made it unnecessary to set the `LegalTypes` flag correctly, and as a result more and more places don't. So I think its time for this flag to go away. This first patch just disconnects the flag. The interface and all callers will be cleaned up in follow up patches. The X86 test change is because we now have the same shift type for both shifts in a (srl (sub C, (shl X, 32), 32) sequence. This makes the shift amounts appear equal in value and type which is needed to enable a combine.
2024-06-21Revert "Intrinsic: introduce minimumnum and maximumnum (#93841)"Nikita Popov1-1/+0
As far as I can tell, this pull request was not approved, and did not go through an RFC on discourse. This reverts commit 89881480030f48f83af668175b70a9798edca2fb. This reverts commit 225d8fc8eb24fb797154c1ef6dcbe5ba033142da.
2024-06-21Intrinsic: introduce minimumnum and maximumnum (#93841)YunQiang Su1-0/+1
Currently, on different platform, the behaivor of llvm.minnum is different if one operand is sNaN: When we compare sNaN vs NUM: ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN. RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86: Returns NUM but not same with IEEE754-2019's minimumNumber as +0.0 is not always greater than -0.0. MIPS/LoongArch/Generic: return NUM. LIBCALL: returns qNaN. So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow IEEE754-2019's minimumNumber/maximumNumber. Half-fix: #93033
2024-06-17[SelectionDAG] Add support for the 3-way comparison intrinsics [US]CMP (#91871)Poseydon421-0/+3
This PR adds initial support for the `scmp`/`ucmp` 3-way comparison intrinsics in the SelectionDAG. Some of the expansions/lowerings are not optimal yet.
2024-06-14[CodeGen] Support vectors across all backends (#95518)Farzon Lotfi1-1/+2
Add a default f16 type promotion
2024-06-05[x86] Add tan intrinsic part 4 (#90503)Farzon Lotfi1-1/+3
This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Much of this change was following how G_FSIN and G_FCOS were used. Changes: - `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN` opcode - `llvm/docs/LangRef.rst` - Document the tan intrinsic - `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan intrinsic as a vector function similar to the tanf libcall. - `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to `ISD::FTAN` - `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall mappings - `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN` Opcode - `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN` Opcode handler - `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map `G_FTAN` to `ftan` - `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`, `strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a vector intrinsic - `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic to `G_FTAN` Opcode - `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to the list of floating point math operations also associate `G_FTAN` with the `TAN_F` runtime lib. - `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math operation common behaviors. - llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function expansion operations for `FTAN` and `STRICT_FTAN`. Also define both opcodes in `PromoteNode`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN` and `STRICT_FTAN` handling in the legalizer - `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define `SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an intrinsic that doesn't return NaN. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map `LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map `Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for `Intrinsic::tan`. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan` and `strict_ftan` names for the equivalent ISD opcodes. - `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and ISD::FTAN as a target lowering action. - `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for tan intrinsic resolves https://github.com/llvm/llvm-project/issues/70082
2024-05-30[SelectionDAG] Add an ISD::CLEAR_CACHE node to lower llvm.clear_cache (#93795)Roger Ferrer Ibáñez1-0/+4
The current way of lowering `llvm.clear_cache` is a bit unusual. As suggested by Matt Arsenault we are better off using an ISD node. This change introduces a new `ISD::CLEAR_CACHE`, registers a new libcall by default named `__clear_cache` and the default legalisation is a libcall. This is preparatory work for a custom lowering of `ISD::CLEAR_CACHE` needed by RISC-V on some platforms.
2024-05-29[ValueTypes] Remove MVT::MAX_ALLOWED_VALUETYPE. NFC (#93654)Craig Topper1-3/+0
Despite the comment, this isn't used to size bit vectors or tables. That's done by VALUETYPE_SIZE. MAX_ALLOWED_VALUETYPE is only used by some static_asserts that compare it to VALUETYPE_SIZE. This patch removes it and most of the static_asserts. I left one where I compared VALUETYPE_SIZE to token which is the first type that isn't part of the VALUETYPE range. This isn't strictly needed, we'd probably catch duplication error from VTEmitter.cpp first.
2024-05-20CodeGen: Fix libcall names for exp10 on the various darwins (#92520)Matt Arsenault1-0/+28
It's really great that we have the same information duplicated in TargetLibraryInfo and RuntimeLibcalls which both assume everything by default. Should fix issue reported after #92287
2024-05-09[Analysis] Add cost model for experimental.cttz.elts intrinsic (#90720)David Sherwood1-0/+18
In PR #88385 I've added support for auto-vectorisation of some early exit loops, which requires using the experimental.cttz.elts to calculate final indices in the early exit block. We need a more accurate cost model for this intrinsic to better reflect the cost of work required in the early exit block. I've tried to accurately represent the expansion code for the intrinsic when the target does not have efficient lowering for it. It's quite tricky to model because you need to first figure out what types will actually be used in the expansion. The type used can have a significant effect on the cost if you end up using illegal vector types. Tests added here: Analysis/CostModel/AArch64/cttz_elts.ll Analysis/CostModel/RISCV/cttz_elts.ll
2024-05-07[Analysis, CodeGen, DebugInfo] Use StringRef::operator== instead of ↵Kazu Hirata1-2/+2
StringRef::equals (NFC) (#91304) I'm planning to remove StringRef::equals in favor of StringRef::operator==. - StringRef::operator==/!= outnumber StringRef::equals by a factor of 53 under llvm/ in terms of their usage. - The elimination of StringRef::equals brings StringRef closer to std::string_view, which has operator== but not equals. - S == "foo" is more readable than S.equals("foo"), especially for !Long.Expression.equals("str") vs Long.Expression != "str".
2024-04-16Recommit [RISCV] RISCV vector calling convention (2/2) (#79096) (#87736)Brandon Wu1-2/+10
Bug fix: Handle RVV return type in calling convention correctly. Return values are handled in a same way as function arguments. One thing to mention is that if a type can be broken down into homogeneous vector types, e.g. {<vscale x 4 x i32>, {<vscale x 4 x i32>, <vscale x 4 x i32>}}, it is considered as a vector tuple type and need to be handled by tuple type rule.
2024-03-28[ISel] Move handling of atomic loads from SystemZ to DAGCombiner (NFC). (#86484)Jonas Paulsson1-0/+6
The folding of sign/zero extensions into an atomic load by specifying an extension type is not target specific, and therefore belongs in the DAGCombiner rather than in the SystemZ backend. - Handle atomic loads similarly to regular loads by adding AtomicLoadExtActions with set/get methods. - Move SystemZ extendAtomicLoad() to DagCombiner.cpp.
2024-03-27[FreeBSD] Mark __stack_chk_guard dso_local except for PPC64 (#86665)Justin Cady1-1/+2
Adjust logic of 1cb9f37a17ab to match freebsd/freebsd-src@9a4d48a645a7a. D113443 is the original attempt to bring this FreeBSD patch to llvm-project, but it never landed. This change is required to build FreeBSD kernel modules with -fstack-protector using a standard LLVM toolchain. The FreeBSD kernel loader does not handle R_X86_64_REX_GOTPCRELX relocations. Fixes #50932.
2024-03-20[AArch64] Support scalable offsets with isLegalAddressingMode (#83255)Graham Hunter1-0/+4
Allows us to indicate that an addressing mode featuring a vscale-relative immediate offset is supported.
2024-03-11[CodeGen] Do not pass MF into MachineRegisterInfo methods. NFC. (#84770)Jay Foad1-1/+1
MachineRegisterInfo already knows the MF so there is no need to pass it in as an argument.
2024-03-04[SelectionDAG] Add `STRICT_BF16_TO_FP` and `STRICT_FP_TO_BF16` (#80056)Shilei Tian1-0/+3
This patch adds the support for `STRICT_BF16_TO_FP` and `STRICT_FP_TO_BF16`.
2024-03-04Revert "[SelectionDAG] Add `STRICT_BF16_TO_FP` and `STRICT_FP_TO_BF16` (#80056)"Shilei Tian1-3/+0
This reverts commit b0c158bd947c360a4652eb0de3a4794f46deb88b. The changes in `compiler-rt` broke tests.
2024-03-04[SelectionDAG] Add `STRICT_BF16_TO_FP` and `STRICT_FP_TO_BF16` (#80056)Shilei Tian1-0/+3
This patch adds the support for `STRICT_BF16_TO_FP` and `STRICT_FP_TO_BF16`.
2024-02-13[X86][CodeGen] Restrict F128 lowering to GNU environment (#81664)Pranav Kant1-1/+1
Otherwise it breaks some environment like X64 Android that doesn't have f128 functions available in its libc. Followup to #79611.
2024-02-13[LLVM] Add `__builtin_readsteadycounter` intrinsic and builtin for realtime ↵Joseph Huber1-0/+3
clocks (#81331) Summary: This patch adds a new intrinsic and builtin function mirroring the existing `__builtin_readcyclecounter`. The difference is that this implementation targets a separate counter that some targets have which returns a fixed frequency clock that can be used to determine elapsed time, this is different compared to the cycle counter which often has variable frequency. This patch only adds support for the NVPTX and AMDGPU targets. This is done as a new and separate builtin rather than an argument to `readcyclecounter` to avoid needing to change existing code and to make the separation more explicit.
2024-02-09[X86][CodeGen] Emit float128 libcalls for math functions (#79611)Pranav Kant1-0/+40
Make LLVM emit libcalls to proper float128 variants for float128 types.