aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis/ConstantFolding.cpp
AgeCommit message (Collapse)AuthorFilesLines
2021-08-17[NFC] More get/removeAttribute() cleanupArthur Eubanks1-1/+1
2021-08-06Introduce intrinsic llvm.isnanSerge Pavlov1-1/+5
This is recommit of the patch 16ff91ebccda1128c43ff3cee104e2c603569fb2, reverted in 0c28a7c990c5218d6aec47c5052a51cba686ec5e because it had an error in call of getFastMathFlags (base type should be FPMathOperator but not Instruction). The original commit message is duplicated below: Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made. * The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN. * Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way. * Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function. Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks: * The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations. * IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool. Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false': std::isnan(std::numeric_limits<float>::quiet_NaN()) The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'. To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics. Differential Revision: https://reviews.llvm.org/D104854
2021-08-04Revert "Introduce intrinsic llvm.isnan"Serge Pavlov1-5/+1
This reverts commit 16ff91ebccda1128c43ff3cee104e2c603569fb2. Several errors were reported mainly test-suite execution time. Reverted for investigation.
2021-08-04Introduce intrinsic llvm.isnanSerge Pavlov1-1/+5
Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made. * The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN. * Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way. * Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function. Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks: * The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations. * IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool. Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false': std::isnan(std::numeric_limits<float>::quiet_NaN()) The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'. To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics. Differential Revision: https://reviews.llvm.org/D104854
2021-07-23[ConstantFolding] Fold constrained arithmetic intrinsicsSerge Pavlov1-2/+109
Constfold constrained variants of operations fadd, fsub, fmul, fdiv, frem, fma and fmuladd. The change also sets up some means to support for removal of unused constrained intrinsics. They are declared as accessing memory to model interaction with floating point environment, so they were not removed, as they have side effect. Now constrained intrinsics that have "fpexcept.ignore" as exception behavior are removed if they have no uses. As for intrinsics that have exception behavior other than "fpexcept.ignore", they can be removed if it is known that they do not raise floating point exceptions. It happens when doing constant folding, attributes of such intrinsic are changed so that the intrinsic is not claimed as accessing memory. Differential Revision: https://reviews.llvm.org/D102673
2021-07-20[ConstantFolding] avoid crashing on a fake math library callSanjay Patel1-2/+13
https://llvm.org/PR50960
2021-06-22[ConstantFold] Delay fetching pointer element typeNikita Popov1-8/+6
Don't do this while stipping pointer casts, instead fetch it at the end. This improves compatibility with opaque pointers for the case where the base object is not opaque.
2021-06-22[ConstantFolding] Separate conditions in GEP evaluation (NFC)Nikita Popov1-20/+17
Handle to gep p, 0-v case separately, and not as part of the loop that ensures all indices are constant integers. Those two things are not really related.
2021-06-10[ConstantFolding] Enable folding of min/max/copysign for all floatsSerge Pavlov1-3/+11
Previously such folding was enabled for half, float and double values only. With this change it is allowed for other floating point values also. Differential Revision: https://reviews.llvm.org/D103956
2021-06-01[OpaquePtr] Create API to make a copy of a PointerType with some address spaceArthur Eubanks1-2/+3
Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103429
2021-05-28[ConstantFolding] Fix -Wunused-variable warning (NFC)Yang Fan1-1/+1
GCC warning: ``` /llvm-project/llvm/lib/Analysis/ConstantFolding.cpp: In function ‘llvm::Constant* llvm::ConstantFoldLoadFromConstPtr(llvm::Constant*, llvm::Type*, const llvm::DataLayout&)’: /llvm-project/llvm/lib/Analysis/ConstantFolding.cpp:713:19: warning: unused variable ‘SimplifiedGEP’ [-Wunused-variable] 713 | if (auto *SimplifiedGEP = dyn_cast<GEPOperator>(Simplified)) { | ^~~~~~~~~~~~~ ```
2021-05-27[ConstFold] Simplify a load's GEP operand through local aliasesArthur Eubanks1-5/+47
MSVC-style RTTI produces loads through a GEP of a local alias which itself is a GEP. Currently we aren't able to devirtualize any virtual calls when MSVC RTTI is enabled. This patch attempts to simplify a load's GEP operand by calling SymbolicallyEvaluateGEP() with an option to look through local aliases. Differential Revision: https://reviews.llvm.org/D101100
2021-05-24[ConstProp] propagate poison from vector reduction element(s) to resultSanjay Patel1-1/+1
This follows from the underlying logic for binops and min/max. Although it does not appear that we handle this for min/max intrinsics currently. https://alive2.llvm.org/ce/z/Kq9Xnh
2021-05-22[ConstantFolding] Use APFloat for constant folding. NFCSerge Pavlov1-103/+80
Replace use of host floating types with operations on APFloat when it is possible. Use of APFloat makes analysis more convenient and facilitates constant folding in the case of non-default FP environment. Differential Revision: https://reviews.llvm.org/D102672
2021-05-21[APFloat] convertToDouble/Float can work on shorter typesSerge Pavlov1-4/+1
Previously APFloat::convertToDouble may be called only for APFloats that were built using double semantics. Other semantics like single precision were not allowed although corresponding numbers could be converted to double without loss of precision. The similar restriction applied to APFloat::convertToFloat. With this change any APFloat that can be precisely represented by double can be handled with convertToDouble. Behavior of convertToFloat was updated similarly. It make the conversion operations more convenient and adds support for formats like half and bfloat. Differential Revision: https://reviews.llvm.org/D102671
2021-05-14[LowerConstantIntrinsics] reuse isManifestLogic from ConstantFoldingNick Desaulniers1-14/+1
GlobalVariables are Constants, yet should not unconditionally be considered true for __builtin_constant_p. Via the LangRef https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic: This intrinsic generates no code. If its argument is known to be a manifest compile-time constant value, then the intrinsic will be converted to a constant true value. Otherwise, it will be converted to a constant false value. In particular, note that if the argument is a constant expression which refers to a global (the address of which _is_ a constant, but not manifest during the compile), then the intrinsic evaluates to false. Move isManifestConstant from ConstantFolding to be a method of Constant so that we can reuse the same logic in LowerConstantIntrinsics. pr/41459 Reviewed By: rsmith, george.burgess.iv Differential Revision: https://reviews.llvm.org/D102367
2021-05-10[AMDGPU] Constant fold Intrinsic::amdgcn_permStanislav Mekhanoshin1-0/+44
Differential Revision: https://reviews.llvm.org/D102203
2021-04-29[ConstantFolding] propagate poison through vector reduction intrinsicsSanjay Patel1-1/+5
2021-04-29[ConstantFolding] refactor helper for vector reductions; NFCSanjay Patel1-33/+31
We should handle other cases (undef/poison), so reduce the duplication of repeated switches.
2021-04-27[ConstFold] Use const-folded operands in more placesArthur Eubanks1-22/+9
Previously we were const folding operands but not passing them. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101394
2021-04-20[AArch64] Constant fold sve_convert_from_svbool(zero) to zeroJoe Ellis1-17/+33
Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D100463
2021-04-20Explicitly pass type to cast load constant folding resultArthur Eubanks1-13/+9
Previously we would use the type of the pointee to determine what to cast the result of constant folding a load. To aid with opaque pointer types, we should explicitly pass the type of the load rather than looking at pointee types. ConstantFoldLoadThroughBitcast() converts the const prop'd value to the proper load type (e.g. [1 x i32] -> i32). Instead of calling this in every intermediate step like bitcasts, we only call this when we actually see the global initializer value. In some existing uses of this API, we don't know the exact type we're loading from immediately (e.g. first we visit a bitcast, then we visit the load using the bitcast). In those cases we have to manually call ConstantFoldLoadThroughBitcast() when simplifying the load to make sure that we cast to the proper type. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D100718
2021-04-16[WebAssembly] Remove saturating fp-to-int target intrinsicsThomas Lively1-20/+4
Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead. This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u}, which are now represented in IR as a saturating truncation to a v2i32 followed by a concatenation with a zero vector. Differential Revision: https://reviews.llvm.org/D100596
2021-03-31[ConstantFolding] Fixing addo/subo with undefGeorge Mitenkov1-5/+8
When folding addo/subo with undef, the current convention is to use { -1, false } for addo and { 0, false } for subo. This was fixed for InstSimplify in https://reviews.llvm.org/rGf094d65beaa492e845b03561eddd75b5be653a01, but not in ConstantFolding. Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D99564
2021-03-21Reapply [ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast()Nikita Popov1-1/+1
There seems to be an impedance mismatch between what the type system considers an aggregate (structs and arrays) and what constants consider an aggregate (structs, arrays and vectors). Adjust the type check to consider vectors as well. The previous version of the patch dropped the type check entirely, but it turns out that getAggregateElement() does require the constant to be an aggregate in some edge cases: For Poison/Undef the getNumElements() API is called, without checking in advance that we're dealing with an aggregate. Possibly the implementation should avoid doing that, but for now I'm adding an assert so the next person doesn't fall into this trap.
2021-03-16Revert "[ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast()"Zequan Wu1-0/+5
That commit caused chromium build to crash: https://bugs.chromium.org/p/chromium/issues/detail?id=1188885 This reverts commit edf7004851519464f86b0f641da4d6c9506decb1.
2021-03-12[ConstantFold] Handle undef/poison when constant folding smul_fix/smul_fix_satBjorn Pettersson1-35/+36
Do constant folding according to posion * C -> poison C * poison -> poison undef * C -> 0 C * undef -> 0 for smul_fix and smul_fix_sat intrinsics (for any scale). Reviewed By: nikic, aqjune, nagisa Differential Revision: https://reviews.llvm.org/D98410
2021-03-06[ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast()Nikita Popov1-5/+0
There seems to be an impedance mismatch between what the type system considers an aggregate (structs and arrays) and what constants consider an aggregate (structs, arrays and vectors). Rather than adjusting the type check, simply drop it entirely, as getAggregateElement() is well-defined for non-aggregates: It simply returns null in that case.
2021-01-10[ConstantFold] Fold fptoi.sat intrinsicsNikita Popov1-1/+16
The APFloat::convertToInteger() API already implements the desired saturation semantics.
2021-01-08[ConstProp] Constant propagation for get.active.lane.mask instrinsicsDavid Green1-0/+20
Similar to the Arm VCTP intrinsics, if the operands of an active.lane.mask are both known, the constant lane mask can be calculated. This can come up after unrolling the loops. Differential Revision: https://reviews.llvm.org/D94103
2020-12-30[X86] Add x86_amx type for intel AMX.Luo, Yuanke1-6/+9
The x86_amx is used for AMX intrisics. <256 x i32> is bitcast to x86_amx when it is used by AMX intrinsics, and x86_amx is bitcast to <256 x i32> when it is used by load/store instruction. So amx intrinsics only operate on type x86_amx. It can help to separate amx intrinsics from llvm IR instructions (+-*/). Thank Craig for the idea. This patch depend on https://reviews.llvm.org/D87981. Differential Revision: https://reviews.llvm.org/D91927
2020-11-19[llvm][IR] Add dso_local_equivalent ConstantLeonard Chan1-3/+18
The `dso_local_equivalent` constant is a wrapper for functions that represents a value which is functionally equivalent to the global passed to this. That is, if this accepts a function, calling this constant should have the same effects as calling the function directly. This could be a direct reference to the function, the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the resolved function as a call target. When lowered, the returned address must have a constant offset at link time from some other symbol defined within the same binary. The address of this value is also insignificant. The name is leveraged from `dso_local` where use of a function or variable is resolved to a symbol in the same linkage unit. In this patch: - Addition of `dso_local_equivalent` and handling it - Update Constant::needsRelocation() to strip constant inbound GEPs and take advantage of `dso_local_equivalent` for relative references This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959) which makes vtables readonly. This works by replacing the dynamic relocations for function pointers in them with static relocations that represent the offset between the vtable and virtual functions. If a function is externally defined, `dso_local_equivalent` can be used as a generic wrapper for the function to still allow for this static offset calculation to be done. See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details. Differential Revision: https://reviews.llvm.org/D77248
2020-10-16[AMDGPU] Add new llvm.amdgcn.fma.legacy intrinsicJay Foad1-2/+16
Differential Revision: https://reviews.llvm.org/D89558
2020-10-07[llvm][mlir] Promote the experimental reduction intrinsics to be first class ↵Amara Emerson1-36/+36
intrinsics. This change renames the intrinsics to not have "experimental" in the name. The autoupgrader will handle legacy intrinsics. Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html Differential Revision: https://reviews.llvm.org/D88787
2020-09-19[ConstantFolding] add undef handling for fmin/fmax intrinsicsSanjay Patel1-0/+19
The output here may not be optimal (yet), but it should be consistent for commuted operands (it was not before) and correct. We can do better by checking FMF and NaN if needed. Code in InstSimplify generally assumes that we have already folded code like this, so it was not handling 2 constant inputs by commuting consistently.
2020-08-10[WebAssembly][ConstantFolding] Fold fp-to-int truncation intrinsicsThomas Lively1-1/+42
Constant fold both the trapping and saturating versions of the WebAssembly truncation intrinsics. The tests are adapted from the WebAssembly spec tests for the corresponding instructions. Requested in PR46982. Differential Revision: https://reviews.llvm.org/D85392
2020-07-31[ConstantFolding] fold abs intrinsicSanjay Patel1-0/+13
The handling for minimum value is similar to cttz/ctlz with 0 just above this case. Differential Revision: https://reviews.llvm.org/D84942
2020-07-31[NFC] Remove unused GetUnderlyingObject paramenterVitaly Buka1-1/+1
Depends on D84617. Differential Revision: https://reviews.llvm.org/D84621
2020-07-30[NFC] GetUnderlyingObject -> getUnderlyingObjectVitaly Buka1-1/+1
I am going to touch them in the next patch anyway
2020-07-29[ConstantFolding] fold integer min/max intrinsicsSanjay Patel1-0/+33
If both operands are undef, return undef. If one operand is undef, clamp to limit constant.
2020-07-26[ConstantFolding] Fold freeze if it is never undef or poisonJuneyoung Lee1-0/+2
This is a simple patch that adds constant folding for freeze instruction. IIUC, it isn't needed to update ConstantFold.cpp because there is no freeze constexpr. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84597
2020-07-22[SVE] Remove calls to VectorType::getNumElements from AnalysisChristopher Tetreault1-5/+5
Reviewers: efriedma, fpetrogalli, c-rhodes, asbirlea, RKSimon Reviewed By: RKSimon Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81504
2020-07-21[ARM] Constant fold VCTP intrinsicsDavid Green1-1/+34
We can sometimes get into the situation where the operand to a vctp intrinsic becomes constant, such as after a loop is fully unrolled. This adds the constant folding needed for them, allowing them to simplify away and hopefully simplifying remaining instructions. Differential Revision: https://reviews.llvm.org/D84110
2020-07-19[ConstantFolding] check applicability of AllOnes constant creation firstJameson Nash1-2/+6
The getAllOnesValue can only handle things that are bitcast from a ConstantInt, while here we bitcast through a pointer, so we may see more complex objects (like Array or Struct). Differential Revision: https://reviews.llvm.org/D83870
2020-07-13[GVN] teach ConstantFolding correct handling of non-integral addrspace castsJameson Nash1-1/+11
Here we teach the ConstantFolding analysis pass that it is not legal to replace a load of a bitcast constant (having a non-integral addrspace) with a bitcast of the value of that constant (with a different non-integral addrspace). But also teach it that certain bit patterns are always known and convertable (a fact it already uses elsewhere). This required us to also fix a globalopt test, since, after this change, LLVM is able to realize that the test actually is a valid transform (NULL is always a known bit-pattern) and so it doesn't need to emit the failure remarks for it. Also simplify some of the negative tests for transforms by avoiding a type change in their bitcast, and add positive versions of the same tests, to show that they otherwise should work. Differential Revision: https://reviews.llvm.org/D59730
2020-07-13[GVN] add early exit to ConstantFoldLoadThroughBitcast [NFC]Jameson Nash1-1/+6
And adds some additional test coverage to ensure later commits don't introduce regressions. Differential Revision: https://reviews.llvm.org/D59730
2020-07-09ConstantFoldScalarCall3 - use const APInt& returned by getValue()Simon Pilgrim1-2/+2
Avoids unnecessary APInt copies and silences clang tidy warning.
2020-06-16[NFC] Bail out for scalable vectors before calling getNumElementsChristopher Tetreault1-8/+10
Summary: Move the bail out logic to before constructing the Result and Lane vectors. This is both potentially faster, and avoids calling getNumElements on a potentially scalable vector Reviewers: efriedma, sunfish, chandlerc, c-rhodes, fpetrogalli Reviewed By: fpetrogalli Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81619
2020-06-03[AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics (fix)Jay Foad1-1/+4
Try to fix Windows buildbots.
2020-06-03[AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsicsJay Foad1-1/+20
Differential Revision: https://reviews.llvm.org/D80702