riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2026-01-19	[mlir][math] Fix the semantics of math.clampf (#175012)	Jack	1	-2/+2
	The `math.clampf` op is semantically incorrect when compared to both the CUDA reference implementation and the SPIRV spec, both of which have a clamp op. - Fix the definition of `math.clampf` to agree with CUDA and SPIRV - Explicitly state when `math.clampf` produces `ub.poison` - Update the ExpandOps pass to reflect the corrected semantics
2025-12-19	[mlir][math] Propagate fast math attrs in AlgebraicSimplification (#166802)	Aleksei Nurmukhametov	1	-12/+11
	Fix missing propagation of fast-math flags in algebraic simplification patterns of the MLIR math dialect.
2025-10-01	[MLIR] Add sincos fusion pass (#161413)	Asher Mancinelli	2	-0/+81
	We see performance improvements from using sincos to reuse calculations in hot loops that compute sin() and cos() of the same operand. Add a pass to identify sin() and cos() calls in the same block with the same operand and fast-math flags, and fuse them into a sincos op. Follow-up to: * #160561 * #160772
2025-09-19	[MLIR] Add new complex.powi op (#158722)	Akash Banerjee	2	-10/+33
	This PR adds a new complex.powi operation to MLIR's complex dialect for computing complex numbers raised to integer powers. Key changes include: - Addition of the new `PowiOp` operation definition in the Complex dialect - Integration with algebraic simplification passes for optimization - Support for conversion to ROCDL library calls - Updates to Flang frontend to generate the new operation This depends on #158642.
2025-09-02	[mlir][math] Add `clampf` and clean math `ExpandOps` API (#151153)	Fabian Mora	2	-63/+78
	This patch adds the `clampf` operation to the math dialect. The semantics op are defined as: ``` clampf(x, min_v, max_v) = max(min(x, min_v), max_v) ``` The reasoning behind adding this operation is that some GPU vendors offer specialized intrinsics for this operation, or subsets of this operation. For example, [__saturatef](https://docs.nvidia.com/cuda/cuda-math-api/cuda_math_api/group__CUDA__MATH__INTRINSIC__SINGLE.html#group__cuda__math__intrinsic__single_1ga2c84f08e0db7117a14509d21c3aec04e) in NVIDIA GPUs, or `__builtin_amdgcn_fmed3f` in AMD GPUs. This patch also removes `test-expand-math` in favor of `math-expand-ops`. Finally, it removes individual expansion population API calls like `populateExpandCoshPattern` in favor of: ```C++ void populateExpansionPatterns(RewritePatternSet &patterns, ArrayRef<StringRef> opMnemonics = {}); ```
2025-07-23	[mlir] Remove unused includes (NFC) (#150266)	Kazu Hirata	2	-2/+0
	These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-07-22	[mlir][NFC] update `mlir/Dialect` create APIs (19/n) (#149926)	Maksim Levental	4	-441/+452
	See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-13	[mlir] Remove unused includes (NFC) (#148535)	Kazu Hirata	2	-2/+0

2025-04-15	[mlir][math] powi with negative exponent should invert at the end (#135735)	Asher Mancinelli	1	-5/+5
	Previously, an FPowI operation would invert the base before performing a sequence of multiplications, but this led to discrepancies between LLVM pow intrinsic folding and that coming from the math dialect. See compiler-rt's version, which does the inverse at the end of the calculation: compiler-rt/lib/builtins/powidf2.c
2025-03-11	[mlir][math] add benefit arg to populate math approximations/expansions ↵	Emilio Cota	1	-51/+54
	(#130782) This is a follow-up to #127291, which added the benefit arg to lowerings to intrinsics and libm. In this change we add the benefit arg to the math approximation and expansion lowerings, which allows users to establish a preferred order among all three math lowerings, namely approximations, intrinsics and libm. Note that we're only updating the new API added in #126103. The legacy one (`mlir::populateMathPolynomialApproximationPatterns`) is left unmodified to encourage users to move out of it.
2025-02-28	[mlir][math] Rsqrt math expand pass expects static shaped operand (#129006)	Kai Sasaki	1	-0/+5
	Similar to the issue reported in https://github.com/llvm/llvm-project/pull/128299#pullrequestreview-2636142506, ExpandMath pattern for rsqrt expects the static shaped operands. Otherwise, it crashes due to the assertion violation. See: https://github.com/llvm/llvm-project/pull/128299
2025-02-26	[mlir][math] expand-math pass assumes the static shaped type (#128299)	Kai Sasaki	1	-0/+5
	In the process of `expand-math` pass, the conversion of ceil op assumes the static shaped type as input as it needs create 0 and 1 constant values whose type is aligned with the op type. Fixes https://github.com/llvm/llvm-project/issues/128275
2025-02-18	[MLIR][Math] Add erfc to math dialect (#126439)	Jan Leyonberg	1	-5/+117
	This patch adds the erfc op to the math dialect. It also does lowering of the math.erfc op to libm calls. There is also a f32 polynomial approximation for the function based on https://stackoverflow.com/questions/35966695/vectorizable-implementation-of-complementary-error-function-erfcf This is in turn based on M. M. Shepherd and J. G. Laframboise, "Chebyshev Approximation of (1+2x)exp(x^2)erfc x in 0 <= x < INF", Mathematics of Computation, Vol. 36, No. 153, January 1981, pp. 249-253. The code has a ULP error less than 3, which was tested, and MLIR test values were verified against the C implementation.
2025-02-14	[MLIR] Lower `math.powf(x, 3.0)` to `x * x * x`. (#127256)	Benoit Jacob	1	-4/+9
	`math.powf(x, y)` never really supported negative values of `x`, but that was unclear (happened to work for some values of `y`) until https://github.com/llvm/llvm-project/pull/126338 was merged yesterday and lowered it to the usual `exp(y * log(x))` outside of a few special exponent values, such as y == 2.0` lowering to `x * x`. It turns out that code in the wild has been relying on `math.powf(x, y)` with negative `x` for some integral values of `y` for which a lowering to muls was intended: https://github.com/iree-org/iree/issues/19996 This PR adds such a lowering for `y == 3.0`. It "fixes" such cases, and it is a more efficient lowering anyway. There needs to be a wider project to stop altogether using `powf` with negative `x`, use `math.fpowi` for that. Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
2025-02-13	[mlir][math] `powf(a, b)` drop support when a < 0 (#126338)	Hyunsung Lee	1	-30/+62
	Related: #124402 - change inefficient implementation of `powf(a, b)` to handle `a < 0` case - thus drop `a < 0` case support However, some special cases are being used such as: - `a < 0` and `b = 0, b = 0.5, b = 1 or b = 2` - convert those special cases into simpler ops.
2025-02-10	[MLIR][Math] Add fine-grained populate-patterns functions for math function ↵	Benoit Jacob	1	-19/+116
	rewrites. (#126103) The existing `mlir::populateMathPolynomialApproximationPatterns` is coarse-grained and inflexible: - It populates 2 distinct classes of patterns: (1) polynomial approximations, (2) expansions of operands to f32. - It does not offer knobs to select which math functions to apply the rewrites to. This PR adds finer-grained populate-patterns functions, which take a predicate lambda allowing the caller to control which math functions to apply rewrites to. Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
2025-02-06	Revert "[mlir][math]Update `convertPowfOp` `ExpandPatterns.cpp`" (#126063)	Han-Chung Wang	1	-7/+18
	Reverts llvm/llvm-project#124402 It breaks an integration test in downstream project (i.e., IREE), which produces NANs. Talked to the author @ita9naiwa, and we agree to reland the PR after we find the issue.
2025-01-28	[mlir][math]Update `convertPowfOp` `ExpandPatterns.cpp` (#124402)	Hyunsung Lee	1	-18/+7
	The current implementation of `convertPowfOp` requires a calculation of `a * a` but, max\<fp16\> ~= 65,504, and if `a` is about 16, it will overflow so get INF in fp8 or fp16 easily. Remove support when `a < 0`. Overhead of handling negative value of `a` is large and easy to overflow; - related issue in iree: https://github.com/iree-org/iree/issues/15936
2025-01-24	[mlir] [math] Fix the precision issue of expand math (#120865)	donald chen	1	-29/+0
	The convertFloorOp pattern incurs precision loss when floating-point numbers exceed the representable range of int64. This pattern should be removed. Fixes https://github.com/llvm/llvm-project/issues/119836
2024-12-20	[mlir] Enable decoupling two kinds of greedy behavior. (#104649)	Jacques Pienaar	1	-2/+1
	The greedy rewriter is used in many different flows and it has a lot of convenience (work list management, debugging actions, tracing, etc). But it combines two kinds of greedy behavior 1) how ops are matched, 2) folding wherever it can. These are independent forms of greedy and leads to inefficiency. E.g., cases where one need to create different phases in lowering and is required to applying patterns in specific order split across different passes. Using the driver one ends up needlessly retrying folding/having multiple rounds of folding attempts, where one final run would have sufficed. Of course folks can locally avoid this behavior by just building their own, but this is also a common requested feature that folks keep on working around locally in suboptimal ways. For downstream users, there should be no behavioral change. Updating from the deprecated should just be a find and replace (e.g., `find ./ -type f -exec sed -i 's\|applyPatternsAndFoldGreedily\|applyPatternsGreedily\|g' {} \;` variety) as the API arguments hasn't changed between the two.
2024-12-09	[mlir][math] Fix `math.powf` expansion case for `pow(x, 0)` (#119015)	Christopher Bate	1	-0/+8
	Lowering `math.powf` to `llvm.intr.powf` will result in `pow(x, 0) = 1`, even for `x=0`. When using the Math dialect expansion patterns, `pow(0, 0)` will result in `-nan`, however, This change adds two additional instructions to the lowering to ensure the `pow(x, 0)` case lowers to to `1` regardless of the value of `x`. Resolves https://github.com/llvm/llvm-project/issues/118945.
2024-11-05	[mlir][Math] Fix 0-rank support for PolynomialApproximation (#114826)	Kunwar Grover	1	-34/+31
	This patch disambiguates 0-rank vectors and scalars in PolynomialApproximation. This fixes a bug in PolynomialApproximation where 0-rank vectors would be treated as scalars and arguments would not be broadcasted properly.
2024-10-05	[mlir][NFC] Mark type converter in `populate...` functions as `const` (#111250)	Matthias Springer	1	-2/+2
	This commit marks the type converter in `populate...` functions as `const`. This is useful for debugging. Patterns already take a `const` type converter. However, some `populate...` functions do not only add new patterns, but also add additional type conversion rules. That makes it difficult to find the place where a type conversion was added in the code base. With this change, all `populate...` functions that only populate pattern now have a `const` type converter. Programmers can then conclude from the function signature that these functions do not register any new type conversion rules. Also some minor cleanups around the 1:N dialect conversion infrastructure, which did not always pass the type converter as a `const` object internally.
2024-09-27	[mlir] Refactor LegalizeToF32 to specify extra supported float types and ↵	Daniel Hernandez-Juarez	3	-119/+165
	target type as arguments (#108815) Instead of hardcoding all fp smaller than 32 bits are unsupported we provide a way to pass supported floating point types as well as the target type. fp64 and fp32 are implicitly supported. CC: @krzysz00 @manupak
2024-08-21	[MLIR][MathDialect] fix fp32 promotion crash when encounters scf.if (#104451)	Ivy Zhang	1	-4/+5
	1. Expand legal op list in `legalizeToF32` 2. add legalization support for `math::rsqrtOp` in `mathToLibm`.
2024-08-04	[mlir] Construct SmallVector with ArrayRef (NFC) (#101896)	Kazu Hirata	1	-1/+1

2024-07-31	[mlir][math] Fix polynomial `math.asin` approximation (#101247)	Rob Suderman	1	-3/+34
	The polynomial approximation for asin is only good between [-9/16, 9/16]. Values beyond that range must be remapped to achieve good numeric results. This is done by the equation below: `arcsin(x) = PI/2 - arcsin(sqrt(1.0 - x*x))`
2024-06-15	[MLIR][Arith][Resubmit] add fastMathAttr on arith::extf and arith::truncf ↵	Ivy Zhang	1	-3/+8
	(#95346) Add an `fastMathAttr` on `arith::extf` and `arith::truncf`. If these two ops are inserted by some promotion passes (like legalize-to-f32 / emulate-unsupported-floats), they will be labeled as `FastMathFlags::contract`, denoting that they can be then `eliminated by canonicalizer`. The `elimination` can help improve performance, while may introduce some numerical differences.
2024-06-13	Revert "[MLIR][Arith] add fastMathAttr on arith::extf and arith::truncf" ↵	Ivy Zhang	1	-8/+3
	(#95344) Reverts llvm/llvm-project#93443
2024-06-13	[MLIR][Arith] add fastMathAttr on arith::extf and arith::truncf (#93443)	Ivy Zhang	1	-3/+8
	Add an `fastMathAttr` on `arith::extf` and `arith::truncf`. If these two ops are inserted by some promotion passes (like legalize-to-f32 / emulate-unsupported-floats), they will be labeled as `FastMathFlags::contract`, denoting that they can be then `eliminated by canonicalizer`. The `elimination` can help improve performance, while may introduce some numerical differences.
2024-05-13	[mlir][math] lower rsqrt to sqrt + fdiv (#91344)	Corentin Ferry	1	-0/+21
	This commit creates an expansion pattern to lower math.rsqrt(x) into fdiv(1, sqrt(x)).
2024-05-07	[mlir][math] Add Polynomial Approximation for acos, asin op (#90962)	Prashant Kumar	1	-6/+154
	Adds the Polynomial Approximation for math.acos and math.asin op. Also, it adds integration tests. The Approximation has been borrowed from https://stackoverflow.com/a/42683455 I added this script: https://gist.github.com/pashu123/cd3e682b21a64ac306f650fb842a422b to test 50 values between -1 and 1. The results are https://gist.github.com/pashu123/8acb233bd045bacabfa8c992d4040465. It's well within the bounds.
2024-05-07	[mlir][math] Add expand patterns for acosh, asinh, atanh (#90718)	jinchen	1	-12/+75

2024-04-03	[mlir][math] Convert math.fpowi to math.powf in case of non constant (#87472)	Prashant Kumar	1	-5/+15
	Convert math.fpowi to math.powf by converting dtype of power operand to floating point.
2024-04-01	[mlir][math] Expand powfI operation for constant power operand. (#87081)	Prashant Kumar	1	-5/+82
	-- Convert `math.fpowi` to a series of `arith.mulf` operations. -- If the power is negative, we divide the result by 1.
2024-03-17	[mlir][math] Reland 58ef9bec071383744fb703ff08df9806f25e4095 (#85436)	srcarroll	1	-18/+26
	The previous implementation decomposes tanh(x) into `(exp(2x) - 1)/(exp(2x)+1), x < 0` `(1 - exp(-2x))/(1 + exp(-2x)), x >= 0` This is fine as it avoids overflow with the exponential, but the whole decomposition is computed for both cases unconditionally, then the result is chosen based off the sign of the input. This results in doing two expensive exp computations. The proposed change avoids doing the whole computation twice by exploiting the reflection symmetry `tanh(-x) = -tanh(x)`. We can "normalize" the input to be positive by setting `y = sign(x) * x`, where the sign of `x` is computed as `sign(x) = (float)(x > 0) * (-2) + 1`. Then compute `z = tanh(y) `with the decomposition above for `x >=0` and "denormalize" the result `z * sign(x)` to retain the sign. The reason it is done this way is that it is very amenable to vectorization. This method trades the duplicate decomposition computations (which takes 5 instructions including an extra expensive exp and div) for 4 cheap instructions to compute the signs value `arith.cmpf `(which is a pre-existing instruction in the previous impl) `arith.sitofp` `arith.mulf` `arith.addf` and 1 more instruction to get the right sign in the result 5. `arith.mulf`. Moreover, numerically, this implementation will yield the exact same results as the previous implementation. As part of the relanding, a casting issue from the original commit has been fixed, i.e. casting bool to float with `uitofp`. Additionally a correctness test with `mlir-cpu-runner` has been added.
2024-03-15	[mlir][math] Propagate scalability in polynomial approximation (#84949)	Benjamin Maxwell	1	-23/+34
	This simply updates the rewrites to propagate the scalable flags (which as they do not alter the vector shape, is pretty simple). The added tests are simply scalable versions of the existing vector tests.
2024-03-15	Revert "[mlir][math] Implement alternative decomposition for tanh (#8… ↵	srcarroll	1	-23/+17
	(#85429) …5025)" This reverts commit 58ef9bec071383744fb703ff08df9806f25e4095. There is a bool to float casting issue that needs to be sorted out to make sure this is target independent
2024-03-14	[mlir][math] Implement alternative decomposition for tanh (#85025)	srcarroll	1	-17/+23
	The previous implementation decomposes `tanh(x)` into `(exp(2x) - 1)/(exp(2x)+1), x < 0` `(1 - exp(-2x))/(1 + exp(-2x)), x >= 0` This is fine as it avoids overflow with the exponential, but the whole decomposition is computed for both cases unconditionally, then the result is chosen based off the sign of the input. This results in doing two expensive `exp` computations. The proposed change avoids doing the whole computation twice by exploiting the reflection symmetry `tanh(-x) = -tanh(x)`. We can "normalize" the input to be positive by setting `y = sign(x) * x`, where the sign of `x` is computed as `sign(x) = (float)(x > 0) * (-2) + 1`. Then compute `z = tanh(y)` with the decomposition above for `x >=0` and "denormalize" the result `z * sign(x)` to retain the sign. The reason it is done this way is that it is very amenable to vectorization. This method trades the duplicate decomposition computations (which takes 5 instructions including an extra expensive `exp` and `div`) for 4 cheap instructions to compute the signs value 1. `arith.cmpf` (which is a pre-existing instruction in the previous impl) 2. `arith.sitofp` 3. `arith.mulf` 4. `arith.addf` and 1 more instruction to get the right sign in the result 5. `arith.mulf`. Moreover, numerically, this implementation will yield the exact same results as the previous implementation.
2024-02-23	[MLIR] Expose approximation patterns for tanh/erf. (#82750)	Johannes Reifferscheid	1	-0/+10
	These patterns can already be used via populateMathPolynomialApproximationPatterns, but that includes a number of other patterns that may not be needed. There are already similar functions for expansion. For now only adding tanh and erf since I have a concrete use case for these two.
2024-02-13	[mlir][vector] ND vectors linearization pass (#81159)	Ivan Butygin	1	-13/+7
	Common backends (LLVM, SPIR-V) only supports 1D vectors, LLVM conversion handles ND vectors (N >= 2) as `array<array<... vector>>` and SPIR-V conversion doesn't handle them at all at the moment. Sometimes it's preferable to treat multidim vectors as linearized 1D. Add pass to do this. Only constants and simple elementwise ops are supported for now. @krzysz00 I've extracted yours result type conversion code from LegalizeToF32 and moved it to common place. Also, add ConversionPattern class operating on traits.
2024-01-22	Apply clang-tidy fixes for readability-identifier-naming in ↵	Mehdi Amini	1	-23/+23
	PolynomialApproximation.cpp (NFC)
2024-01-18	[mlir][Math] Add pass to legalize math functions to f32-or-higher (#78361)	Krzysztof Drewniak	2	-0/+119
	Since most of the operations in the `math` dialect don't have low-precision implementations, add the -math-legalize-to-f32 pass that goes through and brackets low-precision math funcitons (like `math.sin %0 : f16`) with `arith.extf` and `arith.truncf`. This preserves the original semantics of the math operation but allows lowering to proceed. Versions of this lowering are already implicitly present in some passes, like ConvertGPUToROCDL. However, because those are implicit rewrites, they hide the floating-point extension and truncation, preventing anyone from writing passes that operate on those implitic extf/truncf pairs. Exposing this legalization explicitly is needed to allow lowening 8-bit floats on AMD GPUs, as the implementation of extf and truncf on that platform requires the complex logic found in ArithToAMDGPU, which runs before the GPU to ROCDL lowering.
2023-12-15	[mlir][math] Added `math.sinh` with expansions to `math.exp` (#75517)	Rob Suderman	1	-0/+40
	Includes end-to-end tests for the cpu running, folders using `libm` and lowerings to the corresponding `libm` operations.
2023-08-25	[mlir][math] Modify math.powf to handle negative bases.	Balaji V. Iyer	1	-3/+18
	Powf expansion currently returns NaN when the base is negative. This is because taking natural log of a negative number gives NaN. This patch will square the base and half the exponent, thereby getting around the negative base problem. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D158797
2023-08-24	[MLIR][Math] Add support for f64 in the expansion of math.roundeven	Alexander Shaposhnikov	1	-21/+30
	Add support for f64 in the expansion of math.roundeven. Associated GitHub issue: https://github.com/openxla/iree/issues/13522 This is based on the offline discussion and essentially recommits https://reviews.llvm.org/D158234. Test plan: ninja check-mlir check-all
2023-08-18	Revert "[MLIR][Math] Add support for f16 in the expansion of math.roundeven"	Alexander Shaposhnikov	1	-29/+21
	This reverts commit 40bf36319e383b7b5f2ffbee9abc18d93e1e07b4. The build bot ppc64le-mlir-rhel-test got broken by these changes, see https://lab.llvm.org/buildbot#builders/88/builds/61048 .
2023-08-18	[MLIR][Math] Add support for f16 in the expansion of math.roundeven	Alexander Shaposhnikov	1	-21/+29
	Add support for f16 in the expansion of math.roundeven. Associated GitHub issue: https://github.com/openxla/iree/issues/13522 This version addresses the build issues on Windows reported on https://reviews.llvm.org/D157204 Test plan: ninja check-mlir check-all Differential revision: https://reviews.llvm.org/D158234
2023-08-17	Revert "[MLIR][Math] Add support for f16 in the expansion of math.roundeven"	Alexander Shaposhnikov	1	-29/+21
	This reverts commit b96f6cf62902ca96ed5aa62d4e158292280284e1 since it has broken some Windows build bots (see https://reviews.llvm.org/D157204). Will recommit a fixed version later.
2023-08-17	[MLIR][Math] Add support for f16 in the expansion of math.roundeven	Alexander Shaposhnikov	1	-21/+29
	Add support for f16 in the expansion of math.roundeven. Associated GitHub issue: https://github.com/openxla/iree/issues/13522 Test plan: ninja check-mlir check-all Differential revision: https://reviews.llvm.org/D157204