aboutsummaryrefslogtreecommitdiff
path: root/mlir
AgeCommit message (Collapse)AuthorFilesLines
12 hours[mlir][spirv] Add OpExtension "SPV_INTEL_tensor_float32_conversion" (#151337)YixingZhang0075-50/+127
This PR provides the support for the capability `TensorFloat32RoundingINTEL` and the instruction `OpRoundFToTF32INTE`L, as specified by the `SPV_INTEL_tensor_float32_conversion` extension. This extension introduces a rounding instruction that converts standard 32-bit floating-point values to the TensorFloat32 (TF32) format. Reference Specification: https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/INTEL/SPV_INTEL_tensor_float32_conversion.asciidoc
12 hours[MLIR][SCF] Propagate loop annotation during while op lowering (#151746)Thomas Raoux2-17/+63
This is expanding on https://github.com/llvm/llvm-project/pull/102562 This allows also propagating attributes for scf.while lowering
13 hours[mlir][EmitC]Allow Fields to have initial values (#151437)Jaden Angella5-50/+119
This will ensure that: - The `field` of a class can have an initial value - The `field` op is emitted correctly - The `getfield` op is emitted correctly
13 hours[mlir][spirv] Enforce Fixed-Size Vectors for SPIR-V (#151738)YixingZhang00711-59/+59
This PR enforces that SPIR-V instructions only allow fixed-size vectors (no scalable vectors) by updating `SPIRV_VectorOf` to use `FixedVectorOfLengthAndType` instead of `VectorOfLengthAndType`. The affected MLIR tests are also updated accordingly.
14 hours[mlir][spirv] Fix verification and serialization replicated constant … ↵Mohammadreza Ameri Mahabadian3-22/+48
(#151168) …composites of multi-dimensional array This fixes a bug in verification and serialization of replicated constant composite ops where the splat value can potentially be a multi-dimensional array. --------- Signed-off-by: Mohammadreza Ameri Mahabadian <mohammadreza.amerimahabadian@arm.com>
14 hours[OpenACC][CIR] Implement 'private' clause lowering. (#151360)Erich Keane2-0/+56
The private clause is the first with 'recipes', so a lot of infrastructure is included here, including some MLIR dialect changes that allow simple adding of a privatization. We'll likely get similar for firstprivate and reduction. Also, we have quite a bit of infrastructure in clause lowering to make sure we have most cases we could think of covered. At the moment, ONLY private is implemented, so all it requires is an 'init' segment (that doesn't call any copy operations), and potentially a 'destroy' segment. However, actually calling 'init' functions on each of the elements in them are not properly implemented, and will be in a followup patch. This patch implements all of that, and adds tests in a way that will be useful for firstprivate as well.
15 hours[mlir][vector] vector.splat and vector.broadcast folding/canonicalizing ↵James Newling6-105/+272
parity (#150284) This PR ensures parity in folding/canonicalizing of vector.broadcast (from a scalar) and vector.splat. This means that by using vector.broadcast instead of vector.splat (which is currently deprecated), there is no loss in optimizations performed. All tests which were previously checking folding/canonicalizing of vector.splat are now done for vector.broadcast. The vector.splat canonicalization tests are now in a separate file, ready for removal when, in the future, we remove vector.splat completely. This PR also adds a canonicalizer to vector.splat to always convert it to vector.broadcast. This is to reduce the 'traffic' through vector.splat. There is a chance that this PR will break downstream users who create/expect for vector.splat. Changing all such logic to work just vector.broadcast instead should fix.
18 hours[mlir][tosa] Enable Constant Operand Check by Default in TOSA Validation ↵Luke Hutton3-3/+3
Pass (#150598) Previous behaviour was to only run the check for constant operands when the "strict-op-spec-alignment" check was enabled. However, this type of check seems more generally useful without this option enabled. For example, tosa-to-linalg doesn't enable the "strict-op-spec-alignment" option when running the tosa-validate pass, but it does expect operands to be constant. By enabling this type of checking by default, lowering that don't support non constant operands and don't use the "strict-op-spec-alignment" option will fail early, rather than later in the lowering pipeline. Should a use-case need to disable this type of checking, `--tosa-validate="extension=dynamic"` can be used.
18 hours[MLIR] Migrate pattern application / dialect conversion to the LDBG logging ↵Mehdi Amini2-10/+13
format (#150991) This prefix the output with the DEBUG_TYPE. Dialect conversion is using a ScopedPrinter, we insert the raw_ldbg_ostream to consistently prefix each new line.
18 hours[mlir][spirv] Enable (de)serialization of TensorARM to/from OpConstan… ↵Mohammadreza Ameri Mahabadian3-7/+69
(#151485) …tNull This patch enables (de)serialization to/from OpConstantNull for null TensorARM --------- Signed-off-by: Mohammadreza Ameri Mahabadian <mohammadreza.amerimahabadian@arm.com>
19 hours[mlir][tosa] Support boolean types for clamp folder (#151653)Longsheng Mou2-27/+41
This PR fixes several bugs in `ClampIsNoOp` pattern. - static shape check is no need. - ensures i1 values are zero extended to support fold boolean types clamp. Fixes #130016.
20 hours[MLIR] Introduce a OpWithState class to act as a stream modifier for ↵Mehdi Amini2-2/+21
Operations (NFC) (#151547) On the model of OpWithFlags, this modifier allows to stream an operation using a custom AsmPrinter.
20 hours[mlir][spirv] Fix lookup logic `spirv.target_env` for `gpu.module` (#147262)Jaeho Kim3-4/+78
The `gpu.module` operation can contain `spirv.target_env` attributes within an array attribute named `"targets"`. So it accounts for that case by iterating over the `"targets"` attribute, if present, and looking up `spirv.target_env`. --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
20 hours[mlir][linalg] Add getCollapsedVecType and update vectorization of ↵Andrzej Warzyński1-11/+52
linalg.unpack (#151503) This patch introduces a new helper, `getCollapsedVecType`, and updates `vectorizeAsTensorUnpackOp` to use it. The motivation stems from improving how `vector.shape_cast` operations are generated when vectorizing `linalg.unpack`. Previously, the vectorizer relied on * `tensor::CollapseShapeOp::inferCollapsedType` to compute the collapsed vector type. This approach is suboptimal because: * `inferCollapsedType` lacks awareness of scalable vector flags. * Linalg vectorization should not depend on Tensor dialect utilities. Instead of relocating `inferCollapsedType`, we introduce `getCollapsedVecType` — a lightweight, specialized hook that: * Assumes no dynamic sizes. * Handles scalable flags alongside shape dimensions. This change also reduces temporary variables in `vectorizeAsTensorUnpackOp` and paves the way for a cleaner update in #149293.
21 hours[MLIR][NVVM] Support stmatrix intrinsics (#148377)Gao Yanfeng6-47/+143
Add support for the `@llvm.nvvm.stmatrix` intrinsic series. These correspond to PTX stmatrix operations, as documented in the [PTX ISA reference](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-store-instruction-stmatrix).
36 hours[mlir][vector] shape_cast(constant) -> constant fold for non-splats (#145539)James Newling2-7/+36
The folder `shape_cast(splat constant) -> splat constant` was first introduced [here](https://github.com/llvm/llvm-project/commit/36480657d8ce97836f76bf5fa8c36677b9cdc19a#diff-484cea976e0c96459027c951733bf2d22d34c5a0c0de6f577069870ef4588983R2600) (Nov 2020). In that commit there is a comment to _Only handle splat for now_. Based on that I assume the intention was to, at a later time, support a general `shape_cast(constant) -> constant` folder. That is what this PR does One minor downside: It is possible with this folder end up with, instead of 1 large constant and 1 shape_cast, 2 large constants: ```mlir func.func @foo() -> (vector<4xi32>, vector<2x2xi32>) { %cst = arith.constant dense<[1, 2, 3, 4]> : vector<4xi32> # 'large' constant 1 %0 = vector.shape_cast %cst : vector<4xi32> to vector<2x2xi32> return %cst, %0 : vector<4xi32>, vector<2x2xi32> } ``` gets folded with this new folder to ```mlir func.func @foo() -> (vector<4xi32>, vector<2x2xi32>) { %cst = arith.constant dense<[1, 2, 3, 4]> : vector<4xi32> # 'large' constant 1 %cst_0 = arith.constant dense<[[1, 2], [3, 4]]> : vector<2x2xi32> # 'large' constant 2 return %cst, %cst_0 : vector<4xi32>, vector<2x2xi32> } ``` Notes on the above case: 1) This only effects the textual IR, the actual values share the same context storage (I've verified this by checking pointer values in the `DenseIntOrFPElementsAttrStorage` [constructor](https://github.com/llvm/llvm-project/blob/da5c442550a3823fff05c14300c1664d0fbf68c8/mlir/lib/IR/AttributeDetail.h#L59)) so no compile-time memory overhead to this folding. At the LLVM IR level the constant is shared, too. 2) This only happens when the pre-folded constant cannot be dead code eliminated (i.e. when it has 2+ uses) which I don't think is common.
37 hours[MLIR][SPIRV] Add spirv.IsFinite and lower math.{isfinite,isinf,isnan} to ↵Xiaolei Feng6-4/+80
spirv. (#151552) This patch adds support for lowering several float classification ops from the Math dialect to the SPIR-V dialect. ### Highlights: - Introduced a new `spirv.IsFinite` operation corresponding to the SPIR-V `OpIsFinite` instruction. - Lowered `math.isfinite`, `math.isinf`, and `math.isnan` to SPIR-V using `CheckedElementwiseOpPattern`. - Added corresponding tests for op definition and conversion lowering. This addresses the discussion in: https://github.com/llvm/llvm-project/issues/150778 --- Let me know if any additional adjustments are needed! --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
37 hours[mlir] Implement inferResultRanges for vector.step op (#151536)Max1913-1/+29
Implements the `inferResultRanges` method from the `InferIntRangeInterface` interface for `vector.step`. The implementation is similar to that of arith.constant, since the exact result values are statically known. Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
38 hours[mlir][scf] Add `no_inline` attribute to `scf.execute_region` (#151352)Jungwook Park3-8/+49
Enabling users to explicitly specify which regions should be preserved, uncovers additional opportunities to utilize `scf.execute_region` op.
38 hours[mlir] Implement inferResultRanges for vector.transpose (#151537)Max1913-0/+15
Implements the `inferResultRanges` method from the `InferIntRangeInterface` interface for `vector.transpose`. The result ranges simply match the source ranges. Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
39 hours[mlir] Add non-attribute property predicates to adaptor verifiers (#150881)Krzysztof Drewniak2-1/+8
When adding a predicated field to non-attribute properties / implemneting PropConstraint, a call to genPropertyVerifiers() wasn't added to the generation sequence for [Op]GenericAdaptor::verify. This commit fixes the issue.
41 hours[mlir] Set implicit operation loc to start of split. (#151499)Jacques Pienaar2-8/+26
41 hours[mlir][async]: Make async.execute operation with RecursiveMemoryEffects ↵Aviad Cohen2-1/+12
trait (#116544)
41 hours[mlir][vector] Avoid use of vector.splat in transforms (#150279)James Newling6-53/+76
This is part of vector.splat deprecation Reference: https://discourse.llvm.org/t/rfc-mlir-vector-deprecate-then-remove-vector-splat/87143/5 Instead of creating vector::SplatOp, create vector::BroadcastOp
43 hours[mlir][llvm] adds an attribute for the module level assembly (#151318)gitoleg7-1/+62
Adds support for the module level assembly in the LLVM IR dialect. --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
44 hours[MLIR] Specify new padOp's output type in DropPadUnitDims (#150706)Daniel Garvey2-22/+48
Previously when dropping unit dim from a pad with mixed dynamic/static input/output shapes, the resulting shape would take on the Type of the input, resulting in invalid IR. Also did some minor cleanup to the formatting of the `drop_unit_dim_corresponding_to_dynamic_dim` test to make it match the rest of the file. --------- Signed-off-by: dan <danimal197@gmail.com>
2 days[MLIR][XeGPU] Allow load/store/prefetch uses [memref+offset] instead of ↵Jianhui Li6-26/+338
tdesc (#150576) Add variant of load/store/prefetch to allow offset. The new xegpu.load variant accepts memref+offset, and the existing tdesc operand will be removed in the future PR. The semantics are combination of "creating scattered_tdesc + xegpu.load with scattered_tdesc". The current xegpu.load accepts tdesc operand, which encapsulates "memref+offset". This PR "fold" "memref+offset" directly to xegpu.load replacing "tdesc". Create_tdesc will be removed as scatter_tdesc only contains base address after offsets being taken away, so there is no point to keep it. ```mlir // wi level code example %2 = xegpu.load %src[%offsets], %mask <{chunk_size = 2}> : ui64, vector<1xindex>, vector<1xi1> -> vector<2xf32> xegpu.store %val, %src[%offsets], %mask: vector<1xf16>, memref<?xf16>, vector<1xindex>, vector<1xi1> xegpu.prefetch %src[%0] : ui64, vector<1xindex> ```
2 days[mlir][spirv] Add 8-bit float type emulation (#148811)Md Abdullah Shahneous Bari9-7/+185
8-bit floats are not supported in SPIR-V. They are emulated as 8-bit integer during conversion.
2 days[mlir][linalg][nfc] Clean-up leftover code post #149156 (#151334)Andrzej Warzyński1-6/+0
In https://github.com/llvm/llvm-project/pull/149156, I ensured that we no longer generate spurious `tensor.empty` ops when vectorizing `linalg.unpack`. This follow-up removes leftover code that is now redundant but was missed in the original PR and in #150602 that was also meant to clean-up left-over code. Note, this is removing code to compute "write-vector-sizes". Instead, these are fully inferred from previous Ops.
3 days[mlir][spirv] Fix serialization of multi-dimensional TensorArm constant ↵Mohammadreza Ameri Mahabadian2-0/+47
(#151158) This fixes an issue where multi-dimensional TensorArm dense elements could not be serialized. Signed-off-by: Mohammadreza Ameri Mahabadian <mohammadreza.amerimahabadian@arm.com>
3 days[mlir][spirv] Fix UpdateVCEPass pass to deduce the correct version (#151192)Mohammadreza Ameri Mahabadian2-1/+16
UpdateVCEPass currently deduces the required version based on vrsion requirement of ops. This fix adds a check to update the minimum required version based on capabilities as well. --------- Signed-off-by: Mohammadreza Ameri Mahabadian <mohammadreza.amerimahabadian@arm.com>
3 days[mlir][Vector] Allow elementwise/broadcast swap to handle mixed types (#151274)Krzysztof Drewniak2-79/+160
This patch extends the operation that rewrites elementwise operations whose inputs are all broadcast from the same shape to handle mixed-types, such as when the result and input types don't match, or when the inputs have multiple types. PR #150867 failed to check for the possibility of type mismatches when rewriting splat constants. In order to fix that issue, we add support for mixed-type operations more generally.
3 daysRevert "[mlir][spirv] Fix UpdateVCEPass to deduce the correct set of ↵Igor Wodiany2-25/+16
capabilities" (#151358) Reverts llvm/llvm-project#151108 as it breaks sanitizer builds.
3 days[MLIR] Migrate some conversion passes and dialects to LDBG() macro (NFC) ↵Mehdi Amini8-68/+50
(#151349)
3 days[MLIR] Use LDBG in MLIR AsmPrinter (#151163)Mehdi Amini1-4/+3
3 days[MLIR] Migrate InlinerInterfaceImpl to the new LDBG() debug form (NFC) (#150853)Mehdi Amini1-17/+15
3 days[MLIR][XeGPU] Refactor xegpu-wg-to-sg tests (#149204)Nishant Patel2-203/+178
This PR refactors the xegpu-wg-to-sg.mlir tests to use larger shapes which resemble closer to workgroup level programming.
3 days[mlir][spirv] Fix UpdateVCEPass to deduce the correct set of capabilities ↵Davide Grohmann2-16/+25
(#151108) When deducing capabilities implied capabilities are not considered, which causes generation of incorrect SPIR-V modules. This commit fixes that by pulling in the capability set for all the implied ones. --------- Signed-off-by: Davide Grohmann <davide.grohmann@arm.com>
3 days[mlir][spirv] Add support for structs decorations (#149793)Igor Wodiany10-78/+270
An alternative implementation could use `ArrayRef` of `NamedAttribute`s or `NamedAttrList` to store structs decorations, as the deserializer uses `NamedAttribute`s for decorations. However, using a custom struct allows us to store the `spirv::Decoration`s directly rather than its name in a `StringRef`/`StringAttr`.
3 daysReapply "[mlir][llvm] Add intrinsic arg and result attribute support … ↵Tobias Gysi15-209/+262
(#151324) …(… (#151099) This reverts commit 2780b8f22058b35a8e70045858b87a1966df8df3 and relands b7bfbc0c4c7b20d6623a5b0b4a7fea8ae08a62da. Adds the following fixes compared to the original PR (https://github.com/llvm/llvm-project/pull/150783): - A bazel fix - Use `let methods` instead of `list<InterfaceMethod> methods` The missing forward declaration has been added in meantime: https://github.com/llvm/llvm-project/commit/9164d206b33d61c93f5fc4628797485f96d654ca.
3 days[mlir][tosa] Fix invalid data type combinations check (#150066)Luke Hutton3-5/+24
Previously this check assumed that if an operator exists in profile complimance (TosaProfileComplianceData.h.inc), an entry exists in both the profiles and extensions section. However, this is not necessarily the case. This commit changes the check such that it doesn't assume the above. In doing so, it allows more operators to be checked for invalid data type combinations, which were otherwise skipped previously.
3 days[MLIR] Missing declaration build fixMichael Kruse1-0/+1
The sanizer bots are reporting a missing declaration: ``` In file included from /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/mlir/lib/Dialect/LLVMIR/Transforms/InlinerInterfaceImpl.cpp:17: In file included from /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/mlir/include/mlir/Dialect/LLVMIR/NVVMDialect.h:26: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h:318:34: error: no type named 'CallBase' in namespace 'llvm' 318 | llvm::CallBase *call, | ~~~~~~^ 1 error generated. ``` https://lab.llvm.org/buildbot/#/builders/94/builds/9340 https://lab.llvm.org/buildbot/#/builders/24/builds/11029 https://lab.llvm.org/buildbot/#/builders/169/builds/13454 https://lab.llvm.org/buildbot/#/builders/25/builds/10250 PR #151302 removed some indirect header #includes which had to be includes explicitly. I do not know why this particular error only occurs with the sanitizer buildbots. Fix by adding a forward declaration.
3 daysRevert "Reland "[mlir][llvm] Add intrinsic arg and result attribute support ↵Mehdi Amini15-263/+210
(…" (#151316) Reverts llvm/llvm-project#151125 Broke the gcc-7 build: include/mlir/Target/LLVMIR/ModuleTranslation.h:318:34: error: no type named 'CallBase' in namespace 'llvm' llvm::CallBase *call, ~~~~~~^
3 days[mlir][docs] Use APIntParameter instead of APInt in AttributesAndTypes.md ↵Chaitanya Koparkar1-1/+1
(#151315) Fixes #151314.
3 daysReland "[mlir][llvm] Add intrinsic arg and result attribute support (… ↵Tobias Gysi15-210/+263
(#151125) …… (#151099) This reverts commit 2780b8f22058b35a8e70045858b87a1966df8df3 to reland 59013d44058ef423a117f95092150e16e16fdb09. In addition to the original commit this one includes: - This includes a bazel fix - Use `let methods` instead of `list<InterfaceMethod> methods` The original commit message was: This patch extends the LLVM dialect's intrinsic infra to support argument and result attributes. Initial support is added for the memory intrinsics llvm.intr.memcpy, llvm.intr.memmove, and llvm.intr.memset. Additionally, an ArgAndResultAttrsOpInterface is factored out of CallOpInterface and CallableOpInterface, enabling operations to have argument and result attributes without requiring them to be a call or a callable operation.
3 days[MLIR] Migrate NVVM to the new LDBG debug macro (NFC) (#151162)Mehdi Amini2-32/+29
3 days[MLIR] Avoid #include OMPIRBuilder.h (#151302)Michael Kruse3-1/+4
`#include "llvm/Frontend/OpenMP/OMPIRBuilder.h"` can be replaced with forward-declarations of `OpenMPIRBuilder` and `CanonicalLoopInfo`. This also avoids a dependency to `omp_gen` of the LLVMFrontendOpenMP component which is included indirectly in `OMPIRBuilder.h`. Since its inclusion in #147069, additional indirect dependencies on headers included by `OMPIRBuilder.h` were introduced as well. These are now included directly. Reported-by: fabrizio-indirli See https://github.com/llvm/llvm-project/pull/147069#issuecomment-3114034973
4 days[MLIR][XeVM] Add XeVM to LLVMIR translation. (#150696)Sang Ik Lee7-0/+181
Add XeVM dialect to LLVMIR translation. Currently no ops are translated. Only xevm.DecorationCacheControl are translated to metadata for spirv decoration - !spirv.DecorationCacheControlINTEL. Co-authored-by: Artem Kroviakov artem.kroviakov@intel.com
4 days[flang][acc] Lower do and do concurrent loops specially in acc regions (#149614)Razvan Lupusoru2-0/+87
When OpenACC is enabled and Fortran loops are annotated with `acc loop`, they are lowered to `acc.loop` operation. And rest of the contained loops use the normal FIR lowering path. Hovever, the OpenACC specification has special provisions related to contained loops and their induction variable. In order to adhere to this, we convert all valid contained loops to `acc.loop` in order to store this information appropriately. The provisions in the spec that motivated this change (line numbers are from OpenACC 3.4): - 1353 Loop variables in Fortran do statements within a compute construct are predetermined to be private to the thread that executes the loop. - 3783 When do concurrent appears without a loop construct in a kernels construct it is treated as if it is annotated with loop auto. If it appears in a parallel construct or an accelerator routine then it is treated as if it is annotated with loop independent. By valid loops - we convert do loops and do concurrent loops which have induction variable. Loops which are unstructured are not handled.
4 days[mlir][linalg] Fix padding shape computation in PadTilingInterface for convs ↵Vivian Zhang5-26/+206
(#149576) This PR fixes the computation of padded shapes for convolution-style affine maps (e.g., d0 + d1) in `PadTilingInterface`. Previously, the codes used the direct sum of loop upper bounds, leading to over-padding. For example, the following `conv_2d_nhwc_fhwc` op, if only padding the c dimensions to multiples of 16, it also incorrectly pads the convolved dimensions and generates the wrong input shape as: ``` %padded = tensor.pad %arg0 low[0, 0, 0, 0] high[0, 1, 1, 12] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<1x16x16x4xf32> to tensor<1x17x17x16xf32> %padded_0 = tensor.pad %arg1 low[0, 0, 0, 0] high[0, 0, 0, 12] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<16x3x3x4xf32> to tensor<16x3x3x16xf32> %0 = linalg.conv_2d_nhwc_fhwc {dilations = dense<1> : tensor<2xi64>, strides = dense<1> : tensor<2xi64>} ins(%padded, %padded_0 : tensor<1x17x17x16xf32>, tensor<16x3x3x16xf32>) outs(%arg2 : tensor<1x14x14x16xf32>) -> tensor<1x14x14x16xf32> return %0 : tensor<1x14x14x16xf32> ``` The new implementation uses the maximum accessed index as the input for affine map and then adds 1 after aggregating all the terms to get the final padded size. This fixed https://github.com/llvm/llvm-project/issues/148679.