aboutsummaryrefslogtreecommitdiff
path: root/mlir
AgeCommit message (Collapse)AuthorFilesLines
2023-12-08[mlir][memref] extract_strided_metadata for zero-sized memref (#74835)Guray Ozen2-1/+21
2023-12-08[mlir] Add missing MLIR_ENABLE_EXECUTION_ENGINE option to MLIRConfig.cmake.inMehdi Amini1-0/+1
This is the kind of options that downstream consumers of preconfigured MLIR packages can check to see if the execution engine is available or not.
2023-12-08[mlir][linalg] Expose getPreservedProducerResults method from ↵Amir Bishara2-11/+24
ElementwiseOpFusion file (#73850) Declare `getPreservedProducerResults` function which helps to get the preserved results of the producer linalg generic operation as a result of elementwise fusion.
2023-12-08Fix argument name of GEPOp builder (#74810)xiaoleis-nv1-2/+2
This MR fix the argument name of GEPOp builder from `basePtrType` to `elementType` to avoid confusion. Co-authored-by: Xiaolei Shi <xiaoleis@nvidia.com>
2023-12-07Apply clang-tidy fixes for performance-unnecessary-value-param in ↵Mehdi Amini1-2/+3
VectorToGPU.cpp (NFC)
2023-12-07Apply clang-tidy fixes for llvm-qualified-auto in VectorToGPU.cpp (NFC)Mehdi Amini1-1/+1
2023-12-07Apply clang-tidy fixes for llvm-qualified-auto in PredicateTree.cpp (NFC)Mehdi Amini1-1/+1
2023-12-07Apply clang-tidy fixes for llvm-prefer-isa-or-dyn-cast-in-conditionals in ↵Mehdi Amini1-1/+1
MapMemRefStorageClassPass.cpp (NFC)
2023-12-07Apply clang-tidy fixes for readability-identifier-naming in ↵Mehdi Amini1-6/+6
GPUToLLVMConversion.cpp (NFC)
2023-12-08[mlir][Transforms][NFC] GreedyPatternRewriteDriver: Remove redundant ↵Matthias Springer1-3/+0
worklist management code (#74796) Do not add the previous users of replaced ops to the worklist during `notifyOperationReplaced`. The previous users are modified inplace as part of `PatternRewriter::replaceOp`, which calls `PatternRewriter::replaceAllUsesWith`. The latter function updates all users with `updateRootInPlace`, which already puts all previous users of the replaced op on the worklist. No further worklist management work is needed in the `notifyOperationReplaced` callback.
2023-12-07[mlir][sparse] add sparse convolution with 5x5 kernel (#74793)Aart Bik5-4/+220
Also unifies some of the test set up parts in other conv tests
2023-12-07[mlir][sparse] code formatting (NFC) (#74779)Aart Bik2-15/+13
2023-12-07[mlir] Extend CombineTransferReadOpTranspose pattern to handle extf ops. ↵harsh-nod2-2/+36
(#74754) This patch modifies the CombineTransferReadOpTranspose pattern to handle extf ops. Also adds a test which shows the transpose getting folded into the transfer_read.
2023-12-07[mlir][python] fix affine testmax1-3/+1
2023-12-07[mlir][affine] implement inferType for delinearize (#74644)Maksim Levental4-12/+25
2023-12-07[mlir][sparse] optimize memory load to SSA value when generating spar… ↵Peiming Liu3-247/+208
(#74750) …se conv kernel.
2023-12-07[mlir][python] fix up affine for (#74495)Maksim Levental3-97/+202
2023-12-07Reland "[MLIR][Transform] Add attribute in MatchOp to filter by operand type ↵Pablo Antonio Martinez3-1/+82
(#67994)" Test was failing due to a different transform sequence declaration (transform sequence were used, while now it should be named transform sequence). Test is now fixed.
2023-12-07[mlir] Fix missing cmake dependency causing non-deterministic build failure ↵Mehdi Amini1-1/+2
(NFC) Fixes #74611
2023-12-07[mlir][LLVM] Add nsw and nuw flags (#74508)Tom Eccles9-5/+170
The implementation of these are modeled after the existing fastmath flags for floating point arithmetic.
2023-12-07Revert "[MLIR][Transform] Add attribute in MatchOp to filter by operand type ↵Mikhail Goncharov3-84/+1
(#67994)" This reverts commit c4399130ae403acf4e6325b8b46a51bb6abf222f. Test fails https://lab.llvm.org/buildbot/#/builders/272/builds/2757
2023-12-07[mlir][doc] Fix reported Builtin (syntax) issues (#74635)Rik Huijzer3-15/+53
Fixes https://github.com/llvm/llvm-project/issues/62489. Some notes for each number: - 1 `bool-literal` should be reasonably clear from context. - 2 Fixed. - 3 This is now fixed. `loc(fused[])` is valid, but `loc(fused["foo",])` is not. - 4 This operation uses `assemblyFormat` so the syntax is correct (assuming ODS is correct). - 5 This operation uses `assemblyFormat` so the syntax is correct (assuming ODS is correct). - 6 Added an example. - 7 The suggested fix is in line with other `assemblyFormat` examples. - 8 Added syntax and an example. - 9 I don't know what this is referring too. - 10 Added example. - 11 and 12 suggestion seems wrong as the `ShapedTypeInterface` could be extended by clients, so is not limited to tensors or vectors. - 13 is already reasonably clear with the example, I think. - 14 is already reasonably clear with the example, I think. - 15 Added an example from the `opaque_locations.mlir` tests. - 16 The answer to this seems to change over time and depend on the use case? Suggestions by reviewers are welcome.
2023-12-07[MLIR][Transform] Add attribute in MatchOp to filter by operand type (#67994)Pablo Antonio Martinez3-1/+84
This patchs adds the `filter_operand_types` attribute to transform::MatchOp, allowing to filter ops depending on their operand types.
2023-12-07[mlir][arith] Overflow semantics in documentation for muli, subi, and addi ↵Jacob Yu1-1/+31
(#74346) Following discussions from this RFC: https://discourse.llvm.org/t/rfc-integer-overflow-semantics Adding the overflow semantics into the muli, subi and addi arith operations.
2023-12-07[mlir][SparseTensor] Fix invalid API usage in patterns (#74690)Matthias Springer2-11/+20
Rewrite patterns must return `success` if the IR was modified. This commit fixes sparse tensor tests such as `SparseTensor/sparse_fusion.mlir`, `SparseTensor/CPU/sparse_reduce_custom.mlir`, `SparseTensor/CPU/sparse_semiring_select.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`.
2023-12-07[mlir][complex] Allow integer element types in `complex.constant` ops (#74564)Matthias Springer3-5/+10
The op used to support only float element types. This was inconsistent with `ConstantOp::isBuildableWith`, which allows integer element types. The complex type allows any float/integer element type. Note: The other complex dialect ops do not support non-float element types yet. The main purpose of this change to fix `Tensor/canonicalize.mlir`, which is currently failing when verifying the IR after each pattern application (#74270). ``` within split at mlir/test/Dialect/Tensor/canonicalize.mlir:231 offset :8:15: error: 'complex.constant' op result #0 must be complex type with floating-point elements, but got 'complex<i32>' %complex1 = tensor.extract %c1[] : tensor<complex<i32>> ^ within split at mlir/test/Dialect/Tensor/canonicalize.mlir:231 offset :8:15: note: see current operation: %0 = "complex.constant"() <{value = [1 : i32, 2 : i32]}> : () -> complex<i32> "func.func"() <{function_type = () -> tensor<3xcomplex<i32>>, sym_name = "extract_from_elements_complex_i"}> ({ %0 = "complex.constant"() <{value = [1 : i32, 2 : i32]}> : () -> complex<i32> %1 = "arith.constant"() <{value = dense<(3,2)> : tensor<complex<i32>>}> : () -> tensor<complex<i32>> %2 = "arith.constant"() <{value = dense<(1,2)> : tensor<complex<i32>>}> : () -> tensor<complex<i32>> %3 = "tensor.extract"(%1) : (tensor<complex<i32>>) -> complex<i32> %4 = "tensor.from_elements"(%0, %3, %0) : (complex<i32>, complex<i32>, complex<i32>) -> tensor<3xcomplex<i32>> "func.return"(%4) : (tensor<3xcomplex<i32>>) -> () }) : () -> () ```
2023-12-07[mlir] Fix build after 77f5b33cMatthias Springer2-1/+3
2023-12-06[mlir][sparse] fix bugs when generate sparse conv_3d kernels. (#74561)Peiming Liu2-15/+66
2023-12-07[mlir][SparseTensor] Fix invalid IR in `ForallRewriter` pattern (#74547)Matthias Springer1-0/+4
The `ForallRewriter` pattern used to generate invalid IR: ``` mlir/test/Dialect/SparseTensor/GPU/gpu_combi.mlir:0:0: error: 'scf.for' op expects region #0 to have 0 or 1 blocks mlir/test/Dialect/SparseTensor/GPU/gpu_combi.mlir:0:0: note: see current operation: "scf.for"(%8, %2, %9) ({ ^bb0(%arg5: index): // ... "scf.yield"() : () -> () ^bb1(%10: index): // no predecessors "scf.yield"() : () -> () }) : (index, index, index) -> () ``` This commit fixes tests such as `mlir/test/Dialect/SparseTensor/GPU/gpu_combi.mlir` when verifying the IR after each pattern application (#74270).
2023-12-07[mlir][SparseTensor] Fix insertion point in `createQuickSort` (#74549)Matthias Springer1-1/+1
`createQuickSort` used to generate invalid IR: ``` "func.func"() <{function_type = (index, index, memref<?xindex>, memref<?xf32>, memref<?xi32>) -> (), sym_name = "_sparse_qsort_0_1_index_coo_1_f32_i32", sym_visibility = "private"}> ({ ^bb0(%arg0: index, %arg1: index, %arg2: memref<?xindex>, %arg3: memref<?xf32>, %arg4: memref<?xi32>): %0:2 = "scf.while"(%arg0, %arg1) ({ ^bb0(%arg5: index, %arg6: index): // ... "scf.condition"(%3, %arg5, %arg6) : (i1, index, index) -> () }, { ^bb0(%arg5: index, %arg6: index): // ... %7:2 = "scf.if"(%6) ({ %8 = "arith.cmpi"(%2, %3) <{predicate = 7 : i64}> : (index, index) -> i1 // ... "scf.yield"(%9#0, %9#1) : (index, index) -> () %10 = "arith.constant"() <{value = 0 : index}> : () -> index }, { "scf.yield"(%arg5, %arg5) : (index, index) -> () }) : (i1) -> (index, index) "scf.yield"(%7#0, %7#1) : (index, index) -> () }) : (index, index) -> (index, index) "func.return"() : () -> () }) : () -> () within split at mlir/test/Dialect/SparseTensor/buffer_rewriting.mlir:76 offset :11:1: error: 'scf.yield' op must be the last operation in the parent block ``` This commit fixes tests such as `mlir/test/Dialect/SparseTensor/buffer_rewriting.mlir` when verifying the IR after each pattern application (#74270).
2023-12-07[mlir][tensor] `tensor.generate`: do not verify dynamic sizes (#74568)Matthias Springer4-108/+70
Op verifiers should verify only local properties of an op. The dynamic sizes of a `tensor.generate` op should not be verified. Dynamic sizes that have a negative constant value should not prevent the `tensor.generate` op from verifying. Also share some code between the `tensor.empty` and `tensor.generate` "dynamic dim -> static dim" canonicalization patterns. Remove the `invalid-canonicalize.mlir` file and move the test case to `canonicalize.mlir`. Canonicalization no longer produces IR that does not verify (and leaves the op as is).
2023-12-07[mlir][SCF] Retire SCF-specific `to_memref`/`to_tensor` canonicalization ↵Matthias Springer3-181/+4
patterns (#74551) The partial bufferization framework has been replaced with One-Shot Bufferize. SCF-specific canonicalization patterns for `to_memref`/`to_tensor` are no longer needed.
2023-12-06[mlir][spirv] Add folding for SPIR-V Shifting ops (#74192)Finn Plummer3-0/+286
Add missing constant propogation folder for LeftShiftLogical, RightShift[Logical|Arithmetic]. Implement additional folding when Shift value is 0. This helps for readability of lowered code into SPIR-V. Part of work for #70704
2023-12-06[mlir][openacc] Switch * to `star` which is more MLIR friendly (#74662)Valentin Clement (バレンタイン クレメン)2-3/+3
`*` is not friendly to the MLIR attribute parser and will fail to be parsed. Switch the `*` enum representation to `star`.
2023-12-06[mlir][sparse] use "current" and "curr" consistently (#74656)Aart Bik2-139/+138
Removes at in favor of curr; also makes method delegates consistent
2023-12-06[mlir][ArmSME] Update docs (#74527)Andrzej Warzyński1-6/+22
2023-12-06[mlir][linalg][conv] Flatten the channel dimension when vectorizing (#71918)Andrzej Warzyński5-29/+388
The current vectorization of 1D depthwise convolutions in Linalg is _sub-optimal_ for tensor with a low number of channel dimensions, e.g.: ```mlir linalg.depthwise_conv_1d_nwc_wc {dilations = dense<1> : vector<1xi64>, strides = dense<1> : vector<1xi64>} ins(%input, %filter : tensor<1x8x3xi8>, tensor<1x3xi8>) outs(%output : tensor<1x8x3xi8>) -> tensor<1x8x3xi8> ``` That's due to the fact that ultimately (i.e. at LLVM level), vectorization happens along the trailing dimension (i.e. the channel dimension). In this case it leads to vectors with 3 elements (or worse, if there's e.g. only 1 channel dimension). For comparison, a 128 bit wide vector registers can hold 16 x i8. Instead, this patch adds an option to flatten/collapse the channel dimension into the width dimension of the input/filter/output using `vector.shape_cast` operation: ```mlir %sc_input = vector.shape_cast %input : vector<1x8x3xi8> to vector<1x24xi8> %sc_output = vector.shape_cast %output : vector<1x8x3xi8> to vector<1x24xi8> %b_filter = vector.broadcast %filter : vector<3xi8> to vector<1x8x3xi8> %sc_filter = vector.shape_cast %b_filter : vector<1x8x3xi8> to vector<1x24xi8> ``` This new vectorization mode is implemented in `depthwiseConv` by inserting `vector.shape_cast` Ops before and after `depthwiseConv1dSliceAsMulAcc` is invoked. It can be selected through e.g. a transform dialect attribute: ```mlir transform.structured.vectorize_children_and_apply_patterns %conv {flatten_1d_depthwise_conv} ``` A forthcoming patch will implement a strategy to automatically switch between the two implementations, depending on the shape of the input tensors. Co-authored by: Bradley Smith <bradley.smith@arm.com>
2023-12-06[mlir][sparse] cleanup ldx/idx/depth/at usage (#74654)Aart Bik1-70/+69
This adds a consistent usage with `at` for everything that refers to the current loop nesting. This cleans up some redundant legacy code from when we were still using topSort inside sparsifier code.
2023-12-06[mlir] Fix TileUsingForOp attr-dict printing/parsing (#73261)Felix Schneider3-15/+28
`TileUsingForOp` has an optional Attribute `interchange` which was given in curly braces like this: `{interchange = [...]}`. The way this was parsed meant that no `attr-dict` could be attached to the Op. This patch adds printing / parsing of an `attr-dict` to the Op and prints/parses the `interchange` Attribute separate from the discardable Attributes.
2023-12-06[mlir:python] Fail immediately if importing an initializer module raises ↵Peter Hawkins1-0/+1
ImportError (#74595)
2023-12-06[mlir][sparse] remove LoopOrd type (#74540)Aart Bik4-52/+19
Rationale: We no longer deal with topsort during sparsification, so that LoopId == LoopOrd for all methods. This first revision removes the types. A follow up revision will simplify some other remaining constructs that deal with loop order (e.g. at and ldx).
2023-12-06[mlir] Fix shift overflow and warning on LLP64 platforms (Windows) (#74002)Reid Kleckner1-3/+3
2023-12-06[mlir][ArmSME] More precisely model dataflow in ArmSME to SCF lowerings (#73922)Benjamin Maxwell5-103/+135
Since #73253, loops over tiles in SSA form (i.e. loops that take `iter_args` and yield a new tile) are supported, so this patch updates ArmSME lowerings to this form. This is a NFC, as it still lowers to the same intrinsics, but this makes IR less 'surprising' at a higher-level, and may be recognised by more transforms. Example: IR before: ```mlir scf.for %tile_slice_index = %c0 to %num_tile_slices step %c1 { arm_sme.move_vector_to_tile_slice %broadcast_to_1d, %tile, %tile_slice_index : vector<[4]xi32> into vector<[4]x[4]xi32> } // ... later use %tile ``` IR now: ```mlir %broadcast_to_tile = scf.for %tile_slice_index = %c0 to %num_tile_slices step %c1 iter_args(%iter_tile = %init_tile) -> (vector<[4]x[4]xi32>) { %tile_update = arm_sme.move_vector_to_tile_slice %broadcast_to_1d, %iter_tile, %tile_slice_index : vector<[4]xi32> into vector<[4]x[4]xi32> scf.yield %tile_update : vector<[4]x[4]xi32> } // ... later use %broadcast_to_tile ```
2023-12-06[mlir][tosa] Add fp16 support to `tosa.resize` (#73019)Georgios Pinitas2-17/+36
2023-12-06[mlir][nvvm] Introduce `fence.mbarrier.init` (#74058)Guray Ozen2-0/+25
This PR introduce `fence.mbarrier.init` OP
2023-12-06[mlir][flang] add fast math attribute to fcmp (#74315)Tom Eccles4-4/+26
`llvm.fcmp` does support fast math attributes therefore so should `arith.cmpf`. The heavy churn in flang tests are because flang sets `fastmath<contract>` by default on all operations that support the fast math interface. Downstream users of MLIR should not be so effected. This was requested in https://github.com/llvm/llvm-project/issues/74263
2023-12-06Revert "[mlir][Vector] Add fold transpose(shape_cast) -> shape_cast ↵Adam Paszke2-58/+1
(#73951)" (#74579) This reverts commit f42b7615b862bb5f77981f619f92877eb20adf54. The fold pattern is incorrect, because it does not even look at the permutation of non-unit dims and is happy to replace a pattern such as ``` %22 = vector.shape_cast %21 : vector<1x256x256xf32> to vector<256x256xf32> %23 = vector.transpose %22, [1, 0] : vector<256x256xf32> to vector<256x256xf32> ``` with ``` %22 = vector.shape_cast %21 : vector<1x256x256xf32> to vector<256x256xf32> ``` which is obviously incorrect.
2023-12-06[mlir][gpu] Support dynamic_shared_memory Op with vector dialect (#74475)Guray Ozen2-0/+22
`gpu.dynamic_shared_memory` currently does not get lowered when it is used with vector dialect. The reason is that vector-to-llvm conversion is not included in gpu-to-nvvm. This PR includes that and adds a test.
2023-12-06[MLIR][LLVM] Fuse Scope into CallsiteLoc Callee (#74546)Billy Zhu4-17/+24
There's an issue in the translator today where, for a CallsiteLoc, if the callee does not have a DI scope (perhaps due to compile options or optimizations), it may get propagated the DI scope of its callsite's parent function, which will create a non-existent DILocation combining line & col number from one file, and the filename from another. The root problem is we cannot propagate the parent scope when translating the callee location, as it no longer applies to inlined locations (see code diff and hopefully this will make sense). To facilitate this, the importer is also changed so that callee scopes are fused with the callee FileLineCol loc, instead of on the Callsite loc itself. This comes with the benefit that we now have a symmetric Callsite loc representation. If we required the callee scope be always annotated on the Callsite loc, it would be hard for generic inlining passes to maintain that, since it would have to somehow understand the semantics of the fused metadata and pull it out while inlining.
2023-12-06[mlir][transform] TrackingListener: Improve dead handles detection (#74290)Matthias Springer3-55/+112
The tracking listener should not report op replacement errors for payload ops that are not mapped to any live handles. The handle liveless analysis did not work properly with transform IR that has named sequences. A handle is live if it has a user after the transform op that is currently being applied. With named sequences, we need to maintain a stack of currently applied transform ops. That stack already exists (`regionStack`), the only thing that's missing is the current transform op for each stack frame. This commit fixes #72931.