aboutsummaryrefslogtreecommitdiff
path: root/mlir
AgeCommit message (Collapse)AuthorFilesLines
2024-02-24[mlir][linalg] NFC: Use tablegen macro for pass constructors (#82892)Quinn Dawkins12-158/+89
This uses the tablegen macros for generating pass constructors, exposing pass options for fold-unit-extent-dims and linalg-detensorize. Additionally aligns some of the pass namings to their text counterpart. This includes an API change: createLinalgGeneralizationPass -> createLinalgGeneralizeNamedOpsPass
2024-02-24[mlir] Use `OpBuilder::createBlock` in op builders and patterns (#82770)Matthias Springer20-88/+71
When creating a new block in (conversion) rewrite patterns, `OpBuilder::createBlock` must be used. Otherwise, no `notifyBlockInserted` notification is sent to the listener. Note: The dialect conversion relies on listener notifications to keep track of IR modifications. Creating blocks without the builder API can lead to memory leaks during rollback.
2024-02-23[OpenMP][MLIR][OMPIRBuilder] Add a small optional constant alloca raise ↵agozillon1-0/+43
function pass to finalize, utilised in convertTarget (#78818) This patch seeks to add a mechanism to raise constant (not ConstantExpr or runtime/dynamic) sized allocations into the entry block for select functions that have been inserted into a list for processing. This processing occurs during the finalize call, after OutlinedInfo regions have completed. This currently has only been utilised for createOutlinedFunction, which is triggered for TargetOp generation in the OpenMP MLIR dialect lowering to LLVM-IR. This currently is required for Target kernels generated by createOutlinedFunction to avoid subsequent optimization passes doing some unintentional malformed optimizations for AMD kernels (unsure if it occurs for other vendors). If the allocas are generated inside of the kernel and are not in the entry block and are subsequently passed to a function this can lead to required instructions being erased or manipulated in a way that causes the kernel to run into a HSA access error. This fix is related to a series of problems found in: https://github.com/llvm/llvm-project/issues/74603 This problem primarily presents itself for Flang's HLFIR AssignOp currently, when utilised with a scalar temporary constant on the RHS and a descriptor type on the LHS. It will generate a call to a runtime function, wrap the RHS temporary in a newly allocated descriptor (an llvm struct), and pass both the LHS and RHS descriptor into the runtime function call. This will currently be embedded into the middle of the target region in the user entry block, which means the allocas are also embedded in the middle, which seems to pose issues when later passes are executed. This issue may present itself in other HLFIR operations or unrelated operations that generate allocas as a by product, but for the moment, this one test case is the only scenario I've found this problem. Perhaps this is not the appropriate fix, I am very open to other suggestions, I've tried a few others (at varying levels of the flang/mlir compiler flow), but this one is the smallest and least intrusive change set. The other two, that come to mind (but I've not fully looked into, the former I tried a little with blocks but it had a few issues I'd need to think through): - Having a proper alloca only block (or region) generated for TargetOps that we could merge into the entry block that's generated by convertTarget's createOutlinedFunction. - Or diverging a little from Clang's current target generation and using the CodeExtractor to generate the user code as an outlined function region invoked from the kernel we make, with our kernel arguments passed into it. Similar to the current parallel generation. I am not sure how well this would intermingle with the existing parallel generation though that's layered in. Both of these methods seem like quite a divergence from the current status quo, which I am not entirely sure is merited for the small test this change aims to fix.
2024-02-23[mlir][sparse] remove very thin header file from sparse runtime support (#82820)Aart Bik6-81/+70
2024-02-23[mlir][sparse] cleanup sparse runtime library (#82807)Aart Bik5-121/+12
remove some obsoleted APIs from the library that have been fully replaced with actual direct IR codegen
2024-02-23[mlir][ArmSME] Follow MLIR constant style in VectorLegalization.cpp (NFC)Benjamin Maxwell1-14/+14
2024-02-23[mlir][Transforms] Fix crash in dialect conversion (#82783)Matthias Springer1-5/+5
This is a follow-up to #82333. It is possible that the target block of a `BlockTypeConversionRewrite` is detached, so the `MLIRContext` cannot be taken from the block.
2024-02-23[mlir][linalg] `LinalgOp`: Disallow mixed tensor/buffer semantics (#80660)Matthias Springer4-81/+29
Related discussion: https://github.com/llvm/llvm-project/pull/73908/files#r1414913030. This change fixes #73547.
2024-02-23[mlir] Fix memory leaks after #81759 (#82762)Matthias Springer2-12/+14
This commit fixes memory leaks that were introduced by #81759. The way ops and blocks are erased changed slightly. The leaks were caused by an incorrect implementation of op builders: blocks must be created with the supplied builder object. Otherwise, they are not properly tracked by the dialect conversion and can leak during rollback.
2024-02-23Users/tsitdikov (#82757)tsitdikov1-0/+2
Fix Test ARM SME library and build rule.
2024-02-23[MLIR] Expose approximation patterns for tanh/erf. (#82750)Johannes Reifferscheid2-0/+13
These patterns can already be used via populateMathPolynomialApproximationPatterns, but that includes a number of other patterns that may not be needed. There are already similar functions for expansion. For now only adding tanh and erf since I have a concrete use case for these two.
2024-02-23[mlir][NFC] Fix format specifier warning on WindowsMarkus Böck1-1/+2
`%ld` specifier is defined to work on values of type `long`. The parameter given to `fprintf` is of type `intptr_t` whose actual underlying integer type is unspecified. On Unix systems it happens to commonly be `long` but on 64-bit Windows it is defined as `long long`. The cross-platform way to print a `intptr_t` is to use `PRIdPTR` which expands to the correct format specifier for `intptr_t`. This avoids any undefined behaviour and compiler warnings.
2024-02-23[mlir][Transforms][NFC] Decouple `ConversionPatternRewriterImpl` from ↵Matthias Springer1-23/+21
`ConversionPatternRewriter` (#82333) `ConversionPatternRewriterImpl` no longer maintains a reference to the respective `ConversionPatternRewriter`. An `MLIRContext` is sufficient. This commit simplifies the internal state of `ConversionPatternRewriterImpl`.
2024-02-23[mlir][Transforms] Encapsulate dialect conversion options in ↵Matthias Springer3-105/+118
`ConversionConfig` (#82250) This commit adds a new `ConversionConfig` struct that allows users to customize the dialect conversion. This configuration is similar to `GreedyRewriteConfig` for the greedy pattern rewrite driver. A few existing options are moved to this objects, simplifying the dialect conversion API.
2024-02-23[mlir][math] Propagate scalability in `convert-math-to-llvm` (#82635)Benjamin Maxwell2-9/+90
This also generally increases the coverage of scalable vector types in the math-to-llvm tests.
2024-02-23[mlir][ArmSME] Add test-lower-to-arm-sme pipeline (#81732)Cullen Rhodes23-103/+141
The ArmSME compilation pipeline has evolved significantly and is now sufficiently complex enough that it warrants a proper lowering pipeline that encapsulates the various passes and orderings. Currently the pipeline is loosely defined in our integration tests, but these have diverged and are not using the same passes or ordering everywhere. This patch introduces a test-lower-to-arm-sme pipeline mirroring test-lower-to-llvm that provides some sanity when running e2e examples and can be used a reference for targeting ArmSME in MLIR. All the integration tests are updated to use this pipeline. The intention is to productize the pipeline once it becomes more mature.
2024-02-23[mlir][Transforms] Make `ConversionPatternRewriter` constructor private (#82244)Matthias Springer2-8/+20
`ConversionPatternRewriter` objects should not be constructed outside of dialect conversions. Some IR modifications performed through a `ConversionPatternRewriter` are reflected in the IR in a delayed fashion (e.g., only when the dialect conversion is guaranteed to succeed). Using a `ConversionPatternRewriter` outside of the dialect conversion is incorrect API usage and can bring the IR in an inconsistent state. Migration guide: Use `IRRewriter` instead of `ConversionPatternRewriter`.
2024-02-23[MLIR][LLVM] Fix debug intrinsic import (#82637)Tobias Gysi4-17/+73
This revision handles the case that the translation of a scope fails due to cyclic metadata. This mainly affects the import of debug intrinsics that indirectly take such a scope as metadata argument (e.g. via local variable or label metadata). This commit ensures we drop intrinsics with such a dependency on cyclic metadata.
2024-02-23[mlir][Transforms][NFC] Turn unresolved materializations into `IRRewrite`s ↵Matthias Springer1-193/+176
(#81761) This commit is a refactoring of the dialect conversion. The dialect conversion maintains a list of "IR rewrites" that can be committed (upon success) or rolled back (upon failure). This commit turns the creation of unresolved materializations (`unrealized_conversion_cast`) into `IRRewrite` objects. After this commit, all steps in `applyRewrites` and `discardRewrites` are calls to `IRRewrite::commit` and `IRRewrite::rollback`.
2024-02-23[mlir][Transforms][NFC] Turn op creation into `IRRewrite` (#81759)Matthias Springer1-38/+64
This commit is a refactoring of the dialect conversion. The dialect conversion maintains a list of "IR rewrites" that can be committed (upon success) or rolled back (upon failure). Until now, the dialect conversion kept track of "op creation" in separate internal data structures. This commit turns "op creation" into an `IRRewrite` that can be committed and rolled back just like any other rewrite. This commit simplifies the internal state of the dialect conversion.
2024-02-23[mlir][Transforms][NFC] Turn op/block arg replacements into `IRRewrite`s ↵Matthias Springer1-140/+157
(#81757) This commit is a refactoring of the dialect conversion. The dialect conversion maintains a list of "IR rewrites" that can be committed (upon success) or rolled back (upon failure). Until now, op replacements and block argument replacements were kept track in separate data structures inside the dialect conversion. This commit turns them into `IRRewrite`s, so that they can be committed or rolled back just like any other rewrite. This simplifies the internal state of the dialect conversion. Overview of changes: * Add two new rewrite classes: `ReplaceBlockArgRewrite` and `ReplaceOperationRewrite`. Remove the `OpReplacement` helper class; it is now part of `ReplaceOperationRewrite`. * Simplify `RewriterState`: `numReplacements` and `numArgReplacements` are no longer needed. (Now being kept track of by `numRewrites`.) * Add `IRRewrite::cleanup`. Operations should not be erased in `commit` because they may still be referenced in other internal state of the dialect conversion (`mapping`). Detaching operations is fine. * `trackedOps` are now updated during the "commit" phase instead of after applying all rewrites.
2024-02-22[mlir] Fix FunctionOpInterface extraSharedClassDeclaration to be fully ↵shkoo1-113/+113
namespace qualified (#82682) `extraSharedClassDeclaration` of `FunctionOpInterface` can be inherited by other `OpInterfaces` into foreign namespaces, thus types must be fully qualified to prevent compiler errors, for example: def MyFunc : OpInterface<"MyFunc", [FunctionOpInterface]> { let cppNamespace = "::MyNamespace"; }
2024-02-22[mlir][Vector] Add missing CHECK rules to vector-transfer-flatten.mlir (#82698)Diego Caballero1-0/+2
This test failed after landing #81964 due to a bad merge. I provided a quick fix and this PR is adding the rest of CHECK rules that were not merged properly.
2024-02-22[Tosa] Add Tosa Sin and Cos operators (#82510)Jerry-Ge4-0/+58
- Add Tosa Sin and Cos operators to the MLIR dialect - Define the new Tosa_FloatTensor type --------- Signed-off-by: Jerry Ge <jerry.ge@arm.com>
2024-02-22[MLIR] Fix LLVM dialect specification to use AnySignlessInteger instead of ↵Mehdi Amini1-23/+23
AnyInteger (#82694) LLVM IR does not support signed integer, the LLVM dialect was underspecified (likely unintentionally) and the AnyInteger constraint was overly lax. The arithmetic dialect is already consistently using AnySignlessInteger.
2024-02-22Fix test/Dialect/Vector/vector-transfer-flatten.mlirDiego Caballero1-0/+4
2024-02-22[mlir][Vector] Fix bug in vector xfer op flattening transformation (#81964)Diego Caballero4-22/+65
It looks like the affine map generated to compute the indices of the collapsed dimensions used the wrong dim size. For indices `[idx0][idx1]` we computed the collapsed index as `idx0*size0 + idx1` instead of `idx0*size1 + idx1`. This led to correctness issues in convolution tests when enabling this transformation internally.
2024-02-22[TOSA] TosaToLinalg: fix int64_t min/max lowering of clamp (#82641)Matthias Gehre2-12/+27
tosa.clamp takes `min`/`max` attributes as i64, so ensure that the lowering to linalg works for the whole range. Co-authored-by: Tiago Trevisan Jost <tiago.trevisanjost@amd.com>
2024-02-22[mlir][mesh] add support in spmdization for incomplete sharding annotations ↵Boian Petkantchin2-17/+42
(#82442) Don't require that `mesh.shard` operations come in pairs. If there is only a single `mesh.shard` operation we assume that the producer result and consumer operand have the same sharding.
2024-02-22[mlir][test] Add -march=aarch64 -mattr=+sve to test-scalable-interleaveBenjamin Maxwell1-1/+2
Fix for https://lab.llvm.org/buildbot/#/builders/179/builds/9438
2024-02-22[mlir][test] Add integration tests for vector.interleave (#80969)Benjamin Maxwell2-0/+48
2024-02-22[mlir][Transforms][NFC] Turn block type conversion into `IRRewrite` (#81756)Matthias Springer1-430/+364
This commit is a refactoring of the dialect conversion. The dialect conversion maintains a list of "IR rewrites" that can be committed (upon success) or rolled back (upon failure). Until now, the signature conversion of a block was only a "partial" IR rewrite. Rollbacks were triggered via `BlockTypeConversionRewrite::rollback`, but there was no `BlockTypeConversionRewrite::commit` equivalent. Overview of changes: * Remove `ArgConverter`, an internal helper class that kept track of all block type conversions. There is now a separate `BlockTypeConversionRewrite` for each block type conversion. * No more special handling for block type conversions. They are now normal "IR rewrites", just like "block creation" or "block movement". In particular, trigger "commits" of block type conversion via `BlockTypeConversionRewrite::commit`. * Remove `ArgConverter::notifyOpRemoved`. This function was used to inform the `ArgConverter` that an operation was erased, to prevent a double-free of operations in certain situations. It would be unpractical to add a `notifyOpRemoved` API to `IRRewrite`. Instead, erasing ops/block should go through a new `SingleEraseRewriter` (that is owned by the `ConversionPatternRewriterImpl`) if there is chance of double-free. This rewriter ignores `eraseOp`/`eraseBlock` if the op/block was already freed.
2024-02-22[mlir][Transforms] Dialect conversion: Improve signature conversion API (#81997)Matthias Springer2-10/+12
This commit improves the block signature conversion API of the dialect conversion. There is the following comment in `ArgConverter::applySignatureConversion`: ``` // If no arguments are being changed or added, there is nothing to do. ``` However, the implementation actually used to replace a block with a new block even if the block argument types do not change (i.e., there is "nothing to do"). This is fixed in this commit. The documentation of the public `ConversionPatternRewriter` API is updated accordingly. This commit also removes a check that used to *sometimes* skip a block signature conversion if the block was already converted. This is not consistent with the public `ConversionPatternRewriter` API; blocks should always be converted, regardless of whether they were already converted or not. Block signature conversion also used to be silently skipped when the specified block was detached. Instead of silently skipping, an assertion is triggered. Attempting to convert a detached block (which is likely an erased block) is invalid API usage.
2024-02-21[mlir][Vector] Replace `vector.shuffle` with `vector.interleave` in vector ↵Diego Caballero2-41/+68
narrow type emulation (#82550) This PR replaces the generation of `vector.shuffle` with `vector.interleave` in the i4 conversions in vector narrow type emulation. The multi dimensional semantics of `vector.interleave` allow us to enable these conversion emulations also for multi dimensional vectors.
2024-02-21Apply clang-tidy fixes for readability-container-size-empty in ↵Mehdi Amini1-3/+3
SerializeNVVMTarget.cpp (NFC)
2024-02-21Apply clang-tidy fixes for modernize-use-override in SerializeNVVMTarget.cpp ↵Mehdi Amini1-1/+1
(NFC)
2024-02-21Apply clang-tidy fixes for llvm-qualified-auto in OperationSupportTest.cpp (NFC)Mehdi Amini1-2/+2
2024-02-21Apply clang-tidy fixes for llvm-qualified-auto in ↵Mehdi Amini1-1/+1
InterfaceAttachmentTest.cpp (NFC)
2024-02-21Apply clang-tidy fixes for readability-identifier-naming in ↵Mehdi Amini1-3/+3
SerializationTest.cpp (NFC)
2024-02-21[mlir][GPU] Remove the SerializeToCubin pass (#82486)Fabian Mora4-247/+0
The `SerializeToCubin` pass was deprecated in September 2023 in favor of GPU compilation attributes; see the [GPU compilation](https://mlir.llvm.org/docs/Dialects/GPU/#gpu-compilation) section in the `gpu` dialect MLIR docs. This patch removes `SerializeToCubin` from the repo.
2024-02-21[mlir] Use arith max or min ops instead of cmp + select (#82178)mlevesquedion13-218/+113
I believe the semantics should be the same, but this saves 1 op and simplifies the code. For example, the following two instructions: ``` %2 = cmp sgt %0, %1 %3 = select %2, %0, %1 ``` Are equivalent to: ``` %2 = maxsi %0 %1 ```
2024-02-21[OpenMP] Remove `register_requires` global constructor (#80460)Joseph Huber2-28/+1
Summary: Currently, OpenMP handles the `omp requires` clause by emitting a global constructor into the runtime for every translation unit that requires it. However, this is not a great solution because it prevents us from having a defined order in which the runtime is accessed and used. This patch changes the approach to no longer use global constructors, but to instead group the flag with the other offloading entires that we already handle. This has the effect of still registering each flag per requires TU, but now we have a single constructor that handles everything. This function removes support for the old `__tgt_register_requires` and replaces it with a warning message. We just had a recent release, and the OpenMP policy for the past four releases since we switched to LLVM is that we do not provide strict backwards compatibility between major LLVM releases now that the library is versioned. This means that a user will need to recompile if they have an old binary that relied on `register_requires` having the old behavior. It is important that we actively deprecate this, as otherwise it would not solve the problem of having no defined init and shutdown order for `libomptarget`. The problem of `libomptarget` not having a define init and shutdown order cascades into a lot of other issues so I have a strong incentive to be rid of it. It is worth noting that the current `__tgt_offload_entry` only has space for a 32-bit integer here. I am planning to overhaul these at some point as well.
2024-02-21[mlir][Vector] Add vector bitwidth target to xfer op flattening (#81966)Diego Caballero4-13/+137
This PR adds an optional bitwidth parameter to the vector xfer op flattening transformation so that the flattening doesn't happen if the trailing dimension of the read/writen vector is larger than this bitwidth (i.e., we are already able to fill at least one vector register with that size).
2024-02-21[mlir][Transforms] Fix use-after-free in #82474 (#82504)Matthias Springer1-4/+7
When a `ModifyOperationRewrite` is committed, the operation may already have been erased, so `OperationName` must be cached in the rewrite object. Note: This will no longer be needed with #81757, which adds a "cleanup" method to `IRRewrite`.
2024-02-21[mlir][Transforms][NFC] Simplify `ArgConverter` state (#81462)Matthias Springer1-57/+22
* When converting a block signature, `ArgConverter` creates a new block with the new signature and moves all operation from the old block to the new block. The new block is temporarily inserted into a region that is stored in `regionMapping`. The old block is not yet deleted, so that the conversion can be rolled back. `regionMapping` is not needed. Instead of moving the old block to a temporary region, it can just be unlinked. Block erasures are handles in the same way in the dialect conversion. * `regionToConverter` is a mapping from regions to type converter. That field is never accessed within `ArgConverter`. It should be stored in `ConversionPatternRewriterImpl` instead. * `convertedBlocks` is not needed. Old blocks are already stored in `ConvertedBlockInfo`.
2024-02-21[mlir][Transforms] Support rolling back properties in dialect conversion ↵Matthias Springer3-2/+59
(#82474) The dialect conversion rolls back in-place op modifications upon failure. Rolling back modifications of attributes is already supported, but there was no support for properties until now.
2024-02-21[mlir][Transforms][NFC] Turn in-place op modification into `IRRewrite` (#81245)Matthias Springer2-76/+74
This commit simplifies the internal state of the dialect conversion. A separate field for the previous state of in-place op modifications is no longer needed.
2024-02-21[mlir] Apply ClangTidy performance fix.Adrian Kuegel1-2/+2
Use const reference for loop variable.
2024-02-21[mlir] fix memory leakAlex Zinenko1-0/+1
Fix a leak of the root operation not being deleted in the recently introduced transform_interpreter.c.
2024-02-21[MLIR][Python] Use ir.Value directly instead of _SubClassValueT (#82341)Sergei Lebedev6-24/+10
_SubClassValueT is only useful when it is has >1 usage in a signature. This was not true for the signatures produced by tblgen. For example def call(result, callee, operands_, *, loc=None, ip=None) -> _SubClassValueT: ... here a type checker does not have enough information to infer a type argument for _SubClassValueT, and thus effectively treats it as Any.