diff options
Diffstat (limited to 'mlir/docs')
-rw-r--r-- | mlir/docs/Bindings/Python.md | 77 | ||||
-rw-r--r-- | mlir/docs/Canonicalization.md | 2 | ||||
-rw-r--r-- | mlir/docs/Dialects/Shard.md | 6 | ||||
-rw-r--r-- | mlir/docs/Dialects/Vector.md | 4 | ||||
-rw-r--r-- | mlir/docs/Rationale/RationaleLinalgDialect.md | 66 |
5 files changed, 147 insertions, 8 deletions
diff --git a/mlir/docs/Bindings/Python.md b/mlir/docs/Bindings/Python.md index 98ac635..6f778b0 100644 --- a/mlir/docs/Bindings/Python.md +++ b/mlir/docs/Bindings/Python.md @@ -1188,7 +1188,6 @@ which can be `import`ed from the main dialect file, i.e. `python/mlir/dialects/<dialect-namespace>/passes.py` if it is undesirable to make the passes available along with the dialect. - ### Other functionality Dialect functionality other than IR objects or passes, such as helper functions, @@ -1201,6 +1200,82 @@ utilities to connect to the rest of Python API. The bindings can be located in a separate module or in the same module as attributes and types, and loaded along with the dialect. + +## Extending MLIR in Python + +The MLIR Python bindings provide support for defining custom components in Python, +mainly including dialects, passes, and rewrite patterns. +The following sections outline how each of these can be implemented. + +### Dialects + +Dialects can be defined through the IRDL dialect bindings in Python. +The IRDL bindings offer a `load_dialects` function that +converts an MLIR module containing `irdl.dialect` ops into MLIR dialects. +For further details, see the documentation of [the IRDL dialect](../Dialects/IRDL.md). + +### Passes + +Passes can be defined as Python callables via the `PassManager.add` API. +In such case, the callable is wrapped as an `mlir::Pass` internally and +executed as part of the pass pipeline when `PassManager.run` is invoked. +In the callable, the `op` parameter represents the current operation being transformed, +while the `pass_` parameter provides access to the current `Pass` object, +allowing actions such as `signalPassFailure()`. +The lifetime of the callable is extended at least until the `PassManager` is destroyed. +The following example code demonstrates how to define Python passes. + +```python +def demo_pass(op, pass_): + # do something with the given op + pass + +pm = PassManager('any') +pm.add(demo_pass) +pm.add('some-cpp-defined-passes') +... +pm.run(some_op) +``` + +### Rewrite Patterns + +Rewrite patterns can be registered via the `add` method +of `mlir.rewrite.RewritePatternSet` in Python. +This method takes the operation type to be rewritten +and a Python callable that defines the *match and rewrite* logic. +Note that the Python callable should be defined so that +the rewrite is applied if and only if the match succeeds, +which corresponds to the return value being castable to `False`. + +The `RewritePatternSet` can be converted into +a `FrozenRewritePatternSet` using the `freeze` method, +which can be applied to an operation through +the greedy pattern driver using `apply_patterns_and_fold_greedily`. +The following example demonstrates the typical usage: + +```python +def to_muli(op, rewriter): + with rewriter.ip: + new_op = arith.muli(op.lhs, op.rhs, loc=op.location) + rewriter.replace_op(op, new_op) + +patterns = RewritePatternSet() +patterns.add(arith.AddIOp, to_muli) # Rewrite arith.addi into arith.muli +patterns.add(...) +frozen = patterns.freeze() + +module = ... +apply_patterns_and_fold_greedily(module, frozen) +``` + +The PDL dialect bindings also enable defining and generating rewrite patterns in Python. +The `mlir.rewrite.PDLModule` class accepts a module containing `pdl.pattern` ops, +which can be transformed into a `FrozenRewritePatternSet` using the `freeze` method. +This frozen set can then be applied to an operation +using the greedy rewrite pattern driver via `apply_patterns_and_fold_greedily`. +For further information, see [the PDL dialect documentation](/docs/Dialects/PDLOps/). + + ## Free-threading (No-GIL) support Free-threading or no-GIL support refers to CPython interpreter (>=3.13) with Global Interpreter Lock made optional. For details on the topic, please check [PEP-703](https://peps.python.org/pep-0703/) and this [Python free-threading guide](https://py-free-threading.github.io/). diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md index 686e500..2622c08 100644 --- a/mlir/docs/Canonicalization.md +++ b/mlir/docs/Canonicalization.md @@ -55,7 +55,7 @@ Some important things to think about w.r.t. canonicalization patterns: * It is always good to eliminate operations entirely when possible, e.g. by folding known identities (like "x + 0 = x"). -* Pattens with expensive running time (i.e. have O(n) complexity) or +* Patterns with expensive running time (i.e. have O(n) complexity) or complicated cost models don't belong to canonicalization: since the algorithm is executed iteratively until fixed-point we want patterns that execute quickly (in particular their matching phase). diff --git a/mlir/docs/Dialects/Shard.md b/mlir/docs/Dialects/Shard.md index eb6ff61..573e888 100644 --- a/mlir/docs/Dialects/Shard.md +++ b/mlir/docs/Dialects/Shard.md @@ -27,9 +27,9 @@ the tensor is sharded - not specified manually. ### Device Groups -Each collective operation runs within a group of devices. You define groups -using the `grid` and `grid_axes` attributes, which describe how to slice the -full device grid into smaller groups. +Collective operations run within groups of devices, which are defined +using the `grid` and `grid_axes` attributes. These describe +how the full device grid is sliced into smaller groups. Devices that have the same coordinates *outside* the listed `grid_axes` belong to the same group. diff --git a/mlir/docs/Dialects/Vector.md b/mlir/docs/Dialects/Vector.md index 6c8949d..839dc75 100644 --- a/mlir/docs/Dialects/Vector.md +++ b/mlir/docs/Dialects/Vector.md @@ -125,7 +125,7 @@ Some existing Arith and Vector Dialect on `n-D` `vector` types comprise: // Produces a vector<3x7x8xf32> %b = arith.mulf %0, %1 : vector<3x7x8xf32> // Produces a vector<3x7x8xf32> -%c = vector.splat %1 : vector<3x7x8xf32> +%c = vector.broadcast %1 : f32 to vector<3x7x8xf32> %d = vector.extract %0[1]: vector<7x8xf32> from vector<3x7x8xf32> %e = vector.extract %0[1, 5]: vector<8xf32> from vector<3x7x8xf32> @@ -176,8 +176,6 @@ infrastructure can apply iteratively. ### Virtual Vector to Hardware Vector Lowering For now, `VV -> HWV` are specified in C++ (see for instance the -[SplatOpLowering for n-D vectors](https://github.com/tensorflow/mlir/commit/0a0c4867c6a6fcb0a2f17ef26a791c1d551fe33d) -or the [VectorOuterProductOp lowering](https://github.com/tensorflow/mlir/commit/957b1ca9680b4aacabb3a480fbc4ebd2506334b8)). Simple diff --git a/mlir/docs/Rationale/RationaleLinalgDialect.md b/mlir/docs/Rationale/RationaleLinalgDialect.md index 8975b0a..fbe2217 100644 --- a/mlir/docs/Rationale/RationaleLinalgDialect.md +++ b/mlir/docs/Rationale/RationaleLinalgDialect.md @@ -506,6 +506,72 @@ potential by introducing lower-level IR ops and *smaller* Linalg ops. This gradually reduces the potential, all the way to Loops + VectorOps and LLVMIR. +### Interchangeability of Forms<a name="forms"></a> + +#### The Linalg Forms + +The core Linalg operation set has four forms: +* **Generic:** Represented by `linalg.generic` and can encode all perfectly-nested +loop operations. +* **Category:** For example, `linalg.contract` and `linalg.elementwise`, that encode +higher-level semantics of a `linalg.generic` while still representing multiple _named_ +operations via attributes and syntax. In the future, other category operations are +planned (e.g.: `linalg.convolution` and `linalg.pooling`). +* **Named:** For example, `linalg.matmul`, `linalg.add`, etc. All _named_ forms that +can be converted to either a single _category_ or _generic_ forms, ie. are _perfectly nested_. +* **Composite:** For example `linalg.softmax` and the `winograd` variations. These +operations are not perfectly nested, and are converted to a list of other operations +(of various dialects). + +The forms correlate in the following manner: +``` ++ generic + \__ + category + \__ + named ++ composite +``` + +The `category` and `named` forms are derived from `linalg.generic` and are *equivalent*. +It should always be possible to convert a `named` operation into a `category` and that +into a `generic` and back to `named`. However, it may not be possible to convert a +`generic` into a `named` if there is no such `named` form. + +`Composite` operations cannot be converted to the other three classes and forms a +sub-set on its own. But they can use other Linalg forms when expanding. There can be +a pattern-matching transform to detect a graph of operations and convert into a +`composite` operation. + +The various forms in the Linalg dialect are meant to facilitate +pattern matching (single operations or DAGs) and to be able to consider +different forms as *canonical* for different transforms. + +Linalg's various forms also carry information, and that +information should be preserved as much as possible during the progressive +lowering. A `matmul` operation is a special case of a `contract` operation, +which in turn is a special case of a `generic` operation. Transformations on +Linalg operations (in any form) should avoid breaking down into +loops + arithmetic if they can still be represented as a Linalg operation, +preferably in their original form. + +#### Canonical Forms<a name="canonical_forms"></a> + +With multiple (often exchangeable) forms, and with transformation simplicity +in mind, compilers should aim for reducing matching and replacing complexity +as much as possible. When matching a single operation with a complex pattern, +having all the information in a `generic` Op is useful to iteratively match +different patterns in turn. However, when assembling a DAG of operations to +form a pattern, it's much simpler to match against named operations (like +`max` + `div` + `reduce` + `broadcast`) than their generic counterparts. + +This is where the interchangeability of forms comes in handy. Linalg has the +ability to specialize and generalize in order to convert the IR to a form that +is easier for a particular type of transform. With forms being semantically +equivalent, one can convert back-and-forth throughout the various transforms +to match the needs of each transform. For that particular transform, such +form can be considered _canonical_ and therefore "expected" for the pattern +to _match_. This reduces complexity of pattern matchers and simplifies compiler +pipelines. + ### Composable and Declarative Transformations<a name="declarative_transformations"></a> Complex and impactful transformations need not be hard to manipulate, write or maintain. Mixing XLA-style high-level op semantics knowledge with generic |