Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
When executed in the context of canonicalization, the folders are
invoked in a fixed-point iterative process. However in the context of an
API like `createOrFold()` or in DialectConversion for example, we expect
a "one-shot" call to fold to be as "folded" as possible. However, even
when folders themselves are indempotent, folders on a given operation
interact with each other. For example:
```
// X = 0 + Y
%X = arith.addi %c_0, %Y : i32
```
should fold to %Y, but the process actually involves first the folder
provided by the IsCommutative trait to move the constant to the right.
However this happens after attempting to fold the operation and the
operation folder isn't attempt again after applying the trait folder.
This commit makes sure we iterate until fixed point on folder
applications.
Fixes #159844
|
|
arguments (#160755)
In #153973 I added the correctly handling of block arguments,
unfortunately this was gated on operation that also have results. This
wasn't intentional and this excluded operations like function from being
correctly processed.
|
|
## Problem
`RemoveDeadValues` can legally drop dead function arguments on private
`func.func` callees. But call-sites to such functions aren't fixed if
the call operation keeps its call arguments in a **segmented operand
group** (i.ie, uses `AttrSizedOperandSegments`), unless the call op
implements `getArgOperandsMutable` and the RDV pass actually uses it.
## Fix
When RDV decides to drop callee function args, it should, for each
call-site that implements `CallOpInterface`, **shrink the call's
argument segment** via `getArgOperandsMutable()` using the same dead-arg
indices. This keeps both the flat operand list and the
`operand_segment_sizes` attribute in sync (that's what
`MutableOperandRange` does when bound to the segment).
## Note
This change is a no-op for:
* call ops without segment operands (they still get their flat operands
erased via the generic path)
* call ops whose calle args weren't dropped (public, external,
non-`func-func`, unresolved symbol, etc)
* `llvm.call`/`llvm.invoke` (RDV doesn't drop `llvm.func` args
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
|
|
This commit adds functionality to bubble down memory-space casts
operations, allowing consumer operations to use the original
memory-space rather than first casting to a different memory space.
Changes:
- Introduce `MemorySpaceCastOpInterface` to handle memory-space cast
operations
- Create a `MemorySpaceCastConsumerOpInterface` pass that identifies and
bubbles down eligible casts
- Add implementation for memref and vector operations to handle
memory-space cast propagation
- Add `bubbleDownCasts` method to relevant operations to support the
fusion
In particular, in the current implementation only memory-space casts
into the default memory-space can be bubbled-down.
Example:
```mlir
func.func @op_with_cast_sequence(%arg0: memref<4x4xf32, 1>, %arg1: index, %arg2: f32) -> memref<16xf32> {
%memspacecast = memref.memory_space_cast %arg0 : memref<4x4xf32, 1> to memref<4x4xf32>
%c0 = arith.constant 0 : index
%c4 = arith.constant 4 : index
%expanded = memref.expand_shape %memspacecast [[0], [1, 2]] output_shape [4, 2, 2] : memref<4x4xf32> into memref<4x2x2xf32>
%collapsed = memref.collapse_shape %expanded [[0, 1, 2]] : memref<4x2x2xf32> into memref<16xf32>
%loaded = memref.load %collapsed[%c0] : memref<16xf32>
%added = arith.addf %loaded, %arg2 : f32
memref.store %added, %collapsed[%c0] : memref<16xf32>
%atomic_result = memref.atomic_rmw addf %arg2, %collapsed[%c4] : (f32, memref<16xf32>) -> f32
return %collapsed : memref<16xf32>
}
// mlir-opt --bubble-down-memory-space-casts
func.func @op_with_cast_sequence(%arg0: memref<4x4xf32, 1>, %arg1: index, %arg2: f32) -> memref<16xf32> {
%c4 = arith.constant 4 : index
%c0 = arith.constant 0 : index
%expand_shape = memref.expand_shape %arg0 [[0], [1, 2]] output_shape [4, 2, 2] : memref<4x4xf32, 1> into memref<4x2x2xf32, 1>
%collapse_shape = memref.collapse_shape %expand_shape [[0, 1, 2]] : memref<4x2x2xf32, 1> into memref<16xf32, 1>
%memspacecast = memref.memory_space_cast %collapse_shape : memref<16xf32, 1> to memref<16xf32>
%0 = memref.load %collapse_shape[%c0] : memref<16xf32, 1>
%1 = arith.addf %0, %arg2 : f32
memref.store %1, %collapse_shape[%c0] : memref<16xf32, 1>
%2 = memref.atomic_rmw addf %arg2, %collapse_shape[%c4] : (f32, memref<16xf32, 1>) -> f32
return %memspacecast : memref<16xf32>
}
```
---------
Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
|
|
implementation (#158075)
Move the logic for building "out-of-thin-air" source materializations
during op replacements from `replaceOp` to
`findOrBuildReplacementValue`. That function already builds source
materializations and can handle the case where an op result is dropped.
This commit is in preparation of turning `replaceOp` into a non-virtual
function. (It is sufficient for `replaceAllUsesWith` and `eraseOp` to be
virtual.)
|
|
|
|
GreedyPatternRewriteDriver.cpp (NFC)
|
|
(NFC) (#159752)
Pattern benefit allows users to give priority to a pattern.
|
|
forward-declaration (#158291)
This is a follow-up to
https://github.com/llvm/llvm-project/pull/158067/files#r2343711946.
|
|
The `reconcile-unrealized-casts` pass used to crash when the input
contains circular chains of `unrealized_conversion_cast` ops.
Furthermore, the `reconcileUnrealizedCasts` helper functions used to
erase ops that were not passed via the `castOps` operand. Such ops are
now preserved. That's why some integration tests had to be changed.
Also avoid copying the set of all unresolved materializations in
`convertOperations`.
This commit is in preparation of turning `RewriterBase::replaceOp` into
a non-virtual function.
This is a re-upload of #158067, which was reverted due to CI failures.
Note for LLVM integration: If you are seeing tests that are failing with
`error: LLVM Translation failed for operation:
builtin.unrealized_conversion_cast`, you may have to add the
`-reconcile-unrealized-casts` pass to your pass pipeline. (Or switch to
the `-convert-to-llvm` pass instead of combining the various
`-convert-*-to-llvm` passes.)
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
|
|
Reverts llvm/llvm-project#158067
Buildbot is broken.
|
|
The `reconcile-unrealized-casts` pass used to crash when the input
contains circular chains of `unrealized_conversion_cast` ops.
Furthermore, the `reconcileUnrealizedCasts` helper functions used to
erase ops that were not passed via the `castOps` operand. Such ops are
now preserved. That's why some integration tests had to be changed.
Also avoid copying the set of all unresolved materializations in
`convertOperations`.
This commit is in preparation of turning `RewriterBase::replaceOp` into
a non-virtual function.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
|
|
conversion (#158072)
The current implementation is overly conservative and disable all
possible caching as soon as a context-aware conversion is present.
However the context-aware conversion only affects subsequent converters,
we can cache the previous ones.
This isn't NFC because if fixed a bug where we use to unconditionally
cache when using the `convertType(Type t, ...` API, while now all APIs
are aware of context-aware conversions.
|
|
|
|
`ConversionPatternRewriter::replaceAllUsesWith` (#155244)
This commit generalizes `replaceUsesOfBlockArgument` to
`replaceAllUsesWith`. In rollback mode, the same restrictions keep
applying: a value cannot be replaced multiple times and a call to
`replaceAllUsesWith` will replace all current and future uses of the
`from` value.
`replaceAllUsesWith` is now fully supported and its behavior is
consistent with the remaining dialect conversion API. Before this
commit, `replaceAllUsesWith` was immediately reflected in the IR when
running in rollback mode. After this commit, `replaceAllUsesWith`
changes are materialized in a delayed fashion, at the end of the dialect
conversion. This is consistent with the `replaceUsesOfBlockArgument` and
`replaceOp` APIs.
`replaceAllUsesExcept` etc. are still not supported and will be
deactivated on the `ConversionPatternRewriter` (when running in rollback
mode) in a follow-up commit.
Note for LLVM integration: Replace `replaceUsesOfBlockArgument` with
`replaceAllUsesWith`. If you are seeing failures, you may have patterns
that use `replaceAllUsesWith` incorrectly (e.g., being called multiple
times on the same value) or bypass the rewriter API entirely. E.g., such
failures were mitigated in Flang by switching to the walk-patterns
driver (#156171).
You can temporarily reactivate the old behavior by calling
`RewriterBase::replaceAllUsesWith`. However, note that that behavior is
faulty in a dialect conversion. E.g., the base
`RewriterBase::replaceAllUsesWith` implementation does not see uses of
the `from` value that have not materialized yet and will, therefore, not
replace them.
|
|
last block is not the exit. (#156123)
'processFuncOp' queries the number of returned values of a function
using the terminator of the last block's getNumOperands(). It presumes
the last block is the exit. It is not always the case.
This patch fixes the bug by querying from FunctionInterfaceOp directly.
|
|
This patch fixes:
mlir/lib/Transforms/Utils/DialectConversion.cpp:2761:9: error:
unused variable 'impl' [-Werror,-Wunused-variable]
|
|
Many internal functions take a `ConversionPatternRewriter &` or
`ConversionPatternRewriterImpl &` as a parameter. There's only a single
instance of these classes, so it's better to store the reference in a
field. This commit is in preparation of another PR that will require
access to `ConversionPatternRewriter` in additional helper functions.
Note: Public API does not change.
|
|
|
|
LoopInvariantCodeMotionUtils.cpp (NFC)
|
|
This commit adds support for context-aware type conversions: type
conversion rules that can return different types depending on the IR.
There is no change for existing (context-unaware) type conversion rules:
```c++
// Example: Conversion any integer type to f32.
converter.addConversion([](IntegerType t) {
return Float32Type::get(t.getContext());
}
```
There is now an additional overload to register context-aware type
conversion rules:
```c++
// Example: Type conversion rule for integers, depending on the context:
// Get the defining op of `v`, read its "increment" attribute and return an
// integer with a bitwidth that is increased by "increment".
converter.addConversion([](Value v) -> std::optional<Type> {
auto intType = dyn_cast<IntegerType>(v.getType());
if (!intType)
return std::nullopt;
Operation *op = v.getDefiningOp();
if (!op)
return std::nullopt;
auto incrementAttr = op->getAttrOfType<IntegerAttr>("increment");
if (!incrementAttr)
return std::nullopt;
return IntegerType::get(v.getContext(),
intType.getWidth() + incrementAttr.getInt());
});
```
For performance reasons, the type converter caches the result of type
conversions. This is no longer possible when there context-aware type
conversions because each conversion could compute a different type
depending on the context. There is no performance degradation when there
are only context-unaware type conversions.
Note: This commit just adds context-aware type conversions to the
dialect conversion framework. There are many existing patterns that
still call `converter.convertType(someValue.getType())`. These should be
gradually updated in subsequent commits to call
`converter.convertType(someValue)`.
Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
|
|
```
warning: implicit capture of 'this' with a capture default of '=' is deprecated [-Wdeprecated-this-capture]
```
Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
|
|
Improve the documentation of `replaceUsesOfBlockArgument` to clarify its
semantics is rollback mode. Add an assertion to make sure that the same
block argument is not replaced multiple times. That's an API violation
and messes with the internal state of the conversion driver.
This commit is in preparation of adding full support for
`RewriterBase::replaceAllUsesWith`.
|
|
|
|
|
|
|
|
|
|
|
|
I have seen misuse of the `hasEffect` API in downstream projects: users
sometimes think that `hasEffect == false` indicates that the operation
does not have a certain memory effect. That's not necessarily the case.
When the op does not implement the `MemoryEffectsOpInterface`, it is
unknown whether it has the specified effect. "false" can also mean
"maybe".
This commit clarifies the semantics in the documentation. Also adds
`hasUnknownEffects` and `mightHaveEffect` convenience functions. Also
simplifies a few call sites.
|
|
(#153961)
Also improve a bit the LDBG() implementation
|
|
|
|
This patch is forcing all values to be initialized by the
LivenessAnalysis, even in dead blocks. The dataflow framework will skip
visiting values when its already knows that a block is dynamically
unreachable, so this requires specific handling.
Downstream code could consider that the absence of liveness is the same
a "dead".
However as the code is mutated, new value can be introduced, and a
transformation like "RemoveDeadValue" must conservatively consider that
the absence of liveness information meant that we weren't sure if a
value was dead (it could be a newly introduced value.
Fixes #153906
|
|
(#154038)
This is similar to the fix to the greedy driver in #153957 ; except that
instead of removing unreachable code, we just ignore it.
Operations like:
```
%add = arith.addi %add, %add : i64
```
are legal in unreachable code.
Unfortunately many patterns would be unsafe to apply on such IR and can
lead to crashes or infinite loops.
|
|
(#119532)
Add a debugging flag to the dialect conversion to dump the
materialization kind. This flag is useful to find out whether a missing
materialization rule is for source or target materializations.
Also add missing test coverage for the `buildMaterializations` flag.
|
|
(#154037)
This is in preparation of a follow-up change to stop traversing
unreachable blocks.
This is not NFC because of a subtlety of the early_inc. On a test case
like:
```
scf.if %cond {
"test.move_after_parent_op"() ({
"test.any_attr_of_i32_str"() {attr = 0 : i32} : () -> ()
}) : () -> ()
}
```
We recursively traverse the nested regions, and process an op when the
region is done (post-order).
We need to pre-increment the iterator before processing an operation in
case it gets deleted. However
we can do this before or after processing the nested region. This
implementation does the latter.
|
|
rewriter (#153957)
Operations like:
%add = arith.addi %add, %add : i64
are legal in unreachable code. Unfortunately many patterns would be
unsafe to apply on such IR and can lead to crashes or infinite loops. To
avoid this we can remove unreachable blocks before attempting to apply
patterns.
We may have to do this also whenever the CFG is changed by a pattern, it
is left up for future work right now.
Fixes #153732
|
|
Until now, `builtin.unrealized_conversion_cast` ops could not be inlined
by the Inliner pass.
|
|
(#153605)
Prior to this PR, the default behaviour of a conversion pattern which
receives operands of a 1:N is to abort the compilation. This has
historically been useful when the 1:N type conversion got merged into
the dialect conversion as it allowed us to easily find patterns that
should be capable of handling 1:N type conversions but didn't.
However, this behaviour has the disadvantage of being non-composable:
While the pattern in question cannot handle the 1:N type conversion,
another pattern part of the set might, but doesn't get the chance as
compilation is aborted.
This PR fixes this behaviour by failing to match and instead of
aborting, giving other patterns the chance to legalize an op. The
implementation uses a reusable function called `dispatchTo1To1` to allow
derived conversion patterns to also implement the behaviour.
|
|
Fix build after #151865.
|
|
This commit improves the `allowPatternRollback` flag handling in the
dialect conversion driver. Previously, this flag was used to merely
detect cases that are incompatible with the new One-Shot Dialect
Conversion driver. This commit implements the driver itself: when the
flag is set to "false", all IR changes are materialized immediately,
bypassing the `IRRewrite` and `ConversionValueMapping` infrastructure.
A few selected test cases now run with both the old and the new driver.
RFC:
https://discourse.llvm.org/t/rfc-a-new-one-shot-dialect-conversion-driver/79083
|
|
Previously this only happened post checking if the op is legal, but was
done unconditionally post (and before other legalization patterns). Add
option to not attempt folding and one to do so as last resort.
Did consider but did not add a always attempt to fold option (which
would have folded whether or not legal), but removed TODO about it.
|
|
legalized (#152297)
Print a more detailed error message when new/modified IR could not be
legalized with `allowPatternRollback = false`. This is useful to
understand why a pattern is incompatible with the new One-Shot Dialect
Conversion driver.
---------
Co-authored-by: Jeremy Kun <jkun@google.com>
|
|
When a conversion pattern is initialized without a type converter, the
driver implementation currently looks up the most recently mapped value.
This is undesirable because the most recently mapped value could be a
materialization. I.e., the type of the value being looked up could
depend on which other patterns have run before. Such an implementation
makes the type conversion infrastructure fragile and unpredictable.
The current implementation also contradicts the documentation in the
markdown file. According to that documentation, the values provided by
the adaptor should match the types of the operands of the match
operation when running without a type converter. This mechanism is not
desirable, either, for two reasons:
1. Some patterns have started to rely on receiving the most recently
mapped value. Changing the behavior to the documented behavior will
cause regressions. (And there would be no easy way to fix those without
forcing the use of a type converter or extending the `getRemappedValue`
API.)
2. It is more useful to receive the most recently mapped value. A value
of the original operand type can be retrieved by using the operand of
the matched operation. The adaptor is not needed at all in that case.
To implement the new behavior, materializations are now annotated with a
marker attribute. The marker is needed because not all
`unrealized_conversion_cast` ops are materializations that act as "pure
type conversions". E.g., when erasing an operation, its results are
mapped to newly-created "out-of-thin-air values", which are
materializations (with no input) that should be treated like regular
replacement values during a lookup. This marker-based lookup strategy is
also compatible with the One-Shot Dialect Conversion implementation
strategy, which does not utilize the mapping infrastructure anymore and
queries all necessary information by examining the IR.
|
|
Add a helper function to `ConversionPatternRewriter` that returns the
dialect conversion configuration. This flag is useful when migrating
conversion patterns to the new One-Shot Conversion Driver: patterns can
check if they are running in rollback mode or not. They can then work
around API changes and makes sure that the pattern keeps working with
both the old and new driver.
Also remove the `config` field from `OperationLegalizer`. That field was
never needed.
|
|
Do not erase location info when moving an op within the same block.
Since #75415 , the FoldUtils.cpp erases the location information when
moving an operation. This was being done even when an operation was
moved to the front of a block it was already in.
In TFLite, this location information is used to provide meaningful names
for tensors, which aids in debugging and mapping compiled tensors back
to their original layers. The aggressive erasure of location info caused
many tensors in TFLite models to receive generic names (e.g.,
tfl.pseudo_qconst), making the models harder to inspect.
This change modifies the logic to preserve the location of an operation
when it is moved within the same block. The location is now only erased
when the operation is moved from a different block entirely. This
ensures that most tensor names are preserved, improving the debugging
experience for TFLite models.
|
|
format (#150991)
This prefix the output with the DEBUG_TYPE.
Dialect conversion is using a ScopedPrinter, we insert the
raw_ldbg_ostream to consistently prefix each new line.
|
|
With LDBG(), the code isn't guarded in release mode, even if the optimizer
will remove it because there is a `if (false)` statement. We need the
function declaration to be there at minima.
|
|
|
|
Operation folders can do two things:
1. Modify IR (in-place op modification). Failing to legalize an in-place
folded operation does not trigger an immediate rollback. This happens
only if the driver decides to try a different lowering path, requiring
it to roll back a bunch of modifications, including the application of
the folder.
2. Create new IR (constant op materialization of a folded attribute).
Failing to legalize a newly created constant op triggers an immediate
rollback.
In-place op modifications should be guarded by
`startOpModification`/`finalizeOpModification` because they are no
different from other in-place op modifications. (They just happen
outside of a pattern, but that does not mean that we should not track
those changes; we are tracking everything else.) This commit adds those
two function calls.
This commit also moves the `rewriter.replaceOp(op, replacementValues);`
function call before the loop nest that legalizes the newly created
constant ops (and therefore `replacementValues`). Conceptually, the
folded op must be replaced before attempting to legalize the constants
because the constant ops may themselves be replaced as part of their own
legalization process. The previous implementation happened to work in
the current conversion driver, but is incompatible with the One-Shot
Dialect Conversion driver, which expects to see the most recent IR at
all time.
From an end-user perspective, this commit should be NFC. A common
folder-rollback pattern that is exercised by multiple tests cases: A
`memref.dim` is folded to `arith.constant`, but `arith.constant` is not
marked as legal as per the conversion target, triggering a rollback.
Note: Folding is generally unsafe in a dialect conversion (see #92683),
but that's a different issue. (In a One-Shot Dialect Conversion, it will
no longer be unsafe.)
|