diff options
author | Andrzej WarzyĆski <andrzej.warzynski@arm.com> | 2024-11-12 19:10:24 +0000 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-11-12 19:10:24 +0000 |
commit | 7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d (patch) | |
tree | 1e2ad0438d15c9062ae585f90200257475e05cbc /llvm/lib/Object/ELFObjectFile.cpp | |
parent | e458434ebe87f890db0d4a03bbc3de30f3d052b9 (diff) | |
download | llvm-7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d.zip llvm-7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d.tar.gz llvm-7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d.tar.bz2 |
[mlir][tensor] Update `GeneralizeOuterUnitDimsPackOpPattern` (#115312)
Avoid generating spurious tensor.extract_slice, follow-on for #114315.
This is best to demonstrate with an example. Here's input for
`GeneralizeOuterUnitDimsPackOpPattern`:
```mlir
%pack = tensor.pack %input
padding_value(%pad : f32)
inner_dims_pos = [1, 0]
inner_tiles = [2, %tile_dim_1]
into %output : tensor<5x1xf32> -> tensor<1x1x2x?xf32>
```
Output _before_:
```mlir
%padded = tensor.pad %arg0 low[0, 0] high[%0, 1] {
^bb0(%arg4: index, %arg5: index):
tensor.yield %arg2 : f32
} : tensor<5x1xf32> to tensor<?x2xf32>
// NOTE: skipped in the output _after_
%extracted_slice = tensor.extract_slice
%padded[0, 0] [%arg3, 2] [1, 1] :
tensor<?x2xf32> to tensor<?x2xf32>
%empty = tensor.empty(%arg3) : tensor<2x?xf32>
%transposed = linalg.transpose
ins(%extracted_slice : tensor<?x2xf32>)
outs(%empty : tensor<2x?xf32>)
permutation = [1, 0]
%inserted_slice = tensor.insert_slice %transposed=
into %arg1[0, 0, 0, 0] [1, 1, 2, %arg3] [1, 1, 1, 1] :
tensor<2x?xf32> into tensor<1x1x2x?xf32>
```
Output _after_:
```mlir
%padded = tensor.pad %arg0 low[0, 0] high[%0, 1] {
^bb0(%arg4: index, %arg5: index):
tensor.yield %arg2 : f32
} : tensor<5x1xf32> to tensor<?x2xf32>
%empty = tensor.empty(%arg3) : tensor<2x?xf32>
%transposed = linalg.transpose
ins(%padded : tensor<?x2xf32>)
outs(%empty : tensor<2x?xf32>) permutation = [1, 0]
%inserted_slice = tensor.insert_slice %transposed
into %arg1[0, 0, 0, 0] [1, 1, 2, %arg3] [1, 1, 1, 1] :
tensor<2x?xf32> into tensor<1x1x2x?xf32>
```
This PR also adds a check to verify that only the last N trailing
dimensions are tiled (for some value of N). Based on the PR
discussion, this restriction seems reasonable - especially as there
are no in-tree tests requiring otherwise. For now, it also simplifies
the computation of permutations for linalg.transpose. This
restriction can be relaxed in the future if needed.
Diffstat (limited to 'llvm/lib/Object/ELFObjectFile.cpp')
0 files changed, 0 insertions, 0 deletions