riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Andrzej Warzyński <andrzej.warzynski@arm.com>	2024-11-12 19:10:24 +0000
committer	GitHub <noreply@github.com>	2024-11-12 19:10:24 +0000
commit	7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d (patch)
tree	1e2ad0438d15c9062ae585f90200257475e05cbc /llvm/lib/Object/ELFObjectFile.cpp
parent	e458434ebe87f890db0d4a03bbc3de30f3d052b9 (diff)
download	llvm-7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d.zip llvm-7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d.tar.gz llvm-7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d.tar.bz2

[mlir][tensor] Update `GeneralizeOuterUnitDimsPackOpPattern` (#115312)

Avoid generating spurious tensor.extract_slice, follow-on for #114315. This is best to demonstrate with an example. Here's input for `GeneralizeOuterUnitDimsPackOpPattern`: ```mlir %pack = tensor.pack %input padding_value(%pad : f32) inner_dims_pos = [1, 0] inner_tiles = [2, %tile_dim_1] into %output : tensor<5x1xf32> -> tensor<1x1x2x?xf32> ``` Output _before_: ```mlir %padded = tensor.pad %arg0 low[0, 0] high[%0, 1] { ^bb0(%arg4: index, %arg5: index): tensor.yield %arg2 : f32 } : tensor<5x1xf32> to tensor<?x2xf32> // NOTE: skipped in the output _after_ %extracted_slice = tensor.extract_slice %padded[0, 0] [%arg3, 2] [1, 1] : tensor<?x2xf32> to tensor<?x2xf32> %empty = tensor.empty(%arg3) : tensor<2x?xf32> %transposed = linalg.transpose ins(%extracted_slice : tensor<?x2xf32>) outs(%empty : tensor<2x?xf32>) permutation = [1, 0] %inserted_slice = tensor.insert_slice %transposed= into %arg1[0, 0, 0, 0] [1, 1, 2, %arg3] [1, 1, 1, 1] : tensor<2x?xf32> into tensor<1x1x2x?xf32> ``` Output _after_: ```mlir %padded = tensor.pad %arg0 low[0, 0] high[%0, 1] { ^bb0(%arg4: index, %arg5: index): tensor.yield %arg2 : f32 } : tensor<5x1xf32> to tensor<?x2xf32> %empty = tensor.empty(%arg3) : tensor<2x?xf32> %transposed = linalg.transpose ins(%padded : tensor<?x2xf32>) outs(%empty : tensor<2x?xf32>) permutation = [1, 0] %inserted_slice = tensor.insert_slice %transposed into %arg1[0, 0, 0, 0] [1, 1, 2, %arg3] [1, 1, 1, 1] : tensor<2x?xf32> into tensor<1x1x2x?xf32> ``` This PR also adds a check to verify that only the last N trailing dimensions are tiled (for some value of N). Based on the PR discussion, this restriction seems reasonable - especially as there are no in-tree tests requiring otherwise. For now, it also simplifies the computation of permutations for linalg.transpose. This restriction can be relaxed in the future if needed.

Diffstat (limited to 'llvm/lib/Object/ELFObjectFile.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: