riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Andrzej Warzyński <andrzej.warzynski@arm.com>	2024-09-24 14:03:30 +0100
committer	GitHub <noreply@github.com>	2024-09-24 14:03:30 +0100
commit	b47d1787b51f55d69ef1b4f88e72cd54af451649 (patch)
tree	40529c877f5c38c91aed3304106ea258a6b10f1f /llvm/lib/Object/MachOObjectFile.cpp
parent	12033e550b186f3b3e4d2ca3ce9cfc3d3a3fa6e1 (diff)
download	llvm-b47d1787b51f55d69ef1b4f88e72cd54af451649.zip llvm-b47d1787b51f55d69ef1b4f88e72cd54af451649.tar.gz llvm-b47d1787b51f55d69ef1b4f88e72cd54af451649.tar.bz2

[mlir][vector] Refine vectorisation of tensor.extract (#109580)

This PR fixes a bug in `isLoopInvariantIdx`. It makes sure that the following case is vectorised as `vector.gather` (as opposed to attempting a contiguous load): ```mlir func.func @index_from_output_column_vector_gather_load(%src: tensor<8x128xf32>) -> tensor<8x1xf32> { %c0 = arith.constant 0 : index %0 = tensor.empty() : tensor<8x1xf32> %res = linalg.generic { indexing_maps = [#map], iterator_types = ["parallel", "parallel"] } outs(%0 : tensor<8x1xf32>) { ^bb0(%arg1: f32): %1 = linalg.index 0 : index %extracted = tensor.extract %src[%1, %c0] : tensor<8x128xf32> linalg.yield %extracted : f32 } -> tensor<8x1xf32> return %res : tensor<8x1xf32> } ``` Specifically, when looking for loop-invariant indices in `tensor.extract` Ops, any `linalg.index` Op that's used in address colcluation should only access loop dims that are == 1. In the example above, the following does not meet that criteria: ```mlir %1 = linalg.index 0 : index ``` Note that this PR also effectively addresses the issue fixed in #107922, i.e. exercised by: * `@vectorize_nd_tensor_extract_load_1d_column_vector_using_gather_load` `getNonUnitLoopDim` introduced in #107922 is still valid though. In fact, it is required to identify that the following case is a contiguous load: ```mlir func.func @index_from_output_column_vector_contiguous_load(%src: tensor<8x128xf32>) -> tensor<8x1xf32> { %c0 = arith.constant 0 : index %0 = tensor.empty() : tensor<8x1xf32> %res = linalg.generic { indexing_maps = [#map], iterator_types = ["parallel", "parallel"] } outs(%0 : tensor<8x1xf32>) { ^bb0(%arg1: f32): %1 = linalg.index 0 : index %extracted = tensor.extract %src[%c0, %1] : tensor<8x128xf32> linalg.yield %extracted : f32 } -> tensor<8x1xf32> return %res : tensor<8x1xf32> } ``` Some logic is still missing to lower the above to `vector.transfer_read`, so it is conservatively lowered to `vector.gather` instead (see TODO in `getTensorExtractMemoryAccessPattern`). There's a few additional changes: * `getNonUnitLoopDim` is simplified and renamed as `getTrailingNonUnitLoopDimIdx`, additional comments are added (note that the functionality didn't change); * extra comments in a few places, variable names in comments update to use Markdown (which is the preferred approach in MLIR). This is a follow-on for: * https://github.com/llvm/llvm-project/pull/107922 * https://github.com/llvm/llvm-project/pull/102321

Diffstat (limited to 'llvm/lib/Object/MachOObjectFile.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: