diff options
author | Matthias Gehre <matthias.gehre@amd.com> | 2025-01-21 10:33:56 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-01-21 10:33:56 +0100 |
commit | 67b9d3ffc2104e9c718510d83e93b3d26cb0872d (patch) | |
tree | 946924c1dca6587954fa5e4d34c9d9ae0d153033 /clang/lib/Basic/Cuda.cpp | |
parent | 2a8c12b29f8dc777a62868512bed1a2dae1ef8b2 (diff) | |
download | llvm-67b9d3ffc2104e9c718510d83e93b3d26cb0872d.zip llvm-67b9d3ffc2104e9c718510d83e93b3d26cb0872d.tar.gz llvm-67b9d3ffc2104e9c718510d83e93b3d26cb0872d.tar.bz2 |
[mlir] computeSliceParameters: Fix offset when m(0) != 0 (#122492)
For affine maps where `m(0) != 0`,
like `affine_map<(d0) -> (d0 + 3)` in
```
%generic = linalg.generic
{indexing_maps = [affine_map<(d0) -> (d0 + 3)>,
affine_map<(d0) -> (d0)>],
iterator_types = ["parallel"]} ins(%arg0: tensor<9xf32>) outs(%empty : tensor<6xf32>) {
^bb0(%in : f32, %out: f32):
linalg.yield %in : f32
} -> tensor<6xf32>
```
tiling currently computes the wrong slice offsets. When tiling above
example with a size of 3, it would compute
```
scf.for %i = ...
%slice = tensor.extract_slice %arg0[%i + 3] [6] [1]
linalg.generic
{indexing_maps = [affine_map<(d0) -> (d0 + 3)>,
affine_map<(d0) -> (d0)>],
iterator_types = ["parallel"]} ins(%slice: tensor<6xf32>)
```
and thus apply the `+3` twice (once in the extract slice and a second
time in the linalg.generic).
This PR fixes this to yield an offset of
`tensor.extract_slice %arg0[%i] [6] [1]` instead.
Diffstat (limited to 'clang/lib/Basic/Cuda.cpp')
0 files changed, 0 insertions, 0 deletions