aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Basic/Cuda.cpp
diff options
context:
space:
mode:
authorMatthias Gehre <matthias.gehre@amd.com>2025-01-21 10:33:56 +0100
committerGitHub <noreply@github.com>2025-01-21 10:33:56 +0100
commit67b9d3ffc2104e9c718510d83e93b3d26cb0872d (patch)
tree946924c1dca6587954fa5e4d34c9d9ae0d153033 /clang/lib/Basic/Cuda.cpp
parent2a8c12b29f8dc777a62868512bed1a2dae1ef8b2 (diff)
downloadllvm-67b9d3ffc2104e9c718510d83e93b3d26cb0872d.zip
llvm-67b9d3ffc2104e9c718510d83e93b3d26cb0872d.tar.gz
llvm-67b9d3ffc2104e9c718510d83e93b3d26cb0872d.tar.bz2
[mlir] computeSliceParameters: Fix offset when m(0) != 0 (#122492)
For affine maps where `m(0) != 0`, like `affine_map<(d0) -> (d0 + 3)` in ``` %generic = linalg.generic {indexing_maps = [affine_map<(d0) -> (d0 + 3)>, affine_map<(d0) -> (d0)>], iterator_types = ["parallel"]} ins(%arg0: tensor<9xf32>) outs(%empty : tensor<6xf32>) { ^bb0(%in : f32, %out: f32): linalg.yield %in : f32 } -> tensor<6xf32> ``` tiling currently computes the wrong slice offsets. When tiling above example with a size of 3, it would compute ``` scf.for %i = ... %slice = tensor.extract_slice %arg0[%i + 3] [6] [1] linalg.generic {indexing_maps = [affine_map<(d0) -> (d0 + 3)>, affine_map<(d0) -> (d0)>], iterator_types = ["parallel"]} ins(%slice: tensor<6xf32>) ``` and thus apply the `+3` twice (once in the extract slice and a second time in the linalg.generic). This PR fixes this to yield an offset of `tensor.extract_slice %arg0[%i] [6] [1]` instead.
Diffstat (limited to 'clang/lib/Basic/Cuda.cpp')
0 files changed, 0 insertions, 0 deletions