aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/TargetLoweringBase.cpp
diff options
context:
space:
mode:
authorVivian Zhang <zhyuhang88@gmail.com>2025-07-29 09:58:30 -0700
committerGitHub <noreply@github.com>2025-07-29 09:58:30 -0700
commitdc6d7f0637e7c80e39e8b7f0e8b61515b4961b0f (patch)
tree862ca7d4aeff1b0c77d50734b4b2352e2e01264c /llvm/lib/CodeGen/TargetLoweringBase.cpp
parent8a1b252a994dee0c30238f2e6c07516ec523cb70 (diff)
downloadllvm-dc6d7f0637e7c80e39e8b7f0e8b61515b4961b0f.zip
llvm-dc6d7f0637e7c80e39e8b7f0e8b61515b4961b0f.tar.gz
llvm-dc6d7f0637e7c80e39e8b7f0e8b61515b4961b0f.tar.bz2
[mlir][linalg] Fix padding shape computation in PadTilingInterface for convs (#149576)
This PR fixes the computation of padded shapes for convolution-style affine maps (e.g., d0 + d1) in `PadTilingInterface`. Previously, the codes used the direct sum of loop upper bounds, leading to over-padding. For example, the following `conv_2d_nhwc_fhwc` op, if only padding the c dimensions to multiples of 16, it also incorrectly pads the convolved dimensions and generates the wrong input shape as: ``` %padded = tensor.pad %arg0 low[0, 0, 0, 0] high[0, 1, 1, 12] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<1x16x16x4xf32> to tensor<1x17x17x16xf32> %padded_0 = tensor.pad %arg1 low[0, 0, 0, 0] high[0, 0, 0, 12] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<16x3x3x4xf32> to tensor<16x3x3x16xf32> %0 = linalg.conv_2d_nhwc_fhwc {dilations = dense<1> : tensor<2xi64>, strides = dense<1> : tensor<2xi64>} ins(%padded, %padded_0 : tensor<1x17x17x16xf32>, tensor<16x3x3x16xf32>) outs(%arg2 : tensor<1x14x14x16xf32>) -> tensor<1x14x14x16xf32> return %0 : tensor<1x14x14x16xf32> ``` The new implementation uses the maximum accessed index as the input for affine map and then adds 1 after aggregating all the terms to get the final padded size. This fixed https://github.com/llvm/llvm-project/issues/148679.
Diffstat (limited to 'llvm/lib/CodeGen/TargetLoweringBase.cpp')
0 files changed, 0 insertions, 0 deletions