aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Frontend/CompilerInstance.cpp
diff options
context:
space:
mode:
authorAndrzej WarzyƄski <andrzej.warzynski@arm.com>2025-04-24 18:05:41 +0100
committerGitHub <noreply@github.com>2025-04-24 18:05:41 +0100
commit2de936b6eb38e7a37224a97c2a22aa79b9dfb9dc (patch)
tree6e183f9c0e53f2f3d6bca16096c01ed3ea277bd0 /clang/lib/Frontend/CompilerInstance.cpp
parente78b763568e47e685926614195c3075afa35668c (diff)
downloadllvm-2de936b6eb38e7a37224a97c2a22aa79b9dfb9dc.zip
llvm-2de936b6eb38e7a37224a97c2a22aa79b9dfb9dc.tar.gz
llvm-2de936b6eb38e7a37224a97c2a22aa79b9dfb9dc.tar.bz2
[mlir][vector] Fix emulation of "narrow" type `vector.store` (#133231)
Below are two examples of "narrow" `vector.stores`. The first example does not require partial stores and hence no RMW stores. This is currently emulated correctly. ```mlir func.func @example_1(%arg0: vector<4xi2>) { %0 = memref.alloc() : memref<13xi2> %c4 = arith.constant 4 : index vector.store %arg0, %0[%c4] : memref<13xi2>, vector<4xi2> return } ``` The second example requires a partial (and hence RMW) store due to the offset pointing outside the emulated type boundary (`%c3`). ```mlir func.func @example_2(%arg0: vector<4xi2>) { %0 = memref.alloc() : memref<13xi2> %c3 = arith.constant 3 : index vector.store %arg0, %0[%c3] : memref<13xi2>, vector<4xi2> return } ``` This is currently incorrectly emulated as a single "full" store (note that the offset is incorrect) instead of partial stores: ```mlir func.func @example_2(%arg0: vector<4xi2>) { %alloc = memref.alloc() : memref<4xi8> %0 = vector.bitcast %arg0 : vector<4xi2> to vector<1xi8> %c0 = arith.constant 0 : index vector.store %0, %alloc[%c0] : memref<4xi8>, vector<1xi8> return } ``` The incorrect emulation stems from this simplified (i.e. incomplete) calculation of the front padding: ```cpp std::optional<int64_t> foldedNumFrontPadElems = isDivisibleInSize ? 0 : getConstantIntValue(linearizedInfo.intraDataOffset); ``` Since `isDivisibleInSize` is `true` (i8 / i2 = 4): * front padding is set to `0` and, as a result, * the input offset (`%c3`) is ignored, and * we incorrectly assume that partial stores won't be needed. Note that in both examples we are storing `vector<4xi2>` into `memref<13xi2>` (note _different_ trailing dims) and hence partial stores might in fact be required. The condition above is updated to: ```cpp std::optional<int64_t> foldedNumFrontPadElems = (isDivisibleInSize && trailingDimsMatch) ? 0 : getConstantIntValue(linearizedInfo.intraDataOffset); ``` This change ensures that the input offset is properly taken into account, which fixes the issue. It doesn't affect `@example1`. Additional comments are added to clarify the current logic.
Diffstat (limited to 'clang/lib/Frontend/CompilerInstance.cpp')
0 files changed, 0 insertions, 0 deletions