diff options
author | Charitha Saumya <136391709+charithaintc@users.noreply.github.com> | 2025-07-14 15:41:56 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-07-14 15:41:56 -0700 |
commit | 244ebef1ddbc6a77f17c36562d4a4292654940a6 (patch) | |
tree | 69055b6d371fa9a9d7597f4843831e052f370aa7 /clang/lib/Frontend/CompilerInvocation.cpp | |
parent | 5277021c3c75a19a3db5e097e4b4e73eeb1f8ffa (diff) | |
download | llvm-244ebef1ddbc6a77f17c36562d4a4292654940a6.zip llvm-244ebef1ddbc6a77f17c36562d4a4292654940a6.tar.gz llvm-244ebef1ddbc6a77f17c36562d4a4292654940a6.tar.bz2 |
Reapply [mlir][vector] Refactor WarpOpScfForOp to support unused or swapped forOp results. (#148313)
Reapply attempt for : https://github.com/llvm/llvm-project/pull/148291
Fix for the build failure reported in :
https://lab.llvm.org/buildbot/#/builders/116/builds/15477
-----
This crash is caused by mismatch of distributed type returned by
`getDistributedType` and intended distributed type for forOp results.
Solution diff:
https://github.com/llvm/llvm-project/commit/20c2cf67662c3b3fdecf95a0e280809f98d8db50
Example:
```
func.func @warp_scf_for_broadcasted_result(%arg0: index) -> vector<1xf32> {
%c128 = arith.constant 128 : index
%c1 = arith.constant 1 : index
%c0 = arith.constant 0 : index
%2 = gpu.warp_execute_on_lane_0(%arg0)[32] -> (vector<1xf32>) {
%ini = "some_def"() : () -> (vector<1xf32>)
%0 = scf.for %arg3 = %c0 to %c128 step %c1 iter_args(%arg4 = %ini) -> (vector<1xf32>) {
%1 = "some_op"(%arg4) : (vector<1xf32>) -> (vector<1xf32>)
scf.yield %1 : vector<1xf32>
}
gpu.yield %0 : vector<1xf32>
}
return %2 : vector<1xf32>
}
```
In this case the distributed type for forOp result is `vector<1xf32>`
(result is not distributed and broadcasted to all lanes instead).
However, in this case `getDistributedType` will return NULL type.
Therefore, if the distributed type can be recovered from warpOp, we
should always do that first before using `getDistributedType`
Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')
0 files changed, 0 insertions, 0 deletions