aboutsummaryrefslogtreecommitdiff
path: root/clang/unittests/Format/FormatTestJava.cpp
diff options
context:
space:
mode:
authorMingYan <99472920+NexMing@users.noreply.github.com>2025-03-31 16:13:46 +0800
committerGitHub <noreply@github.com>2025-03-31 09:13:46 +0100
commit5c65a321778b99f745d193629975fb6ced34fe07 (patch)
treecd0880fd36544c160c70d041e258bbd8b439cc76 /clang/unittests/Format/FormatTestJava.cpp
parent842b57b77520abf202999946d3bb01b5dcabb179 (diff)
downloadllvm-5c65a321778b99f745d193629975fb6ced34fe07.zip
llvm-5c65a321778b99f745d193629975fb6ced34fe07.tar.gz
llvm-5c65a321778b99f745d193629975fb6ced34fe07.tar.bz2
[RISCV] Vectorize phi for loop carried @llvm.vp.reduce.* (#131974)
LLVM vector predication reduction intrinsics return a scalar result, but on RISC-V vector reduction instructions write the result in the first element of a vector register. So when a reduction in a loop uses a scalar phi, we end up with unnecessary scalar moves: ```asm loop: vmv.s.x v8, zero vredsum.vs v8, v10, v8 vmv.x.s a0, v8 ```` This mainly affects vector predication reduction. This tries to vectorize any scalar phis that feed into a vector predication reduction in RISCVCodeGenPrepare, converting: ```llvm vector.body: %red.phi = phi i32 [ ..., %entry ], [ %red, %vector.body ] %red = tail call i32 @llvm.vp.reduce.add.nxv4i32(i32 %red.phi, <vscale x 4 x i32> %wide.load, <vscale x 4 x i1> splat (i1 true), i32 %evl) ``` to ```llvm vector.body: %red.phi = phi <vscale x 2 x i32> [ ..., %entry ], [ %acc.vec, %vector.body] %phi.scalar = extractelement <vscale x 2 x i32> %red.phi, i64 0 %acc = tail call i32 @llvm.vp.reduce.add.nxv4i32(i32 %phi.scalar, <vscale x 4 x i32> %wide.load, <vscale x 4 x i1> splat (i1 true), i32 %evl) %acc.vec = insertelement <vscale x 2 x i32> poison, float %acc, i64 0 ``` Which eliminates the scalar -> vector -> scalar crossing during instruction selection. --------- Co-authored-by: yanming <ming.yan@terapines.com>
Diffstat (limited to 'clang/unittests/Format/FormatTestJava.cpp')
0 files changed, 0 insertions, 0 deletions