rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Stanley Winata <stanley@nod-labs.com>	2023-08-24 15:30:00 -0700
committer	Anush Elangovan <anush@nod-labs.com>	2023-08-24 17:35:34 -0700
commit	1896096002b75b50d46ee0043c20e90c7e27604a (patch)
tree	7de9679ac5b780f1fc9953185b27a57207e7060f /llvm/lib/CodeGen/MachineTraceMetrics.cpp
parent	e75f240a0432d827c28a5d77fad26a099ceb7a72 (diff)
download	llvm-1896096002b75b50d46ee0043c20e90c7e27604a.zip llvm-1896096002b75b50d46ee0043c20e90c7e27604a.tar.gz llvm-1896096002b75b50d46ee0043c20e90c7e27604a.tar.bz2

[mlir][ROCM] Add Wave/Warp shuffle lowering and op for ROCM.

Reduction is heavily used for many DL workload especially with softmax/Attention layers. Wave/Warp shuffle and reduction is known to be a speedy/efficient way to do these reductions. In this patch we introduce AMD shuffle intrinsic Ops to ROCDL, along with it's corresponding lowering from gpu.shuffle. This should speed up a lot of DL workloads on ROCM backend. Currently, we have support for xor and idx, which are the more common ones. In the future, we plan on adding support for Down and Up, as well as using the ds_swizzle to further enhance it's performance when width and offsets are constant. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D158684

Diffstat (limited to 'llvm/lib/CodeGen/MachineTraceMetrics.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: