aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/MachineTraceMetrics.cpp
diff options
context:
space:
mode:
authorStanley Winata <stanley@nod-labs.com>2023-08-24 15:30:00 -0700
committerAnush Elangovan <anush@nod-labs.com>2023-08-24 17:35:34 -0700
commit1896096002b75b50d46ee0043c20e90c7e27604a (patch)
tree7de9679ac5b780f1fc9953185b27a57207e7060f /llvm/lib/CodeGen/MachineTraceMetrics.cpp
parente75f240a0432d827c28a5d77fad26a099ceb7a72 (diff)
downloadllvm-1896096002b75b50d46ee0043c20e90c7e27604a.zip
llvm-1896096002b75b50d46ee0043c20e90c7e27604a.tar.gz
llvm-1896096002b75b50d46ee0043c20e90c7e27604a.tar.bz2
[mlir][ROCM] Add Wave/Warp shuffle lowering and op for ROCM.
Reduction is heavily used for many DL workload especially with softmax/Attention layers. Wave/Warp shuffle and reduction is known to be a speedy/efficient way to do these reductions. In this patch we introduce AMD shuffle intrinsic Ops to ROCDL, along with it's corresponding lowering from gpu.shuffle. This should speed up a lot of DL workloads on ROCM backend. Currently, we have support for xor and idx, which are the more common ones. In the future, we plan on adding support for Down and Up, as well as using the ds_swizzle to further enhance it's performance when width and offsets are constant. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D158684
Diffstat (limited to 'llvm/lib/CodeGen/MachineTraceMetrics.cpp')
0 files changed, 0 insertions, 0 deletions