diff options
author | Jeffrey Byrnes <Jeffrey.Byrnes@amd.com> | 2023-03-01 16:29:03 -0800 |
---|---|---|
committer | Jeffrey Byrnes <Jeffrey.Byrnes@amd.com> | 2023-03-03 13:18:25 -0800 |
commit | b89236a96f2f2f3e9b88d198585a8eda7fb2c443 (patch) | |
tree | bca48672c2c1615c41079c389f6fbfbc8e5f1bed /lldb/source/Plugins/Process/scripted/ScriptedProcess.cpp | |
parent | 7442f8635b4d7363b07152e5304c9a0c660eead4 (diff) | |
download | llvm-b89236a96f2f2f3e9b88d198585a8eda7fb2c443.zip llvm-b89236a96f2f2f3e9b88d198585a8eda7fb2c443.tar.gz llvm-b89236a96f2f2f3e9b88d198585a8eda7fb2c443.tar.bz2 |
[AMDGPU] Vectorize misaligned global loads & stores
Based on experimentation on gfx906,908,90a and 1030, wider global loads / stores are more performant than multiple narrower ones independent of alignment -- this is especially true when combining 8 bit loads / stores, in which case speedup was usually 2x across all alignments.
Differential Revision: https://reviews.llvm.org/D145170
Change-Id: I6ee6c76e6ace7fc373cc1b2aac3818fc1425a0c1
Diffstat (limited to 'lldb/source/Plugins/Process/scripted/ScriptedProcess.cpp')
0 files changed, 0 insertions, 0 deletions