rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Zhuoran Yin <zhuoryin@amd.com>	2025-04-15 16:36:25 -0400
committer	GitHub <noreply@github.com>	2025-04-15 16:36:25 -0400
commit	2b983a24583dd4e131d727717872a56712b5dd52 (patch)
tree	6c218367976ec3c4e121fa13c5958dcb9765ac40 /clang/lib/Frontend/TestModuleFileExtension.cpp
parent	85eb44e304e0a0a7da78448ceee60fdfec235edb (diff)
download	llvm-2b983a24583dd4e131d727717872a56712b5dd52.zip llvm-2b983a24583dd4e131d727717872a56712b5dd52.tar.gz llvm-2b983a24583dd4e131d727717872a56712b5dd52.tar.bz2

[MLIR][AMDGPU] Adding dynamic size check to avoid subword buffer load (#135014)

Motivation: amdgpu buffer load instruction will return all zeros when loading sub-word values. For example, assuming the buffer size is exactly one word and we attempt to invoke `llvm.amdgcn.raw.ptr.buffer.load.v2i32` starting from byte 2 of the word, we will not receive the actual value of the buffer but all zeros for the first word. This is because the boundary has been crossed for the first word. This PR come up with a fix to this problem, such that, it creates a bounds check against the buffer load instruction. It will compare the offset + vector size to see if the upper bound of the address will exceed the buffer size. If it does, masked transfer read will be optimized to `vector.load` + `arith.select`, else, it will continue to fall back to default lowering of the masked vector load.

Diffstat (limited to 'clang/lib/Frontend/TestModuleFileExtension.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: