aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenModule.cpp
diff options
context:
space:
mode:
authorZhuoran Yin <zhuoryin@amd.com>2025-04-17 08:50:31 -0400
committerGitHub <noreply@github.com>2025-04-17 08:50:31 -0400
commit47f4f39265b31e2249536b74d33d63508cdfb457 (patch)
tree6e97eb043ef32ee63505d24932cc7be64e011867 /clang/lib/CodeGen/CodeGenModule.cpp
parent5a993558c5fa52c605d984e0effdc1cd3b452476 (diff)
downloadllvm-47f4f39265b31e2249536b74d33d63508cdfb457.zip
llvm-47f4f39265b31e2249536b74d33d63508cdfb457.tar.gz
llvm-47f4f39265b31e2249536b74d33d63508cdfb457.tar.bz2
[MLIR][AMDGPU] Fixing word alignment check for bufferload fastpath (#135982)
`delta_bytes % (32 ceilDiv elementBitwidth) != 0` condition is incorrect in https://github.com/llvm/llvm-project/pull/135014 For example, last load is issued to load only one last element of fp16. Then `delta bytes = 2`, `(32 ceildiv 16) = 2`. In this case it will be judged as word aligned. It will send to fast path but get all zeros for the fp16 because it cross the word boundary. In reality the equation should be just `delta_bytes % 4` , since a word is 4 bytes. This PR fix the bug by amending the mod target to 4.
Diffstat (limited to 'clang/lib/CodeGen/CodeGenModule.cpp')
0 files changed, 0 insertions, 0 deletions