aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
diff options
context:
space:
mode:
authorHarrison Hao <57025411+harrisonGPU@users.noreply.github.com>2025-06-18 09:00:07 +0800
committerGitHub <noreply@github.com>2025-06-18 09:00:07 +0800
commit0defde8e06338cbe968d55d1d9e8581d55f3ae2b (patch)
tree0479c4c2510f02650a4afd9c1d796b55e4ca155a /llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
parent86a09f36154fbd264f61ea6462c8cf48b1ff2eb0 (diff)
downloadllvm-0defde8e06338cbe968d55d1d9e8581d55f3ae2b.zip
llvm-0defde8e06338cbe968d55d1d9e8581d55f3ae2b.tar.gz
llvm-0defde8e06338cbe968d55d1d9e8581d55f3ae2b.tar.bz2
[AMDGPU] Support D16 folding for image.sample with multiple extractelement and fptrunc users (#141758)
Now we only support D16 folding for `image sample` instructions with a single user: a `fptrunc` to half. However, we can actually support D16 folding for image.sample instructions with multiple users, as long as each user follows the pattern of extractelement followed by fptrunc to half. For example: ``` %sample = call <4 x float> @llvm.amdgcn.image.sample %e0 = extractelement <4 x float> %sample, i32 0 %h0 = fptrunc float %e0 to half %e1 = extractelement <4 x float> %sample, i32 1 %h1 = fptrunc float %e1 to half %e2 = extractelement <4 x float> %sample, i32 2 %h2 = fptrunc float %e2 to half ``` This change enables D16 folding for such cases and avoids generating `v_cvt_f16_f32_e32` instructions.
Diffstat (limited to 'llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp')
0 files changed, 0 insertions, 0 deletions