riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Harrison Hao <57025411+harrisonGPU@users.noreply.github.com>	2025-06-18 09:00:07 +0800
committer	GitHub <noreply@github.com>	2025-06-18 09:00:07 +0800
commit	0defde8e06338cbe968d55d1d9e8581d55f3ae2b (patch)
tree	0479c4c2510f02650a4afd9c1d796b55e4ca155a /llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
parent	86a09f36154fbd264f61ea6462c8cf48b1ff2eb0 (diff)
download	llvm-0defde8e06338cbe968d55d1d9e8581d55f3ae2b.zip llvm-0defde8e06338cbe968d55d1d9e8581d55f3ae2b.tar.gz llvm-0defde8e06338cbe968d55d1d9e8581d55f3ae2b.tar.bz2

[AMDGPU] Support D16 folding for image.sample with multiple extractelement and fptrunc users (#141758)

Now we only support D16 folding for `image sample` instructions with a single user: a `fptrunc` to half. However, we can actually support D16 folding for image.sample instructions with multiple users, as long as each user follows the pattern of extractelement followed by fptrunc to half. For example: ``` %sample = call <4 x float> @llvm.amdgcn.image.sample %e0 = extractelement <4 x float> %sample, i32 0 %h0 = fptrunc float %e0 to half %e1 = extractelement <4 x float> %sample, i32 1 %h1 = fptrunc float %e1 to half %e2 = extractelement <4 x float> %sample, i32 2 %h2 = fptrunc float %e2 to half ``` This change enables D16 folding for such cases and avoids generating `v_cvt_f16_f32_e32` instructions.

Diffstat (limited to 'llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: