diff options
author | Harrison Hao <57025411+harrisonGPU@users.noreply.github.com> | 2025-06-18 09:00:07 +0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-06-18 09:00:07 +0800 |
commit | 0defde8e06338cbe968d55d1d9e8581d55f3ae2b (patch) | |
tree | 0479c4c2510f02650a4afd9c1d796b55e4ca155a /llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp | |
parent | 86a09f36154fbd264f61ea6462c8cf48b1ff2eb0 (diff) | |
download | llvm-0defde8e06338cbe968d55d1d9e8581d55f3ae2b.zip llvm-0defde8e06338cbe968d55d1d9e8581d55f3ae2b.tar.gz llvm-0defde8e06338cbe968d55d1d9e8581d55f3ae2b.tar.bz2 |
[AMDGPU] Support D16 folding for image.sample with multiple extractelement and fptrunc users (#141758)
Now we only support D16 folding for `image sample` instructions with a
single user: a `fptrunc` to half.
However, we can actually support D16 folding for image.sample
instructions with multiple users,
as long as each user follows the pattern of extractelement followed by
fptrunc to half.
For example:
```
%sample = call <4 x float> @llvm.amdgcn.image.sample
%e0 = extractelement <4 x float> %sample, i32 0
%h0 = fptrunc float %e0 to half
%e1 = extractelement <4 x float> %sample, i32 1
%h1 = fptrunc float %e1 to half
%e2 = extractelement <4 x float> %sample, i32 2
%h2 = fptrunc float %e2 to half
```
This change enables D16 folding for such cases and avoids generating
`v_cvt_f16_f32_e32` instructions.
Diffstat (limited to 'llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp')
0 files changed, 0 insertions, 0 deletions