diff options
author | Alex MacLean <amaclean@nvidia.com> | 2024-12-06 13:30:09 -0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-12-06 13:30:09 -0800 |
commit | 4b24ab4be9351ef822fd8fd546237eabd8c3ba57 (patch) | |
tree | 0a3cbe49488b2b86763061bc8c46958147d03d33 /lldb/test/Shell/ScriptInterpreter/Python | |
parent | 7ff89294b63f8f15c650fe314cff5c576978c489 (diff) | |
download | llvm-4b24ab4be9351ef822fd8fd546237eabd8c3ba57.zip llvm-4b24ab4be9351ef822fd8fd546237eabd8c3ba57.tar.gz llvm-4b24ab4be9351ef822fd8fd546237eabd8c3ba57.tar.bz2 |
Reland "[NVPTX] Add folding for cvt.rn.bf16x2.f32" (#116417)
Reland https://github.com/llvm/llvm-project/pull/116109.
Fixes issue where operands were flipped.
Per the PTX spec, a mov instruction packs the first operand as low, and
the second operand as high:
> ```
> // pack two 16-bit elements into .b32
> d = a.x | (a.y << 16)
> ```
On the other hand cvt.rn.f16x2.f32 instructions take high, than low
operands:
> For .f16x2 and .bf16x2 instruction type, two inputs a and b of .f32
type are converted into .f16 or .bf16 type and the converted values are
packed in the destination register d, such that the value converted from
input a is stored in the upper half of d and the value converted from
input b is stored in the lower half of d
Diffstat (limited to 'lldb/test/Shell/ScriptInterpreter/Python')
0 files changed, 0 insertions, 0 deletions