riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	peterbell10 <peterbell10@openai.com>	2025-01-16 14:53:24 +0000
committer	GitHub <noreply@github.com>	2025-01-16 14:53:24 +0000
commit	5e5fd0e6fc50cc1198750308c11433a5b3acfd0f (patch)
tree	916b8b41a46a02df791ad5dc1bc8109b8274b719 /lldb/source/Plugins/ScriptInterpreter/Python/lldb-python.h
parent	9033e0c2d22c9f247eccea50ae8c975eb3468ac1 (diff)
download	llvm-5e5fd0e6fc50cc1198750308c11433a5b3acfd0f.zip llvm-5e5fd0e6fc50cc1198750308c11433a5b3acfd0f.tar.gz llvm-5e5fd0e6fc50cc1198750308c11433a5b3acfd0f.tar.bz2

[NVPTX] Select bfloat16 add/mul/sub as fma on SM80 (#121065)

SM80 has fma for bfloat16 but not add/mul/sub. Currently these ops incur a promotion to f32, but we can avoid this by writing them in terms of the fma: ``` FADD(a, b) -> FMA(a, 1.0, b) FMUL(a, b) -> FMA(a, b, -0.0) FSUB(a, b) -> FMA(b, -1.0, a) ``` Unfortunately there is no `fma.ftz` so when ftz is enabled, we still fall back to promotion.

Diffstat (limited to 'lldb/source/Plugins/ScriptInterpreter/Python/lldb-python.h')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: