riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Princeton Ferro <pferro@nvidia.com>	2025-01-16 15:21:32 -0500
committer	GitHub <noreply@github.com>	2025-01-16 12:21:32 -0800
commit	3ba339b5e70231985b2e3f966dd80aa65cfeee1b (patch)
tree	15195fe398d78d15c8105c97235550b45b8016f6 /lldb/source/Plugins/ScriptInterpreter/Python
parent	99d40fe8f028efa32d31754be774a0d3a0d20fc7 (diff)
download	llvm-3ba339b5e70231985b2e3f966dd80aa65cfeee1b.zip llvm-3ba339b5e70231985b2e3f966dd80aa65cfeee1b.tar.gz llvm-3ba339b5e70231985b2e3f966dd80aa65cfeee1b.tar.bz2

[NVPTX] Improve support for {ex2,lg2}.approx (#120519)

- Add support for `@llvm.exp2()`: - LLVM: `float` -> PTX: `ex2.approx{.ftz}.f32` - LLVM: `half` -> PTX: `ex2.approx.f16` - LLVM: `<2 x half>` -> PTX: `ex2.approx.f16x2` - LLVM: `bfloat` -> PTX: `ex2.approx.ftz.bf16` - LLVM: `<2 x bfloat>` -> PTX: `ex2.approx.ftz.bf16x2` - Any operations with non-native vector widths are expanded. On targets not supporting f16/bf16, values are promoted to f32. - Add *CONDITIONAL* support for `@llvm.log2()` [^1]: - LLVM: `float` -> PTX: `lg2.approx{.ftz}.f32` - Support for f16/bf16 is emulated by promoting values to f32. [1]: CUDA implements `exp2()` with `ex2.approx` but `log2()` is implemented differently, so this is off by default. To enable, use the flag `-nvptx-approx-log2f32`.

Diffstat (limited to 'lldb/source/Plugins/ScriptInterpreter/Python')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: