diff options
author | Princeton Ferro <pferro@nvidia.com> | 2025-01-16 15:21:32 -0500 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-01-16 12:21:32 -0800 |
commit | 3ba339b5e70231985b2e3f966dd80aa65cfeee1b (patch) | |
tree | 15195fe398d78d15c8105c97235550b45b8016f6 /lldb/source/Plugins/ScriptInterpreter/Python/SWIGPythonBridge.h | |
parent | 99d40fe8f028efa32d31754be774a0d3a0d20fc7 (diff) | |
download | llvm-3ba339b5e70231985b2e3f966dd80aa65cfeee1b.zip llvm-3ba339b5e70231985b2e3f966dd80aa65cfeee1b.tar.gz llvm-3ba339b5e70231985b2e3f966dd80aa65cfeee1b.tar.bz2 |
[NVPTX] Improve support for {ex2,lg2}.approx (#120519)
- Add support for `@llvm.exp2()`:
- LLVM: `float` -> PTX: `ex2.approx{.ftz}.f32`
- LLVM: `half` -> PTX: `ex2.approx.f16`
- LLVM: `<2 x half>` -> PTX: `ex2.approx.f16x2`
- LLVM: `bfloat` -> PTX: `ex2.approx.ftz.bf16`
- LLVM: `<2 x bfloat>` -> PTX: `ex2.approx.ftz.bf16x2`
- Any operations with non-native vector widths are expanded. On
targets not supporting f16/bf16, values are promoted to f32.
- Add *CONDITIONAL* support for `@llvm.log2()` [^1]:
- LLVM: `float` -> PTX: `lg2.approx{.ftz}.f32`
- Support for f16/bf16 is emulated by promoting values to f32.
[1]: CUDA implements `exp2()` with `ex2.approx` but `log2()` is
implemented differently, so this is off by default. To enable, use the
flag `-nvptx-approx-log2f32`.
Diffstat (limited to 'lldb/source/Plugins/ScriptInterpreter/Python/SWIGPythonBridge.h')
0 files changed, 0 insertions, 0 deletions