aboutsummaryrefslogtreecommitdiff
path: root/lldb/source/Plugins/ScriptInterpreter/Python
diff options
context:
space:
mode:
authorPrinceton Ferro <pferro@nvidia.com>2025-01-16 15:21:32 -0500
committerGitHub <noreply@github.com>2025-01-16 12:21:32 -0800
commit3ba339b5e70231985b2e3f966dd80aa65cfeee1b (patch)
tree15195fe398d78d15c8105c97235550b45b8016f6 /lldb/source/Plugins/ScriptInterpreter/Python
parent99d40fe8f028efa32d31754be774a0d3a0d20fc7 (diff)
downloadllvm-3ba339b5e70231985b2e3f966dd80aa65cfeee1b.zip
llvm-3ba339b5e70231985b2e3f966dd80aa65cfeee1b.tar.gz
llvm-3ba339b5e70231985b2e3f966dd80aa65cfeee1b.tar.bz2
[NVPTX] Improve support for {ex2,lg2}.approx (#120519)
- Add support for `@llvm.exp2()`: - LLVM: `float` -> PTX: `ex2.approx{.ftz}.f32` - LLVM: `half` -> PTX: `ex2.approx.f16` - LLVM: `<2 x half>` -> PTX: `ex2.approx.f16x2` - LLVM: `bfloat` -> PTX: `ex2.approx.ftz.bf16` - LLVM: `<2 x bfloat>` -> PTX: `ex2.approx.ftz.bf16x2` - Any operations with non-native vector widths are expanded. On targets not supporting f16/bf16, values are promoted to f32. - Add *CONDITIONAL* support for `@llvm.log2()` [^1]: - LLVM: `float` -> PTX: `lg2.approx{.ftz}.f32` - Support for f16/bf16 is emulated by promoting values to f32. [1]: CUDA implements `exp2()` with `ex2.approx` but `log2()` is implemented differently, so this is off by default. To enable, use the flag `-nvptx-approx-log2f32`.
Diffstat (limited to 'lldb/source/Plugins/ScriptInterpreter/Python')
0 files changed, 0 insertions, 0 deletions