rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Srinivasa Ravi <srinivasar@nvidia.com>	2025-04-16 10:03:21 +0530
committer	GitHub <noreply@github.com>	2025-04-16 10:03:21 +0530
commit	3264a50fe2b61e79572d1623d0cceb2fe88da533 (patch)
tree	c91207ff92648f581b889322ad1d3316e0644c40 /clang/lib/AST/ByteCode/Program.cpp
parent	04b87e15e40f8857e29ade8321b8b67691545a50 (diff)
download	llvm-3264a50fe2b61e79572d1623d0cceb2fe88da533.zip llvm-3264a50fe2b61e79572d1623d0cceb2fe88da533.tar.gz llvm-3264a50fe2b61e79572d1623d0cceb2fe88da533.tar.bz2

[clang][NVPTX] Add builtins and intrinsics for conversions of new FP types (#134345)

This change: - Adds NVVM intrinsics and clang builtins for the cvt instruction variants of types (FP6) `.e2m3x2`, `.e3m2x2`, and (FP8) `.ue8m0x2` introduced in PTX 8.6 for `sm_100a`, `sm_101a`, and `sm_120a`. - Adds clang builtins for cvt instruction variant of type tf32. Tests are added in `NVPTX/convert-sm100a.ll` and `clang/test/CodeGen/builtins-nvptx.c` and verified through ptxas 12.8.0. PTX Spec Reference: https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt

Diffstat (limited to 'clang/lib/AST/ByteCode/Program.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: