diff options
author | Srinivasa Ravi <srinivasar@nvidia.com> | 2025-05-14 14:39:59 +0530 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-05-14 14:39:59 +0530 |
commit | 155e188d94c95b9f389912db2fb180ac8dd75a28 (patch) | |
tree | 09a314e24dfefd871a5f830b3563720884a9fd90 /llvm/lib/CodeGen/MachinePipeliner.cpp | |
parent | 4e63e0457cc1f768c628e71a0786fdb8a6ec271e (diff) | |
download | llvm-155e188d94c95b9f389912db2fb180ac8dd75a28.zip llvm-155e188d94c95b9f389912db2fb180ac8dd75a28.tar.gz llvm-155e188d94c95b9f389912db2fb180ac8dd75a28.tar.bz2 |
[NVPTX] Add intrinsics and clang builtins for conversions of f4x2 type (#139244)
This change adds intrinsics and clang builtins for the cvt instruction
variants of type (FP4) `.e2m1x2`. introduced in PTX 8.6 for `sm_100a`,
`sm_101a`, and `sm_120a`.
Tests are added in `NVPTX/convert-sm100a.ll` and
`clang/test/CodeGen/builtins-nvptx.c` and verified through ptxas 12.8.0.
PTX Spec Reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt
Diffstat (limited to 'llvm/lib/CodeGen/MachinePipeliner.cpp')
0 files changed, 0 insertions, 0 deletions