rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Srinivasa Ravi <srinivasar@nvidia.com>	2025-05-14 14:39:59 +0530
committer	GitHub <noreply@github.com>	2025-05-14 14:39:59 +0530
commit	155e188d94c95b9f389912db2fb180ac8dd75a28 (patch)
tree	09a314e24dfefd871a5f830b3563720884a9fd90 /llvm/lib/CodeGen/MachinePipeliner.cpp
parent	4e63e0457cc1f768c628e71a0786fdb8a6ec271e (diff)
download	llvm-155e188d94c95b9f389912db2fb180ac8dd75a28.zip llvm-155e188d94c95b9f389912db2fb180ac8dd75a28.tar.gz llvm-155e188d94c95b9f389912db2fb180ac8dd75a28.tar.bz2

[NVPTX] Add intrinsics and clang builtins for conversions of f4x2 type (#139244)

This change adds intrinsics and clang builtins for the cvt instruction variants of type (FP4) `.e2m1x2`. introduced in PTX 8.6 for `sm_100a`, `sm_101a`, and `sm_120a`. Tests are added in `NVPTX/convert-sm100a.ll` and `clang/test/CodeGen/builtins-nvptx.c` and verified through ptxas 12.8.0. PTX Spec Reference: https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt

Diffstat (limited to 'llvm/lib/CodeGen/MachinePipeliner.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: