riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Alex MacLean <amaclean@nvidia.com>	2024-04-23 08:56:39 -0700
committer	GitHub <noreply@github.com>	2024-04-23 08:56:39 -0700
commit	df608051234c256f1dc2c89f30afd034706c2c2e (patch)
tree	11e5882cfcf1a7881dda4b31664c24403157df6d /llvm/lib/CodeGen/PatchableFunction.cpp
parent	f426be195a08874686d01783bbc490295bf4afb2 (diff)
download	llvm-df608051234c256f1dc2c89f30afd034706c2c2e.zip llvm-df608051234c256f1dc2c89f30afd034706c2c2e.tar.gz llvm-df608051234c256f1dc2c89f30afd034706c2c2e.tar.bz2

[NVPTX] Improve support for rsqrt.approx (#89417)

Complete support for rsqrt.approx with rsqrt.approx.f64 ([PTX ISA 9.7.3.17. Floating Point Instructions: rsqrt.approx.ftz.f64](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-rsqrt-approx-ftz-f64)). Additionally, add support for folding `sqrt` into `rsqrt`, with an optional flag to disable.

Diffstat (limited to 'llvm/lib/CodeGen/PatchableFunction.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: