aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/PatchableFunction.cpp
diff options
context:
space:
mode:
authorAlex MacLean <amaclean@nvidia.com>2024-04-23 08:56:39 -0700
committerGitHub <noreply@github.com>2024-04-23 08:56:39 -0700
commitdf608051234c256f1dc2c89f30afd034706c2c2e (patch)
tree11e5882cfcf1a7881dda4b31664c24403157df6d /llvm/lib/CodeGen/PatchableFunction.cpp
parentf426be195a08874686d01783bbc490295bf4afb2 (diff)
downloadllvm-df608051234c256f1dc2c89f30afd034706c2c2e.zip
llvm-df608051234c256f1dc2c89f30afd034706c2c2e.tar.gz
llvm-df608051234c256f1dc2c89f30afd034706c2c2e.tar.bz2
[NVPTX] Improve support for rsqrt.approx (#89417)
Complete support for rsqrt.approx with rsqrt.approx.f64 ([PTX ISA 9.7.3.17. Floating Point Instructions: rsqrt.approx.ftz.f64](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-rsqrt-approx-ftz-f64)). Additionally, add support for folding `sqrt` into `rsqrt`, with an optional flag to disable.
Diffstat (limited to 'llvm/lib/CodeGen/PatchableFunction.cpp')
0 files changed, 0 insertions, 0 deletions