diff options
author | Alex MacLean <amaclean@nvidia.com> | 2024-04-23 08:56:39 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-04-23 08:56:39 -0700 |
commit | df608051234c256f1dc2c89f30afd034706c2c2e (patch) | |
tree | 11e5882cfcf1a7881dda4b31664c24403157df6d /llvm/lib/CodeGen/MachineFunctionSplitter.cpp | |
parent | f426be195a08874686d01783bbc490295bf4afb2 (diff) | |
download | llvm-df608051234c256f1dc2c89f30afd034706c2c2e.zip llvm-df608051234c256f1dc2c89f30afd034706c2c2e.tar.gz llvm-df608051234c256f1dc2c89f30afd034706c2c2e.tar.bz2 |
[NVPTX] Improve support for rsqrt.approx (#89417)
Complete support for rsqrt.approx with rsqrt.approx.f64 ([PTX ISA
9.7.3.17. Floating Point Instructions:
rsqrt.approx.ftz.f64](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-rsqrt-approx-ftz-f64)).
Additionally, add support for folding `sqrt` into `rsqrt`, with an
optional flag to disable.
Diffstat (limited to 'llvm/lib/CodeGen/MachineFunctionSplitter.cpp')
0 files changed, 0 insertions, 0 deletions