riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Matt Arsenault <Matthew.Arsenault@amd.com>	2023-06-30 22:27:31 -0400
committer	Matt Arsenault <Matthew.Arsenault@amd.com>	2023-07-05 16:53:01 -0400
commit	9c82dc6a6ba1f3d75b5547680e0a8532684879c9 (patch)
tree	b080783f868816b99e44c8d404b744819283b39f /llvm/lib/CodeGen/PrologEpilogInserter.cpp
parent	59c311c5d4a04a6a4f8c4abf140a63af1079e34c (diff)
download	llvm-9c82dc6a6ba1f3d75b5547680e0a8532684879c9.zip llvm-9c82dc6a6ba1f3d75b5547680e0a8532684879c9.tar.gz llvm-9c82dc6a6ba1f3d75b5547680e0a8532684879c9.tar.bz2

AMDGPU: Always use v_rcp_f16 and v_rsq_f16

These inherited the fast math checks from f32, but the manual suggests these should be accurate enough for unconditional use. The definition of correctly rounded is 0.5ulp, but the manual says "0.51ulp". I've been a bit nervous about changing this as the OpenCL conformance test does not cover half. Brute force produces identical values compared to a reference host implementation for all values.

Diffstat (limited to 'llvm/lib/CodeGen/PrologEpilogInserter.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: