aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenModule.cpp
diff options
context:
space:
mode:
authorMatt Arsenault <Matthew.Arsenault@amd.com>2023-07-16 08:32:08 -0400
committerMatt Arsenault <Matthew.Arsenault@amd.com>2023-07-21 18:55:42 -0400
commit8406c3568aa5cd8256a2b359174eae82c747b162 (patch)
treeca9b605228e354ba5831843f6f1adb064ddc61b6 /clang/lib/CodeGen/CodeGenModule.cpp
parent6699c37028148c722de2a401c13eef2e92833a03 (diff)
downloadllvm-8406c3568aa5cd8256a2b359174eae82c747b162.zip
llvm-8406c3568aa5cd8256a2b359174eae82c747b162.tar.gz
llvm-8406c3568aa5cd8256a2b359174eae82c747b162.tar.bz2
AMDGPU: Implement new 2ulp fdiv lowering
Extends the new frexp scaled reciprocal to the general case. The reciprocal case is just the same thing when frexp of 1 is constant folded. Could probably clean up the code to rely on that constant folding. Improves results for the IEEE path for the default OpenCL division. We used to only emit the fdiv.fast intrinsic with a 2.5 ulp accuracy threshold with DAZ, which uses explicit range checks. This gives us a better fast option with the default IEEE behavior.
Diffstat (limited to 'clang/lib/CodeGen/CodeGenModule.cpp')
0 files changed, 0 insertions, 0 deletions