aboutsummaryrefslogtreecommitdiff
path: root/lldb/unittests/ScriptInterpreter/Python
diff options
context:
space:
mode:
authorChangpeng Fang <changpeng.fang@gmail.com>2020-01-23 16:57:43 -0800
committerChangpeng Fang <changpeng.fang@gmail.com>2020-01-23 16:57:43 -0800
commit2531535984ad989ce88aeee23cb92a827da6686e (patch)
tree70cc36b82c2a6c75b86a1ea1106c164397333bf9 /lldb/unittests/ScriptInterpreter/Python
parent7ad17e008b0abec9b791f17de2f75f9112510d9d (diff)
downloadllvm-2531535984ad989ce88aeee23cb92a827da6686e.zip
llvm-2531535984ad989ce88aeee23cb92a827da6686e.tar.gz
llvm-2531535984ad989ce88aeee23cb92a827da6686e.tar.bz2
AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare
Summary: RCP has the accuracy limit. If FDIV fpmath require high accuracy rcp may not meet the requirement. However, in DAG lowering, fpmath information gets lost, and thus we may generate either inaccurate rcp related computation or slow code for fdiv. In patch implements fdiv optimizations in the AMDGPUCodeGenPrepare, which could exactly know !fpmath. FastUnsafeRcpLegal: We determine whether it is legal to use rcp based on unsafe-fp-math, fast math flags, denormals and fpmath accuracy request. RCP Optimizations: 1/x -> rcp(x) when fast unsafe rcp is legal or fpmath >= 2.5ULP with denormals flushed. a/b -> a*rcp(b) when fast unsafe rcp is legal. Use fdiv.fast: a/b -> fdiv.fast(a, b) when RCP optimization is not performed and fpmath >= 2.5ULP with denormals flushed. 1/x -> fdiv.fast(1,x) when RCP optimization is not performed and fpmath >= 2.5ULP with denormals. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D71293
Diffstat (limited to 'lldb/unittests/ScriptInterpreter/Python')
0 files changed, 0 insertions, 0 deletions