diff options
author | Tue Ly <lntue@google.com> | 2023-05-08 14:03:52 -0400 |
---|---|---|
committer | Tue Ly <lntue@google.com> | 2023-05-23 10:35:15 -0400 |
commit | a68bbf42fa680c7159165be8915d73e012c50f96 (patch) | |
tree | 42c8e1035cc3d89122ca63df87644bf1bec9035a /llvm/lib/IR/Module.cpp | |
parent | cc6a6c48d4bb498717950ae482167102d25bf821 (diff) | |
download | llvm-a68bbf42fa680c7159165be8915d73e012c50f96.zip llvm-a68bbf42fa680c7159165be8915d73e012c50f96.tar.gz llvm-a68bbf42fa680c7159165be8915d73e012c50f96.tar.bz2 |
[libc][math] Implement double precision log function correctly rounded to all rounding modes.
Implement double precision log function correctly rounded to all
rounding modes.
See https://reviews.llvm.org/D150014 for a more detail description of the algorithm.
**Performance**
- For `0.5 <= x <= 2`, the fast pass hitting rate is about 99.93%.
- Reciprocal throughput from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 17.465 + 0.596 clc/call; Median-Min = 0.602 clc/call; Max = 18.389 clc/call;
-- CORE-MATH reciprocal throughput -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 54.961 + 2.606 clc/call; Median-Min = 2.180 clc/call; Max = 59.583 clc/call;
-- System LIBC reciprocal throughput --
[####################] 100 %
Ntrial = 20 ; Min = 12.608 + 0.276 clc/call; Median-Min = 0.359 clc/call; Max = 13.147 clc/call;
-- LIBC reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 20.952 + 0.468 clc/call; Median-Min = 0.602 clc/call; Max = 21.881 clc/call;
-- LIBC reciprocal throughput -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 18.569 + 0.552 clc/call; Median-Min = 0.601 clc/call; Max = 19.259 clc/call;
```
- Latency from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log --latency
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 48.431 + 0.699 clc/call; Median-Min = 0.073 clc/call; Max = 51.269 clc/call;
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 64.865 + 3.235 clc/call; Median-Min = 3.475 clc/call; Max = 71.788 clc/call;
-- System LIBC latency --
[####################] 100 %
Ntrial = 20 ; Min = 42.151 + 2.090 clc/call; Median-Min = 2.270 clc/call; Max = 44.773 clc/call;
-- LIBC latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 35.266 + 0.479 clc/call; Median-Min = 0.373 clc/call; Max = 36.798 clc/call;
-- LIBC latency -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 48.518 + 0.484 clc/call; Median-Min = 0.500 clc/call; Max = 49.896 clc/call;
```
- Accurate pass latency:
```
$ ./perf.sh log --latency --simple_stat
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
598.306
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
632.925
-- LIBC latency -- with FMA
455.632
-- LIBC latency -- without FMA
488.564
```
Reviewed By: zimmermann6
Differential Revision: https://reviews.llvm.org/D150131
Diffstat (limited to 'llvm/lib/IR/Module.cpp')
0 files changed, 0 insertions, 0 deletions