diff options
author | Kito Cheng <kito.cheng@gmail.com> | 2019-03-22 16:43:20 -0400 |
---|---|---|
committer | Alex Bennée <alex.bennee@linaro.org> | 2019-03-25 10:35:32 +0000 |
commit | 896f51fbfa132a6d05a1195f45d6bd9e9df07893 (patch) | |
tree | 680a004dae56a7d9d777cb54830f9927dfac2abb | |
parent | 7ca96e1a9cadc32af7df73dcf4438b08667d07a6 (diff) | |
download | qemu-896f51fbfa132a6d05a1195f45d6bd9e9df07893.zip qemu-896f51fbfa132a6d05a1195f45d6bd9e9df07893.tar.gz qemu-896f51fbfa132a6d05a1195f45d6bd9e9df07893.tar.bz2 |
hardfloat: fix float32/64 fused multiply-add
Before falling back to softfloat FMA, we do not restore the original
values of inputs A and C. Fix it.
This bug was caught by running gcc's testsuite on RISC-V qemu.
Note that this change gives a small perf increase for fp-bench:
Host: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
Command: perf stat -r 3 taskset -c 0 ./fp-bench -o mulAdd -p $prec
- $prec = single:
- before:
101.71 MFlops
102.18 MFlops
100.96 MFlops
- after:
103.63 MFlops
103.05 MFlops
102.96 MFlops
- $prec = double:
- before:
173.10 MFlops
173.93 MFlops
172.11 MFlops
- after:
178.49 MFlops
178.88 MFlops
178.66 MFlops
Signed-off-by: Kito Cheng <kito.cheng@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20190322204320.17777-1-cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
-rw-r--r-- | fpu/softfloat.c | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 4610738..2ba36ec 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1596,6 +1596,9 @@ float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) } ur.h = up.h + uc.h; } else { + union_float32 ua_orig = ua; + union_float32 uc_orig = uc; + if (flags & float_muladd_negate_product) { ua.h = -ua.h; } @@ -1608,6 +1611,8 @@ float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) if (unlikely(f32_is_inf(ur))) { s->float_exception_flags |= float_flag_overflow; } else if (unlikely(fabsf(ur.h) <= FLT_MIN)) { + ua = ua_orig; + uc = uc_orig; goto soft; } } @@ -1662,6 +1667,9 @@ float64_muladd(float64 xa, float64 xb, float64 xc, int flags, float_status *s) } ur.h = up.h + uc.h; } else { + union_float64 ua_orig = ua; + union_float64 uc_orig = uc; + if (flags & float_muladd_negate_product) { ua.h = -ua.h; } @@ -1674,6 +1682,8 @@ float64_muladd(float64 xa, float64 xb, float64 xc, int flags, float_status *s) if (unlikely(f64_is_inf(ur))) { s->float_exception_flags |= float_flag_overflow; } else if (unlikely(fabs(ur.h) <= FLT_MIN)) { + ua = ua_orig; + uc = uc_orig; goto soft; } } |