aboutsummaryrefslogtreecommitdiff
path: root/sysdeps/powerpc/fpu
AgeCommit message (Collapse)AuthorFilesLines
2024-07-25powerpc: Update ulps for fpujeevitha1-6/+6
Adjust the ULPs for the log2p1 implementation.
2024-06-20powerpc: Update ulpsFlorian Weimer1-1/+13
Results based on POWER8 and POWER9 machines running powerpc64-linux-gnu, with and without --disable-multi-arch.
2024-06-18powerpc: Update ulpsAdhemerval Zanella1-0/+60
For the exp10m1, exp2m1, and log10p1 implementations.
2024-06-17Implement C23 logp1Joseph Myers1-0/+24
C23 adds various <math.h> function families originally defined in TS 18661-4. Add the logp1 functions (aliases for log1p functions - the name is intended to be more consistent with the new log2p1 and log10p1, where clearly it would have been very confusing to name those functions log21p and log101p). As aliases rather than new functions, the content of this patch is somewhat different from those actually adding new functions. Tests are shared with log1p, so this patch *does* mechanically update all affected libm-test-ulps files to expect the same errors for both functions. The vector versions of log1p on aarch64 and x86_64 are *not* updated to have logp1 aliases (and thus there are no corresponding header, tests, abilist or ulps changes for vector functions either). It would be reasonable for such vector aliases and corresponding changes to other files to be made separately. For now, the log1p tests instead avoid testing logp1 in the vector case (a Makefile change is needed to avoid problems with grep, used in generating the .c files for vector function tests, matching more than one ALL_RM_TEST line in a file testing multiple functions with the same inputs, when it assumes that the .inc file only has a single such line). Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-05-20powerpc: Update ulpsAdhemerval Zanella1-0/+24
For the log2p1 implementation.
2024-05-09powerpc: Fix __fesetround_inline_nocheck on POWER9+ (BZ 31682)Adhemerval Zanella2-14/+8
The e68b1151f7460d5fa88c3a567c13f66052da79a7 commit changed the __fesetround_inline_nocheck implementation to use mffscrni (through __fe_mffscrn) instead of mtfsfi. For generic powerpc ceil/floor/trunc, the function is supposed to disable the floating-point inexact exception enable bit, however mffscrni does not change any exception enable bits. This patch fixes by reverting the optimization for the __fesetround_inline_nocheck. Checked on powerpc-linux-gnu. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
2024-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert49-49/+49
2023-12-19powerpc: Do not raise exception traps for fesetexcept/fesetexceptflag (BZ 30988)Adhemerval Zanella2-1/+13
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). This is a side-effect of how we implement the GNU extension feenableexcept, where feenableexcept/fesetenv/fesetmode/feupdateenv might issue prctl (PR_SET_FPEXC, PR_FP_EXC_PRECISE) depending of the argument. And on PR_FP_EXC_PRECISE, setting a floating-point exception flag triggers a trap. To make the both functions follow the C23, fesetexcept and fesetexceptflag now fail if the argument may trigger a trap. The math tests now check for an value different than 0, instead of bail out as unsupported for EXCEPTION_SET_FORCES_TRAP. Checked on powerpc64le-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-09-27fegetenv_and_set_rn now uses the builtins provided by GCC.Manjunath Matti1-0/+9
On powerpc, SET_RESTORE_ROUND uses inline assembly to optimize the prologue get/save/set rounding mode operations for POWER9 and later by using 'mffscrn' where possible, this was introduced by commit f1c56cdff09f650ad721fae026eb6a3651631f3d. GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn which now returns the FPSCR fields in a double. This feature is available on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro is defined. GCC commit ef3bbc69d15707e4db6e2f198c621effb636cc26 adds this feature. Changes are done to use __builtin_set_fpscr_rn instead of mffscrn or mffscrni in __fe_mffscrn(rn). Suggested-by: Carl Love <cel@us.ibm.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-06-22sysdeps/powerpc/fpu/tst-setcontext-fpscr.c: Fix warn unused resultFrederic Berat1-1/+3
The fread routine return value needs to be checked when fortification is enabled, hence use xfread helper. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-02-23powerpc:Regenerate ulps for hypotMahesh Bodapati1-0/+4
For new inputs added in commit 3efbf11fdf15ed991d2c41743921c524a867e145, regenerate the ulps of hypot from 0(default) to 1
2023-01-06Update copyright dates with scripts/update-copyrightsJoseph Myers49-49/+49
2022-05-23math: Add math-use-builtins-fabs (BZ#29027)Adhemerval Zanella1-0/+8
Both float, double, and _Float128 are assumed to be supported (float and double already only uses builtins). Only long double is parametrized due GCC bug 29253 which prevents its usage on powerpc. It allows to remove i686, ia64, x86_64, powerpc, and sparc arch specific implementation. On ia64 it also fixes the sNAN handling: math/test-float64x-fabs math/test-ldouble-fabs Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, powerpc64-linux-gnu, sparc64-linux-gnu, and ia64-linux-gnu.
2022-04-07powerpc: Remove fcopysign{f} implementationAdhemerval Zanella2-60/+0
The builtin and generic implementation from generic files are suffice. Checked on powerpc64-linux-gnu and powerpc-linux-gnu.
2022-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert51-51/+51
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: *** 912-#endif remote: *** 913: remote: *** 914- remote: *** error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2021-12-13math: Remove powerpc e_hypotAdhemerval Zanella2-165/+0
The generic implementation is shows only slight worse performance: POWER10 reciprocal-throughput latency master 8.28478 13.7253 new hypot 7.21945 13.1933 POWER9 reciprocal-throughput latency master 13.4024 14.0967 new hypot 14.8479 15.8061 POWER8 reciprocal-throughput latency master 15.5767 16.8885 new hypot 16.5371 18.4057 One way to improve might to make gcc generate xsmaxdp/xsmindp for fmax/fmin (it onl does for -ffast-math, clang does for default options). Checked on powerpc64-linux-gnu (power8) and powerpc64le-linux-gnu (power9).
2021-11-03[powerpc] Tighten contraints for asm constant parametersPaul A. Clarke1-4/+4
There are a few places where only known numeric values are acceptable for `asm` parameters, yet the constraint "i" is used. "i" can include "symbolic constants whose values will be known only at assembly time or later." Use "n" instead of "i" where known numeric values are required. Suggested-by: Segher Boessenkool <segher@kernel.crashing.org> Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2021-10-06powerpc: update libm test ulpsAdhemerval Zanella1-1/+1
Update after commit 6bbf7298323bf31bc43494b2201465a449778e10 (Fixed inaccuracy of j0f (BZ #28185)).
2021-09-22Add narrowing fma functionsJoseph Myers1-0/+16
This patch adds the narrowing fused multiply-add functions from TS 18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal, f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x, f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128, f64xfmaf128 for configurations with _Float64x and _Float128; __f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case (for calls to ffmal and dfmal when long double is IEEE binary128). Corresponding tgmath.h macro support is also added. The changes are mostly similar to those for the other narrowing functions previously added, especially that for sqrt, so the description of those generally applies to this patch as well. As with sqrt, I reused the same test inputs in auto-libm-test-in as for non-narrowing fma rather than adding extra or separate inputs for narrowing fma. The tests in libm-test-narrow-fma.inc also follow those for non-narrowing fma. The non-narrowing fma has a known bug (bug 6801) that it does not set errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather than fixing this or having narrowing fma check for errors when non-narrowing does not (complicating the cases when narrowing fma can otherwise be an alias for a non-narrowing function), this patch does not attempt to check for errors from narrowing fma and set errno; the CHECK_NARROW_FMA macro is still present, but as a placeholder that does nothing, and this missing errno setting is considered to be covered by the existing bug rather than needing a separate open bug. missing-errno annotations are duly added to many of the auto-libm-test-in test inputs for fma. This completes adding all the new functions from TS 18661-1 to glibc, so will be followed by corresponding stdc-predef.h changes to define __STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support for TS 18661-1 will be at a similar level to that for C standard floating-point facilities up to C11 (pragmas not implemented, but library functions done). (There are still further changes to be done to implement changes to the types of fromfp functions from N2548.) Tested as followed: natively with the full glibc testsuite for x86_64 (GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC 11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32 hard float, mips64 (all three ABIs, both hard and soft float). The different GCC versions are to cover the different cases in tgmath.h and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in glibc headers, GCC 7 has proper _Float* support, GCC 8 adds __builtin_tgmath).
2021-09-22Adjust new narrowing div/mul tests for IBM long double, update powerpc ULPsJoseph Myers1-0/+3
Testing for powerpc shows some of the new narrowing div/mul tests need XFAILing for IBM long double and some ULPs updates are needed for those tests.
2021-09-10Add narrowing square root functionsJoseph Myers1-0/+3
This patch adds the narrowing square root functions from TS 18661-1 / TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64, f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x, f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128, f64xsqrtf128 for configurations with _Float64x and _Float128; __f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case (for calls to fsqrtl and dsqrtl when long double is IEEE binary128). Corresponding tgmath.h macro support is also added. The changes are mostly similar to those for the other narrowing functions previously added, so the description of those generally applies to this patch as well. However, the not-actually-narrowing cases (where the two types involved in the function have the same floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather than needing a separately built not-actually-narrowing function such as was needed for add / sub / mul / div. Thus, there is no __nldbl_dsqrtl name for ldbl-opt because no such name was needed (whereas the other functions needed such a name since the only other name for that entry point was e.g. f32xaddf64, not reserved by TS 18661-1); the headers are made to arrange for sqrt to be called in that case instead. The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because they were observed to be needed in GCC 7 testing of riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/ files added didn't need such DIAG_* in any configuration I tested with build-many-glibcs.py, but if they do turn out to be needed in more files with some other configuration / GCC version, they can always be added there. I reused the same test inputs in auto-libm-test-in as for non-narrowing sqrt rather than adding extra or separate inputs for narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow those for non-narrowing sqrt. Tested as followed: natively with the full glibc testsuite for x86_64 (GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC 11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32 hard float, mips64 (all three ABIs, both hard and soft float). The different GCC versions are to cover the different cases in tgmath.h and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in glibc headers, GCC 7 has proper _Float* support, GCC 8 adds __builtin_tgmath).
2021-09-03Remove "Contributed by" linesSiddhesh Poyarekar8-9/+0
We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-04-09powerpc: Update libm test ulpsTulio Magno Quites Machado Filho1-10/+10
Update after commit 43576de04afc6a0896a3ecc094e1581069a0652a.
2021-04-02Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]Paul Zimmermann1-31/+31
For j0f/j1f/y0f/y1f, the largest error for all binary32 inputs is reduced to at most 9 ulps for all rounding modes. The new code is enabled only when there is a cancellation at the very end of the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not give any visible slowdown on average. Two different algorithms are used: * around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of degree 3 are used, computed using the Sollya tool (https://www.sollya.org/) * for large inputs, an asymptotic formula from [1] is used [1] Fast and Accurate Bessel Function Computation, John Harrison, Proceedings of Arith 19, 2009. Inputs yielding the new largest errors are added to auto-libm-test-in, and ulps are regenerated for various targets (thanks Adhemerval Zanella). Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-03-16powerpc: Add optimized ilogb* for POWER9Raphael Moreira Zinsly1-1/+25
The instructions xsxexpdp and xsxexpqp introduced on POWER9 extract the exponent from a double-precision and quad-precision floating-point respectively, thus they can be used to improve ilogb, ilogbf and ilogbf128.
2021-03-16powerpc: Update libm-test-ulpsMatheus Castanho1-1/+1
Generated with 'make regen-ulps' on POWER8. Tested on powerpc, powerpc64, and powerpc64le
2021-03-03powerpc: Regenerate ulpsFlorian Weimer1-10/+10
This time on a POWER8 machine.
2021-03-02powerpc: Update libm-test-ulpsMatheus Castanho1-13/+15
Generated with 'make regen-ulps' Tested on powerpc, powerpc64, and powerpc64le
2021-01-02Update copyright dates with scripts/update-copyrightsPaul Eggert53-53/+53
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
2020-12-22powerpc: Regenerate ulpsFlorian Weimer1-12/+13
For new inputs added in commit cad5ad81d2f7f58a7ad0d8afa8c1b710, as seen on a POWER8 system.
2020-09-10Update powerpc libm-test-ulpsMatheus Castanho1-3/+3
Before this patch, the following tests were failing: ppc and ppc64: FAIL: math/test-ldouble-j0 ppc64le: FAIL: math/test-float128-j0 FAIL: math/test-float64x-j0 FAIL: math/test-ibm128-j0 FAIL: math/test-ldouble-j0
2020-06-22powerpc: Use sqrt{f} builtinAdhemerval Zanella3-80/+42
The powerpc sqrt implementation is also simplified: - the static constants are open coded within the implementation. - for !USE_SQRT_BUILTIN the function is implemented directly on __ieee754_sqrt (it avoid an superflous extra jump). Checked on powerpc-linux-gnu and powerpc64le-linux-gnu.
2020-06-22math: Decompose math-use-builtins.hAdhemerval Zanella2-77/+9
Each symbol definitions are moved on a separated file and it cover all symbol type definitions (float, double, long double, and float128). It allows to set support for architectures without the boiler place of copying default values. Checked with a build on the affected ABIs.
2020-06-05powerpc64le: use common fmaf128 implementationPaul E. Murphy1-1/+7
This defines the macro such that it should behave best on all supported powerpc targets. Likewise, this allows us to remove the ppc64le specific s_fmaf128.c. I have verified powerpc64le multiarch and powerpc64le power9 no-multiarch builds continue to generate optimize fmaf128.
2020-06-04powerpc: Fix powerpc64le due a7a3435c9aAdhemerval Zanella1-0/+2
The build uses an undefined macro evaluation for fmaf128 build. For now set USE_FMAL_BUILTIN and USE_FMAF128_BUILTIN to 0. Checked with a build for: powerpc64le-linux-gnu-power9-disable-multi-arch powerpc64le-linux-gnu-power9 powerpc64le-linux-gnu powerpc64-linux-gnu-power8 powerpc64-linux-gnu powerpc-linux-gnu-power4 powerpc-linux-gnu
2020-06-03powerpc/fpu: use generic fma functionsVineet Gupta3-54/+69
Tested with build-many-glibcs for powerpc-linux-gnu This is a non functional change and powerpc libm before/after was byte invariant as compared below: | cd /SCRATCH/vgupta/gnu/install-glibc-A-baseline | for i in `find . -name libm-2.31.9000.so`; do | echo $i; diff $i /SCRATCH/vgupta/gnu/install-glibc-C-reduce-scope/$i ; | echo $?; | done | ./aarch64-linux-gnu/lib64/libm-2.31.9000.so | 0 | ./arm-linux-gnueabi/lib/libm-2.31.9000.so | 0 | ./x86_64-linux-gnu/lib64/libm-2.31.9000.so | 0 | ./arm-linux-gnueabihf/lib/libm-2.31.9000.so | 0 | ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so | 0 | ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so | 0 | ./powerpc-linux-gnu/lib/libm-2.31.9000.so | 0 | ./microblaze-linux-gnu/lib/libm-2.31.9000.so | 0 | ./nios2-linux-gnu/lib/libm-2.31.9000.so | 0 | ./hppa-linux-gnu/lib/libm-2.31.9000.so | 0 | ./s390x-linux-gnu/lib64/libm-2.31.9000.so | 0 Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2020-04-30powerpc64le: Enable support for IEEE long doubleGabriel F. T. Gomes1-0/+4
On platforms where long double may have two different formats, i.e.: the same format as double (64-bits) or something else (128-bits), building with -mlong-double-128 is the default and function calls in the user program match the name of the function in Glibc. When building with -mlong-double-64, Glibc installed headers redirect such calls to the appropriate function. Likewise, the internals of glibc are now built against IEEE long double. However, the only (minimally) notable usage of long double is difftime. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-07powerpc: Update ULPs and xfail more ibm128 outputsTulio Magno Quites Machado Filho1-16/+18
There are 2 new input values that require to be marked as xfail-rounding:ibm128-libgcc as they're known to fail because of libgcc issues with different rounding modes. Otherwise, the other tests just need an increase in ULP.
2020-03-30math: Remove fenvinline.hAdhemerval Zanella2-4/+5
Similar to string2.h (18b10de7ce) and string3.h (09a596cc2c) this patch removes the fenvinline.h on all architectures. Currently only powerpc implements some optimizations. This kind of optimization is better implemented by the compiler (which handles the architecture ISA transparently). Also, for the specific optimized powerpc implementation the code is becoming convoluted and these micro-optimization are hardly wildly used, even more being a possible hotspot in realword cases (non-default rounding are used only on specific cases and exception handling are done most likely only on errors path). Only x86 implements similar optimization (on fenv.h) also indicates that these should no be on libc. The math/test-fenv already covers all math/test-fenvinline tests, so it is safe to remove it. The powerpc fegetround optimization is moved to internal fenv_libc.h. The BZ#94193 [1] the corresponding GCC bug for adding replacements for these on powerpc. Checked on x86_64-linux-gnu and powerpc64le-linux-gnu. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94193
2020-03-19math: Remove inline math testsAdhemerval Zanella1-1135/+0
With mathinline removal there is no need to keep building and testing inline math tests. The gen-libm-tests.py support to generate ULP_I_* is removed and all libm-test-ulps files are updated to longer have the i{float,double,ldouble} entries. The support for no-test-inline is also removed from both gen-auto-libm-tests and the auto-libm-test-out-* were regenerated. Checked on x86_64-linux-gnu and i686-linux-gnu.
2020-01-03Add libm_alias_finite for _finite symbolsWilco Dijkstra4-4/+12
This patch adds a new macro, libm_alias_finite, to define all _finite symbol. It sets all _finite symbol as compat symbol based on its first version (obtained from the definition at built generated first-versions.h). The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need special treatment in code that is shared between long double and float128. It is done by adding a list, similar to internal symbol redifinition, on sysdeps/ieee754/float128/float128_private.h. Alpha also needs some tricky changes to ensure we still emit 2 compat symbols for sqrt(f). Passes buildmanyglibc. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers55-55/+55
2019-10-02[powerpc] No need to enter "Ignore Exceptions Mode"Paul A. Clarke1-4/+15
Since at least POWER8, there is no performance advantage to entering "Ignore Exceptions Mode", and doing so conditionally requires - the conditional logic, and - a system call. Make it a no-op for uses within glibc.
2019-09-27[powerpc] Rename fesetenv_mode to fesetenv_controlPaul A. Clarke5-6/+6
fesetenv_mode is used variously to write the FPSCR exception enable bits and rounding mode bits. These are referred to as the control bits in the POWER ISA. Change the name to be reflective of its current and expected use, and match up well with fegetenv_control.
2019-09-27[powerpc] libc_feholdsetround_noex_ppc_ctx: optimize FPSCR writePaul A. Clarke1-1/+1
libc_feholdsetround_noex_ppc_ctx currently performs: 1. Read FPSCR, save to context. 2. Create new FPSCR value: clear enables and set new rounding mode. 3. Write new value to FPSCR. Since other bits just pass through, there is no need to write them. Instead, write just the changed values (enables and rounding mode), which can be a bit more efficient.
2019-09-27[powerpc] Rename fegetenv_status to fegetenv_controlPaul A. Clarke7-9/+9
fegetenv_status is used variously to retrieve the FPSCR exception enable bits, rounding mode bits, or both. These are referred to as the control bits in the POWER ISA. FPSCR status bits are also returned by the 'mffs' and 'mffsl' instructions, but they are uniformly ignored by all uses of fegetenv_status. Change the name to be reflective of its current and expected use. Reviewed-By: Paul E Murphy <murphyp@linux.ibm.com>
2019-09-27[powerpc] __fesetround_inline optimizationsPaul A. Clarke1-3/+15
On POWER9, use more efficient means to update the 2-bit rounding mode via the 'mffscrn' instruction (instead of two 'mtfsb0/1' instructions or one 'mtfsfi' instruction that modifies 4 bits). Suggested-by: Paul E. Murphy <murphyp@linux.ibm.com> Reviewed-By: Paul E Murphy <murphyp@linux.ibm.com>
2019-09-27[powerpc] libc_feupdateenv_test: optimize FPSCR accessPaul A. Clarke2-2/+18
ROUND_TO_ODD and a couple of other places use libc_feupdateenv_test to restore the rounding mode and exception enables, preserve exception flags, and test whether given exception(s) were generated. If the exception flags haven't changed, then it is sufficient and a bit more efficient to just restore the rounding mode and enables, rather than writing the full Floating-Point Status and Control Register (FPSCR). Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>
2019-09-27[powerpc] fenv_private.h clean upPaul A. Clarke8-117/+38
fenv_private.h includes unused functions, magic macro constants, and some replicated common code fragments. Remove unused functions, replace magic constants with constants from fenv_libc.h, and refactor replicated code. Suggested-by: Paul E. Murphy <murphyp@linux.ibm.com> Reviewed-By: Paul E Murphy <murphyp@linux.ibm.com>
2019-09-19[powerpc] SET_RESTORE_ROUND optimizations and bug fixPaul A. Clarke2-25/+36
SET_RESTORE_ROUND brackets a block of code, temporarily setting and restoring the rounding mode and letting everything else, including exceptions generated within the block, pass through. On powerpc, the current code clears the exception enables, which will hide exceptions generated within the block. This issue was introduced by me in commit e905212627350d54b58426214b5a54ddc852b0c9. Fix this by not clearing exception enable bits in the prologue. Also, since we are no longer changing the enable bits in either the prologue or the epilogue, there is no need to test for entering/exiting non-stop mode. Also, optimize the prologue get/save/set rounding mode operations for POWER9 and later by using 'mffscrn' when possible. Suggested-by: Paul E. Murphy <murphyp@linux.ibm.com> Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com> Fixes: e905212627350d54b58426214b5a54ddc852b0c9 2019-09-19 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_and_set_rn): New. (__fe_mffscrn): New. * sysdeps/powerpc/fpu/fenv_private.h (libc_feholdsetround_ppc_ctx): Do not clear enable bits, remove obsolete code, use fegetenv_and_set_rn. (libc_feresetround_ppc): Remove obsolete code, use fegetenv_and_set_rn.