aboutsummaryrefslogtreecommitdiff
path: root/string/explicit_bzero.c
diff options
context:
space:
mode:
authorJoe Ramsay <Joe.Ramsay@arm.com>2024-09-23 15:32:14 +0100
committerWilco Dijkstra <wilco.dijkstra@arm.com>2024-09-23 15:44:07 +0100
commit5bc100bd4b7e00db3009ae93d25d303341545d23 (patch)
tree1aa1f7486b762b861a9292457a95f6cf2db23d6f /string/explicit_bzero.c
parenta15b1394b5eba98ffe28a02a392b587e4fe13c0d (diff)
downloadglibc-5bc100bd4b7e00db3009ae93d25d303341545d23.zip
glibc-5bc100bd4b7e00db3009ae93d25d303341545d23.tar.gz
glibc-5bc100bd4b7e00db3009ae93d25d303341545d23.tar.bz2
AArch64: Improve codegen in users of AdvSIMD log1pf helper
log1pf is quite register-intensive - use fewer registers for the polynomial, and make various changes to shorten dependency chains in parent routines. There is now no spilling with GCC 14. Accuracy moves around a little - comments adjusted accordingly but does not require regen-ulps. Use the helper in log1pf as well, instead of having separate implementations. The more accurate polynomial means special-casing can be simplified, and the shorter dependency chain avoids the usual dance around v0, which is otherwise difficult. There is a small duplication of vectors containing 1.0f (or 0x3f800000) - GCC is not currently able to efficiently handle values which fit in FMOV but not MOVI, and are reinterpreted to integer. There may be potential for more optimisation if this is fixed. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Diffstat (limited to 'string/explicit_bzero.c')
0 files changed, 0 insertions, 0 deletions