rocket-tools/riscv-gnu-toolchain/glibc.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-06-16 10:17:37 -0300
committer	Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-07-11 13:01:31 -0300
commit	c055c54e960579619304c7fb998e6bc12e82c5bd (patch)
tree	c4db98a12d980896de92f1645478f883093721da /malloc/tst-malloc-alternate-path.c
parent	3d3572f59059e2b19b8541ea648a6172136ec42e (diff)
download	glibc-master.zip glibc-master.tar.gz glibc-master.tar.bz2

x86_64: Optimize modf/modff for x86_64-v2HEAD master

The SSE4.1 provides a direct instruction for trunc, which improves modf/modff performance with a less text size. On Ryzen 9 (zen3) with gcc 14.2.1: x86_64-v2 reciprocal-throughput master patch difference workload-0_1 7.9610 7.7914 2.13% workload-1_maxint 9.4323 7.8021 17.28% workload-maxint_maxfloat 8.7379 7.8049 10.68% workload-integral 7.9492 7.7991 1.89% latency master patch difference workload-0_1 7.9511 10.8910 -36.97% workload-1_maxint 15.8278 10.9048 31.10% workload-maxint_maxfloat 11.3495 10.9139 3.84% workload-integral 11.5938 10.9071 5.92% x86_64-v3 reciprocal-throughput master patch difference workload-0_1 8.7522 7.9781 8.84% workload-1_maxint 9.6690 7.9872 17.39% workload-maxint_maxfloat 8.7634 7.9857 8.87% workload-integral 8.7397 7.9893 8.59% latency master patch difference workload-0_1 8.7447 9.5589 -9.31% workload-1_maxint 13.7480 9.5690 30.40% workload-maxint_maxfloat 10.0092 9.5680 4.41% workload-integral 9.7518 9.5743 1.82% For x86_64-v1 the optimization is done through a new ifunc selector. The avx is to follow other SSE4_1 optimization (like trunc) to avoid the ifunc for x86_64-v3. Checked on x86_64-linux-gnu. Tested-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Diffstat (limited to 'malloc/tst-malloc-alternate-path.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: