aboutsummaryrefslogtreecommitdiff
path: root/stdio-common
diff options
context:
space:
mode:
authorNoah Goldstein <goldstein.w.n@gmail.com>2021-05-04 19:02:40 -0400
committerNoah Goldstein <goldstein.w.n@gmail.com>2021-05-08 16:26:30 -0400
commit104c7b1967c3e78435c6f7eab5e225a7eddf9c6e (patch)
tree0b79a2d6c085b43bcf667f74f4bc974cbf970bd6 /stdio-common
parent6ea916adfa0ab9af6e7dc6adcf6f977dfe017835 (diff)
downloadglibc-104c7b1967c3e78435c6f7eab5e225a7eddf9c6e.zip
glibc-104c7b1967c3e78435c6f7eab5e225a7eddf9c6e.tar.gz
glibc-104c7b1967c3e78435c6f7eab5e225a7eddf9c6e.tar.bz2
x86: Add EVEX optimized memchr family not safe for RTM
No bug. This commit adds a new implementation for EVEX memchr that is not safe for RTM because it uses vzeroupper. The benefit is that by using ymm0-ymm15 it can use vpcmpeq and vpternlogd in the 4x loop which is faster than the RTM safe version which cannot use vpcmpeq because there is no EVEX encoding for the instruction. All parts of the implementation aside from the 4x loop are the same for the two versions and the optimization is only relevant for large sizes. Tigerlake: size , algn , Pos , Cur T , New T , Win , Dif 512 , 6 , 192 , 9.2 , 9.04 , no-RTM , 0.16 512 , 7 , 224 , 9.19 , 8.98 , no-RTM , 0.21 2048 , 0 , 256 , 10.74 , 10.54 , no-RTM , 0.2 2048 , 0 , 512 , 14.81 , 14.87 , RTM , 0.06 2048 , 0 , 1024 , 22.97 , 22.57 , no-RTM , 0.4 2048 , 0 , 2048 , 37.49 , 34.51 , no-RTM , 2.98 <-- Icelake: size , algn , Pos , Cur T , New T , Win , Dif 512 , 6 , 192 , 7.6 , 7.3 , no-RTM , 0.3 512 , 7 , 224 , 7.63 , 7.27 , no-RTM , 0.36 2048 , 0 , 256 , 8.48 , 8.38 , no-RTM , 0.1 2048 , 0 , 512 , 11.57 , 11.42 , no-RTM , 0.15 2048 , 0 , 1024 , 17.92 , 17.38 , no-RTM , 0.54 2048 , 0 , 2048 , 30.37 , 27.34 , no-RTM , 3.03 <-- test-memchr, test-wmemchr, and test-rawmemchr are all passing. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Diffstat (limited to 'stdio-common')
0 files changed, 0 insertions, 0 deletions