aboutsummaryrefslogtreecommitdiff
path: root/iconvdata/ibm1133.h
diff options
context:
space:
mode:
authorDanila Kutenin <danilak@google.com>2022-06-27 16:12:13 +0000
committerSzabolcs Nagy <szabolcs.nagy@arm.com>2022-07-06 09:26:20 +0100
commit3c9980698988ef64072f1fac339b180f52792faf (patch)
tree3c32dabb3fcbfa564647fcedd9be5c7674a30fc2 /iconvdata/ibm1133.h
parentbd0b58837c7df091046e7531642f379a52e1e157 (diff)
downloadglibc-3c9980698988ef64072f1fac339b180f52792faf.zip
glibc-3c9980698988ef64072f1fac339b180f52792faf.tar.gz
glibc-3c9980698988ef64072f1fac339b180f52792faf.tar.bz2
aarch64: Optimize string functions with shrn instruction
We found that string functions were using AND+ADDP to find the nibble/syndrome mask but there is an easier opportunity through `SHRN dst.8b, src.8h, 4` (shift right every 2 bytes by 4 and narrow to 1 byte) and has same latency on all SIMD ARMv8 targets as ADDP. There are also possible gaps for memcmp but that's for another patch. We see 10-20% savings for small-mid size cases (<=128) which are primary cases for general workloads.
Diffstat (limited to 'iconvdata/ibm1133.h')
0 files changed, 0 insertions, 0 deletions