aboutsummaryrefslogtreecommitdiff
path: root/gcc/tree-loop-distribution.c
diff options
context:
space:
mode:
authoryaozhongxiao <yaozhongxiao@linux.alibaba.com>2021-02-03 15:49:30 +0000
committerJonathan Wakely <jwakely@redhat.com>2021-02-03 15:49:30 +0000
commit598876574184e745defee4b36dc2408068b7a22e (patch)
treeec9e5db823dea825e16f10418c11a0bd8c697d43 /gcc/tree-loop-distribution.c
parent3de9bd16c91c5fc050961db6887880b303b3a630 (diff)
downloadgcc-598876574184e745defee4b36dc2408068b7a22e.zip
gcc-598876574184e745defee4b36dc2408068b7a22e.tar.gz
gcc-598876574184e745defee4b36dc2408068b7a22e.tar.bz2
libstdc++: Improve "find_first/last_set" for NEON
The find_first_set and find_last_set method is not optimal for neon, it needs to be improved by synthesized with horizontal adds(vaddv) which will reduce the generated assembly code. In the following cases, vaddvq_s16 will generate 2 instructions but vpadd_s16 will generate 4 instructions: # vaddvq_s16 vaddvq_s16(__asint); // addv h0, v1.8h // smov w1, v0.h[0] # vpadd_s16 vpaddq_s16(vpaddq_s16(vpaddq_s16(__asint, __zero), __zero), __zero)[0] // addp v1.8h,v1.8h,v2.8h // addp v1.8h,v1.8h,v2.8h // addp v1.8h,v1.8h,v2.8h // smov w1, v1.h[0] # libstdc++-v3/ChangeLog: * include/experimental/bits/simd_neon.h: Replace repeated vpadd calls with a single vaddv for aarch64.
Diffstat (limited to 'gcc/tree-loop-distribution.c')
0 files changed, 0 insertions, 0 deletions