aboutsummaryrefslogtreecommitdiff
path: root/gcc/config.gcc
diff options
context:
space:
mode:
authorKyrylo Tkachov <kyrylo.tkachov@arm.com>2021-09-29 11:21:45 +0100
committerKyrylo Tkachov <kyrylo.tkachov@arm.com>2021-09-29 11:21:45 +0100
commita459ee44c0a74b0df0485ed7a56683816c02aae9 (patch)
tree28120613b6c3f40a1f797cab6d68499117140a49 /gcc/config.gcc
parent8f95e3c04d659d541ca4937b3df2f1175a1c5f05 (diff)
downloadgcc-a459ee44c0a74b0df0485ed7a56683816c02aae9.zip
gcc-a459ee44c0a74b0df0485ed7a56683816c02aae9.tar.gz
gcc-a459ee44c0a74b0df0485ed7a56683816c02aae9.tar.bz2
aarch64: Improve size heuristic for cpymem expansion
Similar to my previous patch for setmem this one does the same for the cpymem expansion. We count the number of ops emitted and compare it against the alternative of just calling the library function when optimising for size. For the code: void cpy_127 (char *out, char *in) { __builtin_memcpy (out, in, 127); } void cpy_128 (char *out, char *in) { __builtin_memcpy (out, in, 128); } we now emit a call to memcpy (with an extra MOV-immediate instruction for the size) instead of: cpy_127(char*, char*): ldp q0, q1, [x1] stp q0, q1, [x0] ldp q0, q1, [x1, 32] stp q0, q1, [x0, 32] ldp q0, q1, [x1, 64] stp q0, q1, [x0, 64] ldr q0, [x1, 96] str q0, [x0, 96] ldr q0, [x1, 111] str q0, [x0, 111] ret cpy_128(char*, char*): ldp q0, q1, [x1] stp q0, q1, [x0] ldp q0, q1, [x1, 32] stp q0, q1, [x0, 32] ldp q0, q1, [x1, 64] stp q0, q1, [x0, 64] ldp q0, q1, [x1, 96] stp q0, q1, [x0, 96] ret which is a clear code size win. Speed optimisation heuristics remain unchanged. 2021-09-29 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * config/aarch64/aarch64.c (aarch64_expand_cpymem): Count number of emitted operations and adjust heuristic for code size. 2021-09-29 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * gcc.target/aarch64/cpymem-size.c: New test.
Diffstat (limited to 'gcc/config.gcc')
0 files changed, 0 insertions, 0 deletions