diff options
author | Wilco Dijkstra <wilco.dijkstra@arm.com> | 2020-11-19 15:57:52 +0000 |
---|---|---|
committer | Wilco Dijkstra <wdijkstr@arm.com> | 2020-11-19 16:05:33 +0000 |
commit | 1d77928fc49b4f2487fd78db26bbebd00f881414 (patch) | |
tree | 1121ac93d078a72f50bbb46975f8a6815fef1eb5 /gcc/tree-vect-loop.c | |
parent | 2729378d0905a04e476a8bdcaaf0288f417810ec (diff) | |
download | gcc-1d77928fc49b4f2487fd78db26bbebd00f881414.zip gcc-1d77928fc49b4f2487fd78db26bbebd00f881414.tar.gz gcc-1d77928fc49b4f2487fd78db26bbebd00f881414.tar.bz2 |
AArch64: Improve inline memcpy expansion
Improve the inline memcpy expansion. Use integer load/store for copies <= 24
bytes instead of SIMD. Set the maximum copy to expand to 256 by default,
except that -Os or no Neon expands up to 128 bytes. When using LDP/STP of
Q-registers, also use Q-register accesses for the unaligned tail, saving 2
instructions (eg. all sizes up to 48 bytes emit exactly 4 instructions).
Cleanup code and comments.
The codesize gain vs the GCC10 expansion is 0.05% on SPECINT2017.
2020-11-03 Wilco Dijkstra <wdijkstr@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_expand_cpymem): Cleanup code and
comments, tweak expansion decisions and improve tail expansion.
Diffstat (limited to 'gcc/tree-vect-loop.c')
0 files changed, 0 insertions, 0 deletions