diff options
author | Wilco Dijkstra <wilco.dijkstra@arm.com> | 2024-02-21 23:33:58 +0000 |
---|---|---|
committer | Wilco Dijkstra <wilco.dijkstra@arm.com> | 2024-03-07 21:25:23 +0000 |
commit | 19b23bf3c32df3cbb96b3d898a1d7142f7bea4a0 (patch) | |
tree | 71284506e4ea695cad5ca367ef17b202e1612dee /gcc/c | |
parent | 0552560f6d2eaa1ae6df5c80660b489de1d5c772 (diff) | |
download | gcc-19b23bf3c32df3cbb96b3d898a1d7142f7bea4a0.zip gcc-19b23bf3c32df3cbb96b3d898a1d7142f7bea4a0.tar.gz gcc-19b23bf3c32df3cbb96b3d898a1d7142f7bea4a0.tar.bz2 |
AArch64: memcpy/memset expansions should not emit LDP/STP [PR113618]
The new RTL introduced for LDP/STP results in regressions due to use of UNSPEC.
Given the new LDP fusion pass is good at finding LDP opportunities, change the
memcpy, memmove and memset expansions to emit single vector loads/stores.
This fixes the regression and enables more RTL optimization on the standard
memory accesses. Handling of unaligned tail of memcpy/memmove is improved
with -mgeneral-regs-only. SPEC2017 performance improves slightly. Codesize
is a bit worse due to missed LDP opportunities as discussed in the PR.
gcc/ChangeLog:
PR target/113618
* config/aarch64/aarch64.cc (aarch64_copy_one_block): Remove.
(aarch64_expand_cpymem): Emit single load/store only.
(aarch64_set_one_block): Emit single stores only.
gcc/testsuite/ChangeLog:
PR target/113618
* gcc.target/aarch64/pr113618.c: New test.
Diffstat (limited to 'gcc/c')
0 files changed, 0 insertions, 0 deletions