aboutsummaryrefslogtreecommitdiff
path: root/gcc/tree-cfg.cc
diff options
context:
space:
mode:
authorXi Ruoyao <xry111@xry111.site>2023-04-12 11:45:48 +0000
committerXi Ruoyao <xry111@xry111.site>2023-04-19 18:13:50 +0800
commit6d7e0bcfa49e4ddc84dabe520bba8a023bc52692 (patch)
tree434e067faa3ff644a6bf44fb9fbdd9f902fc5299 /gcc/tree-cfg.cc
parent81c6501445fcddad653363f815cd04ca6fdb488e (diff)
downloadgcc-6d7e0bcfa49e4ddc84dabe520bba8a023bc52692.zip
gcc-6d7e0bcfa49e4ddc84dabe520bba8a023bc52692.tar.gz
gcc-6d7e0bcfa49e4ddc84dabe520bba8a023bc52692.tar.bz2
LoongArch: Improve cpymemsi expansion [PR109465]
We'd been generating really bad block move sequences which is recently complained by kernel developers who tried __builtin_memcpy. To improve it: 1. Take the advantage of -mno-strict-align. When it is set, set mode size to UNITS_PER_WORD regardless of the alignment. 2. Half the mode size when (block size) % (mode size) != 0, instead of falling back to ld.bu/st.b at once. 3. Limit the length of block move sequence considering the number of instructions, not the size of block. When -mstrict-align is set and the block is not aligned, the old size limit for straight-line implementation (64 bytes) was definitely too large (we don't have 64 registers anyway). Change since v1: add a comment about the calculation of num_reg. gcc/ChangeLog: PR target/109465 * config/loongarch/loongarch-protos.h (loongarch_expand_block_move): Add a parameter as alignment RTX. * config/loongarch/loongarch.h: (LARCH_MAX_MOVE_BYTES_PER_LOOP_ITER): Remove. (LARCH_MAX_MOVE_BYTES_STRAIGHT): Remove. (LARCH_MAX_MOVE_OPS_PER_LOOP_ITER): Define. (LARCH_MAX_MOVE_OPS_STRAIGHT): Define. (MOVE_RATIO): Use LARCH_MAX_MOVE_OPS_PER_LOOP_ITER instead of LARCH_MAX_MOVE_BYTES_PER_LOOP_ITER. * config/loongarch/loongarch.cc (loongarch_expand_block_move): Take the alignment from the parameter, but set it to UNITS_PER_WORD if !TARGET_STRICT_ALIGN. Limit the length of straight-line implementation with LARCH_MAX_MOVE_OPS_STRAIGHT instead of LARCH_MAX_MOVE_BYTES_STRAIGHT. (loongarch_block_move_straight): When there are left-over bytes, half the mode size instead of falling back to byte mode at once. (loongarch_block_move_loop): Limit the length of loop body with LARCH_MAX_MOVE_OPS_PER_LOOP_ITER instead of LARCH_MAX_MOVE_BYTES_PER_LOOP_ITER. * config/loongarch/loongarch.md (cpymemsi): Pass the alignment to loongarch_expand_block_move. gcc/testsuite/ChangeLog: PR target/109465 * gcc.target/loongarch/pr109465-1.c: New test. * gcc.target/loongarch/pr109465-2.c: New test. * gcc.target/loongarch/pr109465-3.c: New test.
Diffstat (limited to 'gcc/tree-cfg.cc')
0 files changed, 0 insertions, 0 deletions