diff options
author | Alexandre Oliva <oliva@adacore.com> | 2021-05-03 22:48:47 -0300 |
---|---|---|
committer | Alexandre Oliva <oliva@gnu.org> | 2021-05-03 22:48:47 -0300 |
commit | da9e6e63d1ae22e530ec7baf59f6ed028bf05776 (patch) | |
tree | 41e492d87df336bef4a7c9bb310627ba3fcb62aa /gcc/expr.h | |
parent | e690396da796cc4e1a0592336b37fec4e97262da (diff) | |
download | gcc-da9e6e63d1ae22e530ec7baf59f6ed028bf05776.zip gcc-da9e6e63d1ae22e530ec7baf59f6ed028bf05776.tar.gz gcc-da9e6e63d1ae22e530ec7baf59f6ed028bf05776.tar.bz2 |
introduce try store by multiple pieces
The ldist pass turns even very short loops into memset calls. E.g.,
the TFmode emulation calls end with a loop of up to 3 iterations, to
zero out trailing words, and the loop distribution pass turns them
into calls of the memset builtin.
Though short constant-length clearing memsets are usually dealt with
efficiently, for non-constant-length ones, the options are setmemM, or
a function calls.
RISC-V doesn't have any setmemM pattern, so the loops above end up
"optimized" into memset calls, incurring not only the overhead of an
explicit call, but also discarding the information the compiler has
about the alignment of the destination, and that the length is a
multiple of the word alignment.
This patch handles variable lengths with multiple conditional
power-of-2-constant-sized stores-by-pieces, so as to reduce the
overhead of length compares.
It also changes the last copy-prop pass into ccp, so that pointer
alignment and length's nonzero bits are detected and made available
for the expander, even for ldist-introduced SSA_NAMEs.
for gcc/ChangeLog
* builtins.c (try_store_by_multiple_pieces): New.
(expand_builtin_memset_args): Use it. If target_char_cast
fails, proceed as for non-constant val. Pass len's ctz to...
* expr.c (clear_storage_hints): ... this. Try store by
multiple pieces after setmem.
(clear_storage): Adjust.
* expr.h (clear_storage_hints): Likewise.
(try_store_by_multiple_pieces): Declare.
* passes.def: Replace the last copy_prop with ccp.
Diffstat (limited to 'gcc/expr.h')
-rw-r--r-- | gcc/expr.h | 13 |
1 files changed, 12 insertions, 1 deletions
@@ -201,7 +201,8 @@ extern rtx clear_storage_hints (rtx, rtx, enum block_op_methods, unsigned int, HOST_WIDE_INT, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, - unsigned HOST_WIDE_INT); + unsigned HOST_WIDE_INT, + unsigned); /* The same, but always output an library call. */ extern rtx set_storage_via_libcall (rtx, rtx, rtx, bool = false); @@ -232,6 +233,16 @@ extern int can_store_by_pieces (unsigned HOST_WIDE_INT, extern rtx store_by_pieces (rtx, unsigned HOST_WIDE_INT, by_pieces_constfn, void *, unsigned int, bool, memop_ret); +/* If can_store_by_pieces passes for worst-case values near MAX_LEN, call + store_by_pieces within conditionals so as to handle variable LEN efficiently, + storing VAL, if non-NULL_RTX, or valc instead. */ +extern bool try_store_by_multiple_pieces (rtx to, rtx len, + unsigned int ctz_len, + unsigned HOST_WIDE_INT min_len, + unsigned HOST_WIDE_INT max_len, + rtx val, char valc, + unsigned int align); + /* Emit insns to set X from Y. */ extern rtx_insn *emit_move_insn (rtx, rtx); extern rtx_insn *gen_move_insn (rtx, rtx); |