diff options
author | Robin Dapp <rdapp@ventanamicro.com> | 2025-07-10 09:41:48 +0200 |
---|---|---|
committer | Robin Dapp <rdapp@ventanamicro.com> | 2025-07-10 15:56:20 +0200 |
commit | dcba959fb30dc250eeb6fdd05aa878e5f1fc8c2d (patch) | |
tree | 9151fa4694203cdd3d7d33152b901dd4781a786d /libgcc/libgcc2.c | |
parent | e6f2daff77ee1f709105cb9f8e3e92f04c179431 (diff) | |
download | gcc-dcba959fb30dc250eeb6fdd05aa878e5f1fc8c2d.zip gcc-dcba959fb30dc250eeb6fdd05aa878e5f1fc8c2d.tar.gz gcc-dcba959fb30dc250eeb6fdd05aa878e5f1fc8c2d.tar.bz2 |
RISC-V: Make zero-stride load broadcast a tunable.
This patch makes the zero-stride load broadcast idiom dependent on a
uarch-tunable "use_zero_stride_load". Right now we have quite a few
paths that reach a strided load and some of them are not exactly
straightforward.
While broadcast is relatively rare on rv64 targets it is more common on
rv32 targets that want to vectorize 64-bit elements.
While the patch is more involved than I would have liked it could have
even touched more places. The whole broadcast-like insn path feels a
bit hackish due to the several optimizations we employ. Some of the
complications stem from the fact that we lump together real broadcasts,
vector single-element sets, and strided broadcasts. The strided-load
alternatives currently require a memory_constraint to work properly
which causes more complications when trying to disable just these.
In short, the whole pred_broadcast handling in combination with the
sew64_scalar_helper could use work in the future. I was about to start
with it in this patch but soon realized that it would only distract from
the original intent. What can help in the future is split strided and
non-strided broadcast entirely, as well as the single-element sets.
Yet unclear is whether we need to pay special attention for misaligned
strided loads (PR120782).
I regtested on rv32 and rv64 with strided_load_broadcast_p forced to
true and false. With either I didn't observe any new execution failures
but obviously there are new scan failures with strided broadcast turned
off.
PR target/118734
gcc/ChangeLog:
* config/riscv/constraints.md (Wdm): Use tunable for Wdm
constraint.
* config/riscv/riscv-protos.h (emit_avltype_insn): Declare.
(can_be_broadcasted_p): Rename to...
(can_be_broadcast_p): ...this.
* config/riscv/predicates.md: Use renamed function.
(strided_load_broadcast_p): Declare.
* config/riscv/riscv-selftests.cc (run_broadcast_selftests):
Only run broadcast selftest if strided broadcasts are OK.
* config/riscv/riscv-v.cc (emit_avltype_insn): New function.
(sew64_scalar_helper): Only emit a pred_broadcast if the new
tunable says so.
(can_be_broadcasted_p): Rename to...
(can_be_broadcast_p): ...this and use new tunable.
* config/riscv/riscv.cc (struct riscv_tune_param): Add strided
broad tunable.
(strided_load_broadcast_p): Implement.
* config/riscv/vector.md: Use strided_load_broadcast_p () and
work around 64-bit broadcast on rv32 targets.
Diffstat (limited to 'libgcc/libgcc2.c')
0 files changed, 0 insertions, 0 deletions