riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Roger Sayle <roger@nextmovesoftware.com>	2023-10-30 16:17:42 +0000
committer	Roger Sayle <roger@nextmovesoftware.com>	2023-10-30 16:17:42 +0000
commit	31cc9824d1cd5e0f7fa145d0831a923479333cd6 (patch)
tree	f6aa115934bf827a620e69a88e1df8600fd6ab0a /libcpp/line-map.cc
parent	d24c3c533454449f4bfe3db5a4c7d0eaff08e3c7 (diff)
download	gcc-31cc9824d1cd5e0f7fa145d0831a923479333cd6.zip gcc-31cc9824d1cd5e0f7fa145d0831a923479333cd6.tar.gz gcc-31cc9824d1cd5e0f7fa145d0831a923479333cd6.tar.bz2

ARC: Improved ARC rtx_costs/insn_cost for SHIFTs and ROTATEs.

This patch overhauls the ARC backend's insn_cost target hook, and makes some related improvements to rtx_costs, BRANCH_COST, etc. The primary goal is to allow the backend to indicate that shifts and rotates are slow (discouraged) when the CPU doesn't have a barrel shifter. I should also acknowledge Richard Sandiford for inspiring the use of set_cost in this rewrite of arc_insn_cost; this implementation borrows heavily for the target hooks for AArch64 and ARM. The motivating example is derived from PR rtl-optimization/110717. struct S { int a : 5; }; unsigned int foo (struct S *p) { return p->a; } With a barrel shifter, GCC -O2 generates the reasonable: foo: ldb_s r0,[r0] asl_s r0,r0,27 j_s.d [blink] asr_s r0,r0,27 What's interesting is that during combine, the middle-end actually has two shifts by three bits, and a sign-extension from QI to SI. Trying 8, 9 -> 11: 8: r158:SI=r157:QI#0<<0x3 REG_DEAD r157:QI 9: r159:SI=sign_extend(r158:SI#0) REG_DEAD r158:SI 11: r155:SI=r159:SI>>0x3 REG_DEAD r159:SI Whilst it's reasonable to simplify this to two shifts by 27 bits when the CPU has a barrel shifter, it's actually a significant pessimization when these shifts are implemented by loops. This combination can be prevented if the backend provides accurate-ish estimates for insn_cost. Previously, without a barrel shifter, GCC -O2 -mcpu=em generates: foo: ldb_s r0,[r0] mov lp_count,27 lp 2f add r0,r0,r0 nop 2: # end single insn loop mov lp_count,27 lp 2f asr r0,r0 nop 2: # end single insn loop j_s [blink] which contains two loops and requires about ~113 cycles to execute. With this patch to rtx_cost/insn_cost, GCC -O2 -mcpu=em generates: foo: ldb_s r0,[r0] mov_s r2,0 ;3 add3 r0,r2,r0 sexb_s r0,r0 asr_s r0,r0 asr_s r0,r0 j_s.d [blink] asr_s r0,r0 which requires only ~6 cycles, for the shorter shifts by 3 and sign extension. 2023-10-30 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/arc/arc.cc (arc_rtx_costs): Improve cost estimates. Provide reasonable values for SHIFTS and ROTATES by constant bit counts depending upon TARGET_BARREL_SHIFTER. (arc_insn_cost): Use insn attributes if the instruction is recognized. Avoid calling get_attr_length for type "multi", i.e. define_insn_and_split patterns without explicit type. Fall-back to set_rtx_cost for single_set and pattern_cost otherwise. * config/arc/arc.h (COSTS_N_BYTES): Define helper macro. (BRANCH_COST): Improve/correct definition. (LOGICAL_OP_NON_SHORT_CIRCUIT): Preserve previous behavior.

Diffstat (limited to 'libcpp/line-map.cc')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: