diff options
author | Richard Sandiford <richard.sandiford@arm.com> | 2025-03-06 11:06:25 +0000 |
---|---|---|
committer | H.J. Lu <hjl.tools@gmail.com> | 2025-03-08 03:57:53 +0800 |
commit | b191e8bdecf881d11c1544c441e38f4c18392a15 (patch) | |
tree | 6b900b69eaf8a616d1810511c37275ddff609334 /gcc/target.def | |
parent | cf65235e03d2eb1667624943eae8f7fc355bceaf (diff) | |
download | gcc-b191e8bdecf881d11c1544c441e38f4c18392a15.zip gcc-b191e8bdecf881d11c1544c441e38f4c18392a15.tar.gz gcc-b191e8bdecf881d11c1544c441e38f4c18392a15.tar.bz2 |
ira: Add new hooks for callee-save vs spills [PR117477]
Following on from the discussion in:
https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675256.html
this patch removes TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE and
replaces it with two hooks: one that controls the cost of using an
extra callee-saved register and one that controls the cost of allocating
a frame for the first spill.
(The patch does not attempt to address the shrink-wrapping part of
the thread above.)
On AArch64, this is enough to fix PR117477, as verified by the new tests.
The patch does not change the SPEC2017 scores significantly. (I saw a
slight improvement in fotonik3d and roms, but I'm not convinced that
the improvements are real.)
The patch makes IRA use caller saves for gcc.target/aarch64/pr103350-1.c,
which is a scan-dump correctness test that relies on not using
caller saves. The decision to use caller saves looks appropriate,
and saves an instruction, so I've just added -fno-caller-saves
to the test options.
The x86 parts were written by Honza. ix86_callee_save_cost is updated
by H.J. to replace gcc_checking_assert with returning 1 if mem_cost <= 2.
gcc/
PR rtl-optimization/117477
* config/aarch64/aarch64.cc (aarch64_count_saves): New function.
(aarch64_count_above_hard_fp_saves, aarch64_callee_save_cost)
(aarch64_frame_allocation_cost): Likewise.
(TARGET_CALLEE_SAVE_COST): Define.
(TARGET_FRAME_ALLOCATION_COST): Likewise.
* config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale):
Replace with...
(ix86_callee_save_cost): ...this new hook.
(TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete.
(TARGET_CALLEE_SAVE_COST): Define.
* target.h (spill_cost_type, frame_cost_type): New enums.
* target.def (callee_save_cost, frame_allocation_cost): New hooks.
(ira_callee_saved_register_cost_scale): Delete.
* doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete.
(TARGET_CALLEE_SAVE_COST, TARGET_FRAME_ALLOCATION_COST): New hooks.
* doc/tm.texi: Regenerate.
* hard-reg-set.h (hard_reg_set_popcount): New function.
* ira-color.cc (allocated_memory_p): New variable.
(allocated_callee_save_regs): Likewise.
(record_allocation): New function.
(assign_hard_reg): Use targetm.frame_allocation_cost to model
the cost of the first spill or first caller save. Use
targetm.callee_save_cost to model the cost of using new callee-saved
registers. Apply the exit rather than entry frequency to the cost
of restoring a register or deallocating the frame. Update the
new variables above.
(improve_allocation): Use record_allocation.
(color): Initialize allocated_callee_save_regs.
(ira_color): Initialize allocated_memory_p.
* targhooks.h (default_callee_save_cost): Declare.
(default_frame_allocation_cost): Likewise.
* targhooks.cc (default_callee_save_cost): New function.
(default_frame_allocation_cost): Likewise.
gcc/testsuite/
PR rtl-optimization/117477
* gcc.target/aarch64/callee_save_1.c: New test.
* gcc.target/aarch64/callee_save_2.c: Likewise.
* gcc.target/aarch64/callee_save_3.c: Likewise.
* gcc.target/aarch64/pr103350-1.c: Add -fno-caller-saves.
Co-authored-by: Jan Hubicka <hubicka@ucw.cz>
Co-authored-by: H.J. Lu <hjl.tools@gmail.com>
Diffstat (limited to 'gcc/target.def')
-rw-r--r-- | gcc/target.def | 87 |
1 files changed, 75 insertions, 12 deletions
diff --git a/gcc/target.def b/gcc/target.def index c348b15..6c7cdc8 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -3776,6 +3776,81 @@ are the same as to this target hook.", default_memory_move_cost) DEFHOOK +(callee_save_cost, + "Return the one-off cost of saving or restoring callee-saved registers\n\ +(also known as call-preserved registers or non-volatile registers).\n\ +The parameters are as follows:\n\ +\n\ +@itemize\n\ +@item\n\ +@var{cost_type} is @samp{spill_cost_type::SAVE} for saving a register\n\ +and @samp{spill_cost_type::RESTORE} for restoring a register.\n\ +\n\ +@item\n\ +@var{hard_regno} and @var{mode} represent the whole register that\n\ +the register allocator is considering using; of these,\n\ +@var{nregs} registers are fully or partially callee-saved.\n\ +\n\ +@item\n\ +@var{mem_cost} is the normal cost for storing (for saves)\n\ +or loading (for restores) the @var{nregs} registers.\n\ +\n\ +@item\n\ +@var{allocated_callee_regs} is the set of callee-saved registers\n\ +that are already in use.\n\ +\n\ +@item\n\ +@var{existing_spills_p} is true if the register allocator has\n\ +already decided to spill registers to memory.\n\ +@end itemize\n\ +\n\ +If @var{existing_spills_p} is false, the cost of a save should account\n\ +for frame allocations in a way that is consistent with\n\ +@code{TARGET_FRAME_ALLOCATION_COST}'s handling of allocations for spills.\n\ +Similarly, the cost of a restore should then account for frame deallocations\n\ +in a way that is consistent with @code{TARGET_FRAME_ALLOCATION_COST}'s\n\ +handling of deallocations.\n\ +\n\ +Note that this hook should not attempt to apply a frequency scale\n\ +to the cost: it is the caller's responsibility to do that where\n\ +appropriate.\n\ +\n\ +The default implementation returns @var{mem_cost}, plus the allocation\n\ +or deallocation cost returned by @code{TARGET_FRAME_ALLOCATION_COST},\n\ +where appropriate.", + int, (spill_cost_type cost_type, unsigned int hard_regno, + machine_mode mode, unsigned int nregs, int mem_cost, + const HARD_REG_SET &allocated_callee_regs, bool existing_spills_p), + default_callee_save_cost) + +DEFHOOK +(frame_allocation_cost, + "Return the cost of allocating or deallocating a frame for the sake of\n\ +a spill; @var{cost_type} chooses between allocation and deallocation.\n\ +The term ``spill'' here includes both forcing a pseudo register to memory\n\ +and using caller-saved registers for pseudo registers that are live across\n\ +a call.\n\ +\n\ +This hook is only called if the register allocator has not so far\n\ +decided to spill. The allocator may have decided to use callee-saved\n\ +registers; if so, @var{allocated_callee_regs} is the set of callee-saved\n\ +registers that the allocator has used. There might also be other reasons\n\ +why a stack frame is already needed; for example, @samp{get_frame_size ()}\n\ +might be nonzero, or the target might already require a frame for\n\ +target-specific reasons.\n\ +\n\ +When the register allocator uses this hook to cost spills, it also uses\n\ +@code{TARGET_CALLEE_SAVE_COST} to cost new callee-saved registers, passing\n\ +@samp{false} as the @var{existing_spills_p} argument. The intention is to\n\ +allow the target to apply an apples-for-apples comparison between the\n\ +cost of using callee-saved registers and using spills in cases where the\n\ +allocator has not yet committed to using both strategies.\n\ +\n\ +The default implementation returns 0.", + int, (frame_cost_type cost_type, const HARD_REG_SET &allocated_callee_regs), + default_frame_allocation_cost) + +DEFHOOK (use_by_pieces_infrastructure_p, "GCC will attempt several strategies when asked to copy between\n\ two areas of memory, or to set, clear or store to memory, for example\n\ @@ -5714,18 +5789,6 @@ DEFHOOK reg_class_t, (int, reg_class_t, reg_class_t), default_ira_change_pseudo_allocno_class) -/* Scale of callee-saved register cost in epilogue and prologue used by - IRA. */ -DEFHOOK -(ira_callee_saved_register_cost_scale, - "A target hook which returns the callee-saved register @var{hard_regno}\n\ -cost scale in epilogue and prologue used by IRA.\n\ -\n\ -The default version of this target hook returns 1 if optimizing for\n\ -size, otherwise returns the entry block frequency.", - int, (int hard_regno), - default_ira_callee_saved_register_cost_scale) - /* Return true if we use LRA instead of reload. */ DEFHOOK (lra_p, |