aboutsummaryrefslogtreecommitdiff
path: root/gcc
diff options
context:
space:
mode:
authorJeff Law <jlaw@ventanamicro.com>2025-03-09 14:25:37 -0600
committerJeff Law <jlaw@ventanamicro.com>2025-03-09 14:25:37 -0600
commit7d3aec2a832ef47be547d9426187562e4548bae6 (patch)
tree89d4194a2f8fa8b14b5e085086fdb27fd626d6f5 /gcc
parent4ed07a11ee2845c2085a3cd5cff043209a452441 (diff)
downloadgcc-7d3aec2a832ef47be547d9426187562e4548bae6.zip
gcc-7d3aec2a832ef47be547d9426187562e4548bae6.tar.gz
gcc-7d3aec2a832ef47be547d9426187562e4548bae6.tar.bz2
[rtl-optimization/117467] Mark FP destinations as dead
The next step in improving ext-dce is to clean up a minor wart in the set/clobber handling code. In that code the safe thing to do is to not process a destination at all. That will leave bits set in the live bitmaps for objects that may no longer be live. Of course with extraneous bits set we use more memory and do more work managing the bitmaps, but it's safe from a code correctness standpoint. One case that is slipping through that we need to fix is scalar fp destinations. Essentially the code never tried to handle those and as a result would leave those entities live and bubble them up through the CFG. In the testcase at hand this takes us from ~10k live objects at entry to ~4k live objects at entry. Time spent in ext-dce goes from 2.14s to .64s. Bootstrapped and regression tested on x86_64. PR rtl-optimization/117467 gcc/ * ext-dce.cc (ext_dce_process_sets): Handle FP destinations better.
Diffstat (limited to 'gcc')
-rw-r--r--gcc/ext-dce.cc8
1 files changed, 4 insertions, 4 deletions
diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index c53dd5b..35ddda0 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -206,8 +206,8 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap live_tmp)
/* We don't support vector destinations or destinations
wider than DImode. */
- scalar_int_mode outer_mode;
- if (!is_a <scalar_int_mode> (GET_MODE (x), &outer_mode)
+ scalar_mode outer_mode;
+ if (!is_a <scalar_mode> (GET_MODE (x), &outer_mode)
|| GET_MODE_BITSIZE (outer_mode) > HOST_BITS_PER_WIDE_INT)
{
/* Skip the subrtxs of this destination. There is
@@ -239,7 +239,7 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap live_tmp)
/* The inner mode might be larger, just punt for
that case. Remember, we can not just continue to process
the inner RTXs due to the STRICT_LOW_PART. */
- if (!is_a <scalar_int_mode> (GET_MODE (SUBREG_REG (x)), &outer_mode)
+ if (!is_a <scalar_mode> (GET_MODE (SUBREG_REG (x)), &outer_mode)
|| GET_MODE_BITSIZE (outer_mode) > HOST_BITS_PER_WIDE_INT)
{
/* Skip the subrtxs of the STRICT_LOW_PART. We can't
@@ -293,7 +293,7 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap live_tmp)
subreg and restart within the SET processing rather than
the top of the loop which just complicates the flow even
more. */
- if (!is_a <scalar_int_mode> (GET_MODE (SUBREG_REG (x)), &outer_mode)
+ if (!is_a <scalar_mode> (GET_MODE (SUBREG_REG (x)), &outer_mode)
|| GET_MODE_BITSIZE (outer_mode) > HOST_BITS_PER_WIDE_INT)
{
skipped_dest = true;