resource.cc (mark_target_live_regs): Don't look past target insn, PR115182

The PR115182 regression is that a delay-slot for a conditional branch, is no longer filled with an insn that has been "sunk" because of r15-518-g99b1daae18c095, for cris-elf w. -O2 -march=v10. There are still sufficient "nearby" dependency-less insns that the delay-slot shouldn't be empty. In particular there's one candidate in the loop, right after an off-ramp branch, off the loop: a move from $r9 to $r3. beq .L2 nop move.d $r9,$r3 But, the resource.cc data-flow-analysis incorrectly says it collides with registers "live" at that .L2 off-ramp. The off-ramp insns (inlined from simple_rand) look like this (left-to-right direction): .L2: move.d $r12,[_seed.0] move.d $r13,[_seed.0+4] ret movem [$sp+],$r8 So, a store of a long long to _seed, a return instruction and a restoring multi-register-load of r0..r8 (all callee-saved registers) in the delay-slot of the return insn. The return-value is kept in $r10,$r11 so in total $r10..$r13 live plus the stack-pointer and return-address registers. But, mark_target_live_regs says that $r0..$r8 are also live because it *includes the registers live for the return instruction*! While they "come alive" after the movem, they certainly aren't live at the "off-ramp" .L2 label. The problem is in mark_target_live_regs: it consults a hash-table indexed by insn uid, where it tracks the currently live registers with a "generation" count to handle when it moves around insn, filling delay-slots. As a fall-back, it starts with registers live at the start of each basic block, calculated by the comparatively modern df machinery (except that it can fail finding out which basic block an insn belongs to, at which times it includes all registers film at 11), and tracks the semantics of insns up to each insn. You'd think that's all that should be done, but then for some reason it *also* looks at insns *after the target insn* up to a few branches, and includes that in the set of live registers! This is the code in mark_target_live_regs that starts with the call to find_dead_or_set_registers. I couldn't make sense of it, so I looked at its history, and I think I found the cause; it's a thinko or possibly two thinkos. The original implementation, gcc-git-described as r0-97-g9c7e297806a27f, later moved from reorg.c to resource.c in r0-20470-gca545bb569b756. I believe the "extra" lookup was intended to counter flaws in the reorg.c/resource.c register liveness analysis; to inspect insns along the execution paths to exclude registers that, when looking at subsequent insns, weren't live. That guess is backed by a sentence in the updated (i.e. deleted) part of the function head comment for mark_target_live_regs: "Next, scan forward from TARGET looking for things set or clobbered before they are used. These are not live." To me that sounds like flawed register-liveness data. An epilogue expanded as RTX (i.e. not just assembly code emitted as text) is introduced in basepoints/gcc-0-1334-gbdac5f5848fb, so before that time, nobody would notice that saved registers were included as live registers in delay-slots in "next-to-last" basic blocks. Then in r0-24783-g96e9c98d59cc40, the intersection ("and") was changed to a union ("or"), i.e. it added to the set of live registers instead of thinning it out. In the gcc-patches archives, I see the patch submission doesn't offer a C test-case and only has RTX snippets (apparently for SPARC). The message does admit that the change goes "against what the comments in the code say": https://gcc.gnu.org/pipermail/gcc-patches/1999-November/021836.html It looks like this was related to a bug with register liveness info messed up when moving a "delay-slotted" insn from one slot to another. But, I can't help but thinking it's just papering over a register liveness bug elsewhere. I think, with a reliable "DF_LR_IN", the whole thing *after* tracking from start-of-bb up to the target insn should be removed; thus. This patch also removes the now-unused find_dead_or_set_registers function. At r15-518, it fixes the issue for CRIS and improves coremark scores at -O2 -march=v10 a tiny bit (about 0.05%). PR rtl-optimization/115182 * resource.cc (mark_target_live_regs): Don't look for unconditional branches after the target to improve on the register liveness. (find_dead_or_set_registers): Remove unused function.
author: Hans-Peter Nilsson <hp@axis.com> 2024-05-28 23:15:57 +0200
committer: Hans-Peter Nilsson <hp@bitrange.com> 2024-05-28 23:15:57 +0200
commit: 84b4ed45ea81ed5c4fb656a17846b26071c23e7d (patch)
tree: 02ff85c86ca98c9c44ebeac305b0eb699432fe60 /gcc
parent: 91d79053f2b416cb9e97d9c0c3fb5b73075289e6 (diff)
download: gcc-84b4ed45ea81ed5c4fb656a17846b26071c23e7d.zip
gcc-84b4ed45ea81ed5c4fb656a17846b26071c23e7d.tar.gz
gcc-84b4ed45ea81ed5c4fb656a17846b26071c23e7d.tar.bz2
1 files changed, 0 insertions, 235 deletions
diff --git a/gcc/resource.cc b/gcc/resource.cc
index bb196fb..06fcfd3 100644
--- a/gcc/resource.cc
+++ b/gcc/resource.cc
@@ -77,9 +77,6 @@ static HARD_REG_SET pending_dead_regs;
 static void update_live_status (rtx, const_rtx, void *);
 static int find_basic_block (rtx_insn *, int);
 static rtx_insn *next_insn_no_annul (rtx_insn *);
-static rtx_insn *find_dead_or_set_registers (rtx_insn *, struct resources*,
-					     rtx *, int, struct resources,
-					     struct resources);
 
 /* Utility function called from mark_target_live_regs via note_stores.
    It deadens any CLOBBERed registers and livens any SET registers.  */
@@ -407,196 +404,6 @@ mark_referenced_resources (rtx x, struct resources *res,
       }
 }
 
-/* A subroutine of mark_target_live_regs.  Search forward from TARGET
-   looking for registers that are set before they are used.  These are dead.
-   Stop after passing a few conditional jumps, and/or a small
-   number of unconditional branches.  */
-
-static rtx_insn *
-find_dead_or_set_registers (rtx_insn *target, struct resources *res,
-			    rtx *jump_target, int jump_count,
-			    struct resources set, struct resources needed)
-{
-  HARD_REG_SET scratch;
-  rtx_insn *insn;
-  rtx_insn *next_insn;
-  rtx_insn *jump_insn = 0;
-  int i;
-
-  for (insn = target; insn; insn = next_insn)
-    {
-      rtx_insn *this_insn = insn;
-
-      next_insn = NEXT_INSN (insn);
-
-      /* If this instruction can throw an exception, then we don't
-	 know where we might end up next.  That means that we have to
-	 assume that whatever we have already marked as live really is
-	 live.  */
-      if (can_throw_internal (insn))
-	break;
-
-      switch (GET_CODE (insn))
-	{
-	case CODE_LABEL:
-	  /* After a label, any pending dead registers that weren't yet
-	     used can be made dead.  */
-	  pending_dead_regs &= ~needed.regs;
-	  res->regs &= ~pending_dead_regs;
-	  CLEAR_HARD_REG_SET (pending_dead_regs);
-
-	  continue;
-
-	case BARRIER:
-	case NOTE:
-	case DEBUG_INSN:
-	  continue;
-
-	case INSN:
-	  if (GET_CODE (PATTERN (insn)) == USE)
-	    {
-	      /* If INSN is a USE made by update_block, we care about the
-		 underlying insn.  Any registers set by the underlying insn
-		 are live since the insn is being done somewhere else.  */
-	      if (INSN_P (XEXP (PATTERN (insn), 0)))
-		mark_set_resources (XEXP (PATTERN (insn), 0), res, 0,
-				    MARK_SRC_DEST_CALL);
-
-	      /* All other USE insns are to be ignored.  */
-	      continue;
-	    }
-	  else if (GET_CODE (PATTERN (insn)) == CLOBBER)
-	    continue;
-	  else if (rtx_sequence *seq =
-		     dyn_cast <rtx_sequence *> (PATTERN (insn)))
-	    {
-	      /* An unconditional jump can be used to fill the delay slot
-		 of a call, so search for a JUMP_INSN in any position.  */
-	      for (i = 0; i < seq->len (); i++)
-		{
-		  this_insn = seq->insn (i);
-		  if (JUMP_P (this_insn))
-		    break;
-		}
-	    }
-
-	default:
-	  break;
-	}
-
-      if (rtx_jump_insn *this_jump_insn =
-	    dyn_cast <rtx_jump_insn *> (this_insn))
-	{
-	  if (jump_count++ < 10)
-	    {
-	      if (any_uncondjump_p (this_jump_insn)
-		  || ANY_RETURN_P (PATTERN (this_jump_insn)))
-		{
-		  rtx lab_or_return = this_jump_insn->jump_label ();
-		  if (ANY_RETURN_P (lab_or_return))
-		    next_insn = NULL;
-		  else
-		    next_insn = as_a <rtx_insn *> (lab_or_return);
-		  if (jump_insn == 0)
-		    {
-		      jump_insn = insn;
-		      if (jump_target)
-			*jump_target = JUMP_LABEL (this_jump_insn);
-		    }
-		}
-	      else if (any_condjump_p (this_jump_insn))
-		{
-		  struct resources target_set, target_res;
-		  struct resources fallthrough_res;
-
-		  /* We can handle conditional branches here by following
-		     both paths, and then IOR the results of the two paths
-		     together, which will give us registers that are dead
-		     on both paths.  Since this is expensive, we give it
-		     a much higher cost than unconditional branches.  The
-		     cost was chosen so that we will follow at most 1
-		     conditional branch.  */
-
-		  jump_count += 4;
-		  if (jump_count >= 10)
-		    break;
-
-		  mark_referenced_resources (insn, &needed, true);
-
-		  /* For an annulled branch, mark_set_resources ignores slots
-		     filled by instructions from the target.  This is correct
-		     if the branch is not taken.  Since we are following both
-		     paths from the branch, we must also compute correct info
-		     if the branch is taken.  We do this by inverting all of
-		     the INSN_FROM_TARGET_P bits, calling mark_set_resources,
-		     and then inverting the INSN_FROM_TARGET_P bits again.  */
-
-		  if (GET_CODE (PATTERN (insn)) == SEQUENCE
-		      && INSN_ANNULLED_BRANCH_P (this_jump_insn))
-		    {
-		      rtx_sequence *seq = as_a <rtx_sequence *> (PATTERN (insn));
-		      for (i = 1; i < seq->len (); i++)
-			INSN_FROM_TARGET_P (seq->element (i))
-			  = ! INSN_FROM_TARGET_P (seq->element (i));
-
-		      target_set = set;
-		      mark_set_resources (insn, &target_set, 0,
-					  MARK_SRC_DEST_CALL);
-
-		      for (i = 1; i < seq->len (); i++)
-			INSN_FROM_TARGET_P (seq->element (i))
-			  = ! INSN_FROM_TARGET_P (seq->element (i));
-
-		      mark_set_resources (insn, &set, 0, MARK_SRC_DEST_CALL);
-		    }
-		  else
-		    {
-		      mark_set_resources (insn, &set, 0, MARK_SRC_DEST_CALL);
-		      target_set = set;
-		    }
-
-		  target_res = *res;
-		  scratch = target_set.regs & ~needed.regs;
-		  target_res.regs &= ~scratch;
-
-		  fallthrough_res = *res;
-		  scratch = set.regs & ~needed.regs;
-		  fallthrough_res.regs &= ~scratch;
-
-		  if (!ANY_RETURN_P (this_jump_insn->jump_label ()))
-		    find_dead_or_set_registers
-			  (this_jump_insn->jump_target (),
-			   &target_res, 0, jump_count, target_set, needed);
-		  find_dead_or_set_registers (next_insn,
-					      &fallthrough_res, 0, jump_count,
-					      set, needed);
-		  fallthrough_res.regs |= target_res.regs;
-		  res->regs &= fallthrough_res.regs;
-		  break;
-		}
-	      else
-		break;
-	    }
-	  else
-	    {
-	      /* Don't try this optimization if we expired our jump count
-		 above, since that would mean there may be an infinite loop
-		 in the function being compiled.  */
-	      jump_insn = 0;
-	      break;
-	    }
-	}
-
-      mark_referenced_resources (insn, &needed, true);
-      mark_set_resources (insn, &set, 0, MARK_SRC_DEST_CALL);
-
-      scratch = set.regs & ~needed.regs;
-      res->regs &= ~scratch;
-    }
-
-  return jump_insn;
-}
-
 /* Given X, a part of an insn, and a pointer to a `struct resource',
    RES, indicate which resources are modified by the insn. If
    MARK_TYPE is MARK_SRC_DEST_CALL, also mark resources potentially
@@ -854,9 +661,6 @@ return_insn_p (const_rtx insn)
    If we cannot find the start of a basic block (should be a very rare
    case, if it can happen at all), mark everything as potentially live.
 
-   Next, scan forward from TARGET looking for things set or clobbered
-   before they are used.  These are not live.
-
    Because we can be called many times on the same target, save our results
    in a hash table indexed by INSN_UID.  This is only done if the function
    init_resource_info () was invoked before we are called.  */
@@ -868,9 +672,6 @@ mark_target_live_regs (rtx_insn *insns, rtx target_maybe_return, struct resource
   unsigned int i;
   struct target_info *tinfo = NULL;
   rtx_insn *insn;
-  rtx jump_target;
-  HARD_REG_SET scratch;
-  struct resources set, needed;
 
   /* Handle end of function.  */
   if (target_maybe_return == 0 || ANY_RETURN_P (target_maybe_return))
@@ -1093,42 +894,6 @@ mark_target_live_regs (rtx_insn *insns, rtx target_maybe_return, struct resource
        in use.  This should happen only extremely rarely.  */
     SET_HARD_REG_SET (res->regs);
 
-  CLEAR_RESOURCE (&set);
-  CLEAR_RESOURCE (&needed);
-
-  rtx_insn *jump_insn = find_dead_or_set_registers (target, res, &jump_target,
-						    0, set, needed);
-
-  /* If we hit an unconditional branch, we have another way of finding out
-     what is live: we can see what is live at the branch target and include
-     anything used but not set before the branch.  We add the live
-     resources found using the test below to those found until now.  */
-
-  if (jump_insn)
-    {
-      struct resources new_resources;
-      rtx_insn *stop_insn = next_active_insn (jump_insn);
-
-      if (!ANY_RETURN_P (jump_target))
-	jump_target = next_active_insn (as_a<rtx_insn *> (jump_target));
-      mark_target_live_regs (insns, jump_target, &new_resources);
-      CLEAR_RESOURCE (&set);
-      CLEAR_RESOURCE (&needed);
-
-      /* Include JUMP_INSN in the needed registers.  */
-      for (insn = target; insn != stop_insn; insn = next_active_insn (insn))
-	{
-	  mark_referenced_resources (insn, &needed, true);
-
-	  scratch = needed.regs & ~set.regs;
-	  new_resources.regs |= scratch;
-
-	  mark_set_resources (insn, &set, 0, MARK_SRC_DEST_CALL);
-	}
-
-      res->regs |= new_resources.regs;
-    }
-
   if (tinfo != NULL)
     tinfo->live_regs = res->regs;
 }
author	Hans-Peter Nilsson <hp@axis.com>	2024-05-28 23:15:57 +0200
committer	Hans-Peter Nilsson <hp@bitrange.com>	2024-05-28 23:15:57 +0200
commit	84b4ed45ea81ed5c4fb656a17846b26071c23e7d (patch)
tree	02ff85c86ca98c9c44ebeac305b0eb699432fe60 /gcc
parent	91d79053f2b416cb9e97d9c0c3fb5b73075289e6 (diff)
download	gcc-84b4ed45ea81ed5c4fb656a17846b26071c23e7d.zip gcc-84b4ed45ea81ed5c4fb656a17846b26071c23e7d.tar.gz gcc-84b4ed45ea81ed5c4fb656a17846b26071c23e7d.tar.bz2