Replace VRP threader with a hybrid forward threader.

This patch implements the new hybrid forward threader and replaces the embedded VRP threader with it. With all the pieces that have gone in, the implementation of the hybrid threader is straightforward: convert the current state into SSA imports that the solver will understand, and let the path solver precompute ranges and relations for the path. After this setup is done, we can use the range_query API to solve gimple statements in the threader. The forward threader is now engine agnostic so there are no changes to the threader per se. I have put the hybrid bits in tree-ssa-threadedge.*, instead of VRP, because they will also be used in the evrp removal of the DOM/threader, which is my next task. Most of the patch, is actually test changes. I have gone through every single one and verified that we're correct. Most were trivial dump file name changes, but others required going through the IL an certifying that the different IL was expected. For example, in pr59597.c, we have one less thread because the ASSERT_EXPR was getting in the way, and making it seem like things were not crossing loops. The hybrid threader sees the correct representation of the IL, and avoids threading this one case. The final numbers are a 12.16% improvement in jump threads immediately after VRP, and a 0.82% improvement in overall jump threads. The performance drop is 0.6% (plus the 1.43% hit from moving the embedded threader into its own pass). As I've said, I'd prefer to keep the threader in its own pass, but if this is an issue, we can address this with a shared ranger when VRP is replaced with an evrp instance (upcoming). Note, that these numbers are slightly different than what I originally posted. A few correctness tweaks, plus restricting loop threads, made the difference. That being said, I was aiming for par. A 12% gain is just gravy ;-). When we merge the threaders, we should see even better numbers-- and we'll have the benefit of an entire release stress testing the solver. As I mentioned in my introductory note, paths ending in MEM_REF conditional are missing. In reality, this didn't make a difference, as it was so rare. However, as a follow-up, I will distill a test and add a suitable PR to keep us honest. There is a one-line change to libgomp/team.c silencing a new used uninitialized warning. As my previous work with the threaders has shown, warnings flare up after each improvement to jump threading. I expect this to be no different. I've promised Jakub to investigate fully, so I will analyze and add the appropriate PR for the warning experts. Oh yeah, the new pass dump is called vrp-threader[12] to match each VRP[12] pass. However, there's no reason for it to either be named vrp-threader, or for it to live in tree-vrp.c. Tested on x86-64 Linux. OK? p.s. "Did I say 5 weeks? My bad, I meant 5 months." gcc/ChangeLog: * passes.def (pass_vrp_threader): New. * tree-pass.h (make_pass_vrp_threader): Add make_pass_vrp_threader. * tree-ssa-threadedge.c (hybrid_jt_state::register_equivs_stmt): New. (hybrid_jt_simplifier::hybrid_jt_simplifier): New. (hybrid_jt_simplifier::simplify): New. (hybrid_jt_simplifier::compute_ranges_from_state): New. * tree-ssa-threadedge.h (class hybrid_jt_state): New. (class hybrid_jt_simplifier): New. * tree-vrp.c (execute_vrp): Remove ASSERT_EXPR based jump threader. (class hybrid_threader): New. (hybrid_threader::hybrid_threader): New. (hybrid_threader::~hybrid_threader): New. (hybrid_threader::before_dom_children): New. (hybrid_threader::after_dom_children): New. (execute_vrp_threader): New. (class pass_vrp_threader): New. (make_pass_vrp_threader): New. libgomp/ChangeLog: * team.c: Initialize start_data. * testsuite/libgomp.graphite/force-parallel-4.c: Adjust. * testsuite/libgomp.graphite/force-parallel-8.c: Adjust. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr55107.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-1.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-2.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-3.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-4.c: Adjust. * gcc.dg/tree-ssa/pr21559.c: Adjust. * gcc.dg/tree-ssa/pr59597.c: Adjust. * gcc.dg/tree-ssa/pr61839_1.c: Adjust. * gcc.dg/tree-ssa/pr61839_3.c: Adjust. * gcc.dg/tree-ssa/pr71437.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-11.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-16.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-18.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-2a.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-4.c: Adjust. * gcc.dg/tree-ssa/ssa-thread-14.c: Adjust. * gcc.dg/tree-ssa/ssa-vrp-thread-1.c: Adjust. * gcc.dg/tree-ssa/vrp106.c: Adjust. * gcc.dg/tree-ssa/vrp55.c: Adjust.
author: Aldy Hernandez <aldyh@redhat.com> 2021-09-21 10:27:53 +0200
committer: Aldy Hernandez <aldyh@redhat.com> 2021-09-27 17:39:51 +0200
commit: 0288527f47cec6698b31ccb3210816415506009e (patch)
tree: f0e5b4465af7816e61befbf1b6fe5b3138c4ab4e /gcc/tree-ssa-threadedge.c
parent: dd11aab6463880c35d942c4a4fd346fdaeeb8e72 (diff)
download: gcc-0288527f47cec6698b31ccb3210816415506009e.zip
gcc-0288527f47cec6698b31ccb3210816415506009e.tar.gz
gcc-0288527f47cec6698b31ccb3210816415506009e.tar.bz2
1 files changed, 71 insertions, 0 deletions
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index ae77e5e..29ed60a 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "vr-values.h"
 #include "gimple-ssa-evrp-analyze.h"
 #include "gimple-range.h"
+#include "gimple-range-path.h"
 
 /* To avoid code explosion due to jump threading, we limit the
    number of statements we are going to copy.  This variable
@@ -1397,3 +1398,73 @@ jt_state::register_equivs_stmt (gimple *stmt, basic_block bb,
     register_equiv (gimple_get_lhs (stmt), cached_lhs,
 		    /*update_range=*/false);
 }
+
+// Hybrid threader implementation.
+
+
+void
+hybrid_jt_state::register_equivs_stmt (gimple *, basic_block, jt_simplifier *)
+{
+  // Ranger has no need to simplify anything to improve equivalences.
+}
+
+hybrid_jt_simplifier::hybrid_jt_simplifier (gimple_ranger *r,
+					    path_range_query *q)
+{
+  m_ranger = r;
+  m_query = q;
+}
+
+tree
+hybrid_jt_simplifier::simplify (gimple *stmt, gimple *, basic_block,
+				jt_state *state)
+{
+  int_range_max r;
+
+  compute_ranges_from_state (stmt, state);
+
+  if (gimple_code (stmt) == GIMPLE_COND
+      || gimple_code (stmt) == GIMPLE_ASSIGN)
+    {
+      tree ret;
+      if (m_query->range_of_stmt (r, stmt) && r.singleton_p (&ret))
+	return ret;
+    }
+  else if (gimple_code (stmt) == GIMPLE_SWITCH)
+    {
+      gswitch *switch_stmt = dyn_cast <gswitch *> (stmt);
+      tree index = gimple_switch_index (switch_stmt);
+      if (m_query->range_of_expr (r, index, stmt))
+	return find_case_label_range (switch_stmt, &r);
+    }
+  return NULL;
+}
+
+// Use STATE to generate the list of imports needed for the solver,
+// and calculate the ranges along the path.
+
+void
+hybrid_jt_simplifier::compute_ranges_from_state (gimple *stmt, jt_state *state)
+{
+  auto_bitmap imports;
+  gori_compute &gori = m_ranger->gori ();
+
+  state->get_path (m_path);
+
+  // Start with the imports to the final conditional.
+  bitmap_copy (imports, gori.imports (m_path[0]));
+
+  // Add any other interesting operands we may have missed.
+  if (gimple_bb (stmt) != m_path[0])
+    {
+      for (unsigned i = 0; i < gimple_num_ops (stmt); ++i)
+	{
+	  tree op = gimple_op (stmt, i);
+	  if (op
+	      && TREE_CODE (op) == SSA_NAME
+	      && irange::supports_type_p (TREE_TYPE (op)))
+	    bitmap_set_bit (imports, SSA_NAME_VERSION (op));
+	}
+    }
+  m_query->precompute_ranges (m_path, imports);
+}
author	Aldy Hernandez <aldyh@redhat.com>	2021-09-21 10:27:53 +0200
committer	Aldy Hernandez <aldyh@redhat.com>	2021-09-27 17:39:51 +0200
commit	0288527f47cec6698b31ccb3210816415506009e (patch)
tree	f0e5b4465af7816e61befbf1b6fe5b3138c4ab4e /gcc/tree-ssa-threadedge.c
parent	dd11aab6463880c35d942c4a4fd346fdaeeb8e72 (diff)
download	gcc-0288527f47cec6698b31ccb3210816415506009e.zip gcc-0288527f47cec6698b31ccb3210816415506009e.tar.gz gcc-0288527f47cec6698b31ccb3210816415506009e.tar.bz2