aboutsummaryrefslogtreecommitdiff
path: root/gcc/timevar.def
diff options
context:
space:
mode:
authorKewen Lin <linkw@gcc.gnu.org>2020-11-03 02:51:47 +0000
committerKewen Lin <linkw@linux.ibm.com>2020-11-02 20:55:48 -0600
commitf5e18dd9c7dacc9671044fc669bd5c1b26b6bdba (patch)
treecb8840d85a0cdd84f2b5648a7ab20dfc1efb02a4 /gcc/timevar.def
parentbd6ecbe48ada79bb14cbb30ef8318495b5237790 (diff)
downloadgcc-f5e18dd9c7dacc9671044fc669bd5c1b26b6bdba.zip
gcc-f5e18dd9c7dacc9671044fc669bd5c1b26b6bdba.tar.gz
gcc-f5e18dd9c7dacc9671044fc669bd5c1b26b6bdba.tar.bz2
pass: Run cleanup passes before SLP [PR96789]
As the discussion in PR96789, we found that some scalar stmts which can be eliminated by some passes after SLP, but we still modeled their costs when trying to SLP, it could impact vectorizer's decision. One typical case is the case in PR96789 on target Power. As Richard suggested there, this patch is to introduce one pass called pre_slp_scalar_cleanup which has some secondary clean up passes, for now they are FRE and DSE. It introduces one new TODO flags group called pending TODO flags, unlike normal TODO flags, the pending TODO flags are passed down in the pipeline until one of its consumers can perform the requested action. Consumers should then clear the flags for the actions that they have taken. Soem compilation time statistics on all SPEC2017 INT bmks were collected on one Power9 machine for several option sets below: A1: -Ofast -funroll-loops A2: -O1 A3: -O1 -funroll-loops A4: -O2 A5: -O2 -funroll-loops the corresponding increment rate is trivial: A1 A2 A3 A4 A5 0.08% 0.00% -0.38% -0.10% -0.05% Bootstrapped/regtested on powerpc64le-linux-gnu P8. gcc/ChangeLog: PR tree-optimization/96789 * function.h (struct function): New member unsigned pending_TODOs. * passes.c (class pass_pre_slp_scalar_cleanup): New class. (make_pass_pre_slp_scalar_cleanup): New function. (pass_data_pre_slp_scalar_cleanup): New pass data. * passes.def: (pass_pre_slp_scalar_cleanup): New pass, add pass_fre and pass_dse as its children. * timevar.def (TV_SCALAR_CLEANUP): New timevar. * tree-pass.h (PENDING_TODO_force_next_scalar_cleanup): New pending TODO flag. (make_pass_pre_slp_scalar_cleanup): New declare. * tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Once any outermost loop gets unrolled, flag cfun pending_TODOs PENDING_TODO_force_next_scalar_cleanup on. gcc/testsuite/ChangeLog: PR tree-optimization/96789 * gcc.dg/tree-ssa/ssa-dse-28.c: Adjust. * gcc.dg/tree-ssa/ssa-dse-29.c: Likewise. * gcc.dg/vect/bb-slp-41.c: Likewise. * gcc.dg/tree-ssa/pr96789.c: New test.
Diffstat (limited to 'gcc/timevar.def')
-rw-r--r--gcc/timevar.def1
1 files changed, 1 insertions, 0 deletions
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 08c21c0..a303179 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -194,6 +194,7 @@ DEFTIMEVAR (TV_TREE_LOOP_UNSWITCH , "tree loop unswitching")
DEFTIMEVAR (TV_LOOP_SPLIT , "loop splitting")
DEFTIMEVAR (TV_LOOP_JAM , "unroll and jam")
DEFTIMEVAR (TV_COMPLETE_UNROLL , "complete unrolling")
+DEFTIMEVAR (TV_SCALAR_CLEANUP , "scalar cleanup")
DEFTIMEVAR (TV_TREE_PARALLELIZE_LOOPS, "tree parallelize loops")
DEFTIMEVAR (TV_TREE_VECTORIZATION , "tree vectorization")
DEFTIMEVAR (TV_TREE_SLP_VECTORIZATION, "tree slp vectorization")