From d8ea47033a726c9b3455ead98b6ddce2403ec6d9 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Sat, 21 Nov 2020 18:26:21 +0100 Subject: Jump threader refactor. This is an overall refactor of the jump threader, both for the low level bits in tree-ssa-threadupdate.* and the high level bits in tree-ssa-threadedge.*. There should be no functional changes. Some of the benefits of the refactor are: a) Eliminates some icky global state (for example the x_vr_values hack). b) Provides some semblance of an API for the threader. c) Makes it clearer to see what parts are from the high level threader, and what parts belong in the low level path registry and BB threading mechanism. d) Avoids passing a ton of variables around. e) Provides for easier sharing with the backward threader. f) Merged simplify stmt code in VRP and DOM as they were nearly identical. This has been bootstrapped and regression tested on x86-64 Linux. Jeff had also been testing this path as part of his Fedora throughout the off-season. gcc/ChangeLog: * tree-ssa-dom.c (class dom_jump_threader_simplifier): New. (class dom_opt_dom_walker): Initialize some class variables. (pass_dominator::execute): Pass evrp_range_analyzer and dom_jump_threader_simplifier to dom_opt_dom_walker. Adjust for some functions moving into classes. (simplify_stmt_for_jump_threading): Adjust and move to... (jump_threader_simplifier::simplify): ...here. (dom_opt_dom_walker::before_dom_children): Adjust for m_evrp_range_analyzer. (dom_opt_dom_walker::after_dom_children): Remove x_vr_values hack. (test_for_singularity): Place in dom_opt_dom_walker class. (dom_opt_dom_walker::optimize_stmt): The argument evrp_range_analyzer is now a class field. * tree-ssa-threadbackward.c (class thread_jumps): Add m_registry. (thread_jumps::thread_through_all_blocks): New. (thread_jumps::convert_and_register_current_path): Use m_registry. (pass_thread_jumps::execute): Adjust for thread_through_all_blocks being in the threader class. (pass_early_thread_jumps::execute): Same. * tree-ssa-threadedge.c (threadedge_initialize_values): Move... (jump_threader::jump_threader): ...here. (threadedge_finalize_values): Move... (jump_threader::~jump_threader): ...here. (jump_threader::remove_jump_threads_including): New. (jump_threader::thread_through_all_blocks): New. (record_temporary_equivalences_from_phis): Move... (jump_threader::record_temporary_equivalences_from_phis): ...here. (record_temporary_equivalences_from_stmts_at_dest): Move... (jump_threader::record_temporary_equivalences_from_stmts_at_dest): Here... (simplify_control_stmt_condition_1): Move to jump_threader class. (simplify_control_stmt_condition): Move... (jump_threader::simplify_control_stmt_condition): ...here. (thread_around_empty_blocks): Move... (jump_threader::thread_around_empty_blocks): ...here. (thread_through_normal_block): Move... (jump_threader::thread_through_normal_block): ...here. (thread_across_edge): Move... (jump_threader::thread_across_edge): ...here. (thread_outgoing_edges): Move... (jump_threader::thread_outgoing_edges): ...here. * tree-ssa-threadedge.h: Move externally facing functings... (class jump_threader): ...here... (class jump_threader_simplifier): ...and here. * tree-ssa-threadupdate.c (struct redirection_data): Remove comment. (jump_thread_path_allocator::jump_thread_path_allocator): New. (jump_thread_path_allocator::~jump_thread_path_allocator): New. (jump_thread_path_allocator::allocate_thread_edge): New. (jump_thread_path_allocator::allocate_thread_path): New. (jump_thread_path_registry::jump_thread_path_registry): New. (jump_thread_path_registry::~jump_thread_path_registry): New. (jump_thread_path_registry::allocate_thread_edge): New. (jump_thread_path_registry::allocate_thread_path): New. (dump_jump_thread_path): Make extern. (debug (const vec &path)): New. (struct removed_edges): Move to tree-ssa-threadupdate.h. (struct thread_stats_d): Remove. (remove_ctrl_stmt_and_useless_edges): Make static. (lookup_redirection_data): Move... (jump_thread_path_registry::lookup_redirection_data): ...here. (ssa_redirect_edges): Make static. (thread_block_1): Move... (jump_thread_path_registry::thread_block_1): ...here. (thread_block): Move... (jump_thread_path_registry::thread_block): ...here. (thread_through_loop_header): Move... (jump_thread_path_registry::thread_through_loop_header): ...here. (mark_threaded_blocks): Move... (jump_thread_path_registry::mark_threaded_blocks): ...here. (debug_path): Move... (jump_thread_path_registry::debug_path): ...here. (debug_all_paths): Move... (jump_thread_path_registry::dump): ..here. (rewire_first_differing_edge): Move... (jump_thread_path_registry::rewire_first_differing_edge): ...here. (adjust_paths_after_duplication): Move... (jump_thread_path_registry::adjust_paths_after_duplication): ...here. (duplicate_thread_path): Move... (jump_thread_path_registry::duplicate_thread_path): ..here. (remove_jump_threads_including): Move... (jump_thread_path_registry::remove_jump_threads_including): ...here. (thread_through_all_blocks): Move to... (jump_thread_path_registry::thread_through_all_blocks): ...here. (delete_jump_thread_path): Remove. (register_jump_thread): Move... (jump_thread_path_registry::register_jump_thread): ...here. * tree-ssa-threadupdate.h: Move externally facing functions... (class jump_thread_path_allocator): ...here... (class jump_thread_path_registry): ...and here. (thread_through_all_blocks): Remove. (struct removed_edges): New. (register_jump_thread): Remove. (remove_jump_threads_including): Remove. (delete_jump_thread_path): Remove. (remove_ctrl_stmt_and_useless_edges): Remove. (free_dom_edge_info): New prototype. * tree-vrp.c: Remove x_vr_values hack. (class vrp_jump_threader_simplifier): New. (vrp_jump_threader_simplifier::simplify): New. (vrp_jump_threader::vrp_jump_threader): Adjust method signature. Remove m_dummy_cond. Instantiate m_simplifier and m_threader. (vrp_jump_threader::thread_through_all_blocks): New. (vrp_jump_threader::simplify_stmt): Remove. (vrp_jump_threader::after_dom_children): Do not set m_dummy_cond. Remove x_vr_values hack. (execute_vrp): Adjust for thread_through_all_blocks being in a class. --- gcc/tree-ssa-threadbackward.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 515a94b..428cf07 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -39,9 +39,11 @@ along with GCC; see the file COPYING3. If not see class thread_jumps { - public: +public: void find_jump_threads_backwards (basic_block bb, bool speed_p); - private: + bool thread_through_all_blocks (); + +private: edge profitable_jump_thread_path (basic_block bbi, tree name, tree arg, bool *creates_irreducible_loop); void convert_and_register_current_path (edge taken_edge); @@ -65,8 +67,16 @@ class thread_jumps /* Indicate that we could increase code size to improve the code path. */ bool m_speed_p; + + jump_thread_path_registry m_registry; }; +bool +thread_jumps::thread_through_all_blocks () +{ + return m_registry.thread_through_all_blocks (true); +} + /* Simple helper to get the last statement from BB, which is assumed to be a control statement. Return NULL if the last statement is not a control statement. */ @@ -459,7 +469,7 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name, void thread_jumps::convert_and_register_current_path (edge taken_edge) { - vec *jump_thread_path = new vec (); + vec *path = m_registry.allocate_thread_path (); /* Record the edges between the blocks in PATH. */ for (unsigned int j = 0; j + 1 < m_path.length (); j++) @@ -469,16 +479,17 @@ thread_jumps::convert_and_register_current_path (edge taken_edge) edge e = find_edge (bb1, bb2); gcc_assert (e); - jump_thread_edge *x = new jump_thread_edge (e, EDGE_FSM_THREAD); - jump_thread_path->safe_push (x); + jump_thread_edge *x + = m_registry.allocate_thread_edge (e, EDGE_FSM_THREAD); + path->safe_push (x); } /* Add the edge taken when the control variable has value ARG. */ jump_thread_edge *x - = new jump_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK); - jump_thread_path->safe_push (x); + = m_registry.allocate_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK); + path->safe_push (x); - register_jump_thread (jump_thread_path); + m_registry.register_jump_thread (path); --m_max_threaded_paths; } @@ -827,7 +838,7 @@ pass_thread_jumps::execute (function *fun) if (EDGE_COUNT (bb->succs) > 1) threader.find_jump_threads_backwards (bb, true); } - bool changed = thread_through_all_blocks (true); + bool changed = threader.thread_through_all_blocks (); loop_optimizer_finalize (); return changed ? TODO_cleanup_cfg : 0; @@ -888,7 +899,7 @@ pass_early_thread_jumps::execute (function *fun) if (EDGE_COUNT (bb->succs) > 1) threader.find_jump_threads_backwards (bb, false); } - thread_through_all_blocks (true); + threader.thread_through_all_blocks (); loop_optimizer_finalize (); return 0; -- cgit v1.1 From 69e5544210e3c0e27df3e6590b646c13dcce24e3 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Wed, 28 Apr 2021 11:15:11 +0200 Subject: Refactor backward threader registry and profitability code into classes. This refactors the registry and the profitability code from the backwards threader into two separate classes. It cleans up the code, and makes it easier for alternate implementations to share code. gcc/ChangeLog: * tree-ssa-threadbackward.c (class thread_jumps): Split out code from here... (class back_threader_registry): ...to here... (class back_threader_profitability): ...and here... (thread_jumps::thread_through_all_blocks): Remove argument. (back_threader_registry::back_threader_registry): New. (back_threader_registry::~back_threader_registry): New. (back_threader_registry::thread_through_all_blocks): New. (thread_jumps::profitable_jump_thread_path): Move from here... (back_threader_profitability::profitable_path_p): ...to here. (thread_jumps::find_taken_edge): New. (thread_jumps::convert_and_register_current_path): Move... (back_threader_registry::register_path): ...to here. (thread_jumps::register_jump_thread_path_if_profitable): Move... (thread_jumps::maybe_register_path): ...to here. (thread_jumps::handle_phi): Call find_taken_edge and maybe_register_path. (thread_jumps::handle_assignment): Same. (thread_jumps::fsm_find_control_statement_thread_paths): Remove tree argument to handle_phi and handle_assignment. (thread_jumps::find_jump_threads_backwards): Set m_name. Remove set of m_speed_p and m_max_threaded_paths. (pass_thread_jumps::execute): Remove second argument from find_jump_threads_backwards. (pass_early_thread_jumps::execute): Same. --- gcc/tree-ssa-threadbackward.c | 367 ++++++++++++++++++++++++------------------ 1 file changed, 213 insertions(+), 154 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 428cf07..7dd8594 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -37,44 +37,79 @@ along with GCC; see the file COPYING3. If not see #include "tree-inline.h" #include "tree-vectorizer.h" +// Path registry for the backwards threader. After all paths have been +// registered with register_path(), thread_through_all_blocks() is called +// to modify the CFG. + +class back_threader_registry +{ +public: + back_threader_registry (int max_allowable_paths); + ~back_threader_registry (); + bool register_path (const vec &, edge taken); + bool thread_through_all_blocks (); + +private: + vec> m_all_paths; + jump_thread_path_registry m_lowlevel_registry; + const int m_max_allowable_paths; + int m_threaded_paths; +}; + +// Class to abstract the profitability code for the backwards threader. + +class back_threader_profitability +{ +public: + back_threader_profitability (bool speed_p) + : m_speed_p (speed_p) + { } + bool profitable_path_p (const vec &, tree name, edge taken, + bool *irreducible_loop = NULL); + +private: + const bool m_speed_p; +}; + class thread_jumps { public: - void find_jump_threads_backwards (basic_block bb, bool speed_p); + thread_jumps (bool speed_p = true) + : m_profit (speed_p), m_registry (param_max_fsm_thread_paths) + { } + void find_jump_threads_backwards (basic_block bb); bool thread_through_all_blocks (); private: - edge profitable_jump_thread_path (basic_block bbi, tree name, tree arg, - bool *creates_irreducible_loop); - void convert_and_register_current_path (edge taken_edge); - void register_jump_thread_path_if_profitable (tree name, tree arg, - basic_block def_bb); - void handle_assignment (gimple *stmt, tree name, basic_block def_bb); - void handle_phi (gphi *phi, tree name, basic_block def_bb); + void maybe_register_path (const vec &m_path, + tree name, + edge taken_edge); + edge find_taken_edge (const vec &path, tree arg); + void handle_assignment (gimple *stmt, basic_block def_bb); + void handle_phi (gphi *phi, basic_block def_bb); void fsm_find_control_statement_thread_paths (tree name); bool check_subpath_and_update_thread_path (basic_block last_bb, basic_block new_bb, int *next_path_length); - /* Maximum number of BBs we are allowed to thread. */ - int m_max_threaded_paths; /* Hash to keep track of seen bbs. */ hash_set m_visited_bbs; /* Current path we're analyzing. */ auto_vec m_path; /* Tracks if we have recursed through a loop PHI node. */ bool m_seen_loop_phi; - /* Indicate that we could increase code size to improve the - code path. */ - bool m_speed_p; - jump_thread_path_registry m_registry; + tree m_name; + back_threader_profitability m_profit; + back_threader_registry m_registry; }; +// Perform the actual jump threading for the all queued paths. + bool thread_jumps::thread_through_all_blocks () { - return m_registry.thread_through_all_blocks (true); + return m_registry.thread_through_all_blocks (); } /* Simple helper to get the last statement from BB, which is assumed @@ -133,62 +168,65 @@ fsm_find_thread_path (basic_block start_bb, basic_block end_bb, return false; } -/* Examine jump threading path PATH to which we want to add BBI. +back_threader_registry::back_threader_registry (int max_allowable_paths) + : m_max_allowable_paths (max_allowable_paths) +{ + m_all_paths.create (5); + m_threaded_paths = 0; +} - If the resulting path is profitable to thread, then return the - final taken edge from the path, NULL otherwise. +back_threader_registry::~back_threader_registry () +{ + m_all_paths.release (); +} + +bool +back_threader_registry::thread_through_all_blocks () +{ + return m_lowlevel_registry.thread_through_all_blocks (true); +} + +/* Examine jump threading path PATH and return TRUE if it is profitable to + thread it, otherwise return FALSE. NAME is the SSA_NAME of the variable we found to have a constant - value on PATH. ARG is the constant value of NAME on that path. + value on PATH. If unknown, SSA_NAME is NULL. - BBI will be appended to PATH when we have a profitable jump - threading path. Callers are responsible for removing BBI from PATH - in that case. */ + If the taken edge out of the path is known ahead of time it is passed in + TAKEN_EDGE, otherwise it is NULL. -edge -thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name, - tree arg, - bool *creates_irreducible_loop) + CREATES_IRREDUCIBLE_LOOP, if non-null is set to TRUE if threading this path + would create an irreducible loop. */ + +bool +back_threader_profitability::profitable_path_p (const vec &m_path, + tree name, + edge taken_edge, + bool *creates_irreducible_loop) { - /* Note BBI is not in the path yet, hence the +1 in the test below - to make sure BBI is accounted for in the path length test. */ + gcc_checking_assert (!m_path.is_empty ()); - /* We can get a length of 0 here when the statement that - makes a conditional generate a compile-time constant - result is in the same block as the conditional. + /* We can an empty path here (excluding the DEF block) when the + statement that makes a conditional generate a compile-time + constant result is in the same block as the conditional. That's not really a jump threading opportunity, but instead is simple cprop & simplification. We could handle it here if we wanted by wiring up all the incoming edges. If we run this early in IPA, that might be worth doing. For now we just reject that case. */ - if (m_path.is_empty ()) - return NULL; + if (m_path.length () <= 1) + return false; - if (m_path.length () + 1 - > (unsigned) param_max_fsm_thread_length) + if (m_path.length () > (unsigned) param_max_fsm_thread_length) { if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "FSM jump-thread path not considered: " + fprintf (dump_file, " FAIL: FSM jump-thread path not considered: " "the number of basic blocks on the path " "exceeds PARAM_MAX_FSM_THREAD_LENGTH.\n"); - return NULL; + return false; } - if (m_max_threaded_paths <= 0) - { - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "FSM jump-thread path not considered: " - "the number of previously recorded FSM paths to " - "thread exceeds PARAM_MAX_FSM_THREAD_PATHS.\n"); - return NULL; - } - - /* Add BBI to the path. - From this point onward, if we decide we the path is not profitable - to thread, we must remove BBI from the path. */ - m_path.safe_push (bbi); - int n_insns = 0; gimple_stmt_iterator gsi; loop_p loop = m_path[0]->loop_father; @@ -256,6 +294,8 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name, SSA_NAMEs, then we do not have enough information to consider them associated. */ if (dst != name + && name + && TREE_CODE (name) == SSA_NAME && (SSA_NAME_VAR (dst) != SSA_NAME_VAR (name) || !SSA_NAME_VAR (dst)) && !virtual_operand_p (dst)) @@ -276,10 +316,7 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name, gimple *stmt = gsi_stmt (gsi); if (gimple_call_internal_p (stmt, IFN_UNIQUE) || gimple_call_builtin_p (stmt, BUILT_IN_CONSTANT_P)) - { - m_path.pop (); - return NULL; - } + return false; /* Do not count empty statements and labels. */ if (gimple_code (stmt) != GIMPLE_NOP && !(gimple_code (stmt) == GIMPLE_ASSIGN @@ -330,75 +367,52 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name, " Overall: %i insns\n", stmt_insns, n_insns); - /* We have found a constant value for ARG. For GIMPLE_SWITCH - and GIMPLE_GOTO, we use it as-is. However, for a GIMPLE_COND - we need to substitute, fold and simplify so we can determine - the edge taken out of the last block. */ - if (gimple_code (stmt) == GIMPLE_COND) + if (creates_irreducible_loop) { - enum tree_code cond_code = gimple_cond_code (stmt); - - /* We know the underyling format of the condition. */ - arg = fold_binary (cond_code, boolean_type_node, - arg, gimple_cond_rhs (stmt)); - } - - /* If this path threaded through the loop latch back into the - same loop and the destination does not dominate the loop - latch, then this thread would create an irreducible loop. - - We have to know the outgoing edge to figure this out. */ - edge taken_edge = find_taken_edge (m_path[0], arg); - - /* There are cases where we may not be able to extract the - taken edge. For example, a computed goto to an absolute - address. Handle those cases gracefully. */ - if (taken_edge == NULL) - { - m_path.pop (); - return NULL; + /* If this path threaded through the loop latch back into the + same loop and the destination does not dominate the loop + latch, then this thread would create an irreducible loop. */ + *creates_irreducible_loop = false; + if (taken_edge + && threaded_through_latch + && loop == taken_edge->dest->loop_father + && (determine_bb_domination_status (loop, taken_edge->dest) + == DOMST_NONDOMINATING)) + *creates_irreducible_loop = true; } - *creates_irreducible_loop = false; - if (threaded_through_latch - && loop == taken_edge->dest->loop_father - && (determine_bb_domination_status (loop, taken_edge->dest) - == DOMST_NONDOMINATING)) - *creates_irreducible_loop = true; - if (path_crosses_loops) { if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "FSM jump-thread path not considered: " + fprintf (dump_file, " FAIL: FSM jump-thread path not considered: " "the path crosses loops.\n"); - m_path.pop (); - return NULL; + return false; } /* Threading is profitable if the path duplicated is hot but also in a case we separate cold path from hot path and permit optimization of the hot path later. Be on the agressive side here. In some testcases, as in PR 78407 this leads to noticeable improvements. */ - if (m_speed_p && (optimize_edge_for_speed_p (taken_edge) || contains_hot_bb)) + if (m_speed_p + && ((taken_edge && optimize_edge_for_speed_p (taken_edge)) + || contains_hot_bb)) { if (n_insns >= param_max_fsm_thread_path_insns) { if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "FSM jump-thread path not considered: " + fprintf (dump_file, " FAIL: FSM jump-thread path not considered: " "the number of instructions on the path " "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n"); - m_path.pop (); - return NULL; + return false; } } - else if (n_insns > 1) + else if (!m_speed_p && n_insns > 1) { if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "FSM jump-thread path not considered: " + fprintf (dump_file, " FAIL: FSM jump-thread path not considered: " "duplication of %i insns is needed and optimizing for size.\n", n_insns); - m_path.pop (); - return NULL; + return false; } /* We avoid creating irreducible inner loops unless we thread through @@ -410,7 +424,9 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name, the path -- in that case there's little the traditional loop optimizer would have done anyway, so an irreducible loop is not so bad. */ - if (!threaded_multiway_branch && *creates_irreducible_loop + if (!threaded_multiway_branch + && creates_irreducible_loop + && *creates_irreducible_loop && (n_insns * (unsigned) param_fsm_scale_path_stmts > (m_path.length () * (unsigned) param_fsm_scale_path_blocks))) @@ -418,13 +434,11 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name, { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, - "FSM would create irreducible loop without threading " + " FAIL: FSM would create irreducible loop without threading " "multiway branch.\n"); - m_path.pop (); - return NULL; + return false; } - /* If this path does not thread through the loop latch, then we are using the FSM threader to find old style jump threads. This is good, except the FSM threader does not re-use an existing @@ -438,10 +452,9 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name, { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, - "FSM did not thread around loop and would copy too " + " FAIL: FSM did not thread around loop and would copy too " "many statements.\n"); - m_path.pop (); - return NULL; + return false; } /* When there is a multi-way branch on the path, then threading can @@ -452,24 +465,69 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name, { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, - "FSM Thread through multiway branch without threading " + " FAIL: FSM Thread through multiway branch without threading " "a multiway branch.\n"); - m_path.pop (); - return NULL; + return false; } - return taken_edge; + return true; +} + +/* Return the taken edge out of a path, assuming that the underlying assignment + or PHI SSA resolves to ARG. */ + +edge +thread_jumps::find_taken_edge (const vec &path, tree arg) +{ + if (TREE_CODE_CLASS (TREE_CODE (arg)) != tcc_constant) + return NULL; + + gcc_checking_assert (!path.is_empty ()); + gimple *stmt = get_gimple_control_stmt (m_path[0]); + + /* We have found a constant value for ARG. For GIMPLE_SWITCH + and GIMPLE_GOTO, we use it as-is. However, for a GIMPLE_COND + we need to substitute, fold and simplify so we can determine + the edge taken out of the last block. */ + if (gimple_code (stmt) == GIMPLE_COND) + { + enum tree_code cond_code = gimple_cond_code (stmt); + + /* We know the underyling format of the condition. */ + arg = fold_binary (cond_code, boolean_type_node, + arg, gimple_cond_rhs (stmt)); + } + + /* If this path threaded through the loop latch back into the + same loop and the destination does not dominate the loop + latch, then this thread would create an irreducible loop. + + We have to know the outgoing edge to figure this out. */ + return ::find_taken_edge (m_path[0], arg); } /* The current path PATH is a vector of blocks forming a jump threading path in reverse order. TAKEN_EDGE is the edge taken from path[0]. Convert the current path into the form used by register_jump_thread and - register it. */ + register it. -void -thread_jumps::convert_and_register_current_path (edge taken_edge) + Return TRUE if successful or FALSE otherwise. */ + +bool +back_threader_registry::register_path (const vec &m_path, + edge taken_edge) { - vec *path = m_registry.allocate_thread_path (); + if (m_threaded_paths > m_max_allowable_paths) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " FAIL: FSM jump-thread path not considered: " + "the number of previously recorded FSM paths to " + "thread exceeds PARAM_MAX_FSM_THREAD_PATHS.\n"); + return false; + } + + vec *jump_thread_path + = m_lowlevel_registry.allocate_thread_path (); /* Record the edges between the blocks in PATH. */ for (unsigned int j = 0; j + 1 < m_path.length (); j++) @@ -480,17 +538,19 @@ thread_jumps::convert_and_register_current_path (edge taken_edge) edge e = find_edge (bb1, bb2); gcc_assert (e); jump_thread_edge *x - = m_registry.allocate_thread_edge (e, EDGE_FSM_THREAD); - path->safe_push (x); + = m_lowlevel_registry.allocate_thread_edge (e, EDGE_FSM_THREAD); + jump_thread_path->safe_push (x); } /* Add the edge taken when the control variable has value ARG. */ jump_thread_edge *x - = m_registry.allocate_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK); - path->safe_push (x); + = m_lowlevel_registry.allocate_thread_edge (taken_edge, + EDGE_NO_COPY_SRC_BLOCK); + jump_thread_path->safe_push (x); - m_registry.register_jump_thread (path); - --m_max_threaded_paths; + m_lowlevel_registry.register_jump_thread (jump_thread_path); + ++m_threaded_paths; + return true; } /* While following a chain of SSA_NAME definitions, we jumped from a @@ -558,19 +618,17 @@ thread_jumps::check_subpath_and_update_thread_path (basic_block last_bb, DEF_BB is the basic block that ultimately defines the constant. */ void -thread_jumps::register_jump_thread_path_if_profitable (tree name, tree arg, - basic_block def_bb) +thread_jumps::maybe_register_path (const vec &m_path, + tree name, + edge taken_edge) { - if (TREE_CODE_CLASS (TREE_CODE (arg)) != tcc_constant) - return; - bool irreducible = false; - edge taken_edge = profitable_jump_thread_path (def_bb, name, arg, - &irreducible); - if (taken_edge) + bool profitable = m_profit.profitable_path_p (m_path, name, taken_edge, + &irreducible); + if (profitable) { - convert_and_register_current_path (taken_edge); - m_path.pop (); + if (!m_registry.register_path (m_path, taken_edge)) + return; if (irreducible) vect_free_loop_info_assumptions (m_path[0]->loop_father); @@ -585,7 +643,7 @@ thread_jumps::register_jump_thread_path_if_profitable (tree name, tree arg, NAME having a constant value. */ void -thread_jumps::handle_phi (gphi *phi, tree name, basic_block def_bb) +thread_jumps::handle_phi (gphi *phi, basic_block def_bb) { /* Iterate over the arguments of PHI. */ for (unsigned int i = 0; i < gimple_phi_num_args (phi); i++) @@ -608,7 +666,11 @@ thread_jumps::handle_phi (gphi *phi, tree name, basic_block def_bb) continue; } - register_jump_thread_path_if_profitable (name, arg, bbi); + m_path.safe_push (bbi); + edge taken_edge = find_taken_edge (m_path, arg); + if (taken_edge) + maybe_register_path (m_path, m_name, taken_edge); + m_path.pop (); } } @@ -650,25 +712,23 @@ handle_assignment_p (gimple *stmt) NAME having a constant value. */ void -thread_jumps::handle_assignment (gimple *stmt, tree name, basic_block def_bb) +thread_jumps::handle_assignment (gimple *stmt, basic_block def_bb) { tree arg = gimple_assign_rhs1 (stmt); if (TREE_CODE (arg) == SSA_NAME) fsm_find_control_statement_thread_paths (arg); - else { - /* register_jump_thread_path_if_profitable will push the current - block onto the path. But the path will always have the current - block at this point. So we can just pop it. */ - m_path.pop (); - - register_jump_thread_path_if_profitable (name, arg, def_bb); - - /* And put the current block back onto the path so that the - state of the stack is unchanged when we leave. */ - m_path.safe_push (def_bb); + if (CHECKING_P) + { + gcc_assert (!m_path.is_empty ()); + basic_block top = m_path[m_path.length () - 1]; + gcc_assert (top == def_bb); + } + edge taken_edge = find_taken_edge (m_path, arg); + if (taken_edge) + maybe_register_path (m_path, m_name, taken_edge); } } @@ -738,9 +798,9 @@ thread_jumps::fsm_find_control_statement_thread_paths (tree name) gcc_assert (m_path.last () == def_bb); if (gimple_code (def_stmt) == GIMPLE_PHI) - handle_phi (as_a (def_stmt), name, def_bb); + handle_phi (as_a (def_stmt), def_bb); else if (gimple_code (def_stmt) == GIMPLE_ASSIGN) - handle_assignment (def_stmt, name, def_bb); + handle_assignment (def_stmt, def_bb); /* Remove all the nodes that we added from NEXT_PATH. */ if (next_path_length) @@ -756,8 +816,8 @@ thread_jumps::fsm_find_control_statement_thread_paths (tree name) code path. */ void -thread_jumps::find_jump_threads_backwards (basic_block bb, bool speed_p) -{ +thread_jumps::find_jump_threads_backwards (basic_block bb) +{ gimple *stmt = get_gimple_control_stmt (bb); if (!stmt) return; @@ -785,8 +845,7 @@ thread_jumps::find_jump_threads_backwards (basic_block bb, bool speed_p) m_path.safe_push (bb); m_visited_bbs.empty (); m_seen_loop_phi = false; - m_speed_p = speed_p; - m_max_threaded_paths = param_max_fsm_thread_paths; + m_name = name; fsm_find_control_statement_thread_paths (name); } @@ -836,7 +895,7 @@ pass_thread_jumps::execute (function *fun) FOR_EACH_BB_FN (bb, fun) { if (EDGE_COUNT (bb->succs) > 1) - threader.find_jump_threads_backwards (bb, true); + threader.find_jump_threads_backwards (bb); } bool changed = threader.thread_through_all_blocks (); @@ -892,12 +951,12 @@ pass_early_thread_jumps::execute (function *fun) loop_optimizer_init (AVOID_CFG_MODIFICATIONS); /* Try to thread each block with more than one successor. */ - thread_jumps threader; + thread_jumps threader (/*speed_p=*/false); basic_block bb; FOR_EACH_BB_FN (bb, fun) { if (EDGE_COUNT (bb->succs) > 1) - threader.find_jump_threads_backwards (bb, false); + threader.find_jump_threads_backwards (bb); } threader.thread_through_all_blocks (); -- cgit v1.1 From 2e96b5f14e4025691b57d2301d71aa6092ed44bc Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Tue, 15 Jun 2021 12:32:51 +0200 Subject: Backwards jump threader rewrite with ranger. This is a rewrite of the backwards threader with a ranger based solver. The code is divided into two parts: the path solver in gimple-range-path.*, and the path discovery bits in tree-ssa-threadbackward.c. The legacy code is still available with --param=threader-mode=legacy, but will be removed shortly after. gcc/ChangeLog: * Makefile.in (tree-ssa-loop-im.o-warn): New. * flag-types.h (enum threader_mode): New. * params.opt: Add entry for --param=threader-mode. * tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New. (class back_threader): New. (back_threader::back_threader): New. (back_threader::~back_threader): New. (back_threader::maybe_register_path): New. (back_threader::find_taken_edge): New. (back_threader::find_taken_edge_switch): New. (back_threader::find_taken_edge_cond): New. (back_threader::resolve_def): New. (back_threader::resolve_phi): New. (back_threader::find_paths_to_names): New. (back_threader::find_paths): New. (dump_path): New. (debug): New. (thread_jumps::find_jump_threads_backwards): Call ranger threader. (thread_jumps::find_jump_threads_backwards_with_ranger): New. (pass_thread_jumps::execute): Abstract out code... (try_thread_blocks): ...here. * tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges): Abstract out threading candidate code to... (single_succ_to_potentially_threadable_block): ...here. * tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block): New. * tree-ssa-threadupdate.c (register_jump_thread): Return boolean. * tree-ssa-threadupdate.h (class jump_thread_path_registry): Return bool from register_jump_thread. libgomp/ChangeLog: * testsuite/libgomp.graphite/force-parallel-4.c: Adjust for threader. * testsuite/libgomp.graphite/force-parallel-8.c: Same. gcc/testsuite/ChangeLog: * g++.dg/debug/dwarf2/deallocator.C: Adjust for threader. * gcc.c-torture/compile/pr83510.c: Same. * dg.dg/analyzer/pr94851-2.c: Same. * gcc.dg/loop-unswitch-2.c: Same. * gcc.dg/old-style-asm-1.c: Same. * gcc.dg/pr68317.c: Same. * gcc.dg/pr97567-2.c: Same. * gcc.dg/predict-9.c: Same. * gcc.dg/shrink-wrap-loop.c: Same. * gcc.dg/sibcall-1.c: Same. * gcc.dg/tree-ssa/builtin-sprintf-3.c: Same. * gcc.dg/tree-ssa/pr21001.c: Same. * gcc.dg/tree-ssa/pr21294.c: Same. * gcc.dg/tree-ssa/pr21417.c: Same. * gcc.dg/tree-ssa/pr21458-2.c: Same. * gcc.dg/tree-ssa/pr21563.c: Same. * gcc.dg/tree-ssa/pr49039.c: Same. * gcc.dg/tree-ssa/pr61839_1.c: Same. * gcc.dg/tree-ssa/pr61839_3.c: Same. * gcc.dg/tree-ssa/pr77445-2.c: Same. * gcc.dg/tree-ssa/split-path-4.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same. * gcc.dg/tree-ssa/ssa-fre-48.c: Same. * gcc.dg/tree-ssa/ssa-thread-11.c: Same. * gcc.dg/tree-ssa/ssa-thread-12.c: Same. * gcc.dg/tree-ssa/ssa-thread-14.c: Same. * gcc.dg/tree-ssa/vrp02.c: Same. * gcc.dg/tree-ssa/vrp03.c: Same. * gcc.dg/tree-ssa/vrp05.c: Same. * gcc.dg/tree-ssa/vrp06.c: Same. * gcc.dg/tree-ssa/vrp07.c: Same. * gcc.dg/tree-ssa/vrp09.c: Same. * gcc.dg/tree-ssa/vrp19.c: Same. * gcc.dg/tree-ssa/vrp20.c: Same. * gcc.dg/tree-ssa/vrp33.c: Same. * gcc.dg/uninit-pred-9_b.c: Same. * gcc.dg/uninit-pr61112.c: Same. * gcc.dg/vect/bb-slp-16.c: Same. * gcc.target/i386/avx2-vect-aggressive.c: Same. * gcc.dg/tree-ssa/ranger-threader-1.c: New test. * gcc.dg/tree-ssa/ranger-threader-2.c: New test. * gcc.dg/tree-ssa/ranger-threader-3.c: New test. * gcc.dg/tree-ssa/ranger-threader-4.c: New test. * gcc.dg/tree-ssa/ranger-threader-5.c: New test. --- gcc/tree-ssa-threadbackward.c | 476 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 468 insertions(+), 8 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 7dd8594..2c0e975 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -36,6 +36,12 @@ along with GCC; see the file COPYING3. If not see #include "tree-phinodes.h" #include "tree-inline.h" #include "tree-vectorizer.h" +#include "value-range.h" +#include "gimple-range.h" +#include "tree-ssa-threadedge.h" +#include "gimple-range-path.h" +#include "ssa.h" +#include "tree-cfgcleanup.h" // Path registry for the backwards threader. After all paths have been // registered with register_path(), thread_through_all_blocks() is called @@ -71,13 +77,415 @@ private: const bool m_speed_p; }; +// Ranger based backwards threader. + +class back_threader +{ + // Temporary until we remove old code. + friend bool path_is_unreachable_p (const vec &); + +public: + back_threader (back_threader_profitability &, back_threader_registry &); + ~back_threader (); + void find_paths (basic_block bb, tree name); + +private: + void maybe_register_path (edge taken_edge); + bool find_paths_to_names (basic_block bb, bitmap imports); + bool resolve_def (tree name, bitmap interesting, vec worklist); + bool resolve_phi (gphi *phi, bitmap imports); + edge find_taken_edge (const vec &path); + edge find_taken_edge_cond (const vec &path, gcond *); + edge find_taken_edge_switch (const vec &path, gswitch *); + + back_threader_registry &m_registry; + back_threader_profitability &m_profit; + gimple_ranger m_ranger; + path_range_query m_solver; + + // Current path being analyzed. + auto_vec m_path; + // Hash to mark visited BBs while analyzing a path. + hash_set m_visited_bbs; + // The set of SSA names, any of which could potentially change the + // value of the final conditional in a path. + bitmap m_imports; + // The last statement in the path. + gimple *m_last_stmt; + // This is a bit of a wart. It's used to pass the LHS SSA name to + // the profitability engine. + tree m_name; + // Marker to differentiate unreachable edges. + static const edge UNREACHABLE_EDGE; +}; + +// Used to differentiate unreachable edges, so we may stop the search +// in a the given direction. +const edge back_threader::UNREACHABLE_EDGE = (edge) -1; + +back_threader::back_threader (back_threader_profitability &profit, + back_threader_registry ®istry) + : m_registry (registry), + m_profit (profit), + m_solver (m_ranger) +{ + m_last_stmt = NULL; + m_imports = BITMAP_ALLOC (NULL); +} + +back_threader::~back_threader () +{ + m_path.release (); + BITMAP_FREE (m_imports); +} + +// Register the current path for jump threading if it's profitable to +// do so. TAKEN_EDGE is the known edge out of the path. + +void +back_threader::maybe_register_path (edge taken_edge) +{ + bool irreducible = false; + bool profitable + = m_profit.profitable_path_p (m_path, m_name, taken_edge, &irreducible); + + if (profitable) + { + m_registry.register_path (m_path, taken_edge); + + if (irreducible) + vect_free_loop_info_assumptions (m_path[0]->loop_father); + } +} + +// Return the known taken edge out of a path. If the path can be +// determined to be unreachable, return UNREACHABLE_EDGE. If no +// outgoing edge can be calculated, return NULL. + +edge +back_threader::find_taken_edge (const vec &path) +{ + gcc_checking_assert (path.length () > 1); + switch (gimple_code (m_last_stmt)) + { + case GIMPLE_COND: + return find_taken_edge_cond (path, as_a (m_last_stmt)); + + case GIMPLE_SWITCH: + return find_taken_edge_switch (path, as_a (m_last_stmt)); + + default: + return NULL; + } +} + +// Same as find_taken_edge, but for paths ending in a switch. + +edge +back_threader::find_taken_edge_switch (const vec &path, + gswitch *sw) +{ + tree name = gimple_switch_index (sw); + int_range_max r; + + m_solver.precompute_ranges (path, m_imports); + m_solver.range_of_expr (r, name, sw); + + if (r.undefined_p ()) + return UNREACHABLE_EDGE; + + if (r.varying_p ()) + return NULL; + + tree val; + if (r.singleton_p (&val)) + return ::find_taken_edge (gimple_bb (sw), val); + + return NULL; +} + +// Same as find_taken_edge, but for paths ending in a GIMPLE_COND. + +edge +back_threader::find_taken_edge_cond (const vec &path, + gcond *cond) +{ + m_solver.precompute_ranges (path, m_imports); + + // Check if either operand is unreachable since this knowledge could + // help the caller cut down the search space. + int_range_max r; + m_solver.range_of_expr (r, gimple_cond_lhs (cond)); + if (r.undefined_p ()) + return UNREACHABLE_EDGE; + m_solver.range_of_expr (r, gimple_cond_rhs (cond)); + if (r.undefined_p ()) + return UNREACHABLE_EDGE; + + m_solver.range_of_stmt (r, cond); + + int_range<2> true_range (boolean_true_node, boolean_true_node); + int_range<2> false_range (boolean_false_node, boolean_false_node); + + if (r == true_range || r == false_range) + { + edge e_true, e_false; + basic_block bb = gimple_bb (cond); + extract_true_false_edges_from_block (bb, &e_true, &e_false); + return r == true_range ? e_true : e_false; + } + return NULL; +} + +// Populate a vector of trees from a bitmap. + +static inline void +populate_worklist (vec worklist, bitmap bits) +{ + bitmap_iterator bi; + unsigned i; + + EXECUTE_IF_SET_IN_BITMAP (bits, 0, i, bi) + { + tree name = ssa_name (i); + worklist.quick_push (name); + } +} + +// If taking any of the incoming edges to a PHI causes the final +// conditional of the current path to be constant, register the +// path(s), and return TRUE. + +bool +back_threader::resolve_phi (gphi *phi, bitmap interesting) +{ + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi))) + return true; + + bool done = false; + for (size_t i = 0; i < gimple_phi_num_args (phi); ++i) + { + edge e = gimple_phi_arg_edge (phi, i); + + // This is like path_crosses_loops in profitable_path_p but more + // restrictive, since profitable_path_p allows threading the + // first block because it would be redirected anyhow. + // + // If we loosened the restriction and used profitable_path_p() + // here instead, we would peel off the first iterations of loops + // in places like tree-ssa/pr14341.c. + bool profitable_p = m_path[0]->loop_father == e->src->loop_father; + if (!profitable_p) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, + " FAIL: path through PHI in bb%d (incoming bb:%d) crosses loop\n", + e->dest->index, e->src->index); + continue; + } + + tree arg = gimple_phi_arg_def (phi, i); + if (TREE_CODE (arg) == SSA_NAME) + { + unsigned v = SSA_NAME_VERSION (arg); + + // Avoid loops as in: x_5 = PHI . + if (bitmap_bit_p (interesting, v)) + continue; + + bitmap_set_bit (interesting, v); + bitmap_set_bit (m_imports, v); + done |= find_paths_to_names (e->src, interesting); + bitmap_clear_bit (interesting, v); + } + else if (TREE_CODE (arg) == INTEGER_CST) + { + m_path.safe_push (e->src); + edge taken_edge = find_taken_edge (m_path); + if (taken_edge && taken_edge != UNREACHABLE_EDGE) + { + maybe_register_path (taken_edge); + done = true; + } + m_path.pop (); + } + } + return done; +} + +// If the definition of NAME causes the final conditional of the +// current path to be constant, register the path, and return TRUE. + +bool +back_threader::resolve_def (tree name, bitmap interesting, vec worklist) +{ + gimple *def_stmt = SSA_NAME_DEF_STMT (name); + + // Handle PHIs. + if (is_a (def_stmt) + && resolve_phi (as_a (def_stmt), interesting)) + return true; + + // Defer copies of SSAs by adding the source to the worklist. + if (gimple_assign_single_p (def_stmt) + && TREE_CODE (gimple_assign_rhs1 (def_stmt)) == SSA_NAME) + { + tree rhs = gimple_assign_rhs1 (def_stmt); + bitmap_set_bit (m_imports, SSA_NAME_VERSION (rhs)); + bitmap_set_bit (interesting, SSA_NAME_VERSION (rhs)); + worklist.safe_push (rhs); + } + return false; +} + +// Find jump threading paths to any of the SSA names in the +// INTERESTING bitmap, and register any such paths. +// +// Return TRUE if no further processing past this block is necessary. +// This is because we've either registered a path, or because there is +// nothing of interesting beyond this block. +// +// BB is the current path being processed. + +bool +back_threader::find_paths_to_names (basic_block bb, bitmap interesting) +{ + if (m_visited_bbs.add (bb)) + return true; + + m_path.safe_push (bb); + + if (m_path.length () > 1 + && !m_profit.profitable_path_p (m_path, m_name, NULL)) + { + m_path.pop (); + m_visited_bbs.remove (bb); + return false; + } + + auto_bitmap processed; + unsigned i; + bool done = false; + + // We use a worklist instead of iterating through the bitmap, + // because we may add new items in-flight. + auto_vec worklist (bitmap_count_bits (interesting)); + populate_worklist (worklist, interesting); + while (!worklist.is_empty ()) + { + tree name = worklist.pop (); + unsigned i = SSA_NAME_VERSION (name); + basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (name)); + + // Process any names defined in this block. + if (def_bb == bb) + { + bitmap_set_bit (processed, i); + + if (resolve_def (name, interesting, worklist)) + { + done = true; + goto leave_bb; + } + } + // Examine blocks that define or export an interesting SSA, + // since they may compute a range which resolve this path. + if ((def_bb == bb + || bitmap_bit_p (m_ranger.gori ().exports (bb), i)) + && m_path.length () > 1) + { + edge taken_edge = find_taken_edge (m_path); + if (taken_edge) + { + if (taken_edge != UNREACHABLE_EDGE) + maybe_register_path (taken_edge); + + done = true; + goto leave_bb; + } + } + } + + // If there are interesting names not yet processed, keep looking. + bitmap_and_compl_into (interesting, processed); + if (!bitmap_empty_p (interesting)) + { + edge_iterator iter; + edge e; + FOR_EACH_EDGE (e, iter, bb->preds) + if ((e->flags & EDGE_ABNORMAL) == 0) + done |= find_paths_to_names (e->src, interesting); + } + + leave_bb: + bitmap_iterator bi; + EXECUTE_IF_SET_IN_BITMAP (processed, 0, i, bi) + bitmap_set_bit (interesting, i); + + m_path.pop (); + m_visited_bbs.remove (bb); + return done; +} + +// Search backwards from BB looking for paths where the final +// conditional out of BB can be determined. NAME is the LHS of the +// final conditional. Register such paths for jump threading. + +void +back_threader::find_paths (basic_block bb, tree name) +{ + gimple *stmt = last_stmt (bb); + if (!stmt + || (gimple_code (stmt) != GIMPLE_COND + && gimple_code (stmt) != GIMPLE_SWITCH)) + return; + + if (EDGE_COUNT (bb->succs) > 1 + || single_succ_to_potentially_threadable_block (bb)) + { + m_last_stmt = stmt; + m_visited_bbs.empty (); + m_path.truncate (0); + m_name = name; + bitmap_clear (m_imports); + + auto_bitmap interesting; + bitmap_copy (m_imports, m_ranger.gori ().imports (bb)); + bitmap_copy (interesting, m_imports); + find_paths_to_names (bb, interesting); + } +} + +// Dump a sequence of BBs through the CFG. + +DEBUG_FUNCTION void +dump_path (FILE *dump_file, const vec &path) +{ + for (size_t i = 0; i < path.length (); ++i) + { + fprintf (dump_file, "BB%d", path[i]->index); + if (i + 1 < path.length ()) + fprintf (dump_file, " <- "); + } + fprintf (dump_file, "\n"); +} + +DEBUG_FUNCTION void +debug (const vec &path) +{ + dump_path (stderr, path); +} + class thread_jumps { public: thread_jumps (bool speed_p = true) - : m_profit (speed_p), m_registry (param_max_fsm_thread_paths) + : m_profit (speed_p), + m_registry (param_max_fsm_thread_paths), + m_back_threader (m_profit, m_registry) { } void find_jump_threads_backwards (basic_block bb); + void find_jump_threads_backwards_with_ranger (basic_block bb); bool thread_through_all_blocks (); private: @@ -102,6 +510,7 @@ private: tree m_name; back_threader_profitability m_profit; back_threader_registry m_registry; + back_threader m_back_threader; }; // Perform the actual jump threading for the all queued paths. @@ -548,8 +957,8 @@ back_threader_registry::register_path (const vec &m_path, EDGE_NO_COPY_SRC_BLOCK); jump_thread_path->safe_push (x); - m_lowlevel_registry.register_jump_thread (jump_thread_path); - ++m_threaded_paths; + if (m_lowlevel_registry.register_jump_thread (jump_thread_path)) + ++m_threaded_paths; return true; } @@ -818,6 +1227,12 @@ thread_jumps::fsm_find_control_statement_thread_paths (tree name) void thread_jumps::find_jump_threads_backwards (basic_block bb) { + if (param_threader_mode & THREADER_MODE_RANGER) + { + find_jump_threads_backwards_with_ranger (bb); + return; + } + gimple *stmt = get_gimple_control_stmt (bb); if (!stmt) return; @@ -850,6 +1265,28 @@ thread_jumps::find_jump_threads_backwards (basic_block bb) fsm_find_control_statement_thread_paths (name); } +// Like find_jump_threads_backwards(), but using ranger. + +void +thread_jumps::find_jump_threads_backwards_with_ranger (basic_block bb) +{ + gimple *stmt = get_gimple_control_stmt (bb); + if (!stmt) + return; + + enum gimple_code code = gimple_code (stmt); + tree name = NULL; + if (code == GIMPLE_SWITCH) + name = gimple_switch_index (as_a (stmt)); + else if (code == GIMPLE_GOTO) + name = gimple_goto_dest (stmt); + else if (code == GIMPLE_COND) + name = gimple_cond_lhs (stmt); + + m_name = name; + m_back_threader.find_paths (bb, name); +} + namespace { const pass_data pass_data_thread_jumps = @@ -883,12 +1320,12 @@ pass_thread_jumps::gate (function *fun ATTRIBUTE_UNUSED) return flag_expensive_optimizations; } +// Try to thread blocks in FUN. Return TRUE if any jump thread paths were +// registered. -unsigned int -pass_thread_jumps::execute (function *fun) +static bool +try_thread_blocks (function *fun) { - loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES); - /* Try to thread each block with more than one successor. */ thread_jumps threader; basic_block bb; @@ -897,7 +1334,30 @@ pass_thread_jumps::execute (function *fun) if (EDGE_COUNT (bb->succs) > 1) threader.find_jump_threads_backwards (bb); } - bool changed = threader.thread_through_all_blocks (); + return threader.thread_through_all_blocks (); +} + +unsigned int +pass_thread_jumps::execute (function *fun) +{ + loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES); + + // Iterative mode is a testing construct and is not meant for public + // consumption. It is OFF by default. + bool iterative = param_threader_iterative; + + bool changed = false; + while (try_thread_blocks (fun)) + { + changed = true; + + if (!iterative) + break; + + if ((param_threader_mode & THREADER_MODE_RANGER) == 0) + break; + cleanup_tree_cfg (TODO_update_ssa); + } loop_optimizer_finalize (); return changed ? TODO_cleanup_cfg : 0; -- cgit v1.1 From cac2353f8b6424980a78fe224b20b2a70e98de51 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Mon, 2 Aug 2021 15:12:30 +0200 Subject: Remove --param=threader-iterative. This was meant to be an internal construct, but I see folks are using it and submitting PRs against it. Let's just remove this to avoid further confusion. Tested on x86-64 Linux. gcc/ChangeLog: PR tree-optimization/101724 * params.opt: Remove --param=threader-iterative. * tree-ssa-threadbackward.c (pass_thread_jumps::execute): Remove iterative mode. --- gcc/tree-ssa-threadbackward.c | 18 ++---------------- 1 file changed, 2 insertions(+), 16 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 2c0e975..91ce443 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -1342,24 +1342,10 @@ pass_thread_jumps::execute (function *fun) { loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES); - // Iterative mode is a testing construct and is not meant for public - // consumption. It is OFF by default. - bool iterative = param_threader_iterative; - - bool changed = false; - while (try_thread_blocks (fun)) - { - changed = true; - - if (!iterative) - break; - - if ((param_threader_mode & THREADER_MODE_RANGER) == 0) - break; - cleanup_tree_cfg (TODO_update_ssa); - } + bool changed = try_thread_blocks (fun); loop_optimizer_finalize (); + return changed ? TODO_cleanup_cfg : 0; } -- cgit v1.1 From a3d3e8c362c2d850543eb2e2631128e1efc368f0 Mon Sep 17 00:00:00 2001 From: Martin Sebor Date: Thu, 5 Aug 2021 19:50:35 -0600 Subject: Adjust by-value function vec arguments to by-reference. gcc/c/ChangeLog: * c-parser.c (c_parser_declaration_or_fndef): Adjust by-value function vec arguments to by-reference. (c_finish_omp_declare_simd): Same. (c_parser_compound_statement_nostart): Same. (c_parser_for_statement): Same. (c_parser_objc_methodprotolist): Same. (c_parser_oacc_routine): Same. (c_parser_omp_for_loop): Same. (c_parser_omp_declare_simd): Same. gcc/ChangeLog: * dominance.c (prune_bbs_to_update_dominators): Adjust by-value vec arguments to by-reference. (iterate_fix_dominators): Same. * dominance.h (iterate_fix_dominators): Same. * ipa-prop.h: Call auto_vec::to_vec_legacy. * tree-data-ref.c (dump_data_dependence_relation): Adjust by-value vec arguments to by-reference. (debug_data_dependence_relation): Same. (dump_data_dependence_relations): Same. * tree-data-ref.h (debug_data_dependence_relation): Same. (dump_data_dependence_relations): Same. * tree-predcom.c (dump_chains): Same. (initialize_root_vars_lm): Same. (determine_unroll_factor): Same. (replace_phis_by_defined_names): Same. (insert_init_seqs): Same. (pcom_worker::tree_predictive_commoning_loop): Call auto_vec::to_vec_legacy. * tree-ssa-pre.c (insert_into_preds_of_block): Adjust by-value vec arguments to by-reference. * tree-ssa-threadbackward.c (populate_worklist): Same. (back_threader::resolve_def): Same. * tree-vect-data-refs.c (vect_check_nonzero_value): Same. (vect_enhance_data_refs_alignment): Same. (vect_check_lower_bound): Same. (vect_prune_runtime_alias_test_list): Same. (vect_permute_store_chain): Same. * tree-vect-slp-patterns.c (vect_normalize_conj_loc): Same. * tree-vect-stmts.c (vect_create_vectorized_demotion_stmts): Same. * tree-vectorizer.h (vect_permute_store_chain): Same. * vec.c (test_init): New function. (vec_c_tests): Call new function. * vec.h (vec): Declare ctors, dtor, and assignment. (auto_vec::vec_to_legacy): New function. (vec::copy): Adjust initialization. --- gcc/tree-ssa-threadbackward.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 91ce443..e237eb4 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -92,7 +92,7 @@ public: private: void maybe_register_path (edge taken_edge); bool find_paths_to_names (basic_block bb, bitmap imports); - bool resolve_def (tree name, bitmap interesting, vec worklist); + bool resolve_def (tree name, bitmap interesting, vec &worklist); bool resolve_phi (gphi *phi, bitmap imports); edge find_taken_edge (const vec &path); edge find_taken_edge_cond (const vec &path, gcond *); @@ -240,7 +240,7 @@ back_threader::find_taken_edge_cond (const vec &path, // Populate a vector of trees from a bitmap. static inline void -populate_worklist (vec worklist, bitmap bits) +populate_worklist (vec &worklist, bitmap bits) { bitmap_iterator bi; unsigned i; @@ -317,7 +317,7 @@ back_threader::resolve_phi (gphi *phi, bitmap interesting) // current path to be constant, register the path, and return TRUE. bool -back_threader::resolve_def (tree name, bitmap interesting, vec worklist) +back_threader::resolve_def (tree name, bitmap interesting, vec &worklist) { gimple *def_stmt = SSA_NAME_DEF_STMT (name); -- cgit v1.1 From 34cd97ff94bdb43e8c9de150f1d89527fc42138e Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Thu, 5 Aug 2021 08:01:47 +0200 Subject: Remove legacy back threader. At this point I don't see any use for the legacy mode, which I had originally left in place during the transition. This patch removes the legacy back threader, and cleans up the code a bit. There are no functional changes to the non-legacy code. Tested on x86-64 Linux. gcc/ChangeLog: * doc/invoke.texi: Remove docs for threader-mode param. * flag-types.h (enum threader_mode): Remove. * params.opt: Remove threader-mode param. * tree-ssa-threadbackward.c (class back_threader): Remove path_is_unreachable_p. Make find_paths private. Add maybe_thread and thread_through_all_blocks. Remove reference marker for m_registry. Remove reference marker for m_profit. (back_threader::back_threader): Adjust for registry and profit not being references. (dump_path): Move down. (debug): Move down. (class thread_jumps): Remove. (class back_threader_registry): Remove m_all_paths. Remove destructor. (thread_jumps::thread_through_all_blocks): Move to back_threader class. (fsm_find_thread_path): Remove (back_threader::maybe_thread): New. (back_threader::thread_through_all_blocks): Move from thread_jumps. (back_threader_registry::back_threader_registry): Remove m_all_paths. (back_threader_registry::~back_threader_registry): Remove. (thread_jumps::find_taken_edge): Remove. (thread_jumps::check_subpath_and_update_thread_path): Remove. (thread_jumps::maybe_register_path): Remove. (thread_jumps::handle_phi): Remove. (handle_assignment_p): Remove. (thread_jumps::handle_assignment): Remove. (thread_jumps::fsm_find_control_statement_thread_paths): Remove. (thread_jumps::find_jump_threads_backwards): Remove. (thread_jumps::find_jump_threads_backwards_with_ranger): Remove. (try_thread_blocks): Rename find_jump_threads_backwards to maybe_thread. (pass_early_thread_jumps::execute): Same. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Remove call into the legacy code and adjust for ranger threader. --- gcc/tree-ssa-threadbackward.c | 543 +++++------------------------------------- 1 file changed, 60 insertions(+), 483 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index e237eb4..3aad1493 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -51,12 +51,9 @@ class back_threader_registry { public: back_threader_registry (int max_allowable_paths); - ~back_threader_registry (); bool register_path (const vec &, edge taken); bool thread_through_all_blocks (); - private: - vec> m_all_paths; jump_thread_path_registry m_lowlevel_registry; const int m_max_allowable_paths; int m_threaded_paths; @@ -72,24 +69,19 @@ public: { } bool profitable_path_p (const vec &, tree name, edge taken, bool *irreducible_loop = NULL); - private: const bool m_speed_p; }; -// Ranger based backwards threader. - class back_threader { - // Temporary until we remove old code. - friend bool path_is_unreachable_p (const vec &); - public: - back_threader (back_threader_profitability &, back_threader_registry &); + back_threader (bool speed_p); ~back_threader (); - void find_paths (basic_block bb, tree name); - + void maybe_thread_block (basic_block bb); + bool thread_through_all_blocks (); private: + void find_paths (basic_block bb, tree name); void maybe_register_path (edge taken_edge); bool find_paths_to_names (basic_block bb, bitmap imports); bool resolve_def (tree name, bitmap interesting, vec &worklist); @@ -98,8 +90,8 @@ private: edge find_taken_edge_cond (const vec &path, gcond *); edge find_taken_edge_switch (const vec &path, gswitch *); - back_threader_registry &m_registry; - back_threader_profitability &m_profit; + back_threader_registry m_registry; + back_threader_profitability m_profit; gimple_ranger m_ranger; path_range_query m_solver; @@ -123,10 +115,9 @@ private: // in a the given direction. const edge back_threader::UNREACHABLE_EDGE = (edge) -1; -back_threader::back_threader (back_threader_profitability &profit, - back_threader_registry ®istry) - : m_registry (registry), - m_profit (profit), +back_threader::back_threader (bool speed_p) + : m_registry (param_max_fsm_thread_paths), + m_profit (speed_p), m_solver (m_ranger) { m_last_stmt = NULL; @@ -456,74 +447,9 @@ back_threader::find_paths (basic_block bb, tree name) } } -// Dump a sequence of BBs through the CFG. - -DEBUG_FUNCTION void -dump_path (FILE *dump_file, const vec &path) -{ - for (size_t i = 0; i < path.length (); ++i) - { - fprintf (dump_file, "BB%d", path[i]->index); - if (i + 1 < path.length ()) - fprintf (dump_file, " <- "); - } - fprintf (dump_file, "\n"); -} - -DEBUG_FUNCTION void -debug (const vec &path) -{ - dump_path (stderr, path); -} - -class thread_jumps -{ -public: - thread_jumps (bool speed_p = true) - : m_profit (speed_p), - m_registry (param_max_fsm_thread_paths), - m_back_threader (m_profit, m_registry) - { } - void find_jump_threads_backwards (basic_block bb); - void find_jump_threads_backwards_with_ranger (basic_block bb); - bool thread_through_all_blocks (); - -private: - void maybe_register_path (const vec &m_path, - tree name, - edge taken_edge); - edge find_taken_edge (const vec &path, tree arg); - void handle_assignment (gimple *stmt, basic_block def_bb); - void handle_phi (gphi *phi, basic_block def_bb); - void fsm_find_control_statement_thread_paths (tree name); - bool check_subpath_and_update_thread_path (basic_block last_bb, - basic_block new_bb, - int *next_path_length); - - /* Hash to keep track of seen bbs. */ - hash_set m_visited_bbs; - /* Current path we're analyzing. */ - auto_vec m_path; - /* Tracks if we have recursed through a loop PHI node. */ - bool m_seen_loop_phi; - - tree m_name; - back_threader_profitability m_profit; - back_threader_registry m_registry; - back_threader m_back_threader; -}; - -// Perform the actual jump threading for the all queued paths. - -bool -thread_jumps::thread_through_all_blocks () -{ - return m_registry.thread_through_all_blocks (); -} - -/* Simple helper to get the last statement from BB, which is assumed - to be a control statement. Return NULL if the last statement is - not a control statement. */ +// Simple helper to get the last statement from BB, which is assumed +// to be a control statement. Return NULL if the last statement is +// not a control statement. static gimple * get_gimple_control_stmt (basic_block bb) @@ -540,55 +466,65 @@ get_gimple_control_stmt (basic_block bb) return NULL; } -/* Return true if the CFG contains at least one path from START_BB to - END_BB. When a path is found, record in PATH the blocks from - END_BB to START_BB. LOCAL_VISITED_BBS is used to make sure we - don't fall into an infinite loop. Bound the recursion to basic - blocks belonging to LOOP. */ +// Search backwards from BB looking for paths where the final +// conditional maybe threaded to a successor block. Record such paths +// for jump threading. -static bool -fsm_find_thread_path (basic_block start_bb, basic_block end_bb, - vec &path, - hash_set &local_visited_bbs, - loop_p loop) +void +back_threader::maybe_thread_block (basic_block bb) { - if (loop != start_bb->loop_father) - return false; + gimple *stmt = get_gimple_control_stmt (bb); + if (!stmt) + return; - if (start_bb == end_bb) - { - path.safe_push (start_bb); - return true; - } + enum gimple_code code = gimple_code (stmt); + tree name; + if (code == GIMPLE_SWITCH) + name = gimple_switch_index (as_a (stmt)); + else if (code == GIMPLE_COND) + name = gimple_cond_lhs (stmt); + else if (code == GIMPLE_GOTO) + name = gimple_goto_dest (stmt); + else + name = NULL; + + find_paths (bb, name); +} + +// Perform the actual jump threading for the all queued paths. + +bool +back_threader::thread_through_all_blocks () +{ + return m_registry.thread_through_all_blocks (); +} + +// Dump a sequence of BBs through the CFG. - if (!local_visited_bbs.add (start_bb)) +DEBUG_FUNCTION void +dump_path (FILE *dump_file, const vec &path) +{ + for (size_t i = 0; i < path.length (); ++i) { - edge e; - edge_iterator ei; - FOR_EACH_EDGE (e, ei, start_bb->succs) - if (fsm_find_thread_path (e->dest, end_bb, path, local_visited_bbs, - loop)) - { - path.safe_push (start_bb); - return true; - } + fprintf (dump_file, "BB%d", path[i]->index); + if (i + 1 < path.length ()) + fprintf (dump_file, " <- "); } + fprintf (dump_file, "\n"); +} - return false; +DEBUG_FUNCTION void +debug (const vec &path) +{ + dump_path (stderr, path); } back_threader_registry::back_threader_registry (int max_allowable_paths) : m_max_allowable_paths (max_allowable_paths) { - m_all_paths.create (5); m_threaded_paths = 0; } -back_threader_registry::~back_threader_registry () -{ - m_all_paths.release (); -} - bool back_threader_registry::thread_through_all_blocks () { @@ -881,39 +817,6 @@ back_threader_profitability::profitable_path_p (const vec &m_path, return true; } -/* Return the taken edge out of a path, assuming that the underlying assignment - or PHI SSA resolves to ARG. */ - -edge -thread_jumps::find_taken_edge (const vec &path, tree arg) -{ - if (TREE_CODE_CLASS (TREE_CODE (arg)) != tcc_constant) - return NULL; - - gcc_checking_assert (!path.is_empty ()); - gimple *stmt = get_gimple_control_stmt (m_path[0]); - - /* We have found a constant value for ARG. For GIMPLE_SWITCH - and GIMPLE_GOTO, we use it as-is. However, for a GIMPLE_COND - we need to substitute, fold and simplify so we can determine - the edge taken out of the last block. */ - if (gimple_code (stmt) == GIMPLE_COND) - { - enum tree_code cond_code = gimple_cond_code (stmt); - - /* We know the underyling format of the condition. */ - arg = fold_binary (cond_code, boolean_type_node, - arg, gimple_cond_rhs (stmt)); - } - - /* If this path threaded through the loop latch back into the - same loop and the destination does not dominate the loop - latch, then this thread would create an irreducible loop. - - We have to know the outgoing edge to figure this out. */ - return ::find_taken_edge (m_path[0], arg); -} - /* The current path PATH is a vector of blocks forming a jump threading path in reverse order. TAKEN_EDGE is the edge taken from path[0]. @@ -962,331 +865,6 @@ back_threader_registry::register_path (const vec &m_path, return true; } -/* While following a chain of SSA_NAME definitions, we jumped from a - definition in LAST_BB to a definition in NEW_BB (walking - backwards). - - Verify there is a single path between the blocks and none of the - blocks in the path is already in VISITED_BBS. If so, then update - VISISTED_BBS, add the new blocks to PATH and return TRUE. - Otherwise return FALSE. - - Store the length of the subpath in NEXT_PATH_LENGTH. */ - -bool -thread_jumps::check_subpath_and_update_thread_path (basic_block last_bb, - basic_block new_bb, - int *next_path_length) -{ - edge e; - int e_count = 0; - edge_iterator ei; - auto_vec next_path; - - FOR_EACH_EDGE (e, ei, last_bb->preds) - { - hash_set local_visited_bbs; - - if (fsm_find_thread_path (new_bb, e->src, next_path, - local_visited_bbs, e->src->loop_father)) - ++e_count; - - /* If there is more than one path, stop. */ - if (e_count > 1) - return false; - } - - /* Stop if we have not found a path: this could occur when the recursion - is stopped by one of the bounds. */ - if (e_count == 0) - return false; - - /* Make sure we haven't already visited any of the nodes in - NEXT_PATH. Don't add them here to avoid pollution. */ - for (unsigned int i = 0; i + 1 < next_path.length (); i++) - { - if (m_visited_bbs.contains (next_path[i])) - return false; - } - - /* Now add the nodes to VISISTED_BBS. */ - for (unsigned int i = 0; i + 1 < next_path.length (); i++) - m_visited_bbs.add (next_path[i]); - - /* Append all the nodes from NEXT_PATH to PATH. */ - m_path.safe_splice (next_path); - *next_path_length = next_path.length (); - - return true; -} - -/* If this is a profitable jump thread path, register it. - - NAME is an SSA NAME with a possible constant value of ARG on PATH. - - DEF_BB is the basic block that ultimately defines the constant. */ - -void -thread_jumps::maybe_register_path (const vec &m_path, - tree name, - edge taken_edge) -{ - bool irreducible = false; - bool profitable = m_profit.profitable_path_p (m_path, name, taken_edge, - &irreducible); - if (profitable) - { - if (!m_registry.register_path (m_path, taken_edge)) - return; - - if (irreducible) - vect_free_loop_info_assumptions (m_path[0]->loop_father); - } -} - -/* Given PHI which defines NAME in block DEF_BB, recurse through the - PHI's arguments searching for paths where NAME will ultimately have - a constant value. - - PATH contains the series of blocks to traverse that will result in - NAME having a constant value. */ - -void -thread_jumps::handle_phi (gphi *phi, basic_block def_bb) -{ - /* Iterate over the arguments of PHI. */ - for (unsigned int i = 0; i < gimple_phi_num_args (phi); i++) - { - tree arg = gimple_phi_arg_def (phi, i); - basic_block bbi = gimple_phi_arg_edge (phi, i)->src; - - /* Skip edges pointing outside the current loop. */ - if (!arg || def_bb->loop_father != bbi->loop_father) - continue; - - if (TREE_CODE (arg) == SSA_NAME) - { - m_path.safe_push (bbi); - /* Recursively follow SSA_NAMEs looking for a constant - definition. */ - fsm_find_control_statement_thread_paths (arg); - - m_path.pop (); - continue; - } - - m_path.safe_push (bbi); - edge taken_edge = find_taken_edge (m_path, arg); - if (taken_edge) - maybe_register_path (m_path, m_name, taken_edge); - m_path.pop (); - } -} - -/* Return TRUE if STMT is a gimple assignment we want to either directly - handle or recurse through. Return FALSE otherwise. - - Note that adding more cases here requires adding cases to handle_assignment - below. */ - -static bool -handle_assignment_p (gimple *stmt) -{ - if (is_gimple_assign (stmt)) - { - enum tree_code def_code = gimple_assign_rhs_code (stmt); - - /* If the RHS is an SSA_NAME, then we will recurse through it. - Go ahead and filter out cases where the SSA_NAME is a default - definition. There's little to be gained by trying to handle that. */ - if (def_code == SSA_NAME - && !SSA_NAME_IS_DEFAULT_DEF (gimple_assign_rhs1 (stmt))) - return true; - - /* If the RHS is a constant, then it's a terminal that we'll want - to handle as well. */ - if (TREE_CODE_CLASS (def_code) == tcc_constant) - return true; - } - - /* Anything not explicitly allowed is not handled. */ - return false; -} - -/* Given STMT which defines NAME in block DEF_BB, recurse through the - PHI's arguments searching for paths where NAME will ultimately have - a constant value. - - PATH contains the series of blocks to traverse that will result in - NAME having a constant value. */ - -void -thread_jumps::handle_assignment (gimple *stmt, basic_block def_bb) -{ - tree arg = gimple_assign_rhs1 (stmt); - - if (TREE_CODE (arg) == SSA_NAME) - fsm_find_control_statement_thread_paths (arg); - else - { - if (CHECKING_P) - { - gcc_assert (!m_path.is_empty ()); - basic_block top = m_path[m_path.length () - 1]; - gcc_assert (top == def_bb); - } - edge taken_edge = find_taken_edge (m_path, arg); - if (taken_edge) - maybe_register_path (m_path, m_name, taken_edge); - } -} - -/* We trace the value of the SSA_NAME NAME back through any phi nodes - looking for places where it gets a constant value and save the - path. */ - -void -thread_jumps::fsm_find_control_statement_thread_paths (tree name) -{ - /* If NAME appears in an abnormal PHI, then don't try to trace its - value back through PHI nodes. */ - if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name)) - return; - - gimple *def_stmt = SSA_NAME_DEF_STMT (name); - basic_block def_bb = gimple_bb (def_stmt); - - if (def_bb == NULL) - return; - - /* We allow the SSA chain to contains PHIs and simple copies and constant - initializations. */ - if (gimple_code (def_stmt) != GIMPLE_PHI - && gimple_code (def_stmt) != GIMPLE_ASSIGN) - return; - - if (gimple_code (def_stmt) == GIMPLE_PHI - && (gimple_phi_num_args (def_stmt) - >= (unsigned) param_fsm_maximum_phi_arguments)) - return; - - if (is_gimple_assign (def_stmt) - && ! handle_assignment_p (def_stmt)) - return; - - /* Avoid infinite recursion. */ - if (m_visited_bbs.add (def_bb)) - return; - - int next_path_length = 0; - basic_block last_bb_in_path = m_path.last (); - - if (loop_containing_stmt (def_stmt)->header == gimple_bb (def_stmt)) - { - /* Do not walk through more than one loop PHI node. */ - if (m_seen_loop_phi) - return; - m_seen_loop_phi = true; - } - - /* Following the chain of SSA_NAME definitions, we jumped from a definition in - LAST_BB_IN_PATH to a definition in DEF_BB. When these basic blocks are - different, append to PATH the blocks from LAST_BB_IN_PATH to DEF_BB. */ - if (def_bb != last_bb_in_path) - { - /* When DEF_BB == LAST_BB_IN_PATH, then the first block in the path - will already be in VISITED_BBS. When they are not equal, then we - must ensure that first block is accounted for to ensure we do not - create bogus jump threading paths. */ - m_visited_bbs.add (m_path[0]); - if (!check_subpath_and_update_thread_path (last_bb_in_path, def_bb, - &next_path_length)) - return; - } - - gcc_assert (m_path.last () == def_bb); - - if (gimple_code (def_stmt) == GIMPLE_PHI) - handle_phi (as_a (def_stmt), def_bb); - else if (gimple_code (def_stmt) == GIMPLE_ASSIGN) - handle_assignment (def_stmt, def_bb); - - /* Remove all the nodes that we added from NEXT_PATH. */ - if (next_path_length) - m_path.truncate (m_path.length () - next_path_length); -} - -/* Search backwards from BB looking for paths where NAME (an SSA_NAME) - is a constant. Record such paths for jump threading. - - It is assumed that BB ends with a control statement and that by - finding a path where NAME is a constant, we can thread the path. - SPEED_P indicates that we could increase code size to improve the - code path. */ - -void -thread_jumps::find_jump_threads_backwards (basic_block bb) -{ - if (param_threader_mode & THREADER_MODE_RANGER) - { - find_jump_threads_backwards_with_ranger (bb); - return; - } - - gimple *stmt = get_gimple_control_stmt (bb); - if (!stmt) - return; - - enum gimple_code code = gimple_code (stmt); - tree name = NULL; - if (code == GIMPLE_SWITCH) - name = gimple_switch_index (as_a (stmt)); - else if (code == GIMPLE_GOTO) - name = gimple_goto_dest (stmt); - else if (code == GIMPLE_COND) - { - if (TREE_CODE (gimple_cond_lhs (stmt)) == SSA_NAME - && TREE_CODE_CLASS (TREE_CODE (gimple_cond_rhs (stmt))) == tcc_constant - && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt))) - || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt))))) - name = gimple_cond_lhs (stmt); - } - - if (!name || TREE_CODE (name) != SSA_NAME) - return; - - /* Initialize pass local data that's different for each BB. */ - m_path.truncate (0); - m_path.safe_push (bb); - m_visited_bbs.empty (); - m_seen_loop_phi = false; - m_name = name; - - fsm_find_control_statement_thread_paths (name); -} - -// Like find_jump_threads_backwards(), but using ranger. - -void -thread_jumps::find_jump_threads_backwards_with_ranger (basic_block bb) -{ - gimple *stmt = get_gimple_control_stmt (bb); - if (!stmt) - return; - - enum gimple_code code = gimple_code (stmt); - tree name = NULL; - if (code == GIMPLE_SWITCH) - name = gimple_switch_index (as_a (stmt)); - else if (code == GIMPLE_GOTO) - name = gimple_goto_dest (stmt); - else if (code == GIMPLE_COND) - name = gimple_cond_lhs (stmt); - - m_name = name; - m_back_threader.find_paths (bb, name); -} - namespace { const pass_data pass_data_thread_jumps = @@ -1327,12 +905,12 @@ static bool try_thread_blocks (function *fun) { /* Try to thread each block with more than one successor. */ - thread_jumps threader; + back_threader threader (/*speed_p=*/true); basic_block bb; FOR_EACH_BB_FN (bb, fun) { if (EDGE_COUNT (bb->succs) > 1) - threader.find_jump_threads_backwards (bb); + threader.maybe_thread_block (bb); } return threader.thread_through_all_blocks (); } @@ -1390,19 +968,18 @@ pass_early_thread_jumps::gate (function *fun ATTRIBUTE_UNUSED) return true; } - unsigned int pass_early_thread_jumps::execute (function *fun) { loop_optimizer_init (AVOID_CFG_MODIFICATIONS); /* Try to thread each block with more than one successor. */ - thread_jumps threader (/*speed_p=*/false); + back_threader threader (/*speed_p=*/false); basic_block bb; FOR_EACH_BB_FN (bb, fun) { if (EDGE_COUNT (bb->succs) > 1) - threader.find_jump_threads_backwards (bb); + threader.maybe_thread_block (bb); } threader.thread_through_all_blocks (); -- cgit v1.1 From 779275c0835b58325f806568836c8b5081d1f52f Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Fri, 3 Sep 2021 10:57:33 +0200 Subject: Improve backwards threader debugging dumps. This patch adds debugging helpers to the backwards threader. I have also noticed that profitable_path_p() can bail early on paths that crosses loops and leave the dump of blocks incomplete. Fixed as well. Unfortunately the new methods cannot be marked const, because we call the solver's dump which is not const. I believe this was because the ranger dump calls m_cache.block_range(). This could probably use a cleanup at a later time. Tested on x86-64 Linux. gcc/ChangeLog: * tree-ssa-threadbackward.c (back_threader::dump): New. (back_threader::debug): New. (back_threader_profitability::profitable_path_p): Dump blocks even if we are bailing early. --- gcc/tree-ssa-threadbackward.c | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 3aad1493..b9a0d9a 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-range-path.h" #include "ssa.h" #include "tree-cfgcleanup.h" +#include "tree-pretty-print.h" // Path registry for the backwards threader. After all paths have been // registered with register_path(), thread_through_all_blocks() is called @@ -89,6 +90,8 @@ private: edge find_taken_edge (const vec &path); edge find_taken_edge_cond (const vec &path, gcond *); edge find_taken_edge_switch (const vec &path, gswitch *); + virtual void debug (); + virtual void dump (FILE *out); back_threader_registry m_registry; back_threader_profitability m_profit; @@ -519,6 +522,30 @@ debug (const vec &path) dump_path (stderr, path); } +void +back_threader::dump (FILE *out) +{ + m_solver.dump (out); + fprintf (out, "\nCandidates for pre-computation:\n"); + fprintf (out, "===================================\n"); + + bitmap_iterator bi; + unsigned i; + + EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi) + { + tree name = ssa_name (i); + print_generic_expr (out, name, TDF_NONE); + fprintf (out, "\n"); + } +} + +void +back_threader::debug () +{ + dump (stderr); +} + back_threader_registry::back_threader_registry (int max_allowable_paths) : m_max_allowable_paths (max_allowable_paths) { @@ -607,6 +634,14 @@ back_threader_profitability::profitable_path_p (const vec &m_path, if (bb->loop_father != loop) { path_crosses_loops = true; + + // Dump rest of blocks. + if (dump_file && (dump_flags & TDF_DETAILS)) + for (j++; j < m_path.length (); j++) + { + bb = m_path[j]; + fprintf (dump_file, " bb:%i", bb->index); + } break; } -- cgit v1.1 From 01005550377ff17716e6ad57c62df726877ab79f Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Fri, 3 Sep 2021 12:07:49 +0200 Subject: Do not assume loop header threading in backward threader. The registry's thread_through_all_blocks() has a may_peel_loop_headers argument. When refactoring the backward threader code, I removed this argument for the local passthru method because it was always TRUE. This may not necessarily be true in the future, if the backward threader is called from another context. This patch removes the default definition, in favor of an argument that is exactly the same as the identically named function in tree-ssa-threadupdate.c. I think this also makes it less confusing when looking at both methods across the source base. Tested on x86-64 Linux. gcc/ChangeLog: * tree-ssa-threadbackward.c (back_threader::thread_through_all_blocks): Add may_peel_loop_headers. (back_threader_registry::thread_through_all_blocks): Same. (try_thread_blocks): Pass may_peel_loop_headers argument. (pass_early_thread_jumps::execute): Same. --- gcc/tree-ssa-threadbackward.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index b9a0d9a..2fa22f8 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -53,7 +53,7 @@ class back_threader_registry public: back_threader_registry (int max_allowable_paths); bool register_path (const vec &, edge taken); - bool thread_through_all_blocks (); + bool thread_through_all_blocks (bool may_peel_loop_headers); private: jump_thread_path_registry m_lowlevel_registry; const int m_max_allowable_paths; @@ -80,7 +80,7 @@ public: back_threader (bool speed_p); ~back_threader (); void maybe_thread_block (basic_block bb); - bool thread_through_all_blocks (); + bool thread_through_all_blocks (bool may_peel_loop_headers); private: void find_paths (basic_block bb, tree name); void maybe_register_path (edge taken_edge); @@ -497,9 +497,9 @@ back_threader::maybe_thread_block (basic_block bb) // Perform the actual jump threading for the all queued paths. bool -back_threader::thread_through_all_blocks () +back_threader::thread_through_all_blocks (bool may_peel_loop_headers) { - return m_registry.thread_through_all_blocks (); + return m_registry.thread_through_all_blocks (may_peel_loop_headers); } // Dump a sequence of BBs through the CFG. @@ -553,9 +553,9 @@ back_threader_registry::back_threader_registry (int max_allowable_paths) } bool -back_threader_registry::thread_through_all_blocks () +back_threader_registry::thread_through_all_blocks (bool may_peel_loop_headers) { - return m_lowlevel_registry.thread_through_all_blocks (true); + return m_lowlevel_registry.thread_through_all_blocks (may_peel_loop_headers); } /* Examine jump threading path PATH and return TRUE if it is profitable to @@ -947,7 +947,7 @@ try_thread_blocks (function *fun) if (EDGE_COUNT (bb->succs) > 1) threader.maybe_thread_block (bb); } - return threader.thread_through_all_blocks (); + return threader.thread_through_all_blocks (/*peel_loop_headers=*/true); } unsigned int @@ -1016,7 +1016,7 @@ pass_early_thread_jumps::execute (function *fun) if (EDGE_COUNT (bb->succs) > 1) threader.maybe_thread_block (bb); } - threader.thread_through_all_blocks (); + threader.thread_through_all_blocks (/*peel_loop_headers=*/true); loop_optimizer_finalize (); return 0; -- cgit v1.1 From cbeeadff4c041c09a6335105d596019b6d583880 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Sun, 5 Sep 2021 09:41:50 +0200 Subject: Clean up registering of paths in backwards threader. All callers to maybe_register_path() call find_taken_edge() beforehand and pass the edge as an argument. There's no reason to repeat this at each call site. This is a clean-up in preparation for some other enhancements to the backwards threader. Tested on x86-64 Linux. gcc/ChangeLog: * tree-ssa-threadbackward.c (back_threader::maybe_register_path): Remove argument and call find_taken_edge. (back_threader::resolve_phi): Do not calculate taken edge before calling maybe_register_path. (back_threader::find_paths_to_names): Same. --- gcc/tree-ssa-threadbackward.c | 44 ++++++++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 21 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 2fa22f8..6827d00 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -83,7 +83,7 @@ public: bool thread_through_all_blocks (bool may_peel_loop_headers); private: void find_paths (basic_block bb, tree name); - void maybe_register_path (edge taken_edge); + edge maybe_register_path (); bool find_paths_to_names (basic_block bb, bitmap imports); bool resolve_def (tree name, bitmap interesting, vec &worklist); bool resolve_phi (gphi *phi, bitmap imports); @@ -134,22 +134,31 @@ back_threader::~back_threader () } // Register the current path for jump threading if it's profitable to -// do so. TAKEN_EDGE is the known edge out of the path. +// do so. +// +// Return the known taken edge out of the path, even if the path was +// not registered, or NULL if the taken edge could not be determined. -void -back_threader::maybe_register_path (edge taken_edge) +edge +back_threader::maybe_register_path () { - bool irreducible = false; - bool profitable - = m_profit.profitable_path_p (m_path, m_name, taken_edge, &irreducible); + edge taken_edge = find_taken_edge (m_path); - if (profitable) + if (taken_edge && taken_edge != UNREACHABLE_EDGE) { - m_registry.register_path (m_path, taken_edge); + bool irreducible = false; + bool profitable + = m_profit.profitable_path_p (m_path, m_name, taken_edge, &irreducible); + + if (profitable) + { + m_registry.register_path (m_path, taken_edge); - if (irreducible) - vect_free_loop_info_assumptions (m_path[0]->loop_father); + if (irreducible) + vect_free_loop_info_assumptions (m_path[0]->loop_father); + } } + return taken_edge; } // Return the known taken edge out of a path. If the path can be @@ -295,12 +304,9 @@ back_threader::resolve_phi (gphi *phi, bitmap interesting) else if (TREE_CODE (arg) == INTEGER_CST) { m_path.safe_push (e->src); - edge taken_edge = find_taken_edge (m_path); + edge taken_edge = maybe_register_path (); if (taken_edge && taken_edge != UNREACHABLE_EDGE) - { - maybe_register_path (taken_edge); - done = true; - } + done = true; m_path.pop (); } } @@ -388,12 +394,8 @@ back_threader::find_paths_to_names (basic_block bb, bitmap interesting) || bitmap_bit_p (m_ranger.gori ().exports (bb), i)) && m_path.length () > 1) { - edge taken_edge = find_taken_edge (m_path); - if (taken_edge) + if (maybe_register_path ()) { - if (taken_edge != UNREACHABLE_EDGE) - maybe_register_path (taken_edge); - done = true; goto leave_bb; } -- cgit v1.1 From 90ef15352701c6880faee83a46031d7837ab9d27 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Sun, 5 Sep 2021 12:44:41 +0200 Subject: Add an unreachable_path_p method to path_range_query. Keeping track of unreachable calculations while traversing a path is useful to determine edge reachability, among other things. We've been doing this ad-hoc in the backwards threader, so this provides a cleaner way of accessing the information. This patch also makes it easier to compare different threading implementations, in some upcoming work. For example, it's currently difficult to gague how good we're doing compared to the forward threader, because it can thread paths that are obviously unreachable. This provides a way of discarding those paths. Note that I've opted to keep unreachable_path_p() out-of-line, because I have local changes that will enhance this method. Tested on x86-64 Linux. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::range_of_expr): Set m_undefined_path when appropriate. (path_range_query::internal_range_of_expr): Copy from range_of_expr. (path_range_query::unreachable_path_p): New. (path_range_query::precompute_ranges): Set m_undefined_path. * gimple-range-path.h (path_range_query::unreachable_path_p): New. (path_range_query::internal_range_of_expr): New. * tree-ssa-threadbackward.c (back_threader::find_taken_edge_cond): Use unreachable_path_p. --- gcc/tree-ssa-threadbackward.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 6827d00..449232c 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -213,20 +213,14 @@ edge back_threader::find_taken_edge_cond (const vec &path, gcond *cond) { - m_solver.precompute_ranges (path, m_imports); - - // Check if either operand is unreachable since this knowledge could - // help the caller cut down the search space. int_range_max r; - m_solver.range_of_expr (r, gimple_cond_lhs (cond)); - if (r.undefined_p ()) - return UNREACHABLE_EDGE; - m_solver.range_of_expr (r, gimple_cond_rhs (cond)); - if (r.undefined_p ()) - return UNREACHABLE_EDGE; + m_solver.precompute_ranges (path, m_imports); m_solver.range_of_stmt (r, cond); + if (m_solver.unreachable_path_p ()) + return UNREACHABLE_EDGE; + int_range<2> true_range (boolean_true_node, boolean_true_node); int_range<2> false_range (boolean_false_node, boolean_false_node); -- cgit v1.1 From 01b5038718056b024b370b74a874fbd92c5bbab3 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Thu, 9 Sep 2021 20:30:28 +0200 Subject: Disable threading through latches until after loop optimizations. The motivation for this patch was enabling the use of global ranges in the path solver, but this caused certain properties of loops being destroyed which made subsequent loop optimizations to fail. Consequently, this patch's mail goal is to disable jump threading involving the latch until after loop optimizations have run. As can be seen in the test adjustments, we mostly shift the threading from the early threaders (ethread, thread[12] to the late threaders thread[34]). I have nuked some of the early notes in the testcases that came as part of the jump threader rewrite. They're mostly noise now. Note that we could probably relax some other restrictions in profitable_path_p when loop optimizations have completed, but it would require more testing, and I'm hesitant to touch more things than needed at this point. I have added a reminder to the function to keep this in mind. Finally, perhaps as a follow-up, we should apply the same restrictions to the forward threader. At some point I'd like to combine the cost models. Tested on x86-64 Linux. p.s. There is a thorough discussion involving the limitations of jump threading involving loops here: https://gcc.gnu.org/pipermail/gcc/2021-September/237247.html gcc/ChangeLog: * tree-pass.h (PROP_loop_opts_done): New. * gimple-range-path.cc (path_range_query::internal_range_of_expr): Intersect with global range. * tree-ssa-loop.c (tree_ssa_loop_done): Set PROP_loop_opts_done. * tree-ssa-threadbackward.c (back_threader_profitability::profitable_path_p): Disable threading through latches until after loop optimizations have run. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ssa-dom-thread-2b.c: Adjust for disabling of threading through latches. * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same. Co-authored-by: Michael Matz --- gcc/tree-ssa-threadbackward.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 449232c..e729923 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. If not see #include "ssa.h" #include "tree-cfgcleanup.h" #include "tree-pretty-print.h" +#include "cfghooks.h" // Path registry for the backwards threader. After all paths have been // registered with register_path(), thread_through_all_blocks() is called @@ -564,7 +565,10 @@ back_threader_registry::thread_through_all_blocks (bool may_peel_loop_headers) TAKEN_EDGE, otherwise it is NULL. CREATES_IRREDUCIBLE_LOOP, if non-null is set to TRUE if threading this path - would create an irreducible loop. */ + would create an irreducible loop. + + ?? It seems we should be able to loosen some of the restrictions in + this function after loop optimizations have run. */ bool back_threader_profitability::profitable_path_p (const vec &m_path, @@ -725,7 +729,11 @@ back_threader_profitability::profitable_path_p (const vec &m_path, the last entry in the array when determining if we thread through the loop latch. */ if (loop->latch == bb) - threaded_through_latch = true; + { + threaded_through_latch = true; + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " (latch)"); + } } gimple *stmt = get_gimple_control_stmt (m_path[0]); @@ -845,6 +853,22 @@ back_threader_profitability::profitable_path_p (const vec &m_path, "a multiway branch.\n"); return false; } + + /* Threading through an empty latch would cause code to be added to + the latch. This could alter the loop form sufficiently to cause + loop optimizations to fail. Disable these threads until after + loop optimizations have run. */ + if ((threaded_through_latch + || (taken_edge && taken_edge->dest == loop->latch)) + && !(cfun->curr_properties & PROP_loop_opts_done) + && empty_block_p (loop->latch)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, + " FAIL: FSM Thread through latch before loop opts would create non-empty latch\n"); + return false; + + } return true; } -- cgit v1.1 From 5485bbebb3679245dd4bc7c149bbc940f8b2e632 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Sat, 11 Sep 2021 09:37:39 +0200 Subject: Refactor jump_thread_path_registry. In an attempt to refactor thread_through_all_blocks(), I've realized that there is a mess of code dealing with coexisting forward and backward thread types. However, this is an impossible scenario, as the registry contains either forward/old-style threads, or backward threads (EDGE_FSM_THREADs), never both. The fact that both types of threads cannot coexist, simplifies the code considerably. For that matter, it splits things up nicely because there are some common bits that can go into a base class, and some differing code that can go into derived classes. Diving things in this way makes it very obvious which parts belong in the old-style copier and which parts belong to the generic copier. Doing all this provided some nice cleanups, as well as fixing a latent bug in adjust_paths_after_duplication. The diff is somewhat hard to read, so perhaps looking at the final output would be easier. A general overview of what this patch achieves can be seen by just looking at this simplified class layout: // Abstract class for the jump thread registry. class jt_path_registry { public: jt_path_registry (); virtual ~jt_path_registry (); bool register_jump_thread (vec *); bool thread_through_all_blocks (bool peel_loop_headers); jump_thread_edge *allocate_thread_edge (edge e, jump_thread_edge_type t); vec *allocate_thread_path (); protected: vec *> m_paths; unsigned long m_num_threaded_edges; private: virtual bool update_cfg (bool peel_loop_headers) = 0; }; // Forward threader path registry using a custom BB copier. class fwd_jt_path_registry : public jt_path_registry { public: fwd_jt_path_registry (); ~fwd_jt_path_registry (); void remove_jump_threads_including (edge); private: bool update_cfg (bool peel_loop_headers) override; void mark_threaded_blocks (bitmap threaded_blocks); bool thread_block_1 (basic_block, bool noloop_only, bool joiners); bool thread_block (basic_block, bool noloop_only); bool thread_through_loop_header (class loop *loop, bool may_peel_loop_headers); class redirection_data *lookup_redirection_data (edge e, enum insert_option); hash_table *m_removed_edges; hash_table *m_redirection_data; }; // Backward threader path registry using a generic BB copier. class back_jt_path_registry : public jt_path_registry { private: bool update_cfg (bool peel_loop_headers) override; void adjust_paths_after_duplication (unsigned curr_path_num); bool duplicate_thread_path (edge entry, edge exit, basic_block *region, unsigned n_region, unsigned current_path_no); bool rewire_first_differing_edge (unsigned path_num, unsigned edge_num); }; That is, the forward and backward bits have been completely split, while deriving from a base class for the common functionality. Most everything is mechanical, but there are a few gotchas: a) back_jt_path_registry::update_cfg(), which contains the backward threading specific bits, is rather simple, since most of the code in the original thread_through_all_blocks() only applied to the forward threader: removed edges, mark_threaded_blocks, thread_through_loop_header, the copy tables (*). (*) The back threader has its own copy tables in duplicate_thread_path. b) In some cases, adjust_paths_after_duplication() was commoning out so many blocks that it was removing the initial EDGE_FSM_THREAD marker. I've fixed this. c) AFAICT, when run from the forward threader, thread_through_all_blocks() attempts to remove threads starting with an edge already seen, but it would never see anything because the loop doing the checking only has a visited_starting_edges.contains(), and no corresponding visited_starting_edges.add(). The add() method in thread_through_all_blocks belongs to the backward threading bits, and as I've explained, both types cannot coexist. I've removed the checks in the forward bits since they don't appear to do anything. If this was an oversight, and we want to avoid threading already seen edges in the forward threader, I can move this functionality to the base class. Ultimately I would like to move all the registry code to tree-ssa-threadregistry.*. I've avoided this in this patch to aid in review. My apologies for this longass explanation, but I want to make sure we're covering all of our bases. Tested on x86-64 Linux by a very tedious process of moving chunks around, running "make check-gcc RUNTESTFLAGS=tree-ssa.exp", and repeating ad-nauseum. And of course, by running a full bootstrap and tests. OK? p.s. In a follow-up patch I will rename the confusing EDGE_FSM_THREAD type. gcc/ChangeLog: * tree-ssa-threadbackward.c (class back_threader_registry): Use back_jt_path_registry. * tree-ssa-threadedge.c (jump_threader::jump_threader): Use fwd_jt_path_registry. * tree-ssa-threadedge.h (class jump_threader): Same.. * tree-ssa-threadupdate.c (jump_thread_path_registry::jump_thread_path_registry): Rename... (jt_path_registry::jt_path_registry): ...to this. (jump_thread_path_registry::~jump_thread_path_registry): Rename... (jt_path_registry::~jt_path_registry): ...this. (fwd_jt_path_registry::fwd_jt_path_registry): New. (fwd_jt_path_registry::~fwd_jt_path_registry): New. (jump_thread_path_registry::allocate_thread_edge): Rename... (jt_path_registry::allocate_thread_edge): ...to this. (jump_thread_path_registry::allocate_thread_path): Rename... (jt_path_registry::allocate_thread_path): ...to this. (jump_thread_path_registry::lookup_redirection_data): Rename... (fwd_jt_path_registry::lookup_redirection_data): ...to this. (jump_thread_path_registry::thread_block_1): Rename... (fwd_jt_path_registry::thread_block_1): ...to this. (jump_thread_path_registry::thread_block): Rename... (fwd_jt_path_registry::thread_block): ...to this. (jt_path_registry::thread_through_loop_header): Rename... (fwd_jt_path_registry::thread_through_loop_header): ...to this. (jump_thread_path_registry::mark_threaded_blocks): Rename... (fwd_jt_path_registry::mark_threaded_blocks): ...to this. (jump_thread_path_registry::debug_path): Rename... (jt_path_registry::debug_path): ...to this. (jump_thread_path_registry::dump): Rename... (jt_path_registry::debug): ...to this. (jump_thread_path_registry::rewire_first_differing_edge): Rename... (back_jt_path_registry::rewire_first_differing_edge): ...to this. (jump_thread_path_registry::adjust_paths_after_duplication): Rename... (back_jt_path_registry::adjust_paths_after_duplication): ...to this. (jump_thread_path_registry::duplicate_thread_path): Rename... (back_jt_path_registry::duplicate_thread_path): ...to this. Also, drop ill-formed candidates. (jump_thread_path_registry::remove_jump_threads_including): Rename... (fwd_jt_path_registry::remove_jump_threads_including): ...to this. (jt_path_registry::thread_through_all_blocks): New. (back_jt_path_registry::update_cfg): New. (fwd_jt_path_registry::update_cfg): New. (jump_thread_path_registry::register_jump_thread): Rename... (jt_path_registry::register_jump_thread): ...to this. * tree-ssa-threadupdate.h (class jump_thread_path_registry): Abstract to... (class jt_path_registry): ...here. (class fwd_jt_path_registry): New. (class back_jt_path_registry): New. --- gcc/tree-ssa-threadbackward.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index e729923..7ff5cec 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -56,7 +56,7 @@ public: bool register_path (const vec &, edge taken); bool thread_through_all_blocks (bool may_peel_loop_headers); private: - jump_thread_path_registry m_lowlevel_registry; + back_jt_path_registry m_lowlevel_registry; const int m_max_allowable_paths; int m_threaded_paths; }; -- cgit v1.1 From c7a669af0aeb639eb78f1614cbecb72a98d81ce8 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Sat, 11 Sep 2021 17:33:25 +0200 Subject: Remove references to FSM threads. Now that the jump thread back registry has been split into the generic copier and the custom (old) copier, it becomes trivial to remove the FSM bits from the jump threaders. First, there's no need for an EDGE_FSM_THREAD type. The only reason we were looking at the threading type was to determine what type of copier to use, and now that the copier has been split, there's no need to even look. However, there is one check in register_jump_thread where we verify that only the generic copier can thread through back-edges. I've removed that check in favor of a flag passed to the constructor. I've also removed all the FSM references from the code and tests. Interestingly, some tests weren't even testing the right thing. They were testing for "FSM" which would catch jump thread paths as well as the backward threader *failing* on registering a path. *big eye roll* The only remaining code that was actually checking for EDGE_FSM_THREAD was adjust_paths_after_duplication, and the checks could be written without looking at the edge type at all. For the record, the code there is horrible: it's convoluted, hard to read, and doesn't have any tests. I'd smack myself if I could go back in time. All that remains are the FSM references in the --param's themselves. I think we should s/fsm/threader/, since I envision a day when we can share the cost basis code between the threaders. However, I don't know what the proper procedure is for renaming existing compiler options. By the way, param_fsm_maximum_phi_arguments is no longer relevant after the rewrite. We can nuke that one right away. Tested on x86-64 Linux. gcc/ChangeLog: * tree-ssa-threadbackward.c (back_threader_profitability::profitable_path_p): Remove FSM references. (back_threader_registry::register_path): Same. * tree-ssa-threadedge.c (jump_threader::simplify_control_stmt_condition): Same. * tree-ssa-threadupdate.c (jt_path_registry::jt_path_registry): Add backedge_threads argument. (fwd_jt_path_registry::fwd_jt_path_registry): Pass backedge_threads argument. (back_jt_path_registry::back_jt_path_registry): Same. (dump_jump_thread_path): Adjust for FSM removal. (back_jt_path_registry::rewire_first_differing_edge): Same. (back_jt_path_registry::adjust_paths_after_duplication): Same. (back_jt_path_registry::update_cfg): Same. (jt_path_registry::register_jump_thread): Same. * tree-ssa-threadupdate.h (enum jump_thread_edge_type): Remove EDGE_FSM_THREAD. (class back_jt_path_registry): Add backedge_threads to constructor. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr21417.c: Adjust for FSM removal. * gcc.dg/tree-ssa/pr66752-3.c: Same. * gcc.dg/tree-ssa/pr68198.c: Same. * gcc.dg/tree-ssa/pr69196-1.c: Same. * gcc.dg/tree-ssa/pr70232.c: Same. * gcc.dg/tree-ssa/pr77445.c: Same. * gcc.dg/tree-ssa/ranger-threader-4.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same. * gcc.dg/tree-ssa/ssa-thread-12.c: Same. * gcc.dg/tree-ssa/ssa-thread-13.c: Same. --- gcc/tree-ssa-threadbackward.c | 37 +++++++++++++++++-------------------- 1 file changed, 17 insertions(+), 20 deletions(-) (limited to 'gcc/tree-ssa-threadbackward.c') diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 7ff5cec..805b7ac 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -593,7 +593,7 @@ back_threader_profitability::profitable_path_p (const vec &m_path, if (m_path.length () > (unsigned) param_max_fsm_thread_length) { if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, " FAIL: FSM jump-thread path not considered: " + fprintf (dump_file, " FAIL: Jump-thread path not considered: " "the number of basic blocks on the path " "exceeds PARAM_MAX_FSM_THREAD_LENGTH.\n"); return false; @@ -768,7 +768,7 @@ back_threader_profitability::profitable_path_p (const vec &m_path, if (path_crosses_loops) { if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, " FAIL: FSM jump-thread path not considered: " + fprintf (dump_file, " FAIL: Jump-thread path not considered: " "the path crosses loops.\n"); return false; } @@ -784,7 +784,7 @@ back_threader_profitability::profitable_path_p (const vec &m_path, if (n_insns >= param_max_fsm_thread_path_insns) { if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, " FAIL: FSM jump-thread path not considered: " + fprintf (dump_file, " FAIL: Jump-thread path not considered: " "the number of instructions on the path " "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n"); return false; @@ -793,7 +793,7 @@ back_threader_profitability::profitable_path_p (const vec &m_path, else if (!m_speed_p && n_insns > 1) { if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, " FAIL: FSM jump-thread path not considered: " + fprintf (dump_file, " FAIL: Jump-thread path not considered: " "duplication of %i insns is needed and optimizing for size.\n", n_insns); return false; @@ -818,25 +818,22 @@ back_threader_profitability::profitable_path_p (const vec &m_path, { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, - " FAIL: FSM would create irreducible loop without threading " + " FAIL: Would create irreducible loop without threading " "multiway branch.\n"); return false; } - /* If this path does not thread through the loop latch, then we are - using the FSM threader to find old style jump threads. This - is good, except the FSM threader does not re-use an existing - threading path to reduce code duplication. - - So for that case, drastically reduce the number of statements - we are allowed to copy. */ + /* The generic copier used by the backthreader does not re-use an + existing threading path to reduce code duplication. So for that + case, drastically reduce the number of statements we are allowed + to copy. */ if (!(threaded_through_latch && threaded_multiway_branch) && (n_insns * param_fsm_scale_path_stmts >= param_max_jump_thread_duplication_stmts)) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, - " FAIL: FSM did not thread around loop and would copy too " + " FAIL: Did not thread around loop and would copy too " "many statements.\n"); return false; } @@ -849,7 +846,7 @@ back_threader_profitability::profitable_path_p (const vec &m_path, { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, - " FAIL: FSM Thread through multiway branch without threading " + " FAIL: Thread through multiway branch without threading " "a multiway branch.\n"); return false; } @@ -865,7 +862,7 @@ back_threader_profitability::profitable_path_p (const vec &m_path, { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, - " FAIL: FSM Thread through latch before loop opts would create non-empty latch\n"); + " FAIL: Thread through latch before loop opts would create non-empty latch\n"); return false; } @@ -887,8 +884,8 @@ back_threader_registry::register_path (const vec &m_path, if (m_threaded_paths > m_max_allowable_paths) { if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, " FAIL: FSM jump-thread path not considered: " - "the number of previously recorded FSM paths to " + fprintf (dump_file, " FAIL: Jump-thread path not considered: " + "the number of previously recorded paths to " "thread exceeds PARAM_MAX_FSM_THREAD_PATHS.\n"); return false; } @@ -896,7 +893,8 @@ back_threader_registry::register_path (const vec &m_path, vec *jump_thread_path = m_lowlevel_registry.allocate_thread_path (); - /* Record the edges between the blocks in PATH. */ + // The generic copier ignores the edge type. We can build the + // thread edges with any type. for (unsigned int j = 0; j + 1 < m_path.length (); j++) { basic_block bb1 = m_path[m_path.length () - j - 1]; @@ -905,11 +903,10 @@ back_threader_registry::register_path (const vec &m_path, edge e = find_edge (bb1, bb2); gcc_assert (e); jump_thread_edge *x - = m_lowlevel_registry.allocate_thread_edge (e, EDGE_FSM_THREAD); + = m_lowlevel_registry.allocate_thread_edge (e, EDGE_COPY_SRC_BLOCK); jump_thread_path->safe_push (x); } - /* Add the edge taken when the control variable has value ARG. */ jump_thread_edge *x = m_lowlevel_registry.allocate_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK); -- cgit v1.1