From d2423144eb36a68fd0da9224857ce807714874a7 Mon Sep 17 00:00:00 2001 From: Jakub Jelinek Date: Thu, 2 Feb 2023 10:54:54 +0100 Subject: Replace IFN_TRAP with BUILT_IN_UNREACHABLE_TRAP [PR107300] For PR106099 I've added IFN_TRAP as an alternative to __builtin_trap meant for __builtin_unreachable purposes (e.g. with -funreachable-traps or some sanitizers) which doesn't need vops because __builtin_unreachable doesn't need them either. This works in various cases, but unfortunately IPA likes to decide on the redirection to unreachable just by tweaking the cgraph edge to point to a different FUNCTION_DECL. As internal functions don't have a decl, this causes problems like in the following testcase. The following patch fixes it by removing IFN_TRAP again and replacing it with user inaccessible BUILT_IN_UNREACHABLE_TRAP, so that e.g. builtin_decl_unreachable can return it directly and we don't need to tweak it later in wherever we actually replace the call stmt. 2023-02-02 Jakub Jelinek PR ipa/107300 * builtins.def (BUILT_IN_UNREACHABLE_TRAP): New builtin. * internal-fn.def (TRAP): Remove. * internal-fn.cc (expand_TRAP): Remove. * tree.cc (build_common_builtin_nodes): Define BUILT_IN_UNREACHABLE_TRAP if not yet defined. (builtin_decl_unreachable): Use BUILT_IN_UNREACHABLE_TRAP instead of BUILT_IN_TRAP. * gimple.cc (gimple_build_builtin_unreachable): Remove emitting internal function for BUILT_IN_TRAP. * asan.cc (maybe_instrument_call): Handle BUILT_IN_UNREACHABLE_TRAP. * cgraph.cc (cgraph_edge::verify_corresponds_to_fndecl): Handle BUILT_IN_UNREACHABLE_TRAP instead of BUILT_IN_TRAP. * ipa-devirt.cc (possible_polymorphic_call_target_p): Handle BUILT_IN_UNREACHABLE_TRAP. * builtins.cc (expand_builtin, is_inexpensive_builtin): Likewise. * tree-cfg.cc (verify_gimple_call, pass_warn_function_return::execute): Likewise. * attribs.cc (decl_attributes): Don't report exclusions on BUILT_IN_UNREACHABLE_TRAP either. * gcc.dg/pr107300.c: New test. --- gcc/cgraph.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'gcc/cgraph.cc') diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index 06bc980..f0d06bf 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -3248,11 +3248,11 @@ cgraph_edge::verify_corresponds_to_fndecl (tree decl) node = node->ultimate_alias_target (); /* Optimizers can redirect unreachable calls or calls triggering undefined - behavior to __builtin_unreachable or __builtin_trap. */ + behavior to __builtin_unreachable or __builtin_unreachable trap. */ if (fndecl_built_in_p (callee->decl, BUILT_IN_NORMAL) && (DECL_FUNCTION_CODE (callee->decl) == BUILT_IN_UNREACHABLE - || DECL_FUNCTION_CODE (callee->decl) == BUILT_IN_TRAP)) + || DECL_FUNCTION_CODE (callee->decl) == BUILT_IN_UNREACHABLE_TRAP)) return false; if (callee->former_clone_of != node->decl -- cgit v1.1 From cad2412cc84518195fceb2db31e82e6df7e5a2c2 Mon Sep 17 00:00:00 2001 From: Jakub Jelinek Date: Tue, 7 Feb 2023 10:33:54 +0100 Subject: cgraph: Handle simd clones in cgraph_node::set_{const,pure}_flag [PR106433] The following testcase ICEs, because we determine only in late pure const pass that bar is const (the content of the function loses a store to a global var during dse3 and read from it during cddce2) and local-pure-const2 makes it const. The cgraph ordering is that post IPA (in late IPA simd clones are created) bar is processed first, then foo as its caller, then foo.simdclone* and finally bar.simdclone*. Conceptually I think that is the right ordering which allows for static simd clones to be removed. The reason for the ICE is that because bar was marked const, the call to it lost vops before vectorization, and when we in foo.simdclone* try to vectorize the call to bar, we replace it with bar.simdclone* which hasn't been marked const and so needs vops, which we don't add. Now, because the simd clones are created from the same IL, just in a loop with different argument/return value passing, I think generally if the base function is determined to be const or pure, the simd clones should be too, unless e.g. the vectorization causes different optimization decisions, but then still the global memory reads if any shouldn't affect what the function does and global memory stores shouldn't be reachable at runtime. So, the following patch changes set_{const,pure}_flag to mark also simd clones. 2023-02-07 Jakub Jelinek PR tree-optimization/106433 * cgraph.cc (set_const_flag_1): Recurse on simd clones too. (cgraph_node::set_pure_flag): Call set_pure_flag_1 on simd clones too. * gcc.c-torture/compile/pr106433.c: New test. --- gcc/cgraph.cc | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'gcc/cgraph.cc') diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index f0d06bf..f352212 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -2764,6 +2764,9 @@ set_const_flag_1 (cgraph_node *node, bool set_const, bool looping, if (!set_const || alias->get_availability () > AVAIL_INTERPOSABLE) set_const_flag_1 (alias, set_const, looping, changed); } + for (struct cgraph_node *n = node->simd_clones; n != NULL; + n = n->simdclone->next_clone) + set_const_flag_1 (n, set_const, looping, changed); for (cgraph_edge *e = node->callers; e; e = e->next_caller) if (e->caller->thunk && (!set_const || e->caller->get_availability () > AVAIL_INTERPOSABLE)) @@ -2876,6 +2879,9 @@ cgraph_node::set_pure_flag (bool pure, bool looping) { struct set_pure_flag_info info = {pure, looping, false}; call_for_symbol_thunks_and_aliases (set_pure_flag_1, &info, !pure, true); + for (struct cgraph_node *n = simd_clones; n != NULL; + n = n->simdclone->next_clone) + set_pure_flag_1 (n, &info); return info.changed; } -- cgit v1.1