aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2022-06-28openmp: Add warning when functions containing metadirectives with ↵Kwok Cheung Yeung8-2/+77
'construct={target}' called directly void f(void) { #pragma omp metadirective \ when (construct={target}: A) \ default (B) ... } ... { #pragma omp target f(); // Target call f(); // Local call } With the OpenMP 5.0/5.1 specifications, we would expect A to be selected in the metadirective when the target call is made, but B when f is called directly outside of a target context. However, since GCC does not have separate copies of f for local and target calls, and the construct selector is static, it must be resolved one way or the other at compile-time (currently in the favour of selecting A), which may be unexpected behaviour. This patch attempts to detect the above situation, and will emit a warning if found. 2022-01-28 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * gimplify.cc (gimplify_omp_metadirective): Mark offloadable functions containing metadirectives with 'construct={target}' in the selector. * omp-general.cc (omp_has_target_constructor_p): New. * omp-general.h (omp_has_target_constructor_p): New prototype. * omp-low.cc (lower_omp_1): Emit warning if marked functions called outside of a target context. gcc/testsuite/ * c-c++-common/gomp/metadirective-4.c (main): Add expected warning. * gfortran.dg/gomp/metadirective-4.f90 (test): Likewise. libgomp/ * testsuite/libgomp.c-c++-common/metadirective-2.c (main): Add expected warning. * testsuite/libgomp.fortran/metadirective-2.f90 (test): Likewise.
2022-06-28openmp: Add support for 'target_device' context selector setKwok Cheung Yeung14-16/+260
2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * builtin-types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New type. * omp-builtins.def (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE): New builtin. * omp-general.cc (omp_context_selector_matches): Handle 'target_device' selector set. (omp_dynamic_cond): Generate expression tree for 'target_device' selector set. (omp_context_compute_score): Handle selectors in 'target_device' set. gcc/c/ * c-parser.cc (omp_target_device_selectors): New. (c_parser_omp_context_selector): Accept 'target_device' selector set. Treat 'device_num' selector as expression. (c_parser_omp_context_selector_specification): Handle 'target_device' selector set. gcc/cp/ * parser.cc (omp_target_device_selectors): New. (cp_parser_omp_context_selector): Accept 'target_device' selector set. Treat 'device_num' selector as expression. (cp_parser_omp_context_selector_specification): Handle 'target_device' selector set. gcc/fortran/ * openmp.cc (omp_target_device_selectors): New. (gfc_match_omp_context_selector): Accept 'target_device' selector set. Treat 'device_num' selector as expression. (gfc_match_omp_context_selector_specification): Handle 'target_device' selector set. * types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New type. gcc/testsuite/ * c-c++-common/gomp/metadirective-7.c: New. * gfortran.dg/gomp/metadirective-7.f90: New. libgomp/ * Makefile.am (libgomp_la_SOURCES): Add selector.c. * Makefile.am: Regenerate. * config/gcn/selector.c: New. * config/linux/selector.c: New. * config/linux/x86/selector.c: New. * config/nvptx/selector.c: New. * libgomp-plugin.h (GOMP_OFFLOAD_evaluate_device): New. * libgomp.h (struct gomp_device_descr): Add evaluate_device_func field. * libgomp.map (GOMP_5.1): Add GOMP_evaluate_target_device. * libgomp_g.h (GOMP_evaluate_current_device): New. (GOMP_evaluate_target_device): New. * oacc-host.c (host_evaluate_device): New. (host_openacc_exec): Initialize evaluate_device_func field to host_evaluate_device. * plugin/plugin-gcn.c (GOMP_OFFLOAD_evaluate_device): New. * plugin/plugin-nvptx.c (struct ptx_device): Add compute_major and compute_minor fields. (nvptx_open_device): Read compute capability information from device. (CHECK_ISA): New macro. (GOMP_OFFLOAD_evaluate_device): New. * selector.c: New. * target.c (GOMP_evaluate_target_device): New. (gomp_load_plugin_for_device): Load evaulate_device plugin function. * testsuite/libgomp.c-c++-common/metadirective-5.c: New testcase. * testsuite/libgomp.fortran/metadirective-5.f90: New testcase.
2022-06-28openmp: Metadirective fixesKwok Cheung Yeung6-19/+34
Fix regressions introduced by block/statement skipping. If user condition selector is constant, do not return it as a dynamic selector. 2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/c/ * c-parser.cc (c_parser_skip_to_end_of_block_or_statement): Track bracket depth separately from nesting depth. gcc/cp/ * parser.cc (cp_parser_skip_to_end_of_statement): Revert. (cp_parser_skip_to_end_of_block_or_statement): Track bracket depth separately from nesting depth. gcc/ * omp-general.cc (omp_dynamic_cond): Do not return user condition if constant.
2022-06-28openmp: Add testcases for metadirectivesKwok Cheung Yeung14-0/+494
2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/testsuite/ * c-c++-common/gomp/metadirective-1.c: New. * c-c++-common/gomp/metadirective-2.c: New. * c-c++-common/gomp/metadirective-3.c: New. * c-c++-common/gomp/metadirective-4.c: New. * c-c++-common/gomp/metadirective-5.c: New. * c-c++-common/gomp/metadirective-6.c: New. * gcc.dg/gomp/metadirective-1.c: New. * gfortran.dg/gomp/metadirective-1.f90: New. * gfortran.dg/gomp/metadirective-2.f90: New. * gfortran.dg/gomp/metadirective-3.f90: New. * gfortran.dg/gomp/metadirective-4.f90: New. * gfortran.dg/gomp/metadirective-5.f90: New. * gfortran.dg/gomp/metadirective-6.f90: New. libgomp/ * testsuite/libgomp.c-c++-common/metadirective-1.c: New. * testsuite/libgomp.c-c++-common/metadirective-2.c: New. * testsuite/libgomp.c-c++-common/metadirective-3.c: New. * testsuite/libgomp.c-c++-common/metadirective-4.c: New. * testsuite/libgomp.fortran/metadirective-1.f90: New. * testsuite/libgomp.fortran/metadirective-2.f90: New. * testsuite/libgomp.fortran/metadirective-3.f90: New. * testsuite/libgomp.fortran/metadirective-4.f90: New.
2022-06-28openmp, fortran: Add Fortran support for parsing metadirectivesKwok Cheung Yeung18-318/+809
This adds support for parsing OpenMP metadirectives in the Fortran front end. 2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * omp-general.cc (omp_check_context_selector): Revert string length check. (omp_context_name_list_prop): Likewise. gcc/fortran/ * decl.cc (gfc_match_end): Handle COMP_OMP_METADIRECTIVE and COMP_OMP_BEGIN_METADIRECTIVE. * dump-parse-tree.cc (show_omp_node): Handle EXEC_OMP_METADIRECTIVE. (show_code_node): Handle EXEC_OMP_METADIRECTIVE. * gfortran.h (enum gfc_statement): Add ST_OMP_METADIRECTIVE, ST_OMP_BEGIN_METADIRECTIVE and ST_OMP_END_METADIRECTIVE. (struct gfc_omp_metadirective_clause): New structure. (gfc_get_omp_metadirective_clause): New macro. (struct gfc_st_label): Add omp_region field. (enum gfc_exec_op): Add EXEC_OMP_METADIRECTIVE. (struct gfc_code): Add omp_metadirective_clauses field. (gfc_free_omp_metadirective_clauses): New prototype. (match_omp_directive): New prototype. * io.cc (format_asterisk): Initialize omp_region field. * match.h (gfc_match_omp_begin_metadirective): New prototype. (gfc_match_omp_metadirective): New prototype. * openmp.cc (gfc_match_omp_eos): Match ')' in context selectors. (gfc_free_omp_metadirective_clauses): New. (gfc_match_omp_clauses): Remove context_selector argument. Rely on gfc_match_omp_eos to match end of clauses. (match_omp): Remove extra argument to gfc_match_omp_clauses. (gfc_match_omp_context_selector): Remove extra argument to gfc_match_omp_clauses. Set gfc_matching_omp_context_selector before call to gfc_match_omp_clauses and reset after. (gfc_match_omp_context_selector_specification): Modify to take a gfc_omp_set_selector** argument. (gfc_match_omp_declare_variant): Pass set_selectors to gfc_match_omp_context_selector_specification. (match_omp_metadirective): New. (gfc_match_omp_begin_metadirective): New. (gfc_match_omp_metadirective): New. (resolve_omp_metadirective): New. (gfc_resolve_omp_directive): Handle EXEC_OMP_METADIRECTIVE. * parse.cc (gfc_matching_omp_context_selector): New variable. (gfc_in_metadirective_body): New variable. (gfc_omp_region_count): New variable. (decode_omp_directive): Match 'begin metadirective', 'end metadirective' and 'metadirective'. (match_omp_directive): New. (case_omp_structured_block): New. (case_omp_do): New. (gfc_ascii_statement): Handle metadirective statements. (gfc_omp_end_stmt): New. (parse_omp_do): Delegate to gfc_omp_end_stmt. (parse_omp_structured_block): Delegate to gfc_omp_end_stmt. Handle ST_OMP_END_METADIRECTIVE. (parse_omp_metadirective_body): New. (parse_executable): Delegate to case_omp_structured_block and case_omp_do. Return after one statement if compiling regular metadirective. Handle metadirective statements. (gfc_parse_file): Reset gfc_omp_region_count, gfc_in_metadirective_body and gfc_matching_omp_context_selector. * parse.h (enum gfc_compile_state): Add COMP_OMP_METADIRECTIVE and COMP_OMP_BEGIN_METADIRECTIVE. (gfc_omp_end_stmt): New prototype. (gfc_matching_omp_context_selector): New declaration. (gfc_in_metadirective_body): New declaration. (gfc_omp_region_count): New declaration. * resolve.cc (gfc_resolve_code): Handle EXEC_OMP_METADIRECTIVE. * st.cc (gfc_free_statement): Handle EXEC_OMP_METADIRECTIVE. * symbol.cc (compare_st_labels): Take omp_region into account. (gfc_get_st_labels): Incorporate omp_region into label. * trans-decl.cc (gfc_get_label_decl): Add omp_region into translated label name. * trans-openmp.cc (gfc_trans_omp_directive): Handle EXEC_OMP_METADIRECTIVE. (gfc_trans_omp_set_selector): Hoist code from... (gfc_trans_omp_declare_variant): ...here. (gfc_trans_omp_metadirective): New. * trans-stmt.h (gfc_trans_omp_metadirective): New prototype. * trans.cc (trans_code): Handle EXEC_OMP_METADIRECTIVE.
2022-06-28openmp: Add C++ support for parsing metadirectivesKwok Cheung Yeung2-8/+430
This adds support for parsing OpenMP metadirectives in the C++ front end. 2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/cp/ * parser.cc (cp_parser_skip_to_end_of_statement): Handle parentheses. (cp_parser_skip_to_end_of_block_or_statement): Likewise. (cp_parser_omp_context_selector): Add extra argument. Allow non-constant expressions. (cp_parser_omp_context_selector_specification): Add extra argument and propagate to cp_parser_omp_context_selector. (analyze_metadirective_body): New. (cp_parser_omp_metadirective): New. (cp_parser_omp_construct): Handle PRAGMA_OMP_METADIRECTIVE. (cp_parser_pragma): Handle PRAGMA_OMP_METADIRECTIVE.
2022-06-28openmp: Add support for streaming metadirectives and resolving them after LTOKwok Cheung Yeung7-0/+220
This patch adds support for streaming metadirective Gimple statements during LTO, and adds a metadirective expansion pass that runs after LTO. This is required for metadirectives with selectors that can only be resolved from within the accel compiler. 2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * Makefile.in (OBJS): Add omp-expand-metadirective.o. * gimple-streamer-in.cc (input_gimple_stmt): Add case for GIMPLE_OMP_METADIRECTIVE. Handle metadirective labels. * gimple-streamer-out.cc (output_gimple_stmt): Likewise. * omp-expand-metadirective.cc: New. * passes.def: Add pass_omp_expand_metadirective. * tree-pass.h (make_pass_omp_expand_metadirective): New prototype.
2022-06-28openmp: Add support for resolving metadirectives during parsing and ↵Kwok Cheung Yeung6-10/+367
Gimplification This adds support for resolving metadirectives according to the OpenMP 5.1 specification. The variants are sorted by score, then gathered into a list of dynamic replacement candidates. The metadirective is then expanded into a sequence of 'if..else' statements to test the dynamic selector and execute the variant if the selector is satisfied. If any of the selectors in the list are unresolvable, GCC will give up on resolving the metadirective and try again later. 2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * gimplify.cc (expand_omp_metadirective): New. * omp-general.cc: Include tree-pretty-print.h. (DELAY_METADIRECTIVES_AFTER_LTO): New macro. (omp_context_selector_matches): Delay resolution of selectors. Allow non-constant expressions. (omp_dynamic_cond): New. (omp_dynamic_selector_p): New. (sort_variant): New. (omp_get_dynamic_candidates): New. (omp_resolve_metadirective): New. (omp_resolve_metadirective): New. * omp-general.h (struct omp_metadirective_variant): New. (omp_resolve_metadirective): New prototype. gcc/c-family/ * c-omp.cc (c_omp_expand_metadirective_r): New. (c_omp_expand_metadirective): New.
2022-06-28openmp: Add middle-end support for metadirectivesKwok Cheung Yeung15-1/+633
This adds a new Gimple statement type GIMPLE_OMP_METADIRECTIVE, which represents the metadirective in Gimple. In high Gimple, the statement contains the body of the directive variants, whereas in low Gimple, it only contains labels to the bodies. This patch adds support for converting metadirectives from tree to Gimple form, and handling of the Gimple form (Gimple lowering, OpenMP lowering and expansion, inlining, SSA handling etc). Metadirectives should be resolved before they reach the back-end, otherwise the compiler will crash as GCC does not know how to convert metadirective Gimple statements to RTX. 2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * gimple-low.cc (lower_omp_metadirective): New. (lower_stmt): Handle GIMPLE_OMP_METADIRECTIVE. * gimple-pretty-print.cc (dump_gimple_omp_metadirective): New. (pp_gimple_stmt_1): Handle GIMPLE_OMP_METADIRECTIVE. * gimple-walk.cc (walk_gimple_op): Handle GIMPLE_OMP_METADIRECTIVE. (walk_gimple_stmt): Likewise. * gimple.cc (gimple_alloc_omp_metadirective): New. (gimple_build_omp_metadirective): New. (gimple_build_omp_metadirective_variant): New. * gimple.def (GIMPLE_OMP_METADIRECTIVE): New. (GIMPLE_OMP_METADIRECTIVE_VARIANT): New. * gimple.h (gomp_metadirective_variant): New. (gomp_metadirective): New. (is_a_helper <gomp_metadirective *>::test): New. (is_a_helper <gomp_metadirective_variant *>::test): New. (is_a_helper <const gomp_metadirective *>::test): New. (is_a_helper <const gomp_metadirective_variant *>::test): New. (gimple_alloc_omp_metadirective): New prototype. (gimple_build_omp_metadirective): New prototype. (gimple_build_omp_metadirective_variant): New prototype. (gimple_has_substatements): Add GIMPLE_OMP_METADIRECTIVE case. (gimple_has_ops): Add GIMPLE_OMP_METADIRECTIVE. (gimple_omp_metadirective_label): New. (gimple_omp_metadirective_set_label): New. (gimple_omp_metadirective_variants): New. (gimple_omp_metadirective_set_variants): New. (CASE_GIMPLE_OMP): Add GIMPLE_OMP_METADIRECTIVE. * gimplify.cc (is_gimple_stmt): Add OMP_METADIRECTIVE. (expand_omp_metadirective): New. (gimplify_omp_metadirective): New. (gimplify_expr): Add case for OMP_METADIRECTIVE. * gsstruct.def (GSS_OMP_METADIRECTIVE): New. (GSS_OMP_METADIRECTIVE_VARIANT): New. * omp-expand.cc (build_omp_regions_1): Handle GIMPLE_OMP_METADIRECTIVE. (omp_make_gimple_edges): Likewise. * omp-low.cc (struct omp_context): Add next_clone field. (new_omp_context): Initialize next_clone field. (clone_omp_context): New. (delete_omp_context): Delete clone contexts. (scan_omp_metadirective): New. (scan_omp_1_stmt): Handle GIMPLE_OMP_METADIRECTIVE. (lower_omp_metadirective): New. (lower_omp_1): Handle GIMPLE_OMP_METADIRECTIVE. * tree-cfg.cc (cleanup_dead_labels): Handle GIMPLE_OMP_METADIRECTIVE. (gimple_redirect_edge_and_branch): Likewise. * tree-inline.cc (remap_gimple_stmt): Handle GIMPLE_OMP_METADIRECTIVE. (estimate_num_insns): Likewise. * tree-pretty-print.cc (dump_generic_node): Handle OMP_METADIRECTIVE. * tree-ssa-operands.cc (parse_ssa_operands): Handle GIMPLE_OMP_METADIRECTIVE.
2022-06-28openmp: Add C support for parsing metadirectivesKwok Cheung Yeung13-14/+499
This patch implements parsing for the OpenMP metadirective introduced in OpenMP 5.0. Metadirectives are parsed into an OMP_METADIRECTIVE node, with the variant clauses forming a chain accessible via OMP_METADIRECTIVE_CLAUSES. Each clause contains the context selector and tree for the variant. User conditions in the selector are now permitted to be non-constant when used in metadirectives as specified in OpenMP 5.1. 2021-01-25 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * omp-general.cc (omp_context_selector_matches): Add extra argument. (omp_resolve_metadirective): New stub function. * omp-general.h (struct omp_metadirective_variant): New. (omp_context_selector_matches): Add extra argument. (omp_resolve_metadirective): New prototype. * tree.def (OMP_METADIRECTIVE): New. * tree.h (OMP_METADIRECTIVE_CLAUSES): New macro. gcc/c/ * c-parser.cc (c_parser_skip_to_end_of_block_or_statement): Handle parentheses in statement. (c_parser_omp_metadirective): New prototype. (c_parser_omp_context_selector): Add extra argument. Allow non-constant expressions. (c_parser_omp_context_selector_specification): Add extra argument and propagate it to c_parser_omp_context_selector. (analyze_metadirective_body): New. (c_parser_omp_metadirective): New. (c_parser_omp_construct): Handle PRAGMA_OMP_METADIRECTIVE. gcc/c-family/ * c-common.h (enum c_omp_directive_kind): Add C_OMP_DIR_META. (c_omp_expand_metadirective): New prototype. * c-gimplify.cc (genericize_omp_metadirective_stmt): New. (c_genericize_control_stmt): Handle OMP_METADIRECTIVE tree nodes. * c-omp.cc (omp_directives): Classify metadirectives as C_OMP_DIR_META. (c_omp_expand_metadirective): New stub function. * c-pragma.cc (omp_pragmas): Add entry for metadirective. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_METADIRECTIVE.
2022-06-28ifcvt: Don't introduce trapping or faulting reads in noce_try_sign_mask ↵Jakub Jelinek2-7/+29
[PR106032] noce_try_sign_mask as documented will optimize if (c < 0) x = t; else x = 0; into x = (c >> bitsm1) & t; The optimization is done if either t is unconditional (e.g. for x = t; if (c >= 0) x = 0; ) or if it is cheap. We already check that t doesn't have side-effects, but if t is conditional, we need to punt also if it may trap or fault, as we make it unconditional. I've briefly skimmed other noce_try* optimizations and didn't find one that would suffer from the same problem. 2022-06-21 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/106032 * ifcvt.cc (noce_try_sign_mask): Punt if !t_unconditional, and t may_trap_or_fault_p, even if it is cheap. * gcc.c-torture/execute/pr106032.c: New test. (cherry picked from commit a0c30fe3b888f20215f3e040d21b62b603804ca9)
2022-06-28expand: Fix up expand_cond_expr_using_cmove [PR106030]Jakub Jelinek2-1/+18
If expand_cond_expr_using_cmove can't find a cmove optab for a particular mode, it tries to promote the mode and perform the cmove in the promoted mode. The testcase in the patch ICEs on arm because in that case we pass temp which has the promoted mode (SImode) as target to expand_operands where the operands have the non-promoted mode (QImode). Later on the function uses paradoxical subregs: if (GET_MODE (op1) != mode) op1 = gen_lowpart (mode, op1); if (GET_MODE (op2) != mode) op2 = gen_lowpart (mode, op2); to change the operand modes. The following patch fixes it by passing NULL_RTX as target if it has promoted mode. 2022-06-21 Jakub Jelinek <jakub@redhat.com> PR middle-end/106030 * expr.cc (expand_cond_expr_using_cmove): Pass NULL_RTX instead of temp to expand_operands if mode has been promoted. * gcc.c-torture/compile/pr106030.c: New test. (cherry picked from commit 2df1df945fac85d7b3d084001414a66a2709d8fe)
2022-06-28loongarch: exclude LARCH_PROLOGUE_TEMP from SIBCALL_REGS [PR 106096]Xi Ruoyao4-2/+111
The epilogue may clobber LARCH_PROLOGUE_TEMP ($r13/$t1), so it cannot be used for sibcalls. gcc/ChangeLog: PR target/106096 * config/loongarch/loongarch.h (REG_CLASS_CONTENTS): Exclude $r13 from SIBCALL_REGS. * config/loongarch/loongarch.cc (loongarch_regno_to_class): Change $r13 to JIRL_REGS. gcc/testsuite/ChangeLog: PR target/106096 * g++.target/loongarch/loongarch.exp: New test support file. * g++.target/loongarch/pr106096.C: New test. (cherry picked from commit 020b7d98589bbc928b5a66b1ed56b42af8791355)
2022-06-28Daily bump.GCC Administrator1-1/+1
2022-06-27Daily bump.GCC Administrator1-1/+1
2022-06-26Daily bump.GCC Administrator1-1/+1
2022-06-25Daily bump.GCC Administrator4-1/+26
2022-06-24openacc: Adjust test expectations to new "kernels" handlingFrederik Harwath53-686/+887
Adjust tests to changed expectations with the new Graphite-based "kernels" handling. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/pr84955-1.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/pr85486.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Adjust. * testsuite/libgomp.oacc-fortran/kernels-acc-loop-reduction-2.f90: Adjust. * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Adjust. * testsuite/libgomp.oacc-fortran/parallel-loop-auto-reduction-2.f90: Removed. gcc/testsuite/ChangeLog: * c-c++-common/goacc/acc-icf.c: Adjust. * c-c++-common/goacc/cache-3-1.c: Adjust. * c-c++-common/goacc/classify-kernels-unparallelized-graphite.c: Adjust. * c-c++-common/goacc/classify-kernels.c: Adjust. * c-c++-common/goacc/classify-serial.c: Adjust. * c-c++-common/goacc/if-clause-2.c: Adjust. * c-c++-common/goacc/kernels-decompose-1.c: Adjust. * c-c++-common/goacc/kernels-decompose-2.c: Adjust. * c-c++-common/goacc/kernels-decompose-ice-1.c: Adjust. * c-c++-common/goacc/kernels-decompose-ice-2.c: Adjust. * c-c++-common/goacc/kernels-loop-3-acc-loop.c: Adjust. * c-c++-common/goacc/kernels-loop-3.c: Adjust. * c-c++-common/goacc/loop-2-kernels.c: Adjust. * c-c++-common/goacc/nested-reductions-2-parallel.c: Adjust. * c-c++-common/goacc/note-parallelism-1-kernels-loop-auto.c: Adjust. * c-c++-common/goacc/note-parallelism-1-kernels-loop-independent_seq.c: Adjust. * c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Adjust. * c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c: Adjust. * c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c: Adjust. * c-c++-common/goacc/note-parallelism-combined-kernels-loop-independent_seq.c: Adjust. * c-c++-common/goacc/note-parallelism-kernels-conditional-loop-independent_seq.c: Adjust. * c-c++-common/goacc/note-parallelism-kernels-loop-auto.c: Adjust. * c-c++-common/goacc/note-parallelism-kernels-loop-independent_seq.c: Adjust. * c-c++-common/goacc/note-parallelism-kernels-loops.c: Adjust. * c-c++-common/goacc/routine-1.c: Adjust. * c-c++-common/goacc/routine-level-of-parallelism-2.c: Adjust. * c-c++-common/goacc/routine-nohost-1.c: Adjust. * c-c++-common/goacc/uninit-copy-clause.c: Adjust. * gcc.dg/goacc/loop-processing-1.c: Adjust. * gcc.dg/goacc/nested-function-1.c: Adjust. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Adjust. * gfortran.dg/goacc/classify-kernels.f95: Adjust. * gfortran.dg/goacc/classify-parallel.f95: Adjust. * gfortran.dg/goacc/classify-routine.f95: Adjust. * gfortran.dg/goacc/classify-serial.f95: Adjust. * gfortran.dg/goacc/common-block-3.f90: Adjust. * gfortran.dg/goacc/gang-static.f95: Adjust. * gfortran.dg/goacc/kernels-decompose-1.f95: Adjust. * gfortran.dg/goacc/kernels-decompose-2.f95: Adjust. * gfortran.dg/goacc/kernels-loop-2.f95: Adjust. * gfortran.dg/goacc/kernels-loop-data-2.f95: Adjust. * gfortran.dg/goacc/kernels-loop-inner.f95: Adjust. * gfortran.dg/goacc/kernels-loop.f95: Adjust. * gfortran.dg/goacc/kernels-tree.f95: Adjust. * gfortran.dg/goacc/loop-2-kernels.f95: Adjust. * gfortran.dg/goacc/loop-auto-transfer-2.f90: Adjust. * gfortran.dg/goacc/loop-auto-transfer-3.f90: Adjust. * gfortran.dg/goacc/loop-auto-transfer-4.f90: Adjust. * gfortran.dg/goacc/nested-function-1.f90: Adjust. * gfortran.dg/goacc/nested-reductions-2-parallel.f90: Adjust. * gfortran.dg/goacc/private-explicit-kernels-1.f95: Adjust. * gfortran.dg/goacc/private-predetermined-kernels-1.f95: Adjust. * gfortran.dg/goacc/routine-module-mod-1.f90: Adjust. * gfortran.dg/goacc/uninit-copy-clause.f95: Adjust. * c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c: Removed.
2022-06-24tilegx: Fix infinite loop in gen-mul-tables generatorIain Buclaw1-2/+2
Since around GCC 10, the condition `j < (INTMAX_MAX / 10)' will get optimized into `j != 922337203685477580', which will result in an infinite loop for certain inputs of `j'. Copy the condition already used by the -DTILEPRO generator code, which doesn't fall into this trap. gcc/ChangeLog: * config/tilepro/gen-mul-tables.cc (tilegx_emit): Adjust loop condition to avoid overflow. (cherry picked from commit c0ad48527c314a1e9354b7c26718b56ed4abc92c)
2022-06-23c++: constexpr folding in unevaluated context [PR105931]Patrick Palka2-0/+16
Changing the type of N from int to unsigned in decltype82.C (from r13-986-g0ecb6b906f215e) reveals another spot where we perform constexpr evaluation in an unevaluated context for sake of warnings, this time from the call to shorten_compare in cp_build_binary_op, which calls fold_for_warn. We could (and probably should) suppress the shorten_compare warnings when in an unevaluated context, but there's probably other callers of fold_for_warn that are similarly affected. So this patch takes the approach of directly suppressing fold_for_warn when in an unevaluated context. PR c++/105931 gcc/cp/ChangeLog: * expr.cc (fold_for_warn): Don't fold when in an unevaluated context. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/decltype82a.C: New test. (cherry picked from commit b00b95198e6720eb23a2618870d67800f6180fdd)
2022-06-24Daily bump.GCC Administrator4-1/+103
2022-06-23c++: anon union designated init [PR105925]Jason Merrill2-0/+25
This testcase was failing because CONSTRUCTOR_IS_DESIGNATED_INIT wasn't getting set on the introduced CONSTRUCTOR for the anonymous union, and build_aggr_conv uses that flag to decide whether to pay attention to the indexes of the CONSTRUCTOR. So set the flag when we see a designator rather than relying on copying it from another CONSTRUCTOR. PR c++/105925 gcc/cp/ChangeLog: * decl.cc (reshape_init_array_1): Set CONSTRUCTOR_IS_DESIGNATED_INIT here. (reshape_init_class): And here. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/desig26.C: New test.
2022-06-23c++: -Waddress and value-dependent expr [PR105885]Jason Merrill2-0/+22
We already suppress various warnings for code that would be tautological if written directly, but not when it's the result of template substitution. It seems we need to do this for -Waddress as well. PR c++/105885 gcc/cp/ChangeLog: * pt.cc (tsubst_copy_and_build): Also suppress -Waddress for comparison of dependent operands. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/constexpr-if37.C: New test.
2022-06-23ipa-icf: skip variables with body_removedMartin Liska1-3/+4
Similarly to cgraph_nodes, it may happen that body_removed is set during merging of symbols. PR ipa/105600 gcc/ChangeLog: * ipa-icf.cc (sem_item_optimizer::filter_removed_items): Skip variables with body_removed. (cherry picked from commit 31ce821a790caec8a2849dd67a9847e78a33d14c)
2022-06-23tree-object-size: Don't let error_mark_node escape for ADDR_EXPR [PR105736]Siddhesh Poyarekar2-9/+29
The addr_expr computation does not check for error_mark_node before returning the size expression. This used to work in the constant case because the conversion to uhwi would end up causing it to return size_unknown, but that won't work for the dynamic case. Modify the control flow to explicitly return size_unknown if the offset computation returns an error_mark_node. gcc/ChangeLog: PR tree-optimization/105736 * tree-object-size.cc (addr_object_size): Return size_unknown when object offset computation returns an error. gcc/testsuite/ChangeLog: PR tree-optimization/105736 * gcc.dg/builtin-dynamic-object-size-0.c (TV4): New struct. (val3): New variable. (test_pr105736): New test. (main): Call it. Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org> (cherry picked from commit 70454c50b4592fe6876ecca13268264e395e058f)
2022-06-22c++: dependence of baselink [PR105964]Jason Merrill2-0/+29
helper<token>::c isn't dependent just because we haven't deduced its return type yet. type_dependent_expression_p already knows how to deal with that for bare FUNCTION_DECL, but needs to learn to look through a BASELINK. PR c++/105964 gcc/cp/ChangeLog: * pt.cc (type_dependent_expression_p): Look through BASELINK. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/nontype-auto21.C: New test.
2022-06-22c++: class scope function lookup [PR105908]Jason Merrill2-0/+21
In r12-1273 for PR91706, I removed the code in get_class_binding that stripped BASELINK. This testcase demonstrates that we still need to strip it in outer_binding before putting the overload set in IDENTIFIER_BINDING, for compatibility with bindings added directly for declarations. PR c++/105908 gcc/cp/ChangeLog: * name-lookup.cc (outer_binding): Strip BASELINK. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/trailing16.C: New test.
2022-06-22aarch64: Revert bogus fix for PR105254Richard Sandiford2-8/+17
In f2ebf2d98efe0ac2314b58cf474f44cb8ebd5244 I'd forced the chosen unroll factor to be a factor of the VF, in order to work around an exact_div ICE in PR105254. This was completely bogus -- clearly I didn't look in enough detail at why we ended up with an unrolled VF that wasn't a multiple of the UF. Kewen has since fixed the bug properly for PR105940, so this patch reverts my earlier attempt. Sorry for the stupidity. gcc/ PR tree-optimization/105254 PR tree-optimization/105940 Revert: * config/aarch64/aarch64.cc (aarch64_vector_costs::determine_suggested_unroll_factor): Take a loop_vec_info as argument. Restrict the unroll factor to values that divide the VF. (aarch64_vector_costs::finish_cost): Update call accordingly. gcc/testsuite/ * gcc.target/aarch64/sve/cost_model_14.c: New test. (cherry picked from commit 2636660b6f35423e0cfbf53bfad5c5fed6ae6471)
2022-06-22vect: Move suggested_unroll_factor applying [PR105940]Kewen Lin1-3/+3
As PR105940 shown, when rs6000 port tries to assign m_suggested_unroll_factor by 4 or so, there will be ICE on: exact_div (LOOP_VINFO_VECT_FACTOR (loop_vinfo), loop_vinfo->suggested_unroll_factor); In function vect_analyze_loop_2, the current place of suggested_unroll_factor applying can't guarantee it's applied for all cases. As the case shows, vectorizer could retry with SLP forced off, the vf is reset by saved_vectorization_factor which isn't applied with suggested_unroll_factor before. It means it can end up with one vf which neglects suggested_unroll_factor. I think it's off design, we should move the applying of suggested_unroll_factor after start_over. PR tree-optimization/105940 gcc/ChangeLog: * tree-vect-loop.cc (vect_analyze_loop_2): Move the place of applying suggested_unroll_factor after start_over. (cherry picked from commit f907cf4c07cf51863dadbe90894e2ae3382bada5)
2022-06-23Daily bump.GCC Administrator1-1/+1
2022-06-22Daily bump.GCC Administrator3-1/+18
2022-06-21i386: Disallow sibcall for calling ifunc functions with PIC registerH.J. Lu2-0/+28
Disallow siball when calling ifunc functions with PIC register so that PIC register can be restored. gcc/ PR target/105960 * config/i386/i386.cc (ix86_function_ok_for_sibcall): Return false if PIC register is used when calling ifunc functions. gcc/testsuite/ PR target/105960 * gcc.target/i386/pr105960.c: New test. (cherry picked from commit fe9765c0b97e6b4ce2cd226631d329fc05ba2aa5)
2022-06-21graphite: Accept loops without data referencesFrederik Harwath2-13/+5
It seems that the check that rejects loops without data references is only included to avoid handling non-profitable loops. Including those loops in Graphite's analysis enables more consistent diagnostic messages in OpenACC "kernels" code and does not introduce any testsuite regressions. If executing Graphite on loops without data references leads to noticeable compile time slow-downs for non-OpenACC users of Graphite, the check can be re-introduced but restricted to non-OpenACC functions. gcc/ChangeLog: * graphite-scop-detection.cc (scop_detection::harmful_loop_in_region): Remove check for loops without data references.
2022-06-21graphite: Adjust scop loop-nest choiceFrederik Harwath4-6/+27
The find_common_loop function is used in Graphite to obtain a common super-loop of all loops inside a SCoP. The function is applied to the loop of the destination block of the edge that leads into the SESE region and the loop of the source block of the edge that exits the region. The exit block is usually introduced by the canonicalization of the loop structure that Graphite does to support its code generation. If it is empty, it may happen that it belongs to the outer fake loop. This way, build_alias_set may end up analysing data-references with respect to this loop although there may exist a proper super-loop of the SCoP loops. This does not seem to be correct in general and it leads to problems with runtime alias check creation which fails if executed on a loop without niter information. gcc/ChangeLog: * graphite-scop-detection.cc (scop_context_loop): New function. (build_alias_set): Use scop_context_loop instead of find_common_loop. * graphite-isl-ast-to-gimple.cc (graphite_regenerate_ast_isl): Likewise. * graphite.h (scop_context_loop): New declaration.
2022-06-21graphite: Tune parameters for OpenACC useFrederik Harwath6-6/+115
The default values of some parameters that restrict Graphite's resource usage are too low for many OpenACC codes. Furthermore, exceeding the limits does not alwas lead to user-visible diagnostic messages. This commit increases the parameter values on OpenACC functions. The values were chosen to allow for the analysis of all "kernels" regions in the SPEC ACCEL v1.3 benchmark suite. Warnings about exceeded Graphite-related limits are added to the -fopt-info-missed output. Those warnings are phrased in a uniform way that intentionally refers to the "data-dependence analysis" of "OpenACC loops" instead of "a failure in Graphite" to make them easier to understand for users. gcc/ChangeLog: * graphite-optimize-isl.cc (optimize_isl): Adjust param_max_isl_operations value for OpenACC functions and add special warnings if value gets exceeded. * graphite-scop-detection.cc (build_scops): Likewise for param_graphite_max_arrays_per_scop. gcc/testsuite/ChangeLog: * gcc.dg/goacc/graphite-parameter-1.c: New test. * gcc.dg/goacc/graphite-parameter-2.c: New test.
2022-06-21openacc: Disable pass_pre on outlined functions analyzed by GraphiteFrederik Harwath2-0/+22
The additional dependences introduced by partial redundancy elimination proper and by the code hoisting step of the pass very often cause Graphite to fail on OpenACC functions. On the other hand, the pass can also enable the analysis of OpenACC loops (cf. e.g. the loop-auto-transfer-4.f90 testcase), for instance, because full redundancy elimination removes definitions that would otherwise prevent the creation of runtime alias checks outside of the SCoP. This commit disables the actual partial redundancy elimination step as well as the code hoisting step of pass_pre on OpenACC functions that might be handled by Graphite. gcc/ChangeLog: * tree-ssa-pre.cc (insert): Skip any insertions in OpenACC functions that might be processed by Graphite.
2022-06-21openacc: Handle internal function calls in pass_limFrederik Harwath5-12/+67
The loop invariant motion pass correctly refuses to move statements out of a loop if any other statement in the loop is unanalyzable. The pass does not know how to handle the OpenACC internal function calls which was not necessary until recently when the OpenACC device lowering pass was moved to a later position in the pass pipeline. This commit changes pass_lim to ignore the OpenACC internal function calls which do not contain any memory references. The hoisting enabled by this change can be useful for the data-dependence analysis in Graphite; for instance, in the outlined functions for OpenACC regions, all invariant accesses to the ".omp_data_i" struct should be hoisted out of the OpenACC loop. This is particularly important for variables that were scalars in the original loop and which have been turned into accesses to the struct by the outlining process. Not hoisting those can prevent scalar evolution analysis which is crucial for Graphite. Since any hoisting that introduces intermediate names - and hence, "fake" dependences - inside the analyzed nest can be harmful to data-dependence analysis, a flag to restrict the hoisting in OpenACC functions is added to the pass. The pass instance that executes before Graphite now runs with this flag set to true and the pass instance after Graphite runs unrestricted. A more precise way of selecting the statements for which hoisting should be enabled is left for a future improvement. gcc/ChangeLog: * passes.def: Set restrict_oacc_hoisting to true for the early pass_lim instance. * tree-ssa-loop-im.cc (movement_possibility): Add restrict_oacc_hoisting flag to function; restrict movement if set. (compute_invariantness): Add restrict_oacc_hoisting flag and pass it on. (gather_mem_refs_stmt): Skip IFN_GOACC_LOOP and IFN_UNIQUE calls. (loop_invariant_motion_in_fun): Add restrict_oacc_hoisting flag and pass it on. (pass_lim::execute): Pass on new flags. * tree-ssa-loop-manip.h (loop_invariant_motion_in_fun): Adjust declaration. * gimple-loop-interchange.cc (pass_linterchange::execute): Adjust call to loop_invariant_motion_in_fun.
2022-06-21openacc: Warn about "independent" "kernels" loops with data-dependencesFrederik Harwath3-0/+60
This commit concerns loops in OpenACC "kernels" region that have been marked up with an explicit "independent" clause by the user, but for which Graphite found data dependences. A discussion on the private internal OpenACC mailing list suggested that warning the user about the dependences woud be a more acceptable solution than reverting the user's decision. This behavior is implemented by the present commit. gcc/ChangeLog: * common.opt: Add flag Wopenacc-false-independent. * omp-offload.cc (oacc_loop_warn_if_false_independent): New function. (oacc_loop_fixed_partitions): Call from here.
2022-06-21openacc: Add runtime alias checking for OpenACC kernelsAndrew Stubbs5-209/+390
This commit adds the code generation for the runtime alias checks for OpenACC loops that have been analyzed by Graphite. The runtime alias check condition gets generated in Graphite. It is evaluated by the code generated for the IFN_GOACC_LOOP internal function calls. If aliasing is detected at runtime, the execution dimensions get adjusted to execute the affected loops sequentially. gcc/ChangeLog: * graphite-isl-ast-to-gimple.cc: Include internal-fn.h. (graphite_oacc_analyze_scop): Implement runtime alias checks. * omp-expand.cc (expand_oacc_for): Add an additional "noalias" parameter to GOACC_LOOP internal calls, and initialise it to integer_one_node. * omp-offload.cc (oacc_xform_loop): Integrate the runtime alias check into the GOACC_LOOP expansion. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/runtime-alias-check-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/runtime-alias-check-2.c: New test.
2022-06-21openacc: Add data optimization passAndrew Stubbs17-7/+2462
Address PR90591 "Avoid unnecessary data transfer out of OMP construct", for simple (but common) cases. This commit adds a pass that optimizes data mapping clauses. Currently, it can optimize copy/map(tofrom) clauses involving scalars to copyin/map(to) and further to "private". The pass is restricted "kernels" regions but could be extended to other types of regions. gcc/ChangeLog: * Makefile.in: Add pass. * doc/gimple.texi: TODO. * gimple-walk.cc (walk_gimple_seq_mod): Adjust for backward walking. * gimple-walk.h (struct walk_stmt_info): Add field. * passes.def: Add new pass. * tree-pass.h (make_pass_omp_data_optimize): New declaration. * omp-data-optimize.cc: New file. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Expect optimization messages. * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise. gcc/testsuite/ChangeLog: * c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Likewise. * c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c: Likewise. * c-c++-common/goacc/note-parallelism-kernels-loops.c: Likewise. * c-c++-common/goacc/uninit-copy-clause.c: Likewise. * gfortran.dg/goacc/uninit-copy-clause.f95: Likewise. * c-c++-common/goacc/omp_data_optimize-1.c: New test. * g++.dg/goacc/omp_data_optimize-1.C: New test. * gfortran.dg/goacc/omp_data_optimize-1.f90: New test. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2022-06-21Add function for printing a single OMP_CLAUSEFrederik Harwath3-0/+17
Commit 89f4f339130c ("For 'OMP_CLAUSE' in 'dump_generic_node', dump the whole OMP clause chain") changed the dumping behavior for OMP_CLAUSEs. The old behavior is required for a follow-up commit ("openacc: Add data optimization pass") that optimizes single OMP_CLAUSEs. gcc/ChangeLog: * tree-pretty-print.cc (print_omp_clause_to_str): Add new function. * tree-pretty-print.h (print_omp_clause_to_str): Add declaration.
2022-06-21openacc: Remove unused partitioning in "kernels" regionsFrederik Harwath2-5/+53
With the old "kernels" handling, unparallelized regions would get executed with 1x1x1 partitioning even if the user provided explicit num_gangs, num_workers clauses etc. This commit restores this behavior by removing unused partitioning after assigning the parallelism dimensions to loops. gcc/ChangeLog: * omp-offload.cc (oacc_remove_unused_partitioning): New function for removing partitioning that is not used by any loop. (oacc_validate_dims): Call oacc_remove_unused_partitioning and enable warnings about unused partitioning. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Adjust expectations.
2022-06-21openacc: Add further kernels testsFrederik Harwath17-24/+1013
Add some copies of tests to continue covering the old "parloops"-based "kernels" implementation - until it gets removed from GCC - and add further tests for the new Graphite-based implementation. libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/parallel-loop-auto-reduction-2.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/goacc/classify-kernels-unparallelized-graphite.c: New test. * c-c++-common/goacc/classify-kernels-unparallelized-parloops.c: New test. * c-c++-common/goacc/kernels-decompose-1-parloops.c: New test. * c-c++-common/goacc/kernels-reduction-parloops.c: New test. * c-c++-common/goacc/loop-auto-reductions.c: New test. * c-c++-common/goacc/note-parallelism-1-kernels-loop-auto-parloops.c: New test. * c-c++-common/goacc/note-parallelism-kernels-loops-1.c: New test. * c-c++-common/goacc/note-parallelism-kernels-loops-parloops.c: New test. * gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95: New test. * gfortran.dg/goacc/kernels-conversion.f95: New test. * gfortran.dg/goacc/kernels-decompose-1-parloops.f95: New test. * gfortran.dg/goacc/kernels-decompose-parloops-2.f95: New test. * gfortran.dg/goacc/kernels-loop-data-parloops-2.f95: New test. * gfortran.dg/goacc/kernels-loop-parloops-2.f95: New test. * gfortran.dg/goacc/kernels-loop-parloops.f95: New test. * gfortran.dg/goacc/kernels-reductions.f90: New test.
2022-06-21openacc: Add "can_be_parallel" flag info to "graph" dumpsFrederik Harwath2-11/+32
gcc/ChangeLog: * graph.cc (oacc_get_fn_attrib): New declaration. (find_loop_location): New declaration. (draw_cfg_nodes_for_loop): Print value of the can_be_parallel flag at the top of loops in OpenACC functions.
2022-06-21openacc: Use Graphite for dependence analysis in "kernels" regionsFrederik Harwath52-400/+3250
This commit changes the handling of OpenACC "kernels" to use Graphite for dependence analysis. To this end, it first introduces a new internal representation for "kernels" regions which should be analyzed by Graphite in pass_omp_oacc_kernels_decompose. This is now the default for all "kernels" regions, but the old handling is still available through the command line parameter "--param=openacc_kernels=decompose-parloops". The handling of this new region type in the omp lowering and omp offloading passes follows the existing handling for "parallel" regions. This replaces the specialized handling for "kernels" regions that was previously used and which was in limited in many ways. Graphite is adjusted to be able to analyze the OpenACC functions that get outlined from the "kernels" regions. It is enabled to handle the internal function calls that contain information about OpenACC constructs. In some places where function calls would be rejected by Graphite, those calls need to be ignored. In other places, information about the loop step, bounds etc. needs to be extracted from the calls. The goal is to enable an analysis of the original loop parameters although the omp lowering and expansion steps have already modified the loop structure. Some parallelization-enabling constructs such as OpenACC "reduction" and "private"/"firstprivate" clauses must be recognized and the data-dependences must be adjusted to reflect the semantics of those constructs. The data-dependence analysis step in Graphite has so far been tied to the code generation step. This commit introduces a separate data-dependence analysis step that avoids the code generation. This is necessary because adjusting the code generation to create a correct OpenACC loop structure would require very considerable effort and the goal of this commit is to implement the dependence analysis only. The ability to use Graphite for dependence analysis without its code generation might be of independent interest, but it is so far used for OpenACC purposes only. In general, all changes to Graphite try to avoid affecting other uses of Graphite as much as possible. gcc/ChangeLog: * Makefile.in: Add graphite-oacc.o * cfgloop.cc (alloc_loop): Set can_be_parallel_valid_p to false. * cfgloop.h: Add can_be_parallel_valid_p field. * cfgloopmanip.cc (copy_loop_info): Add assert. * config/nvptx/nvptx.cc (nvptx_goacc_reduction_setup): * doc/invoke.texi: Adjust param openacc-kernels description. * doc/passes.texi: Adjust pass_ipa_oacc_kernels description. * flag-types.h (enum openacc_kernels):Add OPENACC_KERNELS_DECOMPOSE_PARLOOPS. * gimple-pretty-print.cc (dump_gimple_omp_target): Handle GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE. * gimple.h (enum gf_mask): Add GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE and widen GF_OMP_TARGET_KIND_MASK. (is_gimple_omp_oacc): Handle GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE. (is_gimple_omp_offloaded): Likewise. * gimplify.cc (gimplify_omp_for): Enable reduction localization for "kernels" regions. (gimplify_omp_workshare): Likewise. * graphite-dependences.cc (scop_get_reads_and_writes): Handle "kills" and "reduction" PDRs. (apply_schedule_on_deps): Add dump output for intermediate steps of the dependence computation to enable understanding of unexpected dependences. (carries_deps): Likewise. (scop_get_dependences): Handle "kill" operations and add dump output. * graphite-isl-ast-to-gimple.cc (visit_schedule_loop_node): New function. (graphite_oacc_analyze_scop): New function. * graphite-optimize-isl.cc (optimize_isl): Remove "static" and add argument to identify OpenACC use; don't fail on unchanged schedule in this case. * graphite-poly.cc (new_poly_dr): Handle "kills". (print_pdr): Likewise. (new_gimple_poly_bb): Likewise. (free_gimple_poly_bb): Likewise. (new_scop): Handle "reduction", "private", and "firstprivate" hash sets. (free_scop): Likewise. (print_isl_space): New function. (debug_isl_space): New function. * graphite-scop-detection.cc (scop_detection::can_represent_loop): Don't fail if niter is 0 in OpenACC functions. (scop_detection::add_scop): Don't reject regions with only one loop in OpenACC functions. (ignored_oacc_internal_call_p): New function. (scan_tree_for_params): Handle VIEW_CONVERT_EXPR. (stmt_has_side_effects): Ignore internal OpenACC function calls. (add_write): Likewise. (add_read): Likewise. (add_kill): New function. (add_kills): New function. (add_oacc_kills): New function. (try_generate_gimple_bb): Kill false dependences for OpenACC "private"/"firstprivate" vars. (gather_bbs::gather_bbs): Determin OpenACC "private"/"firstprivate" vars in region. (gather_bbs::before_dom_children): Add assert. (determine_openacc_reductions): New function. (build_scops): Determine OpenACC "reduction" vars in SCoP. * graphite-sese-to-poly.cc (oacc_ifn_call_extract): New declaration. (oacc_internal_call_p): New function. (build_poly_dr): Ignore internal OpenACC function calls, handle "reduction" refs. (build_poly_sr): Likewise; handle "kill" operations. * graphite.cc (graphite_transform_loops): Accept functions with only a single loop. (oacc_enable_graphite_p): New function. (gate_graphite_transforms): Enable pass on OpenACC functions. * graphite.h (enum poly_dr_type): Add PDR_KILL. (struct poly_dr): Add "is_reduction" field. (new_poly_dr): Add argument to declaration. (pdr_kill_p): New function. (print_isl_space): New declaration. (debug_isl_space): New declaration. (struct scop): Add fields "reductions_vars", "oacc_firstprivate_vars", and "oacc_private_scalars". (optimize_isl): New declaration. (graphite_oacc_analyze_scop): New declaration. * internal-fn.cc (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE_SCALAR and IFN_UNIQUE_OACC_FIRSTPRIVATE * internal-fn.h: Add OACC_PRIVATE_SCALAR and OACC_FIRSTPRIVATE * omp-expand.cc (struct omp_region): Adjust comment. (expand_omp_taskloop_for_inner): (expand_omp_for): Add asserts about expected "kernels" region types. (mark_loops_in_oacc_kernels_region): Likewise. (expand_omp_target): Likewise; handle GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE. (build_omp_regions_1): Handle GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE. Likewise. (omp_make_gimple_edges): Likewise. * omp-general.cc (oacc_get_kernels_attrib): New function. (oacc_get_fn_dim_size): Allow argument to be NULL. * omp-general.h (oacc_get_kernels_attrib): New declaration. * omp-low.cc (struct omp_context): Add fields "oacc_firstprivate_vars" and "oacc_private_scalars". (was_originally_oacc_kernels): New function. (is_oacc_kernels): (is_oacc_kernels_decomposed_graphite_part): New function. (new_omp_context): Allocate "oacc_first_private_vars" and "oacc_private_scalars" ... (delete_omp_context): ... and free from here. (oacc_record_firstprivate_var_clauses): New function. (oacc_record_private_scalars): New function. (scan_sharing_clauses): Call functions to record "private" scalars and "firstprivate" variables. (check_oacc_kernel_gwv): Add assert. (ctx_in_oacc_kernels_region): Handle GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE. (scan_omp_for): Likewise. (check_omp_nesting_restrictions): Likewise. (lower_oacc_head_mark): Likewise. (lower_omp_for): Likewise. (lower_omp_target): Create "private" and "firstprivate" marker call statements. (lower_oacc_head_tail): Adjust "private" and "firstprivate" marker calls. (lower_oacc_reductions): Emit "private" and "firstprivate" marker call statements. (make_oacc_firstprivate_vars_marker): New function. (make_oacc_private_scalars_marker): New function. * omp-oacc-kernels-decompose.cc (adjust_region_code_walk_stmt_fn): Assign GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE to region using the new "kernels" handling. (make_region_seq): Adjust default region type for new "kernels" handling; no more exceptions, let Graphite handle everything. (make_region_loop_nest): Likewise; add dump output and assert. (adjust_nested_loop_clauses): Stop creating "auto" clauses if loop has "independent", "gang" etc. (transform_kernels_loop_clauses): Likewise. * omp-offload.cc (oacc_extract_loop_call): New function. (oacc_loop_get_cfg_loop): New function. (can_be_parallel_str): New function. (oacc_loop_can_be_parallel_p): New function. (oacc_parallel_kernels_graphite_fun_p): New function. (oacc_parallel_fun_p): New function. (oacc_loop_transform_auto_into_independent): New function, ... (oacc_loop_fixed_partitions): ... called from here to transfer the result of Graphite's analysis to the loop. (execute_oacc_loop_designation): Handle "oacc functions with "parallel_kernels_graphite" attribute. (execute_oacc_device_lower): Handle IFN_UNIQUE_OACC_PRIVATE_SCALAR and IFN_UNIQUE_OACC_FIRSTPRIVATE. * omp-offload.h (oacc_extract_loop_call): Add declaration. * params.opt: Add "param=openacc-kernels" value "decompose-parloops". * sese.cc (scalar_evolution_in_region): "Redirect" SCEV analysis to outer loop for IFN_GOACC_LOOP calls. * sese.h: Add field "kill_scalar_refs". * tree-chrec.cc (chrec_fold_plus_1): Handle VIEW_CONVERT_EXPR like CASE_CONVERT. * tree-data-ref.cc (dump_data_reference): Include DR_BASE_ADDRESS and DR_OFFSET in dump output. (get_references_in_stmt): Don't reject OpenACC internal function calls. (graphite_find_data_references_in_stmt): Remove unused variable. * tree-parloops.cc (pass_parallelize_loops::execute): Disable pass with the new kernels handling, enable if requested explicitly. * tree-scalar-evolution.cc (set_scev_analyze_openacc_calls): Set flag to enable the analysis of internal OpenACC function calls (use for Graphite only). (oacc_call_analyzable_p): New function. (oacc_ifn_call_extract): New function. (oacc_simplify): New function. (add_to_evolution): Simplify OpenACC internal function calls if applicable. (follow_ssa_edge_binary): Likewise. (follow_ssa_edge_expr): Likewise. (follow_copies_to_constant): Likewise. (analyze_initial_condition): Likewise. (interpret_loop_phi): Likewise. (interpret_gimple_call): New function. (interpret_rhs_expr): Likewise. (instantiate_scev_name): Likewise. (analyze_scalar_evolution_1): Handle GIMPLE_CALL, handle default definitions. (expression_expensive_p): Consider internal OpenACC calls to be cheap. * tree-scalar-evolution.h (set_scev_analyze_openacc_calls): New declaration. (oacc_call_analyzable_p): New declaration. * tree-ssa-dce.cc (mark_stmt_if_obviously_necessary): Mark lhs of internal OpenACC function calls necessary. * tree-ssa-ifcombine.c (recognize_if_then_else): * tree-ssa-loop-niter.cc (oacc_call_analyzable_p): (oacc_ifn_call_extract): New declaration. (interpret_gimple_call): New delcaration. (expand_simple_operations): Handle internal OpenACC function calls. * tree-ssa-loop.cc (gate_oacc_kernels): Disable for new "kernels" handling. * graphite-oacc.cc: New file. * graphite-oacc.h: New file. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Adjust. * testsuite/libgomp.oacc-fortran/kernels-independent.f90: Adjust. * testsuite/libgomp.oacc-fortran/kernels-loop-1.f90: Adjust. * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Adjust. gcc/testsuite/ChangeLog: * c-c++-common/goacc/classify-kernels.c: Adjust. * c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c: Adjust. * c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Adjust. * c-c++-common/goacc/note-parallelism-kernels-loops.c: Adjust. * c-c++-common/goacc/classify-kernels-unparallelized.c: Removed. * c-c++-common/goacc/kernels-reduction.c: Removed. * gfortran.dg/goacc/loop-auto-transfer-2.f90: New test. * gfortran.dg/goacc/loop-auto-transfer-3.f90: New test. * gfortran.dg/goacc/loop-auto-transfer-4.f90: New test. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2022-06-21graphite: Add runtime alias checkingFrederik Harwath9-20/+344
Graphite rejects a SCoP if it contains a pair of data references for which it cannot determine statically if they may alias. This happens very often, for instance in C code which does not use explicit "restrict". This commit adds the possibility to analyze a SCoP nevertheless and perform an alias check at runtime. Then, if aliasing is detected, the execution will fall back to the unoptimized SCoP. TODO This needs more testing on non-OpenACC code. gcc/ChangeLog: * common.opt: Add fgraphite-runtime-alias-checks. * graphite-isl-ast-to-gimple.cc (generate_alias_cond): New function. (graphite_regenerate_ast_isl): Use from here. * graphite-poly.cc (new_scop): Create unhandled_alias_ddrs vec ... (free_scop): and release here. * graphite-scop-detection.cc (dr_defs_outside_region): New function. (dr_well_analyzed_for_runtime_alias_check_p): New function. (graphite_runtime_alias_check_p): New function. (build_alias_set): Record unhandled alias ddrs for later alias check creation if flag_graphite_runtime_alias_checks is true instead of failing. * graphite.h (struct scop): Add field unhandled_alias_ddrs. * sese.h (has_operands_from_region_p): New function. gcc/testsuite/ChangeLog: * gcc.dg/graphite/alias-1.c: New test.
2022-06-21Move compute_alias_check_pairs to tree-data-ref.cFrederik Harwath4-87/+103
Move this function from tree-loop-distribution.c to tree-data-ref.c and make it non-static to enable its use from other parts of GCC. gcc/ChangeLog: * tree-loop-distribution.cc (data_ref_segment_size): Remove function. (latch_dominated_by_data_ref): Likewise. (compute_alias_check_pairs): Likewise. * tree-data-ref.cc (data_ref_segment_size): New function, copied from tree-loop-distribution.c (compute_alias_check_pairs): Likewise. (latch_dominated_by_data_ref): Likewise. * tree-data-ref.h (compute_alias_check_pairs): New declaration.
2022-06-21Fix branch prediction dump messageFrederik Harwath2-1/+5
Instead of, for instance, "Loop got predicted 1 to iterate 10 times" the message should be "Loop 1 got predicted to iterate 10 times". gcc/ChangeLog: * predict.cc (pass_profile::execute): Fix dump message.
2022-06-21graphite: Fix minor mistakes in commentsFrederik Harwath3-3/+10
gcc/ChangeLog: * graphite-sese-to-poly.cc (build_poly_sr_1): Fix a typo and a reference to a variable which does not exist. * graphite-isl-ast-to-gimple.cc (gsi_insert_earliest): Fix typo in comment.
2022-06-21graphite: Rename isl_id_for_ssa_nameFrederik Harwath2-10/+17
The SSA names for which this function gets used are always SCoP parameters and hence "isl_id_for_parameter" is a better name. It also explains the prefix "P_" for those names in the ISL representation. gcc/ChangeLog: * graphite-sese-to-poly.cc (isl_id_for_ssa_name): Rename to ... (isl_id_for_parameter): ... this new function name. (build_scop_context): Adjust function use.