Age | Commit message (Collapse) | Author | Files | Lines |
|
... instead of 'omp_is_reference' vs.
'lang_hooks.decls.omp_privatize_by_reference'.
gcc/
* omp-general.h (omp_is_reference): Rename to...
(omp_privatize_by_reference): ... this. Adjust all users...
* omp-general.c: ... here, ...
* gimplify.c: ... here, ...
* omp-expand.c: ... here, ...
* omp-low.c: ... here.
|
|
'device_num' and 'ancestor' are now parsed on target device constructs for C,
C++, and Fortran (see OpenMP specification 5.0, p. 170). When 'ancestor' is
used, then 'sorry, not supported' is output. Moreover, the restrictions for
'ancestor' are implemented (see OpenMP specification 5.0, p. 174f).
gcc/c/ChangeLog:
* c-parser.c (c_parser_omp_clause_device): Parse device-modifiers 'device_num'
and 'ancestor' in 'target device' clauses.
gcc/cp/ChangeLog:
* parser.c (cp_parser_omp_clause_device): Parse device-modifiers 'device_num'
and 'ancestor' in 'target device' clauses.
* semantics.c (finish_omp_clauses): Error handling. Constant device ids must
evaluate to '1' if 'ancestor' is used.
gcc/fortran/ChangeLog:
* gfortran.h: Add variable for 'ancestor' in struct gfc_omp_clauses.
* openmp.c (gfc_match_omp_clauses): Parse device-modifiers 'device_num'
and 'ancestor' in 'target device' clauses.
* trans-openmp.c (gfc_trans_omp_clauses): Set OMP_CLAUSE_DEVICE_ANCESTOR.
gcc/ChangeLog:
* gimplify.c (gimplify_scan_omp_clauses): Error handling. 'ancestor' only
allowed on target constructs and only with particular other clauses.
* omp-expand.c (expand_omp_target): Output of 'sorry, not supported' if
'ancestor' is used.
* omp-low.c (check_omp_nesting_restrictions): Error handling. No nested OpenMP
structs when 'ancestor' is used.
(scan_omp_1_stmt): No usage of OpenMP runtime routines in a target region when
'ancestor' is used.
* tree-pretty-print.c (dump_omp_clause): Append 'ancestor'.
* tree.h (OMP_CLAUSE_DEVICE_ANCESTOR): Define macro.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/target-device-1.c: New test.
* c-c++-common/gomp/target-device-2.c: New test.
* c-c++-common/gomp/target-device-ancestor-1.c: New test.
* c-c++-common/gomp/target-device-ancestor-2.c: New test.
* c-c++-common/gomp/target-device-ancestor-3.c: New test.
* c-c++-common/gomp/target-device-ancestor-4.c: New test.
* gfortran.dg/gomp/target-device-1.f90: New test.
* gfortran.dg/gomp/target-device-2.f90: New test.
* gfortran.dg/gomp/target-device-ancestor-1.f90: New test.
* gfortran.dg/gomp/target-device-ancestor-2.f90: New test.
* gfortran.dg/gomp/target-device-ancestor-3.f90: New test.
* gfortran.dg/gomp/target-device-ancestor-4.f90: New test.
|
|
This patch implements the OpenMP 5.1 scope construct, which is similar
to worksharing constructs in many regards, but isn't one of them.
The body of the construct is encountered by all threads though, it can
be nested in itself or intermixed with taskgroup and worksharing etc.
constructs can appear inside of it (but it can't be nested in
worksharing etc. constructs). The main purpose of the construct
is to allow reductions (normal and task ones) without the need to
close the parallel and reopen another one.
If it doesn't have task reductions, it can be implemented without
any new library support, with nowait it just does the privatizations
at the start if any and reductions before the end of the body, with
without nowait emits a normal GOMP_barrier{,_cancel} at the end too.
For task reductions, we need to ensure only one thread initializes
the task reduction library data structures and other threads copy from that,
so a new GOMP_scope_start routine is added to the library for that.
It acts as if the start of the scope construct is a nowait worksharing
construct (that is ok, it can't be nested in other worksharing
constructs and all threads need to encounter the start in the same
order) which does the task reduction initialization, but as the body
can have other scope constructs and/or worksharing constructs, that is
all where we use this dummy worksharing construct. With task reductions,
the construct must not have nowait and ends with a GOMP_barrier{,_cancel},
followed by task reductions followed by GOMP_workshare_task_reduction_unregister.
Only C/C++ FE support is done.
2021-08-17 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.def (OMP_SCOPE): New tree code.
* tree.h (OMP_SCOPE_BODY, OMP_SCOPE_CLAUSES): Define.
* tree-nested.c (convert_nonlocal_reference_stmt,
convert_local_reference_stmt, convert_gimple_call): Handle
GIMPLE_OMP_SCOPE.
* tree-pretty-print.c (dump_generic_node): Handle OMP_SCOPE.
* gimple.def (GIMPLE_OMP_SCOPE): New gimple code.
* gimple.c (gimple_build_omp_scope): New function.
(gimple_copy): Handle GIMPLE_OMP_SCOPE.
* gimple.h (gimple_build_omp_scope): Declare.
(gimple_has_substatements): Handle GIMPLE_OMP_SCOPE.
(gimple_omp_scope_clauses, gimple_omp_scope_clauses_ptr,
gimple_omp_scope_set_clauses): New inline functions.
(CASE_GIMPLE_OMP): Add GIMPLE_OMP_SCOPE.
* gimple-pretty-print.c (dump_gimple_omp_scope): New function.
(pp_gimple_stmt_1): Handle GIMPLE_OMP_SCOPE.
* gimple-walk.c (walk_gimple_stmt): Likewise.
* gimple-low.c (lower_stmt): Likewise.
* gimplify.c (is_gimple_stmt): Handle OMP_MASTER.
(gimplify_scan_omp_clauses): For task reductions, handle OMP_SCOPE
like ORT_WORKSHARE constructs. Adjust diagnostics for %<scope%>
allowing task reductions. Reject inscan reductions on scope.
(omp_find_stores_stmt): Handle GIMPLE_OMP_SCOPE.
(gimplify_omp_workshare, gimplify_expr): Handle OMP_SCOPE.
* tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_SCOPE.
(estimate_num_insns): Likewise.
* omp-low.c (build_outer_var_ref): Look through GIMPLE_OMP_SCOPE
contexts if var isn't privatized there.
(check_omp_nesting_restrictions): Handle GIMPLE_OMP_SCOPE.
(scan_omp_1_stmt): Likewise.
(maybe_add_implicit_barrier_cancel): Look through outer
scope constructs.
(lower_omp_scope): New function.
(lower_omp_task_reductions): Handle OMP_SCOPE.
(lower_omp_1): Handle GIMPLE_OMP_SCOPE.
(diagnose_sb_1, diagnose_sb_2): Likewise.
* omp-expand.c (expand_omp_single): Support also GIMPLE_OMP_SCOPE.
(expand_omp): Handle GIMPLE_OMP_SCOPE.
(omp_make_gimple_edges): Likewise.
* omp-builtins.def (BUILT_IN_GOMP_SCOPE_START): New built-in.
gcc/c-family/
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_SCOPE.
* c-pragma.c (omp_pragmas): Add scope construct.
* c-omp.c (omp_directives): Uncomment scope directive entry.
gcc/c/
* c-parser.c (OMP_SCOPE_CLAUSE_MASK): Define.
(c_parser_omp_scope): New function.
(c_parser_omp_construct): Handle PRAGMA_OMP_SCOPE.
gcc/cp/
* parser.c (OMP_SCOPE_CLAUSE_MASK): Define.
(cp_parser_omp_scope): New function.
(cp_parser_omp_construct, cp_parser_pragma): Handle PRAGMA_OMP_SCOPE.
* pt.c (tsubst_expr): Handle OMP_SCOPE.
gcc/testsuite/
* c-c++-common/gomp/nesting-2.c (foo): Add scope and masked
construct tests.
* c-c++-common/gomp/scan-1.c (f3): Add scope construct test..
* c-c++-common/gomp/cancel-1.c (f2): Add scope and masked
construct tests.
* c-c++-common/gomp/reduction-task-2.c (bar): Add scope construct
test. Adjust diagnostics for the addition of scope.
* c-c++-common/gomp/loop-1.c (f5): Add master, masked and scope
construct tests.
* c-c++-common/gomp/clause-dups-1.c (f1): Add scope construct test.
* gcc.dg/gomp/nesting-1.c (f1, f2, f3): Add scope construct tests.
* c-c++-common/gomp/scope-1.c: New test.
* c-c++-common/gomp/scope-2.c: New test.
* g++.dg/gomp/attrs-1.C (bar): Add scope construct tests.
* g++.dg/gomp/attrs-2.C (bar): Likewise.
* gfortran.dg/gomp/reduction4.f90: Adjust expected diagnostics.
* gfortran.dg/gomp/reduction7.f90: Likewise.
libgomp/
* Makefile.am (libgomp_la_SOURCES): Add scope.c
* Makefile.in: Regenerated.
* libgomp_g.h (GOMP_scope_start): Declare.
* libgomp.map: Add GOMP_scope_start@@GOMP_5.1.
* scope.c: New file.
* testsuite/libgomp.c-c++-common/scope-1.c: New test.
* testsuite/libgomp.c-c++-common/task-reduction-16.c: New test.
|
|
gcc/ChangeLog:
PR middle-end/101931
* omp-low.c (omp_runtime_api_call): Update for routines
added in the meanwhile.
|
|
This construct has been introduced as a replacement for master
construct, but unlike that construct is slightly more general,
has an optional clause which allows to choose which thread
will be the one running the region, it can be some other thread
than the master (primary) thread with number 0, or it could be no
threads or multiple threads (then of course one needs to be careful
about data races).
It is way too early to deprecate the master construct though, we don't
even have OpenMP 5.0 fully implemented, it has been deprecated in 5.1,
will be also in 5.2 and removed in 6.0. But even then it will likely
be a good idea to just -Wdeprecated warn about it and still accept it.
The patch also contains something I should have done much earlier,
for clauses that accept some integral expression where we only care
about the value, forces during gimplification that value into
either a min invariant (as before), SSA_NAME or a fresh temporary,
but never e.g. a user VAR_DECL, so that for those clauses we don't
need to worry about adjusting it.
2021-08-12 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.def (OMP_MASKED): New tree code.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_FILTER.
* tree.h (OMP_MASKED_BODY, OMP_MASKED_CLAUSES, OMP_MASKED_COMBINED,
OMP_CLAUSE_FILTER_EXPR): Define.
* tree.c (omp_clause_num_ops): Add OMP_CLAUSE_FILTER entry.
(omp_clause_code_name): Likewise.
(walk_tree_1): Handle OMP_CLAUSE_FILTER.
* tree-nested.c (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_FILTER.
(convert_nonlocal_reference_stmt, convert_local_reference_stmt,
convert_gimple_call): Handle GIMPLE_OMP_MASTER.
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_FILTER.
(dump_generic_node): Handle OMP_MASTER.
* gimple.def (GIMPLE_OMP_MASKED): New gimple code.
* gimple.c (gimple_build_omp_masked): New function.
(gimple_copy): Handle GIMPLE_OMP_MASKED.
* gimple.h (gimple_build_omp_masked): Declare.
(gimple_has_substatements): Handle GIMPLE_OMP_MASKED.
(gimple_omp_masked_clauses, gimple_omp_masked_clauses_ptr,
gimple_omp_masked_set_clauses): New inline functions.
(CASE_GIMPLE_OMP): Add GIMPLE_OMP_MASKED.
* gimple-pretty-print.c (dump_gimple_omp_masked): New function.
(pp_gimple_stmt_1): Handle GIMPLE_OMP_MASKED.
* gimple-walk.c (walk_gimple_stmt): Likewise.
* gimple-low.c (lower_stmt): Likewise.
* gimplify.c (is_gimple_stmt): Handle OMP_MASTER.
(gimplify_scan_omp_clauses): Handle OMP_CLAUSE_FILTER. For clauses
that take one expression rather than decl or constant, force
gimplification of that into a SSA_NAME or temporary unless min
invariant.
(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_FILTER.
(gimplify_expr): Handle OMP_MASKED.
* tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_MASKED.
(estimate_num_insns): Likewise.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_FILTER.
(check_omp_nesting_restrictions): Handle GIMPLE_OMP_MASKED. Adjust
diagnostics for existence of masked construct.
(scan_omp_1_stmt, lower_omp_master, lower_omp_1, diagnose_sb_1,
diagnose_sb_2): Handle GIMPLE_OMP_MASKED.
* omp-expand.c (expand_omp_synch, expand_omp, omp_make_gimple_edges):
Likewise.
gcc/c-family/
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_MASKED.
(enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FILTER.
* c-pragma.c (omp_pragmas_simd): Add masked construct.
* c-common.h (enum c_omp_clause_split): Add C_OMP_CLAUSE_SPLIT_MASKED
enumerator.
(c_finish_omp_masked): Declare.
* c-omp.c (c_finish_omp_masked): New function.
(c_omp_split_clauses): Handle combined masked constructs.
gcc/c/
* c-parser.c (c_parser_omp_clause_name): Parse filter clause name.
(c_parser_omp_clause_filter): New function.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FILTER.
(OMP_MASKED_CLAUSE_MASK): Define.
(c_parser_omp_masked): New function.
(c_parser_omp_parallel): Handle parallel masked.
(c_parser_omp_construct): Handle PRAGMA_OMP_MASKED.
* c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_FILTER.
gcc/cp/
* parser.c (cp_parser_omp_clause_name): Parse filter clause name.
(cp_parser_omp_clause_filter): New function.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FILTER.
(OMP_MASKED_CLAUSE_MASK): Define.
(cp_parser_omp_masked): New function.
(cp_parser_omp_parallel): Handle parallel masked.
(cp_parser_omp_construct, cp_parser_pragma): Handle PRAGMA_OMP_MASKED.
* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_FILTER.
* pt.c (tsubst_omp_clauses): Likewise.
(tsubst_expr): Handle OMP_MASKED.
gcc/testsuite/
* c-c++-common/gomp/clauses-1.c (bar): Add tests for combined masked
constructs with clauses.
* c-c++-common/gomp/clauses-5.c (foo): Add testcase for filter clause.
* c-c++-common/gomp/clause-dups-1.c (f1): Likewise.
* c-c++-common/gomp/masked-1.c: New test.
* c-c++-common/gomp/masked-2.c: New test.
* c-c++-common/gomp/masked-combined-1.c: New test.
* c-c++-common/gomp/masked-combined-2.c: New test.
* c-c++-common/goacc/uninit-if-clause.c: Remove xfails.
* g++.dg/gomp/block-11.C: New test.
* g++.dg/gomp/tpl-masked-1.C: New test.
* g++.dg/gomp/attrs-1.C (bar): Add tests for masked construct and
combined masked constructs with clauses in attribute syntax.
* g++.dg/gomp/attrs-2.C (bar): Likewise.
* gcc.dg/gomp/nesting-1.c (f1, f2): Add tests for masked construct
nesting.
* gfortran.dg/goacc/host_data-tree.f95: Allow also SSA_NAMEs in if
clause.
* gfortran.dg/goacc/kernels-tree.f95: Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/masked-1.c: New test.
|
|
gcc/
* config/nvptx/nvptx.c: Cross-reference parts adapted in
'gcc/omp-oacc-neuter-broadcast.cc'.
* omp-low.c: Likewise.
* omp-oacc-neuter-broadcast.cc: Cross-reference parts adapted from
the above files.
|
|
Do not "compile a version of this procedure for the host".
gcc/
* tree-core.h (omp_clause_code): Add 'OMP_CLAUSE_NOHOST'.
* tree.c (omp_clause_num_ops, omp_clause_code_name, walk_tree_1):
Handle it.
* tree-pretty-print.c (dump_omp_clause): Likewise.
* omp-general.c (oacc_verify_routine_clauses): Likewise.
* gimplify.c (gimplify_scan_omp_clauses)
(gimplify_adjust_omp_clauses): Likewise.
* tree-nested.c (convert_nonlocal_omp_clauses)
(convert_local_omp_clauses): Likewise.
* omp-low.c (scan_sharing_clauses): Likewise.
* omp-offload.c (execute_oacc_device_lower): Update.
gcc/c-family/
* c-pragma.h (pragma_omp_clause): Add 'PRAGMA_OACC_CLAUSE_NOHOST'.
gcc/c/
* c-parser.c (c_parser_omp_clause_name): Handle 'nohost'.
(c_parser_oacc_all_clauses): Handle 'PRAGMA_OACC_CLAUSE_NOHOST'.
(OACC_ROUTINE_CLAUSE_MASK): Add 'PRAGMA_OACC_CLAUSE_NOHOST'.
* c-typeck.c (c_finish_omp_clauses): Handle 'OMP_CLAUSE_NOHOST'.
gcc/cp/
* parser.c (cp_parser_omp_clause_name): Handle 'nohost'.
(cp_parser_oacc_all_clauses): Handle 'PRAGMA_OACC_CLAUSE_NOHOST'.
(OACC_ROUTINE_CLAUSE_MASK): Add 'PRAGMA_OACC_CLAUSE_NOHOST'.
* pt.c (tsubst_omp_clauses): Handle 'OMP_CLAUSE_NOHOST'.
* semantics.c (finish_omp_clauses): Likewise.
gcc/fortran/
* dump-parse-tree.c (show_attr): Update.
* gfortran.h (symbol_attribute): Add 'oacc_routine_nohost' member.
(gfc_omp_clauses): Add 'nohost' member.
* module.c (ab_attribute): Add 'AB_OACC_ROUTINE_NOHOST'.
(attr_bits, mio_symbol_attribute): Update.
* openmp.c (omp_mask2): Add 'OMP_CLAUSE_NOHOST'.
(gfc_match_omp_clauses): Handle 'OMP_CLAUSE_NOHOST'.
(OACC_ROUTINE_CLAUSES): Add 'OMP_CLAUSE_NOHOST'.
(gfc_match_oacc_routine): Update.
* trans-decl.c (add_attributes_to_decl): Update.
* trans-openmp.c (gfc_trans_omp_clauses): Likewise.
gcc/testsuite/
* c-c++-common/goacc/classify-routine-nohost.c: New file.
* c-c++-common/goacc/classify-routine.c: Update.
* c-c++-common/goacc/routine-2.c: Likewise.
* c-c++-common/goacc/routine-nohost-1.c: New file.
* c-c++-common/goacc/routine-nohost-2.c: Likewise.
* g++.dg/goacc/template.C: Update.
* gfortran.dg/goacc/classify-routine-nohost.f95: New file.
* gfortran.dg/goacc/classify-routine.f95: Update.
* gfortran.dg/goacc/pure-elemental-procedures-2.f90: Likewise.
* gfortran.dg/goacc/routine-6.f90: Likewise.
* gfortran.dg/goacc/routine-intrinsic-2.f: Likewise.
* gfortran.dg/goacc/routine-module-1.f90: Likewise.
* gfortran.dg/goacc/routine-module-2.f90: Likewise.
* gfortran.dg/goacc/routine-module-3.f90: Likewise.
* gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.
* gfortran.dg/goacc/routine-multiple-directives-1.f90: Likewise.
* gfortran.dg/goacc/routine-multiple-directives-2.f90: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: New
file.
* testsuite/libgomp.oacc-c-c++-common/routine-nohost-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-nohost-2_2.c:
Likewise.
* testsuite/libgomp.oacc-fortran/routine-nohost-1.f90: Likewise.
Co-Authored-By: Joseph Myers <joseph@codesourcery.com>
Co-Authored-By: Cesar Philippidis <cesar@codesourcery.com>
|
|
As the testcase shows, the special treatment of && and || reduction combiners
where we expand them as omp_out = (omp_out != 0) && (omp_in != 0) (or with ||)
is not needed just for &&/|| on floating point or complex types, but for all
&&/|| reductions - when expanded as omp_out = omp_out && omp_in (not in C but
GENERIC) it is actually gimplified into NOP_EXPRs to bool from both operands,
which turns non-zero values multiple of 2 into 0 rather than 1.
This patch just treats all &&/|| the same and furthermore uses bool type
instead of int for the comparisons.
2021-07-01 Jakub Jelinek <jakub@redhat.com>
PR middle-end/94366
gcc/
* omp-low.c (lower_rec_input_clauses): Rename is_fp_and_or to
is_truth_op, set it for TRUTH_*IF_EXPR regardless of new_var's type,
use boolean_type_node instead of integer_type_node as NE_EXPR type.
(lower_reduction_clauses): Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/pr94366.c: New test.
|
|
gcc/ChangeLog:
* builtins.c (warn_string_no_nul): Replace uses of TREE_NO_WARNING,
gimple_no_warning_p and gimple_set_no_warning with
warning_suppressed_p, and suppress_warning.
(c_strlen): Same.
(maybe_warn_for_bound): Same.
(warn_for_access): Same.
(check_access): Same.
(expand_builtin_strncmp): Same.
(fold_builtin_varargs): Same.
* calls.c (maybe_warn_nonstring_arg): Same.
(maybe_warn_rdwr_sizes): Same.
* cfgexpand.c (expand_call_stmt): Same.
* cgraphunit.c (check_global_declaration): Same.
* fold-const.c (fold_undefer_overflow_warnings): Same.
(fold_truth_not_expr): Same.
(fold_unary_loc): Same.
(fold_checksum_tree): Same.
* gimple-array-bounds.cc (array_bounds_checker::check_array_ref): Same.
(array_bounds_checker::check_mem_ref): Same.
(array_bounds_checker::check_addr_expr): Same.
(array_bounds_checker::check_array_bounds): Same.
* gimple-expr.c (copy_var_decl): Same.
* gimple-fold.c (gimple_fold_builtin_strcpy): Same.
(gimple_fold_builtin_strncat): Same.
(gimple_fold_builtin_stxcpy_chk): Same.
(gimple_fold_builtin_stpcpy): Same.
(gimple_fold_builtin_sprintf): Same.
(fold_stmt_1): Same.
* gimple-ssa-isolate-paths.c (diag_returned_locals): Same.
* gimple-ssa-nonnull-compare.c (do_warn_nonnull_compare): Same.
* gimple-ssa-sprintf.c (handle_printf_call): Same.
* gimple-ssa-store-merging.c (imm_store_chain_info::output_merged_store): Same.
* gimple-ssa-warn-restrict.c (maybe_diag_overlap): Same.
* gimple-ssa-warn-restrict.h: Adjust declarations.
(maybe_diag_access_bounds): Replace uses of TREE_NO_WARNING,
gimple_no_warning_p and gimple_set_no_warning with
warning_suppressed_p, and suppress_warning.
(check_call): Same.
(check_bounds_or_overlap): Same.
* gimple.c (gimple_build_call_from_tree): Same.
* gimplify.c (gimplify_return_expr): Same.
(gimplify_cond_expr): Same.
(gimplify_modify_expr_complex_part): Same.
(gimplify_modify_expr): Same.
(gimple_push_cleanup): Same.
(gimplify_expr): Same.
* omp-expand.c (expand_omp_for_generic): Same.
(expand_omp_taskloop_for_outer): Same.
* omp-low.c (lower_rec_input_clauses): Same.
(lower_lastprivate_clauses): Same.
(lower_send_clauses): Same.
(lower_omp_target): Same.
* tree-cfg.c (pass_warn_function_return::execute): Same.
* tree-complex.c (create_one_component_var): Same.
* tree-inline.c (remap_gimple_op_r): Same.
(copy_tree_body_r): Same.
(declare_return_variable): Same.
(expand_call_inline): Same.
* tree-nested.c (lookup_field_for_decl): Same.
* tree-sra.c (create_access_replacement): Same.
(generate_subtree_copies): Same.
* tree-ssa-ccp.c (pass_post_ipa_warn::execute): Same.
* tree-ssa-forwprop.c (combine_cond_expr_cond): Same.
* tree-ssa-loop-ch.c (ch_base::copy_headers): Same.
* tree-ssa-loop-im.c (execute_sm): Same.
* tree-ssa-phiopt.c (cond_store_replacement): Same.
* tree-ssa-strlen.c (maybe_warn_overflow): Same.
(handle_builtin_strcpy): Same.
(maybe_diag_stxncpy_trunc): Same.
(handle_builtin_stxncpy_strncat): Same.
(handle_builtin_strcat): Same.
* tree-ssa-uninit.c (get_no_uninit_warning): Same.
(set_no_uninit_warning): Same.
(uninit_undefined_value_p): Same.
(warn_uninit): Same.
(maybe_warn_operand): Same.
* tree-vrp.c (compare_values_warnv): Same.
* vr-values.c (vr_values::extract_range_for_var_from_comparison_expr): Same.
(test_for_singularity): Same.
* gimple.h (warning_suppressed_p): New function.
(suppress_warning): Same.
(copy_no_warning): Same.
(gimple_set_block): Call gimple_set_location.
(gimple_set_location): Call copy_warning.
|
|
This patch adds support for in_reduction clause on target construct, though
for now only for synchronous targets (without nowait clause).
The encountering thread in that case runs the target task and blocks until
the target region ends, so it is implemented by remapping it before entering
the target, initializing the private copy if not yet initialized for the
current thread and then using the remapped addresses for the mapping
addresses.
For nowait combined with in_reduction the patch contains a hack where the
nowait clause is ignored. To implement it correctly, I think we would need
to create a new private variable for the in_reduction and initialize it before
doing the async target and adjust the map addresses to that private variable
and then pass a function pointer to the library routine with code where the callback
would remap the address to the current threads private variable and use in_reduction
combiner to combine the private variable we've created into the thread's copy.
The library would then need to make sure that the routine is called in some thread
participating in the parallel (and not in an unshackeled thread).
2021-06-24 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.h (OMP_CLAUSE_MAP_IN_REDUCTION): Document meaning for OpenMP.
* gimplify.c (gimplify_scan_omp_clauses): For OpenMP map clauses
with OMP_CLAUSE_MAP_IN_REDUCTION flag partially defer gimplification
of non-decl OMP_CLAUSE_DECL. For OMP_CLAUSE_IN_REDUCTION on
OMP_TARGET user outer_ctx instead of ctx for placeholders and
initializer/combiner gimplification.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_MAP_IN_REDUCTION
on target constructs.
(lower_rec_input_clauses): Likewise.
(lower_omp_target): Likewise.
* omp-expand.c (expand_omp_target): Temporarily ignore nowait clause
on target if in_reduction is present.
gcc/c-family/
* c-common.h (enum c_omp_region_type): Add C_ORT_TARGET and
C_ORT_OMP_TARGET.
* c-omp.c (c_omp_split_clauses): For OMP_CLAUSE_IN_REDUCTION on
combined target constructs also add map (always, tofrom:) clause.
gcc/c/
* c-parser.c (omp_split_clauses): Pass C_ORT_OMP_TARGET instead of
C_ORT_OMP for clauses on target construct.
(OMP_TARGET_CLAUSE_MASK): Add in_reduction clause.
(c_parser_omp_target): For non-combined target add
map (always, tofrom:) clauses for OMP_CLAUSE_IN_REDUCTION. Pass
C_ORT_OMP_TARGET to c_finish_omp_clauses.
* c-typeck.c (handle_omp_array_sections): Adjust ort handling
for addition of C_ORT_OMP_TARGET and simplify, mapping clauses are
never present on C_ORT_*DECLARE_SIMD.
(c_finish_omp_clauses): Likewise. Handle OMP_CLAUSE_IN_REDUCTION
on C_ORT_OMP_TARGET, set OMP_CLAUSE_MAP_IN_REDUCTION on
corresponding map clauses.
gcc/cp/
* parser.c (cp_omp_split_clauses): Pass C_ORT_OMP_TARGET instead of
C_ORT_OMP for clauses on target construct.
(OMP_TARGET_CLAUSE_MASK): Add in_reduction clause.
(cp_parser_omp_target): For non-combined target add
map (always, tofrom:) clauses for OMP_CLAUSE_IN_REDUCTION. Pass
C_ORT_OMP_TARGET to finish_omp_clauses.
* semantics.c (handle_omp_array_sections_1): Adjust ort handling
for addition of C_ORT_OMP_TARGET and simplify, mapping clauses are
never present on C_ORT_*DECLARE_SIMD.
(handle_omp_array_sections): Likewise.
(finish_omp_clauses): Likewise. Handle OMP_CLAUSE_IN_REDUCTION
on C_ORT_OMP_TARGET, set OMP_CLAUSE_MAP_IN_REDUCTION on
corresponding map clauses.
* pt.c (tsubst_expr): Pass C_ORT_OMP_TARGET instead of C_ORT_OMP for
clauses on target construct.
gcc/testsuite/
* c-c++-common/gomp/target-in-reduction-1.c: New test.
* c-c++-common/gomp/clauses-1.c: Add in_reduction clauses on
target or combined target constructs.
libgomp/
* testsuite/libgomp.c-c++-common/target-in-reduction-1.c: New test.
* testsuite/libgomp.c-c++-common/target-in-reduction-2.c: New test.
* testsuite/libgomp.c++/target-in-reduction-1.C: New test.
* testsuite/libgomp.c++/target-in-reduction-2.C: New test.
|
|
The following testcase FAILs, because the UDR combiner is invoked incorrectly.
lower_omp_rec_clauses expects that when it sets
DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P
for both the placeholder and the var that everything will be properly
regimplified, but as the variable in question is a PARM_DECL rather than
VAR_DECL, lower_omp_regimplify_p doesn't say that it should be regimplified
and so it is not.
2021-06-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101167
* omp-low.c (lower_omp_regimplify_p): Regimplify also PARM_DECLs
and RESULT_DECLs that have DECL_HAS_VALUE_EXPR_P set.
* testsuite/libgomp.c-c++-common/task-reduction-15.c: New test.
|
|
Move the OpenACC enter and exit data directives from using a single builtin to
having one each. For most purposes it was easy to tell which was which, from
the clauses given, but it's overhead we can easily avoid, and there may be
future uses where that isn't possible.
gcc/
* omp-builtins.def (BUILT_IN_GOACC_ENTER_EXIT_DATA): Split into...
(BUILT_IN_GOACC_ENTER_DATA, BUILT_IN_GOACC_EXIT_DATA): ... these.
* gimple.h (enum gf_mask): Split
'GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA' into
'GF_OMP_TARGET_KIND_OACC_ENTER_DATA' and
'GF_OMP_TARGET_KIND_OACC_EXIT_DATA'.
(is_gimple_omp_oacc): Update.
* gimple-pretty-print.c (dump_gimple_omp_target): Likewise.
* gimplify.c (gimplify_omp_target_update): Likewise.
* omp-expand.c (expand_omp_target, build_omp_regions_1)
(omp_make_gimple_edges): Likewise.
* omp-low.c (check_omp_nesting_restrictions, lower_omp_target):
Likewise.
gcc/testsuite/
* c-c++-common/goacc-gomp/nesting-fail-1.c: Adjust patterns.
* c-c++-common/goacc/finalize-1.c: Likewise.
* c-c++-common/goacc/mdc-1.c: Likewise.
* c-c++-common/goacc/nesting-fail-1.c: Likewise.
* c-c++-common/goacc/struct-enter-exit-data-1.c: Likewise.
* gfortran.dg/goacc/attach-descriptor.f90: Likewise.
* gfortran.dg/goacc/finalize-1.f: Likewise.
* gfortran.dg/goacc/mapping-tests-3.f90: Likewise.
libgomp/
* libgomp.map (GOACC_2.0.2): New symbol version.
* libgomp_g.h (GOACC_enter_data, GOACC_exit_data) New prototypes.
* oacc-mem.c (GOACC_enter_data, GOACC_exit_data) New functions.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
|
|
OpenMP Nesting of Regions restrictions say:
- If a target update, target data, target enter data, or target exit data
construct is encountered during execution of a target region, the behavior is unspecified.
- If a target construct is encountered during execution of a target region and a device
clause in which the ancestor device-modifier appears is not present on the construct, the
behavior is unspecified.
That wording is about the dynamic (runtime) behavior, not about lexical nesting,
so while it is UB if omp target * is encountered in the target region, we need to make
it compile and link (for lexical nesting of target * inside of target we actually
emit a warning).
To make this work, I had to do multiple changes.
One was to mark .omp_data_{sizes,kinds}.* variables when static as "omp declare target".
Another one was to add stub GOMP_target* entrypoints to nvptx and gcn libgomp.a.
The entrypoint functions shouldn't be called or passed in the offload regions,
otherwise
libgomp: cuLaunchKernel error: too many resources requested for launch
was reported; fixed by changing those arguments of calls to GOMP_target_ext
to NULL.
And we didn't mark the entrypoints "omp target entrypoint" when the caller
has been "omp declare target".
2021-05-26 Jakub Jelinek <jakub@redhat.com>
PR libgomp/100573
gcc/
* omp-low.c: Include omp-offload.h.
(create_omp_child_function): If current_function_decl has
"omp declare target" attribute and is_gimple_omp_offloaded,
remove that attribute from the copy of attribute list and
add "omp target entrypoint" attribute instead.
(lower_omp_target): Mark .omp_data_sizes.* and .omp_data_kinds.*
variables for offloading if in omp_maybe_offloaded_ctx.
* omp-offload.c (pass_omp_target_link::execute): Nullify second
argument to GOMP_target_data_ext in offloaded code.
libgomp/
* config/nvptx/target.c (GOMP_target_ext, GOMP_target_data_ext,
GOMP_target_end_data, GOMP_target_update_ext,
GOMP_target_enter_exit_data): New dummy entrypoints.
* config/gcn/target.c (GOMP_target_ext, GOMP_target_data_ext,
GOMP_target_end_data, GOMP_target_update_ext,
GOMP_target_enter_exit_data): Likewise.
* testsuite/libgomp.c-c++-common/for-3.c (DO_PRAGMA, OMPTEAMS,
OMPFROM, OMPTO): Define.
(main): Remove #pragma omp target teams around all the tests.
* testsuite/libgomp.c-c++-common/target-41.c: New test.
* testsuite/libgomp.c-c++-common/target-42.c: New test.
|
|
gcc/
PR middle-end/90115
* omp-low.c (oacc_privatization_candidate_p): Reject 'static',
'external' in blocks.
gcc/testsuite/
PR middle-end/90115
* c-c++-common/goacc/privatization-1-compute-loop.c: Update.
* c-c++-common/goacc/privatization-1-compute.c: Likewise.
* c-c++-common/goacc/privatization-1-routine_gang-loop.c:
Likewise.
* c-c++-common/goacc/privatization-1-routine_gang.c: Likewise.
libgomp/
PR middle-end/90115
* testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: Update.
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
|
|
testsuite coverage [PR90115]
gcc/
PR middle-end/90115
* flag-types.h (enum openacc_privatization): New.
* params.opt (-param=openacc-privatization): New.
* doc/invoke.texi (openacc-privatization): Document it.
* omp-general.h (get_openacc_privatization_dump_flags): New
function.
* omp-low.c (oacc_privatization_candidate_p): Add diagnostics.
* omp-offload.c (execute_oacc_device_lower)
<IFN_UNIQUE_OACC_PRIVATE>: Re-work diagnostics.
* target.def (goacc.adjust_private_decl): Add 'location_t'
parameter.
* doc/tm.texi: Regenerate.
* config/gcn/gcn-protos.h (gcn_goacc_adjust_private_decl): Adjust.
* config/gcn/gcn-tree.c (gcn_goacc_adjust_private_decl): Likewise.
* config/nvptx/nvptx.c (nvptx_goacc_adjust_private_decl):
Likewise. Preserve it for...
(nvptx_goacc_expand_var_decl): ... use here.
gcc/testsuite/
PR middle-end/90115
* c-c++-common/goacc/privatization-1-compute-loop.c: New file.
* c-c++-common/goacc/privatization-1-compute.c: Likewise.
* c-c++-common/goacc/privatization-1-routine_gang-loop.c:
Likewise.
* c-c++-common/goacc/privatization-1-routine_gang.c: Likewise.
* gfortran.dg/goacc/privatization-1-compute-loop.f90: Likewise.
* gfortran.dg/goacc/privatization-1-compute.f90: Likewise.
* gfortran.dg/goacc/privatization-1-routine_gang-loop.f90:
Likewise.
* gfortran.dg/goacc/privatization-1-routine_gang.f90: Likewise.
* c-c++-common/goacc-gomp/nesting-1.c: Update.
* c-c++-common/goacc/private-reduction-1.c: Likewise.
* gfortran.dg/goacc/private-3.f95: Likewise.
libgomp/
PR middle-end/90115
* testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90: New
file.
* testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Update.
* testsuite/libgomp.oacc-c-c++-common/host_data-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/private-variables.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/static-variable-1.c:
Likewise.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise.
* testsuite/libgomp.oacc-fortran/declare-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/host_data-5.F90: Likewise.
* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
* testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise.
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise.
|
|
[PR90115]
gcc/
PR middle-end/90115
* omp-low.c (oacc_privatization_candidate_p): New function.
(oacc_privatization_scan_clause_chain)
(oacc_privatization_scan_decl_chain): Use it. Also
'gcc_checking_assert' that we're not seeing duplicates.
|
|
gcc/
PR middle-end/90115
* omp-low.c (lower_omp_for): Don't evaluate OpenMP 'for' clauses.
|
|
[PR90115]
This patch implements a method to track the "private-ness" of
OpenACC variables declared in offload regions in gang-partitioned,
worker-partitioned or vector-partitioned modes. Variables declared
implicitly in scoped blocks and those declared "private" on enclosing
directives (e.g. "acc parallel") are both handled. Variables that are
e.g. gang-private can then be adjusted so they reside in GPU shared
memory.
The reason for doing this is twofold: correct implementation of OpenACC
semantics, and optimisation, since shared memory might be faster than
the main memory on a GPU. Handling of private variables is intimately
tied to the execution model for gangs/workers/vectors implemented by
a particular target: for current targets, we use (or on mainline, will
soon use) a broadcasting/neutering scheme.
That is sufficient for code that e.g. sets a variable in worker-single
mode and expects to use the value in worker-partitioned mode. The
difficulty (semantics-wise) comes when the user wants to do something like
an atomic operation in worker-partitioned mode and expects a worker-single
(gang private) variable to be shared across each partitioned worker.
Forcing use of shared memory for such variables makes that work properly.
In terms of implementation, the parallelism level of a given loop is
not fixed until the oaccdevlow pass in the offload compiler, so the
patch delays fixing the parallelism level of variables declared on or
within such loops until the same point. This is done by adding a new
internal UNIQUE function (OACC_PRIVATE) that lists (the address of) each
private variable as an argument, and other arguments set so as to be able
to determine the correct parallelism level to use for the listed
variables. This new internal function fits into the existing scheme for
demarcating OpenACC loops, as described in comments in the patch.
Two new target hooks are introduced: TARGET_GOACC_ADJUST_PRIVATE_DECL and
TARGET_GOACC_EXPAND_VAR_DECL. The first can tweak a variable declaration
at oaccdevlow time, and the second at expand time. The first or both
of these target hooks can be used by a given offload target, depending
on its strategy for implementing private variables.
This patch updates the TARGET_GOACC_ADJUST_PRIVATE_DECL target hook in
the AMD GCN backend to the current name and prototype. (An earlier
version of the hook was already present, but dormant.)
gcc/
PR middle-end/90115
* doc/tm.texi.in (TARGET_GOACC_EXPAND_VAR_DECL)
(TARGET_GOACC_ADJUST_PRIVATE_DECL): Add documentation hooks.
* doc/tm.texi: Regenerate.
* expr.c (expand_expr_real_1): Expand decls using the
expand_var_decl OpenACC hook if defined.
* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
* omp-low.c (omp_context): Add oacc_privatization_candidates
field.
(lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert
before fork.
(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify
private marker's gimple call arguments, and pass it to
lower_oacc_reductions.
(oacc_privatization_scan_clause_chain)
(oacc_privatization_scan_decl_chain, lower_oacc_private_marker):
New functions.
(lower_omp_for, lower_omp_target, lower_omp_1): Use these.
* omp-offload.c (convert.h): Include.
(oacc_loop_xform_head_tail): Treat private-variable markers like
fork/join when transforming head/tail sequences.
(struct var_decl_rewrite_info): Add struct.
(oacc_rewrite_var_decl, is_sync_builtin_call): New functions.
(execute_oacc_device_lower): Support rewriting gang-private
variables using target hook, and fix up addr_expr and var_decl
nodes afterwards.
* target.def (adjust_private_decl, expand_var_decl): New hooks.
* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl):
Rename to...
(gcn_goacc_adjust_private_decl): ...this.
* config/gcn/gcn-tree.c (gcn_goacc_adjust_gangprivate_decl):
Rename to...
(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename
definition using gcn_goacc_adjust_gangprivate_decl...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...to this, using
gcn_goacc_adjust_private_decl.
* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
(gang_private_shared_size): New global variable.
(gang_private_shared_align): Likewise.
(gang_private_shared_sym): Likewise.
(gang_private_shared_hmap): Likewise.
(nvptx_option_override): Initialize these.
(nvptx_file_end): Output gang_private_shared_sym.
(nvptx_goacc_adjust_private_decl, nvptx_goacc_expand_var_decl):
New functions.
(nvptx_set_current_function): Clear gang_private_shared_hmap.
(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook.
(TARGET_GOACC_EXPAND_VAR_DECL): Likewise.
libgomp/
PR middle-end/90115
* testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c: New
test.
* testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90:
Likewise.
Co-Authored-By: Chung-Lin Tang <cltang@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
|
|
No overall change in behavior.
gcc/
* gimple.h (is_gimple_omp_oacc): Tighten.
* omp-low.c (check_omp_nesting_restrictions): Adjust.
|
|
gcc/ChangeLog:
* omp-low.c (finish_taskreg_scan): Use the proper detach decl.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/task-detach-12.c: New test.
* testsuite/libgomp.fortran/task-detach-12.f90: New test.
|
|
When a taskloop doesn't have any iterations, GOMP_taskloop* takes an early
return, doesn't create any tasks and more importantly, doesn't create
a taskgroup and doesn't register task reductions. But, the code emitted
in the callers assumes task reductions have been registered and performs
the reduction handling and task reduction unregistration. The pointer
to the task reduction private variables is reused, on input it is the alignment
and only on output it is the pointer, so in the case taskloop with no iterations
the caller attempts to dereference the alignment value as if it was a pointer
and crashes. We could in the early returns register the task reductions
only to have them looped over and unregistered in the caller, but I think
it is better to tell the caller there is nothing to task reduce and bypass
all that.
2021-05-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100471
* omp-low.c (lower_omp_task_reductions): For OMP_TASKLOOP, if data
is 0, bypass the reduction loop including
GOMP_taskgroup_reduction_unregister call.
* taskloop.c (GOMP_taskloop): If GOMP_TASK_FLAG_REDUCTION and not
GOMP_TASK_FLAG_NOGROUP, when doing early return clear the task
reduction pointer.
* testsuite/libgomp.c/task-reduction-4.c: New test.
|
|
gcc/ada/ChangeLog:
* gcc-interface/utils.c (def_builtin_1): Use startswith
function instead of strncmp.
gcc/analyzer/ChangeLog:
* sm-file.cc (is_file_using_fn_p): Use startswith
function instead of strncmp.
gcc/ChangeLog:
* builtins.c (is_builtin_name): Use startswith
function instead of strncmp.
* collect2.c (main): Likewise.
(has_lto_section): Likewise.
(scan_libraries): Likewise.
* coverage.c (coverage_checksum_string): Likewise.
(coverage_init): Likewise.
* dwarf2out.c (is_cxx): Likewise.
(gen_compile_unit_die): Likewise.
* gcc-ar.c (main): Likewise.
* gcc.c (init_spec): Likewise.
(read_specs): Likewise.
(execute): Likewise.
(check_live_switch): Likewise.
* genattrtab.c (write_attr_case): Likewise.
(IS_ATTR_GROUP): Likewise.
* gencfn-macros.c (main): Likewise.
* gengtype.c (type_for_name): Likewise.
(gen_rtx_next): Likewise.
(get_file_langdir): Likewise.
(write_local): Likewise.
* genmatch.c (get_operator): Likewise.
(get_operand_type): Likewise.
(expr::gen_transform): Likewise.
* genoutput.c (validate_optab_operands): Likewise.
* incpath.c (add_sysroot_to_chain): Likewise.
* langhooks.c (lang_GNU_C): Likewise.
(lang_GNU_CXX): Likewise.
(lang_GNU_Fortran): Likewise.
(lang_GNU_OBJC): Likewise.
* lto-wrapper.c (run_gcc): Likewise.
* omp-general.c (omp_max_simt_vf): Likewise.
* omp-low.c (omp_runtime_api_call): Likewise.
* opts-common.c (parse_options_from_collect_gcc_options): Likewise.
* read-rtl-function.c (function_reader::read_rtx_operand_r): Likewise.
* real.c (real_from_string): Likewise.
* selftest.c (assert_str_startswith): Likewise.
* timevar.c (timer::validate_phases): Likewise.
* tree.c (get_file_function_name): Likewise.
* ubsan.c (ubsan_use_new_style_p): Likewise.
* varasm.c (default_function_rodata_section): Likewise.
(incorporeal_function_p): Likewise.
(default_section_type_flags): Likewise.
* system.h (startswith): Define startswith.
gcc/c-family/ChangeLog:
* c-ada-spec.c (print_destructor): Use startswith
function instead of strncmp.
(dump_ada_declaration): Likewise.
* c-common.c (disable_builtin_function): Likewise.
(def_builtin_1): Likewise.
* c-format.c (check_tokens): Likewise.
(check_plain): Likewise.
(convert_format_name_to_system_name): Likewise.
gcc/c/ChangeLog:
* c-aux-info.c (affix_data_type): Use startswith
function instead of strncmp.
* c-typeck.c (build_function_call_vec): Likewise.
* gimple-parser.c (c_parser_gimple_parse_bb_spec): Likewise.
gcc/cp/ChangeLog:
* decl.c (duplicate_decls): Use startswith
function instead of strncmp.
(cxx_builtin_function): Likewise.
(omp_declare_variant_finalize_one): Likewise.
(grokfndecl): Likewise.
* error.c (dump_decl_name): Likewise.
* mangle.c (find_decomp_unqualified_name): Likewise.
(write_guarded_var_name): Likewise.
(decl_tls_wrapper_p): Likewise.
* parser.c (cp_parser_simple_type_specifier): Likewise.
(cp_parser_tx_qualifier_opt): Likewise.
* pt.c (template_parm_object_p): Likewise.
(dguide_name_p): Likewise.
gcc/d/ChangeLog:
* d-builtins.cc (do_build_builtin_fn): Use startswith
function instead of strncmp.
* dmd/dinterpret.c (evaluateIfBuiltin): Likewise.
* dmd/dmangle.c: Likewise.
* dmd/hdrgen.c: Likewise.
* dmd/identifier.c (Identifier::toHChars2): Likewise.
gcc/fortran/ChangeLog:
* decl.c (variable_decl): Use startswith
function instead of strncmp.
(gfc_match_end): Likewise.
* gfortran.h (gfc_str_startswith): Likewise.
* module.c (load_omp_udrs): Likewise.
(read_module): Likewise.
* options.c (gfc_handle_runtime_check_option): Likewise.
* primary.c (match_arg_list_function): Likewise.
* trans-decl.c (gfc_get_symbol_decl): Likewise.
* trans-expr.c (gfc_conv_procedure_call): Likewise.
* trans-intrinsic.c (gfc_conv_ieee_arithmetic_function): Likewise.
gcc/go/ChangeLog:
* gofrontend/runtime.cc (Runtime::name_to_code): Use startswith
function instead of strncmp.
gcc/objc/ChangeLog:
* objc-act.c (objc_string_ref_type_p): Use startswith
function instead of strncmp.
* objc-encoding.c (encode_type): Likewise.
* objc-next-runtime-abi-02.c (has_load_impl): Likewise.
|
|
2021-05-07 Tobias Burnus <tobias@codesourcery.com>
Tom de Vries <tdevries@suse.de>
gcc/ChangeLog:
* omp-low.c (lower_rec_simd_input_clauses): Set max_vf = 1 if
a truth_value_p reduction variable is nonintegral.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/reduction-5.c: New test, testing
complex/floating-point || + && reduction with 'omp target'.
* testsuite/libgomp.c-c++-common/reduction-6.c: Likewise.
|
|
C/C++ permit logical AND and logical OR also with floating-point or complex
arguments by doing an unequal zero comparison; the result is an 'int' with
value one or zero. Hence, those are also permitted as reduction variable,
even though it is not the most sensible thing to do.
gcc/c/ChangeLog:
* c-typeck.c (c_finish_omp_clauses): Accept float + complex
for || and && reductions.
gcc/cp/ChangeLog:
* semantics.c (finish_omp_reduction_clause): Accept float + complex
for || and && reductions.
gcc/ChangeLog:
* omp-low.c (lower_rec_input_clauses, lower_reduction_clauses): Handle
&& and || with floating-point and complex arguments.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/clause-1.c: Use 'reduction(&:..)' instead of '...(&&:..)'.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/reduction-1.c: New test.
* testsuite/libgomp.c-c++-common/reduction-2.c: New test.
* testsuite/libgomp.c-c++-common/reduction-3.c: New test.
|
|
The test-case included in this patch contains this target region:
...
for (int i0 = 0 ; i0 < N0 ; i0++ )
counter_N0.i += 1;
...
When running with nvptx accelerator, the counter variable is expected to
be N0 after the region, but instead is N0 / 32. The problem is that rather
than getting the result for all warp lanes, we get it for just one lane.
This is caused by the implementation of SIMT being incomplete. It handles
regular reductions, but appearantly not user-defined reductions.
For now, handle this by disabling SIMT in this case, specifically by setting
sctx->max_vf to 1.
Tested libgomp on x86_64-linux with nvptx accelerator.
gcc/ChangeLog:
2021-05-03 Tom de Vries <tdevries@suse.de>
PR target/100321
* omp-low.c (lower_rec_input_clauses): Disable SIMT for user-defined
reduction.
libgomp/ChangeLog:
2021-05-03 Tom de Vries <tdevries@suse.de>
PR target/100321
* testsuite/libgomp.c/target-44.c: New test.
|
|
The OpenMP standard says:
"A teams region can only be strictly nested within the implicit parallel region
or a target region. If a teams construct is nested within a target construct,
that target construct must contain no statements, declarations or directives
outside of the teams construct."
We weren't diagnosing that restriction, because we need to allow e.g.
#pragma omp target
{{{{{{
#pragma omp teams
;
}}}}}}
and as target doesn't need to have teams nested in it, using some special
parser of the target body didn't feel right. And after the parsing,
the question is if e.g. already parsing of the clauses doesn't add some
statements before the teams statement (gimplification certainly will).
As we now have a bugreport where we ICE on the invalid code, this just
diagnoses a subset of the invalid programs, in particular those where
nest to the teams strictly nested in targets the target region contains
some other OpenMP construct.
2021-02-24 Jakub Jelinek <jakub@redhat.com>
PR fortran/99226
* omp-low.c (struct omp_context): Add teams_nested_p and
nonteams_nested_p members.
(scan_omp_target): Diagnose teams nested inside of target with other
directives strictly nested inside of the same target.
(check_omp_nesting_restrictions): Set ctx->teams_nested_p or
ctx->nonteams_nested_p as needed.
* c-c++-common/gomp/pr99226.c: New test.
* gfortran.dg/gomp/pr99226.f90: New test.
|
|
gcc/fortran/ChangeLog:
PR fortran/98476
* openmp.c (resolve_omp_clauses): Change use_device_ptr
to use_device_addr for unless type(c_ptr); check all
list item for is_device_ptr.
gcc/ChangeLog:
PR fortran/98476
* omp-low.c (lower_omp_target): Handle nonpointer is_device_ptr.
libgomp/ChangeLog:
PR fortran/98476
* testsuite/libgomp.fortran/is_device_ptr-1.f90: New test.
gcc/testsuite/ChangeLog:
PR fortran/98476
* gfortran.dg/gomp/map-3.f90: Update expected scan-dump-tree.
* gfortran.dg/gomp/is_device_ptr-2.f90: New test.
* gfortran.dg/gomp/use_device_ptr-1.f90: New test.
|
|
2021-01-16 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* builtin-types.def
(BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT): Rename
to...
(BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PTR):
...this. Add extra argument.
* gimplify.c (omp_default_clause): Ensure that event handle is
firstprivate in a task region.
(gimplify_scan_omp_clauses): Handle OMP_CLAUSE_DETACH.
(gimplify_adjust_omp_clauses): Likewise.
* omp-builtins.def (BUILT_IN_GOMP_TASK): Change function type to
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PTR.
* omp-expand.c (expand_task_call): Add GOMP_TASK_FLAG_DETACH to flags
if detach clause specified. Add detach argument when generating
call to GOMP_task.
* omp-low.c (scan_sharing_clauses): Setup data environment for detach
clause.
(finish_taskreg_scan): Move field for variable containing the event
handle to the front of the struct.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DETACH. Fix
ordering.
* tree-nested.c (convert_nonlocal_omp_clauses): Handle
OMP_CLAUSE_DETACH clause.
(convert_local_omp_clauses): Handle OMP_CLAUSE_DETACH clause.
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_DETACH.
* tree.c (omp_clause_num_ops): Add entry for OMP_CLAUSE_DETACH.
Fix ordering.
(omp_clause_code_name): Add entry for OMP_CLAUSE_DETACH. Fix
ordering.
(walk_tree_1): Handle OMP_CLAUSE_DETACH.
gcc/c-family/
* c-pragma.h (pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_DETACH.
Redefine PRAGMA_OACC_CLAUSE_DETACH.
gcc/c/
* c-parser.c (c_parser_omp_clause_detach): New.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH clause.
(OMP_TASK_CLAUSE_MASK): Add mask for PRAGMA_OMP_CLAUSE_DETACH.
* c-typeck.c (c_finish_omp_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH
clause. Prevent use of detach with mergeable and overriding the
data sharing mode of the event handle.
gcc/cp/
* parser.c (cp_parser_omp_clause_detach): New.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH.
(OMP_TASK_CLAUSE_MASK): Add mask for PRAGMA_OMP_CLAUSE_DETACH.
* pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_DETACH clause.
* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_DETACH clause.
Prevent use of detach with mergeable and overriding the data sharing
mode of the event handle.
gcc/fortran/
* dump-parse-tree.c (show_omp_clauses): Handle detach clause.
* frontend-passes.c (gfc_code_walker): Walk detach expression.
* gfortran.h (struct gfc_omp_clauses): Add detach field.
(gfc_c_intptr_kind): New.
* openmp.c (gfc_free_omp_clauses): Free detach clause.
(gfc_match_omp_detach): New.
(enum omp_mask1): Add OMP_CLAUSE_DETACH.
(enum omp_mask2): Remove OMP_CLAUSE_DETACH.
(gfc_match_omp_clauses): Handle OMP_CLAUSE_DETACH for OpenMP.
(OMP_TASK_CLAUSES): Add OMP_CLAUSE_DETACH.
(resolve_omp_clauses): Prevent use of detach with mergeable and
overriding the data sharing mode of the event handle.
* trans-openmp.c (gfc_trans_omp_clauses): Handle detach clause.
* trans-types.c (gfc_c_intptr_kind): New.
(gfc_init_kinds): Initialize gfc_c_intptr_kind.
* types.def
(BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT): Rename
to...
(BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PTR):
...this. Add extra argument.
gcc/testsuite/
* c-c++-common/gomp/task-detach-1.c: New.
* g++.dg/gomp/task-detach-1.C: New.
* gcc.dg/gomp/task-detach-1.c: New.
* gfortran.dg/gomp/task-detach-1.f90: New.
include/
* gomp-constants.h (GOMP_TASK_FLAG_DETACH): New.
libgomp/
* fortran.c (omp_fulfill_event_): New.
* libgomp.h (struct gomp_task): Add detach and completion_sem fields.
(struct gomp_team): Add task_detach_queue and task_detach_count
fields.
* libgomp.map (OMP_5.0.1): Add omp_fulfill_event and omp_fulfill_event_.
* libgomp_g.h (GOMP_task): Add extra argument.
* omp.h.in (enum omp_event_handle_t): New.
(omp_fulfill_event): New.
* omp_lib.f90.in (omp_event_handle_kind): New.
(omp_fulfill_event): New.
* omp_lib.h.in (omp_event_handle_kind): New.
(omp_fulfill_event): Declare.
* priority_queue.c (priority_tree_find): New.
(priority_list_find): New.
(priority_queue_find): New.
* priority_queue.h (priority_queue_predicate): New.
(priority_queue_find): New.
* task.c (gomp_init_task): Initialize detach field.
(task_fulfilled_p): New.
(GOMP_task): Add detach argument. Ignore detach argument if
GOMP_TASK_FLAG_DETACH not set in flags. Initialize completion_sem
field. Copy address of completion_sem into detach argument and
into the start of the data record. Wait for detach event if task
not deferred.
(gomp_barrier_handle_tasks): Queue tasks with unfulfilled events.
Remove completed tasks and requeue dependent tasks.
(omp_fulfill_event): New.
* team.c (gomp_new_team): Initialize task_detach_queue and
task_detach_count fields.
(free_team): Free task_detach_queue field.
* testsuite/libgomp.c-c++-common/task-detach-1.c: New testcase.
* testsuite/libgomp.c-c++-common/task-detach-2.c: New testcase.
* testsuite/libgomp.c-c++-common/task-detach-3.c: New testcase.
* testsuite/libgomp.c-c++-common/task-detach-4.c: New testcase.
* testsuite/libgomp.c-c++-common/task-detach-5.c: New testcase.
* testsuite/libgomp.c-c++-common/task-detach-6.c: New testcase.
* testsuite/libgomp.fortran/task-detach-1.f90: New testcase.
* testsuite/libgomp.fortran/task-detach-2.f90: New testcase.
* testsuite/libgomp.fortran/task-detach-3.f90: New testcase.
* testsuite/libgomp.fortran/task-detach-4.f90: New testcase.
* testsuite/libgomp.fortran/task-detach-5.f90: New testcase.
* testsuite/libgomp.fortran/task-detach-6.f90: New testcase.
|
|
|
|
While the data regions (target data and OpenACC counterparts) aren't
standalone directives, unlike most other OpenMP/OpenACC constructs
we allow (apparently as an extension) exceptions and goto out of
the block. During gimplification we place an *end* call into a finally
block so that it is reached even on exceptions or goto out etc.).
During omplower pass we then add paired #pragma omp return for them,
but due to the exceptions because the region is not SESE we can end up
with #pragma omp return appearing only conditionally in the CFG etc.,
which the ompexp pass can't handle.
For the ompexp pass, we actually don't care about the end part or about
target data nesting, so we can treat it as standalone directive.
2020-12-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/98183
* omp-low.c (lower_omp_target): Don't add OMP_RETURN for
data regions.
* omp-expand.c (expand_omp_target): Don't try to remove
OMP_RETURN for data regions.
(build_omp_regions_1, omp_make_gimple_edges): Don't expect
OMP_RETURN for data regions.
* gcc.dg/gomp/pr98183.c: New test.
* gcc.dg/goacc/pr98183.c: New test.
|
|
This patch adds support for custom allocators on private/firstprivate
clauses for task (and taskloop) constructs. Private didn't need anything
special, but firstprivate if it is passed by reference needs the GOMP_alloc
calls in the copyfn and GOMP_free in the task body.
2020-11-14 Jakub Jelinek <jakub@redhat.com>
* gimplify.c (gimplify_omp_for): Add OMP_CLAUSE_ALLOCATE_ALLOCATOR
decls as firstprivate on task clauses even when allocate clause
decl is not lastprivate.
* omp-low.c (install_var_field): Don't dereference omp_is_reference
types if mask is 33 rather than 1.
(scan_sharing_clauses): Populate allocate_map even for task
constructs. For now remove it back for variables mentioned in
reduction and in_reduction clauses on task/taskloop constructs
or on VLA task firstprivates. For firstprivate on task construct,
install the var field into field_map with by_ref and 33 instead
of false and 1 if mentioned in allocate clause.
(lower_private_allocate): Set TREE_THIS_NOTRAP on the created
MEM_REF.
(lower_rec_input_clauses): Handle allocate for task firstprivatized
non-VLA variables.
(create_task_copyfn): Likewise.
* testsuite/libgomp.c-c++-common/allocate-1.c (struct S): New type.
(foo): Add tests for non-VLA private and firstprivate clauses on
omp task.
(bar): Likewise. Remove taking of address from private/firstprivate
variables.
* testsuite/libgomp.c++/allocate-1.C (struct S): New type.
(foo): Add p, q, px and s arguments. Add tests for array reductions
and for non-VLA private and firstprivate clauses on omp task.
(bar): Removed.
(main): Adjust foo caller. Don't call bar.
|
|
constructs
Not yet enabled by default: for now, the current mode of OpenACC 'kernels'
constructs handling still remains '-fopenacc-kernels=parloops', but that is to
change later.
gcc/
* omp-oacc-kernels-decompose.cc: New.
* Makefile.in (OBJS): Add it.
* passes.def: Instantiate it.
* tree-pass.h (make_pass_omp_oacc_kernels_decompose): Declare.
* flag-types.h (enum openacc_kernels): Add.
* doc/invoke.texi (-fopenacc-kernels): Document.
* gimple.h (enum gf_mask): Add
'GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_PARALLELIZED',
'GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GANG_SINGLE',
'GF_OMP_TARGET_KIND_OACC_DATA_KERNELS'.
(is_gimple_omp_oacc, is_gimple_omp_offloaded): Handle these.
* gimple-pretty-print.c (dump_gimple_omp_target): Likewise.
* omp-expand.c (expand_omp_target, build_omp_regions_1)
(omp_make_gimple_edges): Likewise.
* omp-low.c (scan_sharing_clauses, scan_omp_for)
(check_omp_nesting_restrictions, lower_oacc_reductions)
(lower_oacc_head_mark, lower_omp_target): Likewise.
* omp-offload.c (execute_oacc_device_lower): Likewise.
gcc/c-family/
* c.opt (fopenacc-kernels): Add.
gcc/fortran/
* lang.opt (fopenacc-kernels): Add.
gcc/testsuite/
* c-c++-common/goacc/kernels-decompose-1.c: New.
* c-c++-common/goacc/kernels-decompose-2.c: New.
* c-c++-common/goacc/kernels-decompose-ice-1.c: New.
* c-c++-common/goacc/kernels-decompose-ice-2.c: New.
* gfortran.dg/goacc/kernels-decompose-1.f95: New.
* gfortran.dg/goacc/kernels-decompose-2.f95: New.
* c-c++-common/goacc/if-clause-2.c: Adjust.
* gfortran.dg/goacc/kernels-tree.f95: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c:
New.
* testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Adjust.
* testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise.
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
|
|
In particular, more precisely highlight what applies generally vs. the special
handling for the current 'parloops'-based OpenACC 'kernels' implementation.
gcc/
* omp-low.c (scan_sharing_clauses, scan_omp_for)
(lower_oacc_reductions, lower_omp_target): More explicit checking
of which OMP constructs we're expecting.
|
|
This adds allocate clause support for array section reductions.
Furthermore, it fixes one bug that would cause inscan reductions with
allocate to be rejected by C, and for now just ignores allocate for
inscan/task reductions, that will need slightly more work.
2020-11-13 Jakub Jelinek <jakub@redhat.com>
gcc/
* omp-low.c (scan_sharing_clauses): For now remove for reduction
clauses with inscan or task modifiers decl from allocate_map.
(lower_private_allocate): Handle TYPE_P (new_var).
(lower_rec_input_clauses): Handle allocate clause for C/C++ array
reductions.
gcc/c/
* c-typeck.c (c_finish_omp_clauses): Don't clear
OMP_CLAUSE_REDUCTION_INSCAN unless reduction_seen == -2.
libgomp/
* testsuite/libgomp.c-c++-common/allocate-1.c (foo): Add tests
for array reductions.
(main): Adjust foo callers.
|
|
For now, task/taskloop constructs aren't handled and C/C++ array reductions
and reductions with task or inscan modifiers need further work.
Instead of calling omp_alloc/omp_free (where the former doesn't have
alignment argument and omp_aligned_alloc is 5.1 only feature), this calls
GOMP_alloc/GOMP_free, so that the library can fail if it would fall back
into NULL (exception is zero length allocations).
2020-11-12 Jakub Jelinek <jakub@redhat.com>
gcc/
* builtin-types.def (BT_FN_PTR_SIZE_SIZE_PTRMODE): New function type.
* omp-builtins.def (BUILT_IN_GOACC_DECLARE): Move earlier.
(BUILT_IN_GOMP_ALLOC, BUILT_IN_GOMP_FREE): New builtins.
* gimplify.c (gimplify_scan_omp_clauses): Force allocator into a
decl if it is not NULL, INTEGER_CST or decl.
(gimplify_adjust_omp_clauses): Clear GOVD_EXPLICIT on explicit clauses
which are being removed. Remove allocate clauses for variables not seen
if they are private, firstprivate or linear too. Call
omp_notice_variable on the allocator otherwise.
(gimplify_omp_for): Handle iterator vars mentioned in allocate clauses
similarly to non-is_gimple_reg iterators.
* omp-low.c (struct omp_context): Add allocate_map field.
(delete_omp_context): Delete it.
(scan_sharing_clauses): Fill it from allocate clauses. Remove it
if mentioned also in shared clause.
(lower_private_allocate): New function.
(lower_rec_input_clauses): Handle allocate clause for privatized
variables, except for task/taskloop, C/C++ array reductions for now
and task/inscan variables.
(lower_send_shared_vars): Don't consider variables in allocate_map
as shared.
* omp-expand.c (expand_omp_for_generic, expand_omp_for_static_nochunk,
expand_omp_for_static_chunk): Use expand_omp_build_assign instead of
gimple_build_assign + gsi_insert_after.
* builtins.c (builtin_fnspec): Handle BUILTIN_GOMP_ALLOC and
BUILTIN_GOMP_FREE.
* tree-ssa-ccp.c (evaluate_stmt): Handle BUILTIN_GOMP_ALLOC.
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Handle
BUILTIN_GOMP_ALLOC.
(mark_all_reaching_defs_necessary_1): Handle BUILTIN_GOMP_ALLOC
and BUILTIN_GOMP_FREE.
(propagate_necessity): Likewise.
gcc/fortran/
* f95-lang.c (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST):
Define.
(gfc_init_builtin_functions): Add alloc_size and warn_unused_result
attributes to __builtin_GOMP_alloc.
* types.def (BT_PTRMODE): New primitive type.
(BT_FN_VOID_PTR_PTRMODE, BT_FN_PTR_SIZE_SIZE_PTRMODE): New function
types.
libgomp/
* libgomp.map (GOMP_alloc, GOMP_free): Export at GOMP_5.0.1.
* omp.h.in (omp_alloc): Add malloc and alloc_size attributes.
* libgomp_g.h (GOMP_alloc, GOMP_free): Declare.
* allocator.c (omp_aligned_alloc): New for now static function,
add alignment argument and handle it.
(omp_alloc): Reimplement using omp_aligned_alloc.
(GOMP_alloc, GOMP_free): New functions.
(omp_free): Add ialias.
* testsuite/libgomp.c-c++-common/allocate-1.c: New test.
* testsuite/libgomp.c++/allocate-1.C: New test.
|
|
gcc/fortran/ChangeLog:
* dump-parse-tree.c (show_omp_clauses): Handle new reduction enums.
* gfortran.h (OMP_LIST_REDUCTION_INSCAN, OMP_LIST_REDUCTION_TASK,
OMP_LIST_IN_REDUCTION, OMP_LIST_TASK_REDUCTION): Add enums.
* openmp.c (enum omp_mask1): Add OMP_CLAUSE_IN_REDUCTION
and OMP_CLAUSE_TASK_REDUCTION.
(gfc_match_omp_clause_reduction): Extend reduction handling;
moved from ...
(gfc_match_omp_clauses): ... here. Add calls to it.
(OMP_TASK_CLAUSES, OMP_TARGET_CLAUSES, OMP_TASKLOOP_CLAUSES):
Add OMP_CLAUSE_IN_REDUCTION.
(gfc_match_omp_taskgroup): Add task_reduction matching.
(resolve_omp_clauses): Update for new reduction clause changes;
remove removed nonmonotonic-schedule restrictions.
(gfc_resolve_omp_parallel_blocks): Add new enums to switch.
* trans-openmp.c (gfc_omp_clause_default_ctor,
gfc_trans_omp_reduction_list, gfc_trans_omp_clauses,
gfc_split_omp_clauses): Handle updated reduction clause.
gcc/ChangeLog:
* gimplify.c (gimplify_scan_omp_clauses, gimplify_omp_loop): Use 'do'
instead of 'for' in error messages for Fortran.
* omp-low.c (check_omp_nesting_restrictions): Likewise
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/schedule-modifiers-2.f90: Remove some dg-error.
* gfortran.dg/gomp/reduction4.f90: New test.
* gfortran.dg/gomp/reduction5.f90: New test.
* gfortran.dg/gomp/workshare-reduction-1.f90: New test.
* gfortran.dg/gomp/workshare-reduction-2.f90: New test.
* gfortran.dg/gomp/workshare-reduction-3.f90: New test.
* gfortran.dg/gomp/workshare-reduction-4.f90: New test.
* gfortran.dg/gomp/workshare-reduction-5.f90: New test.
* gfortran.dg/gomp/workshare-reduction-6.f90: New test.
* gfortran.dg/gomp/workshare-reduction-7.f90: New test.
* gfortran.dg/gomp/workshare-reduction-8.f90: New test.
* gfortran.dg/gomp/workshare-reduction-9.f90: New test.
* gfortran.dg/gomp/workshare-reduction-10.f90: New test.
* gfortran.dg/gomp/workshare-reduction-11.f90: New test.
* gfortran.dg/gomp/workshare-reduction-12.f90: New test.
* gfortran.dg/gomp/workshare-reduction-13.f90: New test.
* gfortran.dg/gomp/workshare-reduction-14.f90: New test.
* gfortran.dg/gomp/workshare-reduction-15.f90: New test.
* gfortran.dg/gomp/workshare-reduction-16.f90: New test.
* gfortran.dg/gomp/workshare-reduction-17.f90: New test.
* gfortran.dg/gomp/workshare-reduction-18.f90: New test.
* gfortran.dg/gomp/workshare-reduction-19.f90: New test.
* gfortran.dg/gomp/workshare-reduction-20.f90: New test.
* gfortran.dg/gomp/workshare-reduction-21.f90: New test.
* gfortran.dg/gomp/workshare-reduction-22.f90: New test.
* gfortran.dg/gomp/workshare-reduction-23.f90: New test.
* gfortran.dg/gomp/workshare-reduction-24.f90: New test.
* gfortran.dg/gomp/workshare-reduction-25.f90: New test.
* gfortran.dg/gomp/workshare-reduction-26.f90: New test.
* gfortran.dg/gomp/workshare-reduction-27.f90: New test.
* gfortran.dg/gomp/workshare-reduction-28.f90: New test.
* gfortran.dg/gomp/workshare-reduction-29.f90: New test.
* gfortran.dg/gomp/workshare-reduction-30.f90: New test.
* gfortran.dg/gomp/workshare-reduction-31.f90: New test.
* gfortran.dg/gomp/workshare-reduction-32.f90: New test.
* gfortran.dg/gomp/workshare-reduction-33.f90: New test.
* gfortran.dg/gomp/workshare-reduction-34.f90: New test.
* gfortran.dg/gomp/workshare-reduction-35.f90: New test.
* gfortran.dg/gomp/workshare-reduction-36.f90: New test.
* gfortran.dg/gomp/workshare-reduction-37.f90: New test.
* gfortran.dg/gomp/workshare-reduction-38.f90: New test.
* gfortran.dg/gomp/workshare-reduction-39.f90: New test.
* gfortran.dg/gomp/workshare-reduction-40.f90: New test.
* gfortran.dg/gomp/workshare-reduction-41.f90: New test.
* gfortran.dg/gomp/workshare-reduction-42.f90: New test.
* gfortran.dg/gomp/workshare-reduction-43.f90: New test.
* gfortran.dg/gomp/workshare-reduction-44.f90: New test.
* gfortran.dg/gomp/workshare-reduction-45.f90: New test.
* gfortran.dg/gomp/workshare-reduction-46.f90: New test.
* gfortran.dg/gomp/workshare-reduction-47.f90: New test.
* gfortran.dg/gomp/workshare-reduction-48.f90: New test.
* gfortran.dg/gomp/workshare-reduction-49.f90: New test.
* gfortran.dg/gomp/workshare-reduction-50.f90: New test.
* gfortran.dg/gomp/workshare-reduction-51.f90: New test.
* gfortran.dg/gomp/workshare-reduction-52.f90: New test.
* gfortran.dg/gomp/workshare-reduction-53.f90: New test.
* gfortran.dg/gomp/workshare-reduction-54.f90: New test.
* gfortran.dg/gomp/workshare-reduction-55.f90: New test.
* gfortran.dg/gomp/workshare-reduction-56.f90: New test.
* gfortran.dg/gomp/workshare-reduction-57.f90: New test.
* gfortran.dg/gomp/workshare-reduction-58.f90: New test.
|
|
This patch implements some parts of the target variable mapping changes
specified in OpenMP 5.0, including base-pointer attachment/detachment
behavior for array section list-items in map clauses, and ordering of
map clauses according to map kind.
2020-11-10 Chung-Lin Tang <cltang@codesourcery.com>
gcc/c-family/ChangeLog:
* c-common.h (c_omp_adjust_map_clauses): New declaration.
* c-omp.c (struct map_clause): Helper type for c_omp_adjust_map_clauses.
(c_omp_adjust_map_clauses): New function.
gcc/c/ChangeLog:
* c-parser.c (c_parser_omp_target_data): Add use of
new c_omp_adjust_map_clauses function. Add GOMP_MAP_ATTACH_DETACH as
handled map clause kind.
(c_parser_omp_target_enter_data): Likewise.
(c_parser_omp_target_exit_data): Likewise.
(c_parser_omp_target): Likewise.
* c-typeck.c (handle_omp_array_sections): Adjust COMPONENT_REF case to
use GOMP_MAP_ATTACH_DETACH map kind for C_ORT_OMP region type.
(c_finish_omp_clauses): Adjust bitmap checks to allow struct decl and
same struct field access to co-exist on OpenMP construct.
gcc/cp/ChangeLog:
* parser.c (cp_parser_omp_target_data): Add use of
new c_omp_adjust_map_clauses function. Add GOMP_MAP_ATTACH_DETACH as
handled map clause kind.
(cp_parser_omp_target_enter_data): Likewise.
(cp_parser_omp_target_exit_data): Likewise.
(cp_parser_omp_target): Likewise.
* semantics.c (handle_omp_array_sections): Adjust COMPONENT_REF case to
use GOMP_MAP_ATTACH_DETACH map kind for C_ORT_OMP region type. Fix
interaction between reference case and attach/detach.
(finish_omp_clauses): Adjust bitmap checks to allow struct decl and
same struct field access to co-exist on OpenMP construct.
gcc/ChangeLog:
* gimplify.c (is_or_contains_p): New static helper function.
(omp_target_reorder_clauses): New function.
(gimplify_scan_omp_clauses): Add use of omp_target_reorder_clauses to
reorder clause list according to OpenMP 5.0 rules. Add handling of
GOMP_MAP_ATTACH_DETACH for OpenMP cases.
* omp-low.c (is_omp_target): New static helper function.
(scan_sharing_clauses): Add scan phase handling of GOMP_MAP_ATTACH/DETACH
for OpenMP cases.
(lower_omp_target): Add lowering handling of GOMP_MAP_ATTACH/DETACH for
OpenMP cases.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/clauses-2.c: Remove dg-error cases now valid.
* gfortran.dg/gomp/map-2.f90: Likewise.
* c-c++-common/gomp/map-5.c: New testcase.
libgomp/ChangeLog:
* libgomp.h (enum gomp_map_vars_kind): Adjust enum values to be bit-flag
usable.
* oacc-mem.c (acc_map_data): Adjust gomp_map_vars argument flags to
'GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_ENTER_DATA'.
(goacc_enter_datum): Likewise for call to gomp_map_vars_async.
(goacc_enter_data_internal): Likewise.
* target.c (gomp_map_vars_internal):
Change checks of GOMP_MAP_VARS_ENTER_DATA to use bit-and (&). Adjust use
of gomp_attach_pointer for OpenMP cases.
(gomp_exit_data): Add handling of GOMP_MAP_DETACH.
(GOMP_target_enter_exit_data): Add handling of GOMP_MAP_ATTACH.
* testsuite/libgomp.c-c++-common/ptr-attach-1.c: New testcase.
|
|
Bug fix for recent commit beddd1762ad2bbe84dd776c54489153f83f21e56 "[OpenACC]
More precise diagnostics for 'gang', 'worker', 'vector' clauses with arguments
on 'loop' only allowed in 'kernels' regions":
> [...], and 'inform' at the location of the enclosing parent
> compute construct/[...].
Now really.
gcc/
* omp-low.c (scan_omp_for) <OpenACC>: Use proper location to
'inform' of enclosing parent compute construct.
gcc/testsuite/
* c-c++-common/goacc/pr92793-1.c: Extend.
* gfortran.dg/goacc/pr92793-1.f90: Likewise.
|
|
OpenACC 'kernels'
gcc/
* omp-low.c (scan_omp_for) <OpenACC>: Move earlier inconsistent
nested 'reduction' clauses checking.
gcc/testsuite/
* c-c++-common/goacc/nested-reductions-1-kernels.c: Extend.
* c-c++-common/goacc/nested-reductions-2-kernels.c: Likewise.
* gfortran.dg/goacc/nested-reductions-1-kernels.f90: Likewise.
* gfortran.dg/goacc/nested-reductions-2-kernels.f90: Likewise.
|
|
with arguments on 'loop' only allowed in 'kernels' regions
Instead of at the location of the 'loop' directive, 'error_at' the location of
the improper clause, and 'inform' at the location of the enclosing parent
compute construct/routine.
The Fortran testcases come with some XFAILing, to be resolved later.
gcc/
* omp-low.c (scan_omp_for) <OpenACC>: More precise diagnostics for
'gang', 'worker', 'vector' clauses with arguments only allowed in
'kernels' regions.
gcc/testsuite/
* c-c++-common/goacc/pr92793-1.c: Extend.
* gfortran.dg/goacc/pr92793-1.f90: Likewise.
|
|
This patch adds parsing of OpenMP allocate clause, but still ignores
it during OpenMP lowering where we should for privatized variables
with allocate clause use the corresponding allocators rather than
allocating them on the stack.
2020-10-28 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_ALLOCATE.
* tree.h (OMP_CLAUSE_ALLOCATE_ALLOCATOR,
OMP_CLAUSE_ALLOCATE_COMBINED): Define.
* tree.c (omp_clause_num_ops, omp_clause_code_name): Add allocate
clause.
(walk_tree_1): Handle OMP_CLAUSE_ALLOCATE.
* tree-pretty-print.c (dump_omp_clause): Likewise.
* gimplify.c (gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses,
gimplify_omp_for): Likewise.
* tree-nested.c (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Likewise.
* omp-low.c (scan_sharing_clauses): Likewise.
gcc/c-family/
* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_ALLOCATE.
* c-omp.c: Include bitmap.h.
(c_omp_split_clauses): Handle OMP_CLAUSE_ALLOCATE.
gcc/c/
* c-parser.c (c_parser_omp_clause_name): Handle allocate.
(c_parser_omp_clause_allocate): New function.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_ALLOCATE.
(OMP_FOR_CLAUSE_MASK, OMP_SECTIONS_CLAUSE_MASK,
OMP_PARALLEL_CLAUSE_MASK, OMP_SINGLE_CLAUSE_MASK,
OMP_TASK_CLAUSE_MASK, OMP_TASKGROUP_CLAUSE_MASK,
OMP_DISTRIBUTE_CLAUSE_MASK, OMP_TEAMS_CLAUSE_MASK,
OMP_TARGET_CLAUSE_MASK, OMP_TASKLOOP_CLAUSE_MASK): Add
PRAGMA_OMP_CLAUSE_ALLOCATE.
* c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_ALLOCATE.
gcc/cp/
* parser.c (cp_parser_omp_clause_name): Handle allocate.
(cp_parser_omp_clause_allocate): New function.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_ALLOCATE.
(OMP_FOR_CLAUSE_MASK, OMP_SECTIONS_CLAUSE_MASK,
OMP_PARALLEL_CLAUSE_MASK, OMP_SINGLE_CLAUSE_MASK,
OMP_TASK_CLAUSE_MASK, OMP_TASKGROUP_CLAUSE_MASK,
OMP_DISTRIBUTE_CLAUSE_MASK, OMP_TEAMS_CLAUSE_MASK,
OMP_TARGET_CLAUSE_MASK, OMP_TASKLOOP_CLAUSE_MASK): Add
PRAGMA_OMP_CLAUSE_ALLOCATE.
* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_ALLOCATE.
* pt.c (tsubst_omp_clauses): Likewise.
gcc/testsuite/
* c-c++-common/gomp/allocate-1.c: New test.
* c-c++-common/gomp/allocate-2.c: New test.
* c-c++-common/gomp/clauses-1.c (omp_allocator_handle_t): New typedef.
(foo, bar, baz): Add allocate clauses where allowed.
|
|
This propagates needed values from the point where number of iterations
is calculated on composite loops to the places where that information
is needed to use the more efficient square root discovery to compute
the starting iterator values from the logical iteration number.
2020-10-13 Jakub Jelinek <jakub@redhat.com>
* omp-low.c (add_taskreg_looptemp_clauses): For triangular loops
with non-constant number of iterations add another 4 _looptemp_
clauses before the (optional) one for lastprivate.
(lower_omp_for_lastprivate): Skip those clauses when looking for
the lastprivate clause.
(lower_omp_for): For triangular loops with non-constant number of
iterations add another 4 _looptemp_ clauses.
* omp-expand.c (expand_omp_for_init_counts): For triangular loops
with non-constant number of iterations set counts[0],
fd->first_inner_iterations, fd->factor and fd->adjn1 from the newly
added _looptemp_ clauses.
(expand_omp_for_init_vars): Initialize the newly added _looptemp_
clauses.
(find_lastprivate_looptemp): New function.
(expand_omp_for_static_nochunk, expand_omp_for_static_chunk,
expand_omp_taskloop_for_outer): Use it instead of manually skipping
_looptemp_ clauses.
|
|
The following testcase FAILs, because we don't mark the child OpenMP function
as cfun->calls_alloca when it does call alloca. When optimizing, during DCE we
reset those flags and recompute them again, but with -O0 DCE is not performed.
Fixed by calling notice_special_calls when moving insns to the child function.
cfun->calls_alloca is normally set during gimplification and most of the
alloca calls omp-low.c does go through the gimplifier, but one spot didn't
and built the gcall directly, so that one needs to set calls_alloca too.
2020-10-08 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/97294
* tree-cfg.c (move_block_to_fn): Call notice_special_calls on
call stmts being moved into dest_cfun.
* omp-low.c (lower_rec_input_clauses): Set cfun->calls_alloca when
adding __builtin_alloca_with_align call without gimplification.
* gcc.dg/asan/pr97294.c: New test.
|
|
The following change adds support for non-rectangular simd loops.
While working on that, I've noticed we actually don't vectorize collapsed
simd loops at all, because the code that I thought would be vectorizable
actually is not vectorized. While in theory for the constant lower/upper
bounds and constant step of all but the outermost loop we could in theory
vectorize by computing the seprate iterators using vectorized division
and modulo for each of them from the single iterator that increments
by 1 from 0 to total iteration count in the loop nest, I think that would
be fairly expensive and the chances of the loop body being vectorizable
would be low e.g. because of array indices unlikely to be linear and would
need scatters/gathers.
This patch changes the generated code to vectorize only the innermost
loop which has higher chance of being vectorized. Below is the list of
tests and function names in which the patch resulted in vectorizing something
that hasn't been vectorized before (ok, the first line is a new test).
I've also found that the vectorizer will not vectorize loops with non-constant
steps, I plan to do something about those incrementally on the omp-expand.c
side (basically, compute number of iterations before the loop and use a 0 to
number_of_iterations step 1 IV as the main one).
I have problem with the composite simd vectorization though.
The point is that each thread (or task etc.) is given only a range of
consecutive iterations, so somewhere earlier it computes total number of iterations
and splits the work between the workers and then the intent is to try to vectorize it.
So, each thread is then given a begin ... end-1 range that it would handle.
This means that from the single begin value I need to compute the individual iteration
vars I should start at and then goto into the loop nest to begin iterating there
(and actually compute how many iterations the innermost loop should do each time
so that it stops before end).
Very roughly the IL I emit is something like:
int t[100][100][100];
void
foo (int a, int b, int c, int d, int e, int f, int g, int h, int u, int v, int w, int x)
{
int i, j, k;
int cnt;
if (x)
{
i = u; j = v; k = w; goto doit;
}
for (i = a; i < b; i += c)
for (j = d; j < e; j += f)
{
k = g;
doit:
for (; k < h; k++)
t[i][j][k] += i + j + k;
}
}
Unfortunately, some pass then turns the innermost loop to have more than 2 basic blocks
and it isn't vectorized because of that.
Also, I have disabled (for now) SIMTization of collapsed simd loops, because for SIMT
it would be using a single thread anyway and I didn't want to bother with checking
SIMT on all places I've been changing. If SIMT support is added for some or all
collapsed loops, that omp-low.c change needs to be reverted.
Here is that list of what hasn't been vectorized before and is now:
gcc/testsuite/gcc.dg/vect/vect-simd-17.c doit
gcc/testsuite/gfortran.dg/gomp/openmp-simd-6.f90 bar
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-10.c f28_taskloop_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-10.c _Z24f28_taskloop_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f25_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f26_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f27_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f28_tpf_simd_guided32._omp_fn.1
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f28_tpf_simd_runtime._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f25_t_simd_normaliiiiiii._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f26_t_simd_normaliiiixxi._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f27_t_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z20f28_tpf_simd_runtimev._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z21f28_tpf_simd_guided32v._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f7_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f7_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f8_f_simd_guided32
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_f_simd_guided32
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f8_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_pf_simd_guided32._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_pf_simd_runtime._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c _Z18f8_pf_simd_runtimev._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c _Z19f8_pf_simd_guided32v._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-4.c f8_taskloop_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-4.c _Z23f8_taskloop_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f7_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f8_tpf_simd_guided32._omp_fn.1
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f8_tpf_simd_runtime._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z16f7_t_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z19f8_tpf_simd_runtimev._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z20f8_tpf_simd_guided32v._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f25_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f25_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f26_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f26_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f27_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f27_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f28_f_simd_guided32
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_f_simd_guided32
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f28_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_pf_simd_guided32._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_pf_simd_runtime._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c _Z19f28_pf_simd_runtimev._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c _Z20f28_pf_simd_guided32v._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/master-combined-1.c main._omp_fn.9
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/master-combined-1.c main._omp_fn.9
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/simd-1.c f2
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/simd-1.c f2
libgomp/testsuite/libgomp.c/pr70680-2.c f1._omp_fn.0
libgomp/testsuite/libgomp.c/pr70680-2.c f2._omp_fn.0
libgomp/testsuite/libgomp.c/pr70680-2.c f3._omp_fn.0
libgomp/testsuite/libgomp.c/pr70680-2.c f4._omp_fn.0
libgomp/testsuite/libgomp.c/simd-8.c foo
libgomp/testsuite/libgomp.c/simd-9.c bar
libgomp/testsuite/libgomp.c/simd-9.c foo
2020-09-25 Jakub Jelinek <jakub@redhat.com>
gcc/
* omp-low.c (scan_omp_1_stmt): Don't call scan_omp_simd for
collapse > 1 loops as simt doesn't support collapsed loops yet.
* omp-expand.c (expand_omp_for_init_counts, expand_omp_for_init_vars):
Small tweaks to function comment.
(expand_omp_simd): Rewritten collapse > 1 support to only attempt
to vectorize the innermost loop and emit set of outer loops around it.
For non-composite simd with collapse > 1 without broken loop don't
even try to compute number of iterations first. Add support for
non-rectangular simd loops.
(expand_omp_for): Don't sorry_at on non-rectangular simd loops.
gcc/testsuite/
* gcc.dg/vect/vect-simd-17.c: New test.
libgomp/
* testsuite/libgomp.c/loop-25.c: New test.
|
|
gcc/cp/ChangeLog:
PR fortran/96668
* cp-gimplify.c (cxx_omp_finish_clause): Add bool openacc arg.
* cp-tree.h (cxx_omp_finish_clause): Likewise
* semantics.c (handle_omp_for_class_iterator): Update call.
gcc/fortran/ChangeLog:
PR fortran/96668
* trans.h (gfc_omp_finish_clause): Add bool openacc arg.
* trans-openmp.c (gfc_omp_finish_clause): Ditto. Use
GOMP_MAP_ALWAYS_POINTER with PSET for pointers.
(gfc_trans_omp_clauses): Like the latter and also if the always
modifier is used.
gcc/ChangeLog:
PR fortran/96668
* gimplify.c (gimplify_omp_for): Add 'bool openacc' argument;
update omp_finish_clause calls.
(gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses,
gimplify_expr, gimplify_omp_loop): Update omp_finish_clause
and/or gimplify_for calls.
* langhooks-def.h (lhd_omp_finish_clause): Add bool openacc arg.
* langhooks.c (lhd_omp_finish_clause): Likewise.
* langhooks.h (lhd_omp_finish_clause): Likewise.
* omp-low.c (scan_sharing_clauses): Keep GOMP_MAP_TO_PSET cause for
'declare target' vars.
include/ChangeLog:
PR fortran/96668
* gomp-constants.h (GOMP_MAP_ALWAYS_POINTER_P): Define.
libgomp/ChangeLog:
PR fortran/96668
* libgomp.h (struct target_var_desc): Add has_null_ptr_assoc member.
* target.c (gomp_map_vars_existing): Add always_to_flag flag.
(gomp_map_vars_existing): Update call to it.
(gomp_map_fields_existing): Likewise
(gomp_map_vars_internal): Update PSET handling such that if a nullptr is
now allocated or if GOMP_MAP_POINTER is used PSET is updated and pointer
remapped.
(GOMP_target_enter_exit_data): Hanlde GOMP_MAP_ALWAYS_POINTER like
GOMP_MAP_POINTER.
* testsuite/libgomp.fortran/map-alloc-ptr-1.f90: New test.
* testsuite/libgomp.fortran/map-alloc-ptr-2.f90: New test.
|
|
As the new testcase shows, we weren't actually performing reductions on
host teams construct. And fixing that revealed a flaw in the for-14.c testcase.
The problem is that the tests perform also initialization and checking around the
calls to the functions with the OpenMP constructs. In that testcase, all the
tests have been spawned from a teams construct but only the tested loops were
distribute, which means the initialization and checking has been performed
redundantly and racily in each team. Fixed by performing the initialization
and checking outside of host teams and only do the calls to functions with
the tested constructs inside of host teams.
2020-08-05 Jakub Jelinek <jakub@redhat.com>
PR middle-end/96459
* omp-low.c (lower_omp_taskreg): Call lower_reduction_clauses even in
for host teams.
* testsuite/libgomp.c/teams-3.c: New test.
* testsuite/libgomp.c-c++-common/for-2.h (OMPTEAMS): Define to nothing
if not defined yet.
(N(test)): Use it before all N(f*) calls.
* testsuite/libgomp.c-c++-common/for-14.c (DO_PRAGMA, OMPTEAMS): Define.
(main): Don't call all test_* functions from within
#pragma omp teams reduction(|:err), call them directly.
|
|
This patch removes the generation of HSAIL from the compiler, the HSA
offloading plugin from libgomp and the associated testsuite tests and
infrastructure bits from the respective testsuites.
Apart from removal of the obvious files, I removed bits that I found
by searching for HSA related terms and by re-tracing my steps and
looking at the patches that introduced HSA in the first place. I did
not remove everything these patches brought in, for example:
- the mechanism to pass offload-target specific info from the application to
the offloading plugin - but the same mechanism is also used to
communicate number of teams and the thread limit to all offload targets.
- run_func hook in gomp_device_descr stays too, although now it is
not used. If some future offload target would like the ability to
refuse to offload some functions, it can use it. It is easy to
remove as a follow-up if it is considered clutter, though.
- configure options --with-hsa-runtime=PATH, -with-hsa-runtime-include=PATH
and --with-hsa-runtime-lib=PATH rmeain because GCN uses them too.
- Surprisingly, GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES (a constant
from gomp-constants.h) appears in the source of the amdgcn libgomp
plugin, although I tend to think that code path is not ever used
and this patch certainly removes it from the compiler.
Nevertheless, it seems it has potential value beyond HSAIL and so
I've kept it, it can of course always be easily removed in the
future of GCN folk abandon it too.
- I assume constants OFFLOAD_TARGET_TYPE_HSA and GOMP_DEVICE_HSA
need to stay indefinitely too just so that no future offload
target picks that number.
- I have kept dg-require-effective-target
offload_device_nonshared_as requirement of thests which have it.
It is quite probable I missed some small HSA artifacts but those
should be easy to remove later as we find them.
include/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* gomp-constants.h (GOMP_VERSION_HSA): Remove.
gcc/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* hsa-brig-format.h: Moved to brig/brigfrontend.
* hsa-brig.c: Removed.
* hsa-builtins.def: Likewise.
* hsa-common.c: Likewise.
* hsa-common.h: Likewise.
* hsa-dump.c: Likewise.
* hsa-gen.c: Likewise.
* hsa-regalloc.c: Likewise.
* ipa-hsa.c: Likewise.
* omp-grid.c: Likewise.
* omp-grid.h: Likewise.
* Makefile.in (BUILTINS_DEF): Remove hsa-builtins.def.
(OBJS): Remove hsa-common.o, hsa-gen.o, hsa-regalloc.o, hsa-brig.o,
hsa-dump.o, ipa-hsa.c and omp-grid.o.
(GTFILES): Removed hsa-common.c and omp-expand.c.
* builtins.def: Remove processing of hsa-builtins.def.
(DEF_HSA_BUILTIN): Remove.
* common.opt (flag_disable_hsa): Remove.
(-Whsa): Ignore.
* config.in (ENABLE_HSA): Removed.
* configure.ac: Removed handling configuration for hsa offloading.
(ENABLE_HSA): Removed.
* configure: Regenerated.
* doc/install.texi (--enable-offload-targets): Remove hsa from the
example.
(--with-hsa-runtime): Reword to reference any HSA run-time, not
specifically HSA offloading.
* doc/invoke.texi (Option Summary): Remove -Whsa.
(Warning Options): Likewise.
(Optimize Options): Remove hsa-gen-debug-stores.
* doc/passes.texi (Regular IPA passes): Remove section on IPA HSA
pass.
* gimple-low.c (lower_stmt): Remove GIMPLE_OMP_GRID_BODY case.
* gimple-pretty-print.c (dump_gimple_omp_for): Likewise.
(dump_gimple_omp_block): Likewise.
(pp_gimple_stmt_1): Likewise.
* gimple-walk.c (walk_gimple_stmt): Likewise.
* gimple.c (gimple_build_omp_grid_body): Removed function.
(gimple_copy): Remove GIMPLE_OMP_GRID_BODY case.
* gimple.def (GIMPLE_OMP_GRID_BODY): Removed.
* gimple.h (gf_mask): Removed GF_OMP_PARALLEL_GRID_PHONY,
OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY,
GF_OMP_FOR_GRID_INTRA_GROUP, GF_OMP_FOR_GRID_GROUP_ITER and
GF_OMP_TEAMS_GRID_PHONY. Renumbered GF_OMP_FOR_KIND_SIMD and
GF_OMP_TEAMS_HOST.
(gimple_build_omp_grid_body): Removed declaration.
(gimple_has_substatements): Remove GIMPLE_OMP_GRID_BODY case.
(gimple_omp_for_grid_phony): Removed.
(gimple_omp_for_set_grid_phony): Likewise.
(gimple_omp_for_grid_intra_group): Likewise.
(gimple_omp_for_grid_intra_group): Likewise.
(gimple_omp_for_grid_group_iter): Likewise.
(gimple_omp_for_set_grid_group_iter): Likewise.
(gimple_omp_parallel_grid_phony): Likewise.
(gimple_omp_parallel_set_grid_phony): Likewise.
(gimple_omp_teams_grid_phony): Likewise.
(gimple_omp_teams_set_grid_phony): Likewise.
(CASE_GIMPLE_OMP): Remove GIMPLE_OMP_GRID_BODY case.
* lto-section-in.c (lto_section_name): Removed hsa.
* lto-streamer.h (lto_section_type): Removed LTO_section_ipa_hsa.
* lto-wrapper.c (compile_images_for_offload_targets): Remove special
handling of hsa.
* omp-expand.c: Do not include hsa-common.h and gt-omp-expand.h.
(parallel_needs_hsa_kernel_p): Removed.
(grid_launch_attributes_trees): Likewise.
(grid_launch_attributes_trees): Likewise.
(grid_create_kernel_launch_attr_types): Likewise.
(grid_insert_store_range_dim): Likewise.
(grid_get_kernel_launch_attributes): Likewise.
(get_target_arguments): Remove code passing HSA grid sizes.
(grid_expand_omp_for_loop): Remove.
(grid_arg_decl_map): Likewise.
(grid_remap_kernel_arg_accesses): Likewise.
(grid_expand_target_grid_body): Likewise.
(expand_omp): Remove call to grid_expand_target_grid_body.
(omp_make_gimple_edges): Remove GIMPLE_OMP_GRID_BODY case.
* omp-general.c: Do not include hsa-common.h.
(omp_maybe_offloaded): Do not check for HSA offloading.
(omp_context_selector_matches): Likewise.
* omp-low.c: Do not include hsa-common.h and omp-grid.h.
(build_outer_var_ref): Remove handling of GIMPLE_OMP_GRID_BODY.
(scan_sharing_clauses): Remove handling of OMP_CLAUSE__GRIDDIM_.
(scan_omp_parallel): Remove handling of the phoney variant.
(check_omp_nesting_restrictions): Remove handling of
GIMPLE_OMP_GRID_BODY and GF_OMP_FOR_KIND_GRID_LOOP.
(scan_omp_1_stmt): Remove handling of GIMPLE_OMP_GRID_BODY.
(lower_omp_for_lastprivate): Remove handling of gridified loops.
(lower_omp_for): Remove phony loop handling.
(lower_omp_taskreg): Remove phony construct handling.
(lower_omp_teams): Likewise.
(lower_omp_grid_body): Removed.
(lower_omp_1): Remove GIMPLE_OMP_GRID_BODY case.
(execute_lower_omp): Do not call omp_grid_gridify_all_targets.
* opts.c (common_handle_option): Do not handle hsa when processing
OPT_foffload_.
* params.opt (hsa-gen-debug-stores): Remove.
* passes.def: Remove pass_ipa_hsa and pass_gen_hsail.
* timevar.def: Remove TV_IPA_HSA.
* toplev.c: Do not include hsa-common.h.
(compile_file): Do not call hsa_output_brig.
* tree-core.h (enum omp_clause_code): Remove OMP_CLAUSE__GRIDDIM_.
(tree_omp_clause): Remove union field dimension.
* tree-nested.c (convert_nonlocal_omp_clauses): Remove the
OMP_CLAUSE__GRIDDIM_ case.
(convert_local_omp_clauses): Likewise.
* tree-pass.h (make_pass_gen_hsail): Remove declaration.
(make_pass_ipa_hsa): Likewise.
* tree-pretty-print.c (dump_omp_clause): Remove GIMPLE_OMP_GRID_BODY
case.
* tree.c (omp_clause_num_ops): Remove the element corresponding to
OMP_CLAUSE__GRIDDIM_.
(omp_clause_code_name): Likewise.
(walk_tree_1): Remove GIMPLE_OMP_GRID_BODY case.
* tree.h (OMP_CLAUSE__GRIDDIM__DIMENSION): Remove.
(OMP_CLAUSE__GRIDDIM__SIZE): Likewise.
(OMP_CLAUSE__GRIDDIM__GROUP): Likewise.
gcc/fortran/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* f95-lang.c (gfc_init_builtin_functions): Remove processing of
hsa-builtins.def.
gcc/brig/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* brigfrontend/brig-util.h (hsa_type_packed_p): Declared.
* brigfrontend/brig-util.cc (hsa_type_packed_p): Moved here from
removed gcc/hsa-common.c.
libgomp/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* plugin/Makefrag.am: Remove configuration of HSA plugin.
* aclocal.m4: Regenerated.
* Makefile.in: Regenerated.
* config.h.in: Regenerated.
* configure: Regenerated.
* plugin/configfrag.ac: Likewise.
* plugin/hsa_ext_finalize.h: Removed.
* plugin/plugin-hsa.c: Likewise.
* testsuite/Makefile.in: Regenerated.
* testsuite/lib/libgomp.exp
(offload_target_to_openacc_device_type): Remove hsa case.
(check_effective_target_hsa_offloading_selected_nocache): Removed
(check_effective_target_hsa_offloading_selected): Likewise.
(libgomp_init): Do not add -Wno-hsa to additional_flags.
* testsuite/libgomp.hsa.c/alloca-1.c: Removed test.
* testsuite/libgomp.hsa.c/bitfield-1.c: Likewise.
* testsuite/libgomp.hsa.c/bits-insns.c: Likewise.
* testsuite/libgomp.hsa.c/builtins-1.c: Likewise.
* testsuite/libgomp.hsa.c/c.exp: Likewise.
* testsuite/libgomp.hsa.c/complex-1.c: Likewise.
* testsuite/libgomp.hsa.c/complex-align-2.c: Likewise.
* testsuite/libgomp.hsa.c/formal-actual-args-1.c: Likewise.
* testsuite/libgomp.hsa.c/function-call-1.c: Likewise.
* testsuite/libgomp.hsa.c/get-level-1.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-1.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-2.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-3.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-4.c: Likewise.
* testsuite/libgomp.hsa.c/memory-operations-1.c: Likewise.
* testsuite/libgomp.hsa.c/pr69568.c: Likewise.
* testsuite/libgomp.hsa.c/pr82416.c: Likewise.
* testsuite/libgomp.hsa.c/rotate-1.c: Likewise.
* testsuite/libgomp.hsa.c/staticvar.c: Likewise.
* testsuite/libgomp.hsa.c/switch-1.c: Likewise.
* testsuite/libgomp.hsa.c/switch-branch-1.c: Likewise.
* testsuite/libgomp.hsa.c/switch-sbr-2.c: Likewise.
* testsuite/libgomp.hsa.c/tiling-1.c: Likewise.
* testsuite/libgomp.hsa.c/tiling-2.c: Likewise.
gcc/testsuite/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* lib/target-supports.exp (check_effective_target_offload_hsa):
Removed.
* c-c++-common/gomp/gridify-1.c: Removed test.
* c-c++-common/gomp/gridify-2.c: Likewise.
* c-c++-common/gomp/gridify-3.c: Likewise.
* c-c++-common/gomp/hsa-indirect-call-1.c: Likewise.
* gfortran.dg/gomp/gridify-1.f90: Likewise.
* gcc.dg/gomp/gomp.exp: Do not pass -Wno-hsa to tests.
* g++.dg/gomp/gomp.exp: Likewise.
* gfortran.dg/gomp/gomp.exp: Likewise.
|
|
2020-06-24 Jakub Jelinek <jakub@redhat.com>
* omp-low.c (lower_omp_for): Fix two pastos.
|
|
OpenMP 5.0 adds support for non-rectangular loop collapses, e.g.
triangular and more complex.
This patch deals just with the diagnostics so that they aren't rejected
immediately as before. As the spec generally requires as before that the
iteration variable initializer and bound in the comparison as invariant
vs. the outermost loop, and just add some exceptional forms that can violate
that, we need to avoid folding the expressions until we can detect them and
in order to avoid folding it later on, I chose to use a TREE_VEC in those
expressions to hold the var_outer * expr1 + expr2 triplet, the patch adds
pretty-printing of that, gimplification etc. and just sorry_at during
omp expansion for now.
The next step will be to implement the different cases of that one by one.
2020-06-16 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.h (OMP_FOR_NON_RECTANGULAR): Define.
* gimplify.c (gimplify_omp_for): Diagnose schedule, ordered
or dist_schedule clause on non-rectangular loops. Handle
gimplification of non-rectangular lb/b expressions. When changing
iteration variable, adjust also non-rectangular lb/b expressions
referencing that.
* omp-general.h (struct omp_for_data_loop): Add m1, m2 and outer
members.
(struct omp_for_data): Add non_rect member.
* omp-general.c (omp_extract_for_data): Handle non-rectangular
loops. Fill in non_rect, m1, m2 and outer.
* omp-low.c (lower_omp_for): Handle non-rectangular lb/b expressions.
* omp-expand.c (expand_omp_for): Emit sorry_at for unsupported
non-rectangular loop cases and assert for cases that can't be
non-rectangular.
* tree-pretty-print.c (dump_mem_ref): Formatting fix.
(dump_omp_loop_non_rect_expr): New function.
(dump_generic_node): Handle non-rectangular OpenMP loops.
* tree-pretty-print.h (dump_omp_loop_non_rect_expr): Declare.
* gimple-pretty-print.c (dump_gimple_omp_for): Handle non-rectangular
OpenMP loops.
gcc/c-family/
* c-common.h (c_omp_check_loop_iv_exprs): Add an int argument.
* c-omp.c (struct c_omp_check_loop_iv_data): Add maybe_nonrect and
idx members.
(c_omp_is_loop_iterator): New function.
(c_omp_check_loop_iv_r): Use it. Add support for silent scanning
if outer loop iterator is present. Perform duplicate checking through
hash_set in the function rather than expecting caller to do that.
Pass NULL instead of d->ppset to walk_tree_1.
(c_omp_check_nonrect_loop_iv): New function.
(c_omp_check_loop_iv): Use it. Fill in new members, allow
non-rectangular loop forms, diagnose multiple associated loops with
the same iterator. Pass NULL instead of &pset to walk_tree_1.
(c_omp_check_loop_iv_exprs): Likewise.
gcc/c/
* c-parser.c (c_parser_expr_no_commas): Save, clear and restore
c_in_omp_for.
(c_parser_omp_for_loop): Set c_in_omp_for around some calls to avoid
premature c_fully_fold. Defer explicit c_fully_fold calls to after
c_finish_omp_for.
* c-tree.h (c_in_omp_for): Declare.
* c-typeck.c (c_in_omp_for): Define.
(build_modify_expr): Avoid c_fully_fold if c_in_omp_for.
(digest_init): Likewise.
(build_binary_op): Likewise.
gcc/cp/
* semantics.c (handle_omp_for_class_iterator): Adjust
c_omp_check_loop_iv_exprs caller.
(finish_omp_for): Likewise. Don't call fold_build_cleanup_point_expr
before calling c_finish_omp_for and c_omp_check_loop_iv, move it
after those calls.
* pt.c (tsubst_omp_for_iterator): Handle non-rectangular loops.
gcc/testsuite/
* c-c++-common/gomp/loop-6.c: New test.
* gcc.dg/gomp/loop-1.c: Don't expect diagnostics on valid
non-rectangular loops.
* gcc.dg/gomp/loop-2.c: New test.
* g++.dg/gomp/loop-1.C: Don't expect diagnostics on valid
non-rectangular loops.
* g++.dg/gomp/loop-2.C: Likewise.
* g++.dg/gomp/loop-5.C: New test.
* g++.dg/gomp/loop-6.C: New test.
|
|
This extends DECL_GIMPLE_REG_P to all types so we can clear
TREE_ADDRESSABLE even for integers with partial defs, not just
complex and vector variables. To make that transition easier
the patch inverts DECL_GIMPLE_REG_P to DECL_NOT_GIMPLE_REG_P
since that makes the default the current state for all other
types besides complex and vectors.
For the testcase in PR94703 we're able to expand the partial
def'ed local integer to a register then, producing a single
movl rather than going through the stack.
On i?86 this execute FAILs gcc.dg/torture/pr71522.c because
we now expand a round-trip through a long double automatic var
to a register fld/fst which normalizes the value. For that
during RTL expansion we're looking for problematic punnings
of decls and avoid pseudos for those - I chose integer or
BLKmode accesses on decls with modes where precision doesn't
match bitsize which covers the XFmode case.
2020-05-07 Richard Biener <rguenther@suse.de>
PR middle-end/94703
* tree-core.h (tree_decl_common::gimple_reg_flag): Rename ...
(tree_decl_common::not_gimple_reg_flag): ... to this.
* tree.h (DECL_GIMPLE_REG_P): Rename ...
(DECL_NOT_GIMPLE_REG_P): ... to this.
* gimple-expr.c (copy_var_decl): Copy DECL_NOT_GIMPLE_REG_P.
(create_tmp_reg): Simplify.
(create_tmp_reg_fn): Likewise.
(is_gimple_reg): Check DECL_NOT_GIMPLE_REG_P for all regs.
* gimplify.c (create_tmp_from_val): Simplify.
(gimplify_bind_expr): Likewise.
(gimplify_compound_literal_expr): Likewise.
(gimplify_function_tree): Likewise.
(prepare_gimple_addressable): Set DECL_NOT_GIMPLE_REG_P.
* asan.c (create_odr_indicator): Do not clear DECL_GIMPLE_REG_P.
(asan_add_global): Copy it.
* cgraphunit.c (cgraph_node::expand_thunk): Force args
to be GIMPLE regs.
* function.c (gimplify_parameters): Copy
DECL_NOT_GIMPLE_REG_P.
* ipa-param-manipulation.c
(ipa_param_body_adjustments::common_initialization): Simplify.
(ipa_param_body_adjustments::reset_debug_stmts): Copy
DECL_NOT_GIMPLE_REG_P.
* omp-low.c (lower_omp_for_scan): Do not set DECL_GIMPLE_REG_P.
* sanopt.c (sanitize_rewrite_addressable_params): Likewise.
* tree-cfg.c (make_blocks_1): Simplify.
(verify_address): Do not verify DECL_GIMPLE_REG_P setting.
* tree-eh.c (lower_eh_constructs_2): Simplify.
* tree-inline.c (declare_return_variable): Adjust and
generalize.
(copy_decl_to_var): Copy DECL_NOT_GIMPLE_REG_P.
(copy_result_decl_to_var): Likewise.
* tree-into-ssa.c (pass_build_ssa::execute): Adjust comment.
* tree-nested.c (create_tmp_var_for): Simplify.
* tree-parloops.c (separate_decls_in_region_name): Copy
DECL_NOT_GIMPLE_REG_P.
* tree-sra.c (create_access_replacement): Adjust and
generalize partial def support.
* tree-ssa-forwprop.c (pass_forwprop::execute): Set
DECL_NOT_GIMPLE_REG_P on decls we introduce partial defs on.
* tree-ssa.c (maybe_optimize_var): Handle clearing of
TREE_ADDRESSABLE and setting/clearing DECL_NOT_GIMPLE_REG_P
independently.
* lto-streamer-out.c (hash_tree): Hash DECL_NOT_GIMPLE_REG_P.
* tree-streamer-out.c (pack_ts_decl_common_value_fields): Stream
DECL_NOT_GIMPLE_REG_P.
* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Likewise.
* cfgexpand.c (avoid_type_punning_on_regs): New.
(discover_nonconstant_array_refs): Call
avoid_type_punning_on_regs to avoid unsupported mode punning.
lto/
* lto-common.c (compare_tree_sccs_1): Compare
DECL_NOT_GIMPLE_REG_P.
c/
* gimple-parser.c (c_parser_parse_ssa_name): Do not set
DECL_GIMPLE_REG_P.
cp/
* optimize.c (update_cloned_parm): Copy DECL_NOT_GIMPLE_REG_P.
* gcc.dg/tree-ssa/pr94703.c: New testcase.
|