diff options
author | Julian Brown <julian@codesourcery.com> | 2019-10-14 13:12:39 -0700 |
---|---|---|
committer | Thomas Schwinge <thomas@codesourcery.com> | 2020-03-03 12:51:25 +0100 |
commit | 833d954448cd353c3e40208ab9916edb5a7c5c5b (patch) | |
tree | cb5fc55f1ff6b9c0dc68206abacc311f8505980e | |
parent | de5b5fcdd6934bc58a28cd34a12930cf87bd7551 (diff) | |
download | gcc-833d954448cd353c3e40208ab9916edb5a7c5c5b.zip gcc-833d954448cd353c3e40208ab9916edb5a7c5c5b.tar.gz gcc-833d954448cd353c3e40208ab9916edb5a7c5c5b.tar.bz2 |
[og9] Re-do OpenACC private variable resolution
gcc/
* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename
to...
(gcn_goacc_adjust_private_decl): ...this.
* config/gcn/gcn-tree.c (diagnostic-core.h): Include.
(gcn_goacc_adjust_gangprivate_decl): Rename to...
(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
(nvptx_goacc_adjust_private_decl): New function.
(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function.
* doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* doc/tm.texi: Regenerated.
* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
* omp-low.c (omp_context): Remove oacc_partitioning_levels field.
(lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert before
fork.
(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its
gimple call arguments as appropriate. Don't set
oacc_partitioning_levels in omp_context. Pass private_marker to
lower_oacc_reductions.
(oacc_record_private_var_clauses): Don't check for NULL ctx.
(make_oacc_private_marker): New function.
(lower_omp_for): Only call oacc_record_vars_in_bind for
OpenACC contexts. Create private marker and pass to
lower_oacc_head_tail.
(lower_omp_target): Remove unnecessary call to
oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate.
Create private marker and pass to lower_oacc_reductions.
(process_oacc_gangprivate_1): Remove.
(lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC. Don't
iterate over contexts calling process_oacc_gangprivate_1.
(omp-offload.c (oacc_loop_xform_head_tail): Treat
private-variable markers like fork/join when transforming head/tail
sequences.
(execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of
"oacc gangprivate" attributes to determine partitioning level of
variables.
* omp-sese.c (find_gangprivate_vars): New function.
(find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead
of "oacc gangprivate" attribute to determine which variables are
gang-private.
(oacc_do_neutering): Use find_gangprivate_vars.
* target.def (adjust_gangprivate_decl): Rename to...
(adjust_private_decl): ...this. Update documentation (briefly).
libgomp/
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use
oaccdevlow dump and update scanned output.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise.
Add missing atomic to force worker partitioning for test variable.
(cherry picked from openacc-gcc-9-branch commit
bbad7288269195b39603cdfde6c15f9488de83dc)
-rw-r--r-- | gcc/ChangeLog.omp | 50 | ||||
-rw-r--r-- | gcc/config/gcn/gcn-protos.h | 2 | ||||
-rw-r--r-- | gcc/config/gcn/gcn-tree.c | 6 | ||||
-rw-r--r-- | gcc/config/gcn/gcn.c | 4 | ||||
-rw-r--r-- | gcc/config/nvptx/nvptx.c | 26 | ||||
-rw-r--r-- | gcc/doc/tm.texi | 5 | ||||
-rw-r--r-- | gcc/doc/tm.texi.in | 2 | ||||
-rw-r--r-- | gcc/internal-fn.c | 2 | ||||
-rw-r--r-- | gcc/internal-fn.h | 3 | ||||
-rw-r--r-- | gcc/omp-low.c | 127 | ||||
-rw-r--r-- | gcc/omp-offload.c | 56 | ||||
-rw-r--r-- | gcc/omp-sese.c | 54 | ||||
-rw-r--r-- | gcc/target.def | 7 | ||||
-rw-r--r-- | libgomp/ChangeLog.omp | 7 | ||||
-rw-r--r-- | libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 | 4 | ||||
-rw-r--r-- | libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 | 8 |
16 files changed, 289 insertions, 74 deletions
diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp index 99734c8..a845697 100644 --- a/gcc/ChangeLog.omp +++ b/gcc/ChangeLog.omp @@ -1,3 +1,53 @@ +2019-10-16 Julian Brown <julian@codesourcery.com> + + * config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename + to... + (gcn_goacc_adjust_private_decl): ...this. + * config/gcn/gcn-tree.c (diagnostic-core.h): Include. + (gcn_goacc_adjust_gangprivate_decl): Rename to... + (gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter. + * config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to... + (TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this. + * config/nvptx/nvptx.c (tree-pretty-print.h): Include. + (nvptx_goacc_adjust_private_decl): New function. + (TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function. + * doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to... + (TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this. + * doc/tm.texi: Regenerated. + * internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE. + * internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE. + * omp-low.c (omp_context): Remove oacc_partitioning_levels field. + (lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert before + fork. + (lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its + gimple call arguments as appropriate. Don't set + oacc_partitioning_levels in omp_context. Pass private_marker to + lower_oacc_reductions. + (oacc_record_private_var_clauses): Don't check for NULL ctx. + (make_oacc_private_marker): New function. + (lower_omp_for): Only call oacc_record_vars_in_bind for + OpenACC contexts. Create private marker and pass to + lower_oacc_head_tail. + (lower_omp_target): Remove unnecessary call to + oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate. + Create private marker and pass to lower_oacc_reductions. + (process_oacc_gangprivate_1): Remove. + (lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC. Don't + iterate over contexts calling process_oacc_gangprivate_1. + (omp-offload.c (oacc_loop_xform_head_tail): Treat + private-variable markers like fork/join when transforming head/tail + sequences. + (execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of + "oacc gangprivate" attributes to determine partitioning level of + variables. + * omp-sese.c (find_gangprivate_vars): New function. + (find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead + of "oacc gangprivate" attribute to determine which variables are + gang-private. + (oacc_do_neutering): Use find_gangprivate_vars. + * target.def (adjust_gangprivate_decl): Rename to... + (adjust_private_decl): ...this. Update documentation (briefly). + 2019-10-09 Tobias Burnus <tobias@codesourcery.com> * f95-lang.c (LANG_HOOKS_OMP_ARRAY_DATA): Set to gfc_omp_array_data. diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h index 1711862..e33c059 100644 --- a/gcc/config/gcn/gcn-protos.h +++ b/gcc/config/gcn/gcn-protos.h @@ -39,7 +39,7 @@ extern rtx gcn_gen_undef (machine_mode); extern bool gcn_global_address_p (rtx); extern tree gcn_goacc_create_propagation_record (tree record_type, bool sender, const char *name); -extern void gcn_goacc_adjust_gangprivate_decl (tree var); +extern void gcn_goacc_adjust_private_decl (tree var, int level); extern void gcn_goacc_reduction (gcall *call); extern bool gcn_hard_regno_rename_ok (unsigned int from_reg, unsigned int to_reg); diff --git a/gcc/config/gcn/gcn-tree.c b/gcc/config/gcn/gcn-tree.c index 04902a3..db8e290 100644 --- a/gcc/config/gcn/gcn-tree.c +++ b/gcc/config/gcn/gcn-tree.c @@ -44,6 +44,7 @@ #include "cgraph.h" #include "targhooks.h" #include "langhooks-def.h" +#include "diagnostic-core.h" /* }}} */ /* {{{ OMP GCN pass. @@ -697,8 +698,11 @@ gcn_goacc_create_propagation_record (tree record_type, bool sender, } void -gcn_goacc_adjust_gangprivate_decl (tree var) +gcn_goacc_adjust_private_decl (tree var, int level) { + if (level != GOMP_DIM_GANG) + return; + tree type = TREE_TYPE (var); tree lds_type = build_qualified_type (type, TYPE_QUALS_NO_ADDR_SPACE (type) diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c index e0a558b..2835a3d 100644 --- a/gcc/config/gcn/gcn.c +++ b/gcc/config/gcn/gcn.c @@ -6044,8 +6044,8 @@ print_operand (FILE *file, rtx x, int code) #undef TARGET_GOACC_CREATE_PROPAGATION_RECORD #define TARGET_GOACC_CREATE_PROPAGATION_RECORD \ gcn_goacc_create_propagation_record -#undef TARGET_GOACC_ADJUST_GANGPRIVATE_DECL -#define TARGET_GOACC_ADJUST_GANGPRIVATE_DECL gcn_goacc_adjust_gangprivate_decl +#undef TARGET_GOACC_ADJUST_PRIVATE_DECL +#define TARGET_GOACC_ADJUST_PRIVATE_DECL gcn_goacc_adjust_private_decl #undef TARGET_GOACC_FORK_JOIN #define TARGET_GOACC_FORK_JOIN gcn_fork_join #undef TARGET_GOACC_REDUCTION diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index d6b2881..2a41d56 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -76,6 +76,7 @@ #include "intl.h" #include "tree-hash-traits.h" #include "omp-sese.h" +#include "tree-pretty-print.h" /* This file should be included last. */ #include "target-def.h" @@ -6019,6 +6020,28 @@ nvptx_can_change_mode_class (machine_mode, machine_mode, reg_class_t) return false; } +/* Implement TARGET_GOACC_ADJUST_PRIVATE_DECL. Set "oacc gangprivate" + attribute for gang-private variable declarations. */ + +void +nvptx_goacc_adjust_private_decl (tree decl, int level) +{ + if (level != GOMP_DIM_GANG) + return; + + if (!lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (decl))) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Setting 'oacc gangprivate' attribute for decl:"); + print_generic_decl (dump_file, decl, TDF_SLIM); + fputc ('\n', dump_file); + } + tree id = get_identifier ("oacc gangprivate"); + DECL_ATTRIBUTES (decl) = tree_cons (id, NULL, DECL_ATTRIBUTES (decl)); + } +} + /* Implement TARGET_GOACC_EXPAND_ACCEL_VAR. Place "oacc gangprivate" variables in shared memory. */ @@ -6201,6 +6224,9 @@ nvptx_set_current_function (tree fndecl) #undef TARGET_HAVE_SPECULATION_SAFE_VALUE #define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed +#undef TARGET_GOACC_ADJUST_PRIVATE_DECL +#define TARGET_GOACC_ADJUST_PRIVATE_DECL nvptx_goacc_adjust_private_decl + #undef TARGET_GOACC_EXPAND_ACCEL_VAR #define TARGET_GOACC_EXPAND_ACCEL_VAR nvptx_goacc_expand_accel_var diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 536a436..e44f805 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6162,8 +6162,9 @@ memories. A return value of NULL indicates that the target does not handle this VAR_DECL, and normal RTL expanding is resumed. @end deftypefn -@deftypefn {Target Hook} void TARGET_GOACC_ADJUST_GANGPRIVATE_DECL (tree @var{var}) -Tweak variable declaration for a gang-private variable. +@deftypefn {Target Hook} void TARGET_GOACC_ADJUST_PRIVATE_DECL (tree @var{var}, @var{int}) +Tweak variable declaration for a private variable at the specified +parallelism level. @end deftypefn @deftypevr {Target Hook} bool TARGET_GOACC_WORKER_PARTITIONING diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index c0b92f2..74a1b03 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4210,7 +4210,7 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_GOACC_EXPAND_ACCEL_VAR -@hook TARGET_GOACC_ADJUST_GANGPRIVATE_DECL +@hook TARGET_GOACC_ADJUST_PRIVATE_DECL @hook TARGET_GOACC_WORKER_PARTITIONING diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 04081f3..9b5e518 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -2617,6 +2617,8 @@ expand_UNIQUE (internal_fn, gcall *stmt) else gcc_unreachable (); break; + case IFN_UNIQUE_OACC_PRIVATE: + break; } if (pattern) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 7164ee5..a2810ed 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see #define IFN_UNIQUE_CODES \ DEF(UNSPEC), \ DEF(OACC_FORK), DEF(OACC_JOIN), \ - DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK) + DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK), \ + DEF(OACC_PRIVATE) enum ifn_unique_kind { #define DEF(X) IFN_UNIQUE_##X diff --git a/gcc/omp-low.c b/gcc/omp-low.c index f0d87a6..eddae64 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -147,9 +147,6 @@ struct omp_context /* A tree_list of the reduction clauses in outer contexts. */ tree outer_reduction_clauses; - /* The number of levels of OpenACC partitioning invoked in this context. */ - unsigned oacc_partitioning_levels; - /* Addressable variable decls in this context. */ vec<tree> *oacc_addressable_var_decls; }; @@ -6148,8 +6145,9 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *stmt_list, static void lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, - gcall *fork, gcall *join, gimple_seq *fork_seq, - gimple_seq *join_seq, omp_context *ctx) + gcall *fork, gcall *private_marker, gcall *join, + gimple_seq *fork_seq, gimple_seq *join_seq, + omp_context *ctx) { gimple_seq before_fork = NULL; gimple_seq after_fork = NULL; @@ -6351,6 +6349,8 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, /* Now stitch things together. */ gimple_seq_add_seq (fork_seq, before_fork); + if (private_marker) + gimple_seq_add_stmt (fork_seq, private_marker); if (fork) gimple_seq_add_stmt (fork_seq, fork); gimple_seq_add_seq (fork_seq, after_fork); @@ -7048,7 +7048,7 @@ lower_oacc_loop_marker (location_t loc, tree ddvar, bool head, HEAD and TAIL. */ static void -lower_oacc_head_tail (location_t loc, tree clauses, +lower_oacc_head_tail (location_t loc, tree clauses, gcall *private_marker, gimple_seq *head, gimple_seq *tail, omp_context *ctx) { bool inner = false; @@ -7056,13 +7056,19 @@ lower_oacc_head_tail (location_t loc, tree clauses, gimple_seq_add_stmt (head, gimple_build_assign (ddvar, integer_zero_node)); unsigned count = lower_oacc_head_mark (loc, ddvar, clauses, head, ctx); + + if (private_marker) + { + gimple_set_location (private_marker, loc); + gimple_call_set_lhs (private_marker, ddvar); + gimple_call_set_arg (private_marker, 1, ddvar); + } + tree fork_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_FORK); tree join_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_JOIN); gcc_assert (count); - ctx->oacc_partitioning_levels = count; - for (unsigned done = 1; count; count--, done++) { gimple_seq fork_seq = NULL; @@ -7089,7 +7095,8 @@ lower_oacc_head_tail (location_t loc, tree clauses, &join_seq); lower_oacc_reductions (loc, clauses, place, inner, - fork, join, &fork_seq, &join_seq, ctx); + fork, (count == 1) ? private_marker : NULL, + join, &fork_seq, &join_seq, ctx); /* Append this level to head. */ gimple_seq_add_seq (head, fork_seq); @@ -8755,9 +8762,6 @@ oacc_record_private_var_clauses (omp_context *ctx, tree clauses) { tree c; - if (!ctx) - return; - for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c)) if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_PRIVATE) { @@ -8821,6 +8825,58 @@ mark_oacc_gangprivate (vec<tree> *decls, omp_context *ctx) } } +/* Build an internal UNIQUE function with type IFN_UNIQUE_OACC_PRIVATE listing + the addresses of variables that should be made private at the surrounding + parallelism level. Such functions appear in the gimple code stream in two + forms, e.g. for a partitioned loop: + + .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6, 1, 68); + .data_dep.6 = .UNIQUE (OACC_PRIVATE, .data_dep.6, -1, &w); + .data_dep.6 = .UNIQUE (OACC_FORK, .data_dep.6, -1); + .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6); + + or alternatively, OACC_PRIVATE can appear at the top level of a parallel, + not as part of a HEAD_MARK sequence: + + .UNIQUE (OACC_PRIVATE, 0, 0, &w); + + For such stand-alone appearances, the 3rd argument is always 0, denoting + gang partitioning. */ + +static gcall * +make_oacc_private_marker (omp_context *ctx) +{ + int i; + tree decl; + + if (ctx->oacc_addressable_var_decls->length () == 0) + return NULL; + + auto_vec<tree, 5> args; + + args.quick_push (build_int_cst (integer_type_node, + IFN_UNIQUE_OACC_PRIVATE)); + args.quick_push (integer_zero_node); + args.quick_push (integer_minus_one_node); + + FOR_EACH_VEC_ELT (*ctx->oacc_addressable_var_decls, i, decl) + { + for (omp_context *thisctx = ctx; thisctx; thisctx = thisctx->outer) + { + tree inner_decl = maybe_lookup_decl (decl, thisctx); + if (inner_decl) + { + decl = inner_decl; + break; + } + } + tree addr = build_fold_addr_expr (decl); + args.safe_push (addr); + } + + return gimple_build_call_internal_vec (IFN_UNIQUE, args); +} + /* Lower code for an OMP loop directive. */ static void @@ -8857,6 +8913,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx) gbind *inner_bind = as_a <gbind *> (gimple_seq_first_stmt (omp_for_body)); tree vars = gimple_bind_vars (inner_bind); + if (is_gimple_omp_oacc (ctx->stmt)) + oacc_record_vars_in_bind (ctx, vars); gimple_bind_append_vars (new_stmt, vars); /* bind_vars/BLOCK_VARS are being moved to new_stmt/block, don't keep them on the inner_bind and it's block. */ @@ -8953,6 +9011,12 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx) lower_omp (gimple_omp_body_ptr (stmt), ctx); + gcall *private_marker = NULL; + if (is_gimple_omp_oacc (ctx->stmt) + && !gimple_seq_empty_p (omp_for_body) + && !gimple_seq_empty_p (omp_for_body)) + private_marker = make_oacc_private_marker (ctx); + /* Lower the header expressions. At this point, we can assume that the header is of the form: @@ -8989,7 +9053,7 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx) if (is_gimple_omp_oacc (ctx->stmt) && !ctx_in_oacc_kernels_region (ctx)) lower_oacc_head_tail (gimple_location (stmt), - gimple_omp_for_clauses (stmt), + gimple_omp_for_clauses (stmt), private_marker, &oacc_head, &oacc_tail, ctx); /* Add OpenACC partitioning and reduction markers just before the loop. */ @@ -9872,8 +9936,6 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) clauses = gimple_omp_target_clauses (stmt); - oacc_record_private_var_clauses (ctx, clauses); - gimple_seq dep_ilist = NULL; gimple_seq dep_olist = NULL; if (omp_find_clause (clauses, OMP_CLAUSE_DEPEND)) @@ -10242,8 +10304,6 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) if (offloaded) { - mark_oacc_gangprivate (ctx->oacc_addressable_var_decls, ctx); - /* Declare all the variables created by mapping and the variables declared in the scope of the target body. */ record_vars_into (ctx->block_vars, child_fn); @@ -11195,8 +11255,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) them as a dummy GANG loop. */ tree level = build_int_cst (integer_type_node, GOMP_DIM_GANG); + gcall *private_marker = make_oacc_private_marker (ctx); + + if (private_marker) + gimple_call_set_arg (private_marker, 2, level); + lower_oacc_reductions (gimple_location (ctx->stmt), clauses, level, - false, NULL, NULL, &fork_seq, &join_seq, ctx); + false, NULL, private_marker, NULL, &fork_seq, + &join_seq, ctx); } gimple_seq_add_seq (&new_body, fork_seq); @@ -11307,26 +11373,6 @@ lower_omp_grid_body (gimple_stmt_iterator *gsi_p, omp_context *ctx) gimple_build_omp_return (false)); } -/* Find gang-private variables in a context. */ - -static int -process_oacc_gangprivate_1 (splay_tree_node node, void * /* data */) -{ - omp_context *ctx = (omp_context *) node->value; - unsigned level_total = 0; - omp_context *thisctx; - - for (thisctx = ctx; thisctx; thisctx = thisctx->outer) - level_total += thisctx->oacc_partitioning_levels; - - /* If the current context and parent contexts are distributed over a - total of one parallelism level, we have gang partitioning. */ - if (level_total == 1) - mark_oacc_gangprivate (ctx->oacc_addressable_var_decls, ctx); - - return 0; -} - /* Helper to lookup dynamic array through nested omp contexts. Returns TREE_LIST of dimensions, and the CTX where it was found in *CTX_P. */ @@ -11666,7 +11712,9 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx) ctx); break; case GIMPLE_BIND: - oacc_record_vars_in_bind (ctx, gimple_bind_vars (as_a <gbind *> (stmt))); + if (ctx && is_gimple_omp_oacc (ctx->stmt)) + oacc_record_vars_in_bind (ctx, + gimple_bind_vars (as_a <gbind *> (stmt))); lower_omp (gimple_bind_body_ptr (as_a <gbind *> (stmt)), ctx); maybe_remove_omp_member_access_dummy_vars (as_a <gbind *> (stmt)); break; @@ -11917,7 +11965,6 @@ execute_lower_omp (void) if (all_contexts) { - splay_tree_foreach (all_contexts, process_oacc_gangprivate_1, NULL); splay_tree_delete (all_contexts); all_contexts = NULL; } diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c index a6f64aa..e489ad3 100644 --- a/gcc/omp-offload.c +++ b/gcc/omp-offload.c @@ -1110,7 +1110,9 @@ oacc_loop_xform_head_tail (gcall *from, int level) = ((enum ifn_unique_kind) TREE_INT_CST_LOW (gimple_call_arg (stmt, 0))); - if (k == IFN_UNIQUE_OACC_FORK || k == IFN_UNIQUE_OACC_JOIN) + if (k == IFN_UNIQUE_OACC_FORK + || k == IFN_UNIQUE_OACC_JOIN + || k == IFN_UNIQUE_OACC_PRIVATE) *gimple_call_arg_ptr (stmt, 2) = replacement; else if (k == kind && stmt != from) break; @@ -1828,6 +1830,8 @@ execute_oacc_device_lower () for (unsigned i = 0; i < GOMP_DIM_MAX; i++) dims[i] = oacc_get_fn_dim_size (current_function_decl, i); + hash_set<tree> adjusted_vars; + /* Now lower internal loop functions to target-specific code sequences. */ basic_block bb; @@ -1904,6 +1908,43 @@ execute_oacc_device_lower () case IFN_UNIQUE_OACC_TAIL_MARK: remove = true; break; + + case IFN_UNIQUE_OACC_PRIVATE: + { + HOST_WIDE_INT level + = TREE_INT_CST_LOW (gimple_call_arg (call, 2)); + if (level == -1) + break; + for (unsigned i = 3; + i < gimple_call_num_args (call); + i++) + { + tree arg = gimple_call_arg (call, i); + gcc_assert (TREE_CODE (arg) == ADDR_EXPR); + tree decl = TREE_OPERAND (arg, 0); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + static char const *const axes[] = + /* Must be kept in sync with GOMP_DIM + enumeration. */ + { "gang", "worker", "vector" }; + fprintf (dump_file, "Decl UID %u has %s " + "partitioning:", DECL_UID (decl), + axes[level]); + print_generic_decl (dump_file, decl, TDF_SLIM); + fputc ('\n', dump_file); + } + if (targetm.goacc.adjust_private_decl) + { + tree oldtype = TREE_TYPE (decl); + targetm.goacc.adjust_private_decl (decl, level); + if (TREE_TYPE (decl) != oldtype) + adjusted_vars.add (decl); + } + } + remove = true; + } + break; } break; } @@ -1952,21 +1993,10 @@ execute_oacc_device_lower () uses (2). At least on AMD GCN, there are atomic operations that work directly in the LDS address space. */ - if (targetm.goacc.adjust_gangprivate_decl) + if (targetm.goacc.adjust_private_decl) { tree var; unsigned i; - hash_set<tree> adjusted_vars; - - FOR_EACH_LOCAL_DECL (cfun, i, var) - { - if (!VAR_P (var) - || !lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (var))) - continue; - - targetm.goacc.adjust_gangprivate_decl (var); - adjusted_vars.add (var); - } FOR_ALL_BB_FN (bb, cfun) for (gimple_stmt_iterator gsi = gsi_start_bb (bb); diff --git a/gcc/omp-sese.c b/gcc/omp-sese.c index d726701..13d803f 100644 --- a/gcc/omp-sese.c +++ b/gcc/omp-sese.c @@ -713,19 +713,61 @@ find_partitioned_var_uses (parallel_g *par, unsigned outer_mask, } } +/* Gang-private variables (typically placed in a GPU's shared memory) do not + need to be processed by the worker-propagation mechanism. Populate the + GANGPRIVATE_VARS set with any such variables found in the current + function. */ + +static void +find_gangprivate_vars (hash_set<tree> *gangprivate_vars) +{ + basic_block block; + + FOR_EACH_BB_FN (block, cfun) + { + for (gimple_stmt_iterator gsi = gsi_start_bb (block); + !gsi_end_p (gsi); + gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + + if (gimple_call_internal_p (stmt, IFN_UNIQUE)) + { + enum ifn_unique_kind k = ((enum ifn_unique_kind) + TREE_INT_CST_LOW (gimple_call_arg (stmt, 0))); + if (k == IFN_UNIQUE_OACC_PRIVATE) + { + HOST_WIDE_INT level + = TREE_INT_CST_LOW (gimple_call_arg (stmt, 2)); + if (level != GOMP_DIM_GANG) + continue; + for (unsigned i = 3; i < gimple_call_num_args (stmt); i++) + { + tree arg = gimple_call_arg (stmt, i); + gcc_assert (TREE_CODE (arg) == ADDR_EXPR); + tree decl = TREE_OPERAND (arg, 0); + gangprivate_vars->add (decl); + } + } + } + } + } +} + static void find_local_vars_to_propagate (parallel_g *par, unsigned outer_mask, hash_set<tree> *partitioned_var_uses, + hash_set<tree> *gangprivate_vars, vec<propagation_set *> *prop_set) { unsigned mask = outer_mask | par->mask; if (par->inner) find_local_vars_to_propagate (par->inner, mask, partitioned_var_uses, - prop_set); + gangprivate_vars, prop_set); if (par->next) find_local_vars_to_propagate (par->next, outer_mask, partitioned_var_uses, - prop_set); + gangprivate_vars, prop_set); if (!(mask & GOMP_DIM_MASK (GOMP_DIM_WORKER))) { @@ -747,8 +789,7 @@ find_local_vars_to_propagate (parallel_g *par, unsigned outer_mask, || is_global_var (var) || AGGREGATE_TYPE_P (TREE_TYPE (var)) || !partitioned_var_uses->contains (var) - || lookup_attribute ("oacc gangprivate", - DECL_ATTRIBUTES (var))) + || gangprivate_vars->contains (var)) continue; if (stmt_may_clobber_ref_p (stmt, var)) @@ -1353,9 +1394,12 @@ oacc_do_neutering (void) &prop_set); hash_set<tree> partitioned_var_uses; + hash_set<tree> gangprivate_vars; + find_gangprivate_vars (&gangprivate_vars); find_partitioned_var_uses (par, mask, &partitioned_var_uses); - find_local_vars_to_propagate (par, mask, &partitioned_var_uses, &prop_set); + find_local_vars_to_propagate (par, mask, &partitioned_var_uses, + &gangprivate_vars, &prop_set); FOR_ALL_BB_FN (bb, cfun) { diff --git a/gcc/target.def b/gcc/target.def index c9c3f65..d490138 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1730,9 +1730,10 @@ rtx, (tree var), NULL) DEFHOOK -(adjust_gangprivate_decl, -"Tweak variable declaration for a gang-private variable.", -void, (tree var), +(adjust_private_decl, +"Tweak variable declaration for a private variable at the specified\n\ +parallelism level.", +void, (tree var, int), NULL) DEFHOOK diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp index bf880ac..b1748ac 100644 --- a/libgomp/ChangeLog.omp +++ b/libgomp/ChangeLog.omp @@ -1,5 +1,12 @@ 2019-10-16 Julian Brown <julian@codesourcery.com> + * testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use + oaccdevlow dump and update scanned output. + * testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise. + Add missing atomic to force worker partitioning for test variable. + +2019-10-16 Julian Brown <julian@codesourcery.com> + * testsuite/libgomp.oacc-c-c++-common/serial-dims.c: Support AMD GCN. 2019-10-09 Tobias Burnus <tobias@codesourcery.com> diff --git a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 index 9158b6f..dafc70c 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 @@ -1,8 +1,8 @@ ! Test for "oacc gangprivate" attribute on gang-private variables ! { dg-do run } -! { dg-additional-options "-fdump-tree-omplower-details" } -! { dg-final { scan-tree-dump-times "Setting 'oacc gangprivate' attribute for decl: integer\\(kind=4\\) w;" 1 "omplower" } } */ +! { dg-additional-options "-fdump-tree-oaccdevlow-details" } +! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has gang partitioning: integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */ program main integer :: w, arr(0:31) diff --git a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 index d147229..90e06be 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 @@ -1,8 +1,8 @@ -! Test for lack of "oacc gangprivate" attribute on worker-private variables +! Test for worker-private variables ! { dg-do run } -! { dg-additional-options "-fdump-tree-omplower-details" } -! { dg-final { scan-tree-dump-times "Setting 'oacc gangprivate' attribute for decl" 0 "omplower" } } */ +! { dg-additional-options "-fdump-tree-oaccdevlow-details" } +! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has worker partitioning: integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */ program main integer :: w, arr(0:31) @@ -13,7 +13,9 @@ program main w = 0 !$acc loop seq do i = 0, 31 + !$acc atomic update w = w + 1 + !$acc end atomic end do arr(j) = w end do |