aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorChung-Lin Tang <cltang@baylibre.com>2025-05-02 00:27:53 +0000
committerSandra Loosemore <sloosemore@baylibre.com>2025-05-15 20:25:51 +0000
commit5fd60a678116773e99d5fd2d64a118f837e5d6f0 (patch)
tree2821bef94bfb75dceeec8369d9227e3ba0fe1b9a
parent3dfe9733eabd74f21999dde36166bdedc5d06b1c (diff)
downloadgcc-5fd60a678116773e99d5fd2d64a118f837e5d6f0.zip
gcc-5fd60a678116773e99d5fd2d64a118f837e5d6f0.tar.gz
gcc-5fd60a678116773e99d5fd2d64a118f837e5d6f0.tar.bz2
OpenACC 2.7: Implement reductions for arrays and records
This patch is a merge of: https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675222.html This patch implements reductions for arrays, array sections, and struct/record types as according to the OpenACC 2.7 specification. 2025-02-23 Chung-Lin Tang <cltang@baylibre.com> gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_variable_list): Adjust parsing of opening square bracket. (c_parser_omp_clause_reduction): Adjustments for OpenACC-specific cases. * c-typeck.cc (c_oacc_reduction_defined_type_p): New function. (c_oacc_reduction_code_name): Likewise. (c_finish_omp_clauses): Handle OpenACC cases using new functions. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_var_list_no_open): Adjust parsing of opening square bracket. (cp_parser_omp_clause_reduction): Adjustments for OpenACC-specific cases. * semantics.cc (cp_oacc_reduction_defined_type_p): New function. (cp_oacc_reduction_code_name): Likewise. (finish_omp_reduction_clause): Handle OpenACC cases using new functions. gcc/fortran/ChangeLog: * openmp.cc (oacc_reduction_defined_type_p): New function. (resolve_omp_clauses): Adjust OpenACC array reduction error case. Adjust OMP_LIST_REDUCTION case. Use oacc_reduction_defined_type_p for OpenACC. * trans-openmp.cc (gfc_trans_omp_array_reduction_or_udr): Add 'stmtblock_t *block', and 'bool openacc' parameters. Add array and array section handling for openacc case. Adjust part of function to be !openacc only. (gfc_trans_omp_reduction_list): Add 'stmtblock_t *block', and 'bool openacc' parameters, pass to calls to gfc_trans_omp_array_reduction_or_udr. (gfc_trans_omp_array_section): Adjust setting of OMP_CLAUSE_SIZE to only OMP_CLAUSE_MAP clauses. Adjust calculations of decls and bias to use temporary variables instead of tree expression inside clauses. (gfc_trans_omp_clauses): Add 'block' and 'openacc' arguments to calls to gfc_trans_omp_reduction_list. (gfc_trans_omp_do): Pass 'op == EXEC_OACC_LOOP' as 'bool openacc' parameter in call to gfc_trans_omp_clauses. gcc/ChangeLog: * config/gcn/gcn-tree.cc (#include "omp-offload.h"): Add include. (#include "memmodel.h"): Add include. (gcn_array_reduction_buffers): New vec<tree> for holding DECLs for reduction buffer pointer variables. (gcn_lockfull_update): Add pointer type fold_converts. (gcn_reduction_update): Additions for handling ARRAY_TYPE, pointer to ARRAY_TYPE, and RECORD_TYPE reductions. (gcn_goacc_get_worker_red_decl): Adjust parameters to handle non-constant offset case. (gcn_goacc_get_worker_array_reduction_buffer): New function. (gcn_create_if_else_seq): New function. (gcn_create_do_while_loop_seq): New function. (gcn_goacc_reduction_setup): Adjustments to handle arrays and records. (gcn_goacc_reduction_init): Likewise. (gcn_goacc_reduction_fini): Likewise. (gcn_goacc_reduction_teardown): Likewise. * config/nvptx/nvptx.cc (nvptx_gen_shuffle): Properly generate V2SI shuffle using vec_extract op. (nvptx_expand_shared_addr): Adjustments to handle non-constant size. (nvptx_get_shared_red_addr): Adjust type/alignment calculations to use TYPE_SIZE/ALIGN_UNIT instead of machine mode based. (nvptx_get_shared_red_addr): New function with array_max_idx parameter. (nvptx_reduction_update): Additions for handling ARRAY_TYPE, pointer to ARRA_TYPE, and RECORD_TYPE reductions. (nvptx_goacc_reduction_setup): Likewise. (nvptx_goacc_reduction_init): Likewise. (nvptx_goacc_reduction_fini): Likewise. (nvptx_goacc_reduction_teardown): Likewise. * gimplify.cc (gimplify_scan_omp_clauses): Gimplify inside COMPONENT_REF and convert codes for OMP_CLAUSE_REDUCTION cases. Add DECL_P check for do_add/do_add_decl goto case. (gimplify_adjust_omp_clauses): Avoid GOMP_MAP_POINTER OMP_CLAUSE_SIZE handling for OpenACC kernels. Call omp_add_variable for ARRAY_REF index. Peel away array MEM_REF for decl lookup. * omp-low.cc (struct omp_context): Add 'hash_map<tree, tree> *block_vars_map' field. (omp_copy_decl_2): Create/lookup using ctx->block_vars_map first. Add new copy into ctx->block_vars_map. (install_var_field): Add 'bool field_may_exist = false' parameter. Adjust lookup assertions. (delete_omp_context): Add delete of ctx->block_vars_map. (scan_sharing_clauses): Adjust calls to install_var_field. Adjust ARRAY_REF pointer type building to use decl type, rather than generic ptr_type_node. For ARRAY_REFs on offloaded constructs, also add base expression as field lookup key. (omp_reduction_init_op): Add ARRAY_TYPE and RECORD_TYPE init op construction. (oacc_array_reduction_bias): New function. (lower_oacc_reductions): Add array reduction handling code. Arrays use a different mode of IFN parameters, using additional 'array_addr' and 'array_max_idx' arguments. The LHS var is a simple integer for dependency ordering. (lower_omp_target): Adjust 'offload' condition for GOMP_MAP_POINTER case. Generate BUILT_IN_ALLOCA_WITH_ALIGN to create private copy for reductions of non-constant size types. * omp-oacc-neuter-broadcast.cc (worker_single_copy): Add 'hash_set<tree> *array_reduction_base_vars' parameter. Avoid propagation for SSA_NAMEs used for array reduction accesses. (neuter_worker_single): Add 'hash_set<tree> *array_reduction_base_vars' parameter. Adjust recursive calls to self and worker_single_copy. (oacc_do_neutering): Add 'hash_set<tree> *array_reduction_base_vars' parameter. Adjust call to neuter_worker_single. (execute_omp_oacc_neuter_broadcast): Add local 'hash_set<tree> array_reduction_base_vars' declaration. Collect MEM_REF base-pointer SSA_NAMEs of arrays into array_reduction_base_vars. Add '&array_reduction_base_vars' argument to call of oacc_do_neutering. * omp-offload.cc (#include "cfghooks.h"): Add include. (oacc_build_array_copy): New function. (oacc_build_array_copy_loop): New function. (oacc_build_indexed_ssa_loop): New function. (default_goacc_reduction): Adjustments to handle arrays. * omp-offload.h (oacc_build_array_copy): New declaration. (oacc_build_array_copy_loop): New declaration. (oacc_build_indexed_ssa_loop): New declaration. * tree-loop-distribution.cc (generate_memset_builtin): Under OpenACC, when last stmt of pre-header block is a UNIQUE(OACC_FORK) internal-fn, split a new basic block to serve as place of insertion, otherwise may fail later checking because UNIQUE(OACC_FORK) counts as control flow stmt. (generate_memcpy_builtin): Likewise. gcc/testsuite/ChangeLog: * c-c++-common/goacc/readonly-2.c: Adjust test. * c-c++-common/goacc/reduction-9.c: Adjust test. * c-c++-common/goacc/reduction-11.c: New test. * c-c++-common/goacc/reduction-12.c: New test. * c-c++-common/goacc/reduction-13.c: New test. * c-c++-common/goacc/reduction-14.c: New test. * c-c++-common/goacc/reduction-15.c: New test. * c-c++-common/goacc/reduction-16.c: New test. * g++.dg/goacc/reductions-1.C: Adjust test. * gfortran.dg/goacc/array-reduction.f90: Adjust test. * gfortran.dg/goacc/enter-exit-data-2.f90: Adjust test. * gfortran.dg/goacc/finalize-1.f: Adjust test. * gfortran.dg/goacc/kernels-decompose-1.f95: Adjust test. * gfortran.dg/goacc/pr70828.f90: Adjust test. * gfortran.dg/goacc/reduction.f95: Adjust test. * gfortran.dg/gomp/target-enter-exit-data.f90: Adjust test. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/reduction.h (check_reduction_array_xx): New macro. (operator_apply): Likewise. (check_reduction_array_op): Likewise. (check_reduction_arraysec_op): Likewise. (function_apply): Likewise. (check_reduction_array_macro): Likewise. (check_reduction_arraysec_macro): Likewise. (check_reduction_xxx_xx_all): Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-arrays-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c: New test. * testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c: New test. * testsuite/libgomp.oacc-c-c++-common/reduction-structs-1.c: New test. * testsuite/libgomp.oacc-fortran/reduction-10.f90: New test. * testsuite/libgomp.oacc-fortran/reduction-11.f90: New test. * testsuite/libgomp.oacc-fortran/reduction-12.f90: New test. * testsuite/libgomp.oacc-fortran/reduction-13.f90: New test. * testsuite/libgomp.oacc-fortran/reduction-14.f90: New test. * testsuite/libgomp.oacc-fortran/reduction-15.f90: New test. * testsuite/libgomp.oacc-fortran/reduction-16.f90: New test.
-rw-r--r--gcc/c/c-parser.cc23
-rw-r--r--gcc/c/c-typeck.cc81
-rw-r--r--gcc/config/gcn/gcn-tree.cc623
-rw-r--r--gcc/config/nvptx/nvptx.cc407
-rw-r--r--gcc/cp/parser.cc13
-rw-r--r--gcc/cp/semantics.cc77
-rw-r--r--gcc/fortran/openmp.cc199
-rw-r--r--gcc/fortran/trans-openmp.cc171
-rw-r--r--gcc/gimplify.cc44
-rw-r--r--gcc/omp-low.cc446
-rw-r--r--gcc/omp-oacc-neuter-broadcast.cc124
-rw-r--r--gcc/omp-offload.cc173
-rw-r--r--gcc/omp-offload.h10
-rw-r--r--gcc/testsuite/c-c++-common/goacc/readonly-2.c4
-rw-r--r--gcc/testsuite/c-c++-common/goacc/reduction-11.c81
-rw-r--r--gcc/testsuite/c-c++-common/goacc/reduction-12.c60
-rw-r--r--gcc/testsuite/c-c++-common/goacc/reduction-13.c60
-rw-r--r--gcc/testsuite/c-c++-common/goacc/reduction-14.c46
-rw-r--r--gcc/testsuite/c-c++-common/goacc/reduction-15.c51
-rw-r--r--gcc/testsuite/c-c++-common/goacc/reduction-16.c30
-rw-r--r--gcc/testsuite/c-c++-common/goacc/reduction-9.c18
-rw-r--r--gcc/testsuite/g++.dg/goacc/reductions-1.C108
-rw-r--r--gcc/testsuite/gfortran.dg/goacc/array-reduction.f9047
-rw-r--r--gcc/testsuite/gfortran.dg/goacc/enter-exit-data-2.f908
-rw-r--r--gcc/testsuite/gfortran.dg/goacc/finalize-1.f4
-rw-r--r--gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f9510
-rw-r--r--gcc/testsuite/gfortran.dg/goacc/pr70828.f902
-rw-r--r--gcc/testsuite/gfortran.dg/goacc/reduction.f95176
-rw-r--r--gcc/testsuite/gfortran.dg/gomp/target-enter-exit-data.f908
-rw-r--r--gcc/tree-loop-distribution.cc20
-rw-r--r--libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-1.c69
-rw-r--r--libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c91
-rw-r--r--libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c90
-rw-r--r--libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c91
-rw-r--r--libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c89
-rw-r--r--libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-structs-1.c121
-rw-r--r--libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h52
-rw-r--r--libgomp/testsuite/libgomp.oacc-fortran/reduction-10.f90598
-rw-r--r--libgomp/testsuite/libgomp.oacc-fortran/reduction-11.f90424
-rw-r--r--libgomp/testsuite/libgomp.oacc-fortran/reduction-12.f90424
-rw-r--r--libgomp/testsuite/libgomp.oacc-fortran/reduction-13.f90134
-rw-r--r--libgomp/testsuite/libgomp.oacc-fortran/reduction-14.f9068
-rw-r--r--libgomp/testsuite/libgomp.oacc-fortran/reduction-15.f9098
-rw-r--r--libgomp/testsuite/libgomp.oacc-fortran/reduction-16.f9099
44 files changed, 5130 insertions, 442 deletions
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 6d607e9..fbaa917 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -16945,8 +16945,7 @@ c_parser_omp_variable_list (c_parser *parser,
case OMP_CLAUSE_HAS_DEVICE_ADDR:
array_section_p = false;
dims.truncate (0);
- while ((ort != C_ORT_ACC || kind != OMP_CLAUSE_REDUCTION)
- && c_parser_next_token_is (parser, CPP_OPEN_SQUARE))
+ while (c_parser_next_token_is (parser, CPP_OPEN_SQUARE))
{
location_t loc = UNKNOWN_LOCATION;
tree low_bound = NULL_TREE, length = NULL_TREE;
@@ -18867,13 +18866,21 @@ c_parser_omp_clause_reduction (c_parser *parser, enum omp_clause_code kind,
code = MAX_EXPR;
break;
}
+ if (ort == C_ORT_ACC)
+ goto name_error;
reduc_id = c_parser_peek_token (parser)->value;
break;
}
default:
- c_parser_error (parser,
- "expected %<+%>, %<*%>, %<-%>, %<&%>, "
- "%<^%>, %<|%>, %<&&%>, %<||%> or identifier");
+ name_error:
+ if (ort == C_ORT_OMP)
+ c_parser_error (parser,
+ "expected %<+%>, %<*%>, %<-%>, %<&%>, "
+ "%<^%>, %<|%>, %<&&%>, %<||%> or identifier");
+ else
+ c_parser_error (parser,
+ "expected %<+%>, %<*%>, %<-%>, %<&%>, "
+ "%<^%>, %<|%>, %<&&%>, %<||%>, %<min%> or %<max%>");
c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, 0);
return list;
}
@@ -18887,6 +18894,11 @@ c_parser_omp_clause_reduction (c_parser *parser, enum omp_clause_code kind,
for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
{
+ OMP_CLAUSE_REDUCTION_CODE (c) = code;
+ /* OpenACC does not require anything below. */
+ if (ort == C_ORT_ACC)
+ continue;
+
tree d = OMP_CLAUSE_DECL (c), type;
if (TREE_CODE (d) != OMP_ARRAY_SECTION)
type = TREE_TYPE (d);
@@ -18910,7 +18922,6 @@ c_parser_omp_clause_reduction (c_parser *parser, enum omp_clause_code kind,
}
while (TREE_CODE (type) == ARRAY_TYPE)
type = TREE_TYPE (type);
- OMP_CLAUSE_REDUCTION_CODE (c) = code;
if (task)
OMP_CLAUSE_REDUCTION_TASK (c) = 1;
else if (inscan)
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index a1731ff..68640ff 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -16248,6 +16248,68 @@ c_oacc_check_attachments (tree c)
return false;
}
+static bool
+c_oacc_reduction_defined_type_p (enum tree_code reduction_code, tree t)
+{
+ if (TREE_CODE (t) == INTEGER_TYPE)
+ return true;
+
+ if (FLOAT_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE)
+ switch (reduction_code)
+ {
+ case PLUS_EXPR:
+ case MULT_EXPR:
+ case MINUS_EXPR:
+ case TRUTH_ANDIF_EXPR:
+ case TRUTH_ORIF_EXPR:
+ return true;
+ case MIN_EXPR:
+ case MAX_EXPR:
+ return TREE_CODE (t) != COMPLEX_TYPE;
+ case BIT_AND_EXPR:
+ case BIT_XOR_EXPR:
+ case BIT_IOR_EXPR:
+ return false;
+ default:
+ gcc_unreachable ();
+ }
+
+ if (TREE_CODE (t) == ARRAY_TYPE)
+ return c_oacc_reduction_defined_type_p (reduction_code, TREE_TYPE (t));
+
+ if (TREE_CODE (t) == RECORD_TYPE)
+ {
+ for (tree fld = TYPE_FIELDS (t); fld; fld = TREE_CHAIN (fld))
+ if (TREE_CODE (fld) == FIELD_DECL
+ && !c_oacc_reduction_defined_type_p (reduction_code,
+ TREE_TYPE (fld)))
+ return false;
+ return true;
+ }
+
+ return false;
+}
+
+static const char *
+c_oacc_reduction_code_name (enum tree_code reduction_code)
+{
+ switch (reduction_code)
+ {
+ case PLUS_EXPR: return "+";
+ case MULT_EXPR: return "*";
+ case MINUS_EXPR: return "-";
+ case TRUTH_ANDIF_EXPR: return "&&";
+ case TRUTH_ORIF_EXPR: return "||";
+ case MIN_EXPR: return "min";
+ case MAX_EXPR: return "max";
+ case BIT_AND_EXPR: return "&";
+ case BIT_XOR_EXPR: return "^";
+ case BIT_IOR_EXPR: return "|";
+ default:
+ gcc_unreachable ();
+ }
+}
+
/* For all elements of CLAUSES, validate them against their constraints.
Remove any elements from the list that are invalid. */
@@ -16447,9 +16509,22 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
break;
}
}
- if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) == NULL_TREE
- && (FLOAT_TYPE_P (type)
- || TREE_CODE (type) == COMPLEX_TYPE))
+ if (ort == C_ORT_ACC)
+ {
+ enum tree_code r_code = OMP_CLAUSE_REDUCTION_CODE (c);
+ if (!c_oacc_reduction_defined_type_p (r_code, TREE_TYPE (t)))
+ {
+ const char *r_name = c_oacc_reduction_code_name (r_code);
+ error_at (OMP_CLAUSE_LOCATION (c),
+ "%qE has invalid type for %<reduction(%s)%>",
+ t, r_name);
+ remove = true;
+ break;
+ }
+ }
+ else if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) == NULL_TREE
+ && (FLOAT_TYPE_P (type)
+ || TREE_CODE (type) == COMPLEX_TYPE))
{
enum tree_code r_code = OMP_CLAUSE_REDUCTION_CODE (c);
const char *r_name = NULL;
diff --git a/gcc/config/gcn/gcn-tree.cc b/gcc/config/gcn/gcn-tree.cc
index 4ae29bd..87c4267 100644
--- a/gcc/config/gcn/gcn-tree.cc
+++ b/gcc/config/gcn/gcn-tree.cc
@@ -35,6 +35,7 @@
#include "varasm.h"
#include "omp-low.h"
#include "omp-general.h"
+#include "omp-offload.h"
#include "internal-fn.h"
#include "tree-vrp.h"
#include "tree-ssanames.h"
@@ -44,6 +45,7 @@
#include "cgraph.h"
#include "targhooks.h"
#include "langhooks-def.h"
+#include "memmodel.h"
/* }}} */
/* {{{ OpenACC reductions. */
@@ -78,6 +80,9 @@ gcn_global_lock_addr ()
return build_fold_addr_expr (v);
}
+/* Pointer variables for array reduction buffers used. */
+static vec<tree> gcn_array_reduction_buffers;
+
/* Helper function for gcn_reduction_update.
Insert code to locklessly update *PTR with *PTR OP VAR just before
@@ -259,7 +264,8 @@ gcn_lockfull_update (location_t loc, gimple_stmt_iterator *gsi,
/* Build and insert the reduction calculation. */
gimple_seq red_seq = NULL;
tree acc_in = make_ssa_name (var_type);
- tree ref_in = build_simple_mem_ref (ptr);
+ tree ref_in
+ = build_simple_mem_ref (fold_convert (build_pointer_type (var_type), ptr));
TREE_THIS_VOLATILE (ref_in) = 1;
gimplify_assign (acc_in, ref_in, &red_seq);
@@ -267,7 +273,8 @@ gcn_lockfull_update (location_t loc, gimple_stmt_iterator *gsi,
tree update_expr = fold_build2 (op, var_type, ref_in, var);
gimplify_assign (acc_out, update_expr, &red_seq);
- tree ref_out = build_simple_mem_ref (ptr);
+ tree ref_out
+ = build_simple_mem_ref (fold_convert (build_pointer_type (var_type), ptr));
TREE_THIS_VOLATILE (ref_out) = 1;
gimplify_assign (ref_out, acc_out, &red_seq);
@@ -291,11 +298,144 @@ gcn_lockfull_update (location_t loc, gimple_stmt_iterator *gsi,
static tree
gcn_reduction_update (location_t loc, gimple_stmt_iterator *gsi,
- tree ptr, tree var, tree_code op)
+ tree ptr, tree var, tree_code op,
+ tree array_max_idx = NULL_TREE)
{
tree type = TREE_TYPE (var);
tree size = TYPE_SIZE (type);
+ if (!VAR_P (ptr))
+ {
+ tree t = make_ssa_name (TREE_TYPE (ptr));
+ gimple_seq seq = NULL;
+ gimplify_assign (t, ptr, &seq);
+ gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+ ptr = t;
+ }
+
+ if (TREE_CODE (type) == ARRAY_TYPE
+ || TREE_CODE (type) == POINTER_TYPE)
+ {
+ tree array_type;
+ if (TREE_CODE (type) == ARRAY_TYPE)
+ {
+ array_type = TREE_TYPE (var);
+ }
+ else if (TREE_CODE (type) == POINTER_TYPE
+ && TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE)
+ {
+ array_type = TREE_TYPE (TREE_TYPE (var));
+ }
+ else if (TREE_CODE (type) == POINTER_TYPE)
+ {
+ array_type = TREE_TYPE (var);
+ }
+ else
+ gcc_unreachable ();
+
+ tree array_elem_type = TREE_TYPE (array_type);
+
+ gimple *g;
+ gimple_seq seq = NULL;
+ tree max_index = array_max_idx;
+ gcc_assert (array_max_idx);
+
+ tree init_index = make_ssa_name (TREE_TYPE (max_index));
+ tree loop_index = make_ssa_name (TREE_TYPE (max_index));
+ tree update_index = make_ssa_name (TREE_TYPE (max_index));
+
+ g = gimple_build_assign (init_index,
+ build_int_cst (TREE_TYPE (init_index), 0));
+ gimple_seq_add_stmt (&seq, g);
+ gimple *init_end = gimple_seq_last (seq);
+ gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+
+ basic_block init_bb = gsi_bb (*gsi);
+ edge init_edge = split_block (init_bb, init_end);
+ basic_block loop_bb = init_edge->dest;
+ /* Reset the iterator. */
+ *gsi = gsi_for_stmt (gsi_stmt (*gsi));
+
+ seq = NULL;
+ g = gimple_build_assign (update_index, PLUS_EXPR, loop_index,
+ build_int_cst (TREE_TYPE (loop_index), 1));
+ gimple_seq_add_stmt (&seq, g);
+
+ g = gimple_build_cond (LE_EXPR, update_index, max_index, NULL, NULL);
+ gimple_seq_add_stmt (&seq, g);
+ gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+
+ edge post_edge = split_block (loop_bb, g);
+ basic_block post_bb = post_edge->dest;
+ loop_bb = post_edge->src;
+ /* Reset the iterator. */
+ *gsi = gsi_for_stmt (gsi_stmt (*gsi));
+
+ /* Place where we insert reduction code below. */
+ gimple_stmt_iterator reduction_code_gsi = gsi_start_bb (loop_bb);
+
+ post_edge->flags ^= EDGE_FALSE_VALUE | EDGE_FALLTHRU;
+ post_edge->probability = profile_probability::even ();
+ edge loop_edge = make_edge (loop_bb, loop_bb, EDGE_TRUE_VALUE);
+ loop_edge->probability = profile_probability::even ();
+ set_immediate_dominator (CDI_DOMINATORS, loop_bb, init_bb);
+ set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb);
+ class loop *new_loop = alloc_loop ();
+ new_loop->header = loop_bb;
+ new_loop->latch = loop_bb;
+ add_loop (new_loop, loop_bb->loop_father);
+
+ gphi *phi = create_phi_node (loop_index, loop_bb);
+ add_phi_arg (phi, init_index, init_edge, loc);
+ add_phi_arg (phi, update_index, loop_edge, loc);
+
+ tree var_ptr = fold_convert (build_pointer_type (array_elem_type),
+ var);
+ tree idx = fold_build2 (MULT_EXPR, sizetype,
+ fold_convert (sizetype, loop_index),
+ TYPE_SIZE_UNIT (array_elem_type));
+ var_ptr = build2 (POINTER_PLUS_EXPR, TREE_TYPE (var_ptr), var_ptr, idx);
+ tree var_aref = build_simple_mem_ref (var_ptr);
+ ptr = build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, idx);
+
+ gcn_reduction_update (loc, &reduction_code_gsi,
+ ptr, var_aref, op);
+
+ return build_simple_mem_ref (ptr);
+ }
+ else if (TREE_CODE (type) == RECORD_TYPE)
+ {
+ for (tree fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
+ if (TREE_CODE (fld) == FIELD_DECL)
+ {
+ tree var_fld_ref = build3 (COMPONENT_REF, TREE_TYPE (fld),
+ var, fld, NULL);
+ tree ptr_ref = build_simple_mem_ref (ptr);
+ tree ptr_fld_type
+ = build_qualified_type (TREE_TYPE (fld),
+ TYPE_QUALS (TREE_TYPE (ptr_ref)));
+ tree ptr_fld_ref = build3 (COMPONENT_REF, ptr_fld_type,
+ ptr_ref, fld, NULL);
+
+ if (TREE_CODE (TREE_TYPE (fld)) == ARRAY_TYPE)
+ {
+ tree array_elem_ptr_type
+ = build_pointer_type (TREE_TYPE (TREE_TYPE (fld)));
+ gcn_reduction_update
+ (loc, gsi,
+ fold_convert (array_elem_ptr_type,
+ build_fold_addr_expr (ptr_fld_ref)),
+ build_fold_addr_expr (var_fld_ref), op,
+ TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (fld))));
+ }
+ else
+ gcn_reduction_update (loc, gsi,
+ build_fold_addr_expr (ptr_fld_ref),
+ var_fld_ref, op);
+ }
+ return build_simple_mem_ref (ptr);
+ }
+
if (size == TYPE_SIZE (unsigned_type_node)
|| size == TYPE_SIZE (long_long_unsigned_type_node))
return gcn_lockless_update (loc, gsi, ptr, var, op);
@@ -306,7 +446,7 @@ gcn_reduction_update (location_t loc, gimple_stmt_iterator *gsi,
/* Return a temporary variable decl to use for an OpenACC worker reduction. */
static tree
-gcn_goacc_get_worker_red_decl (tree type, unsigned offset)
+gcn_goacc_get_worker_red_decl (tree type, tree offset_expr)
{
machine_function *machfun = cfun->machine;
@@ -317,15 +457,165 @@ gcn_goacc_get_worker_red_decl (tree type, unsigned offset)
= build_qualified_type (type,
(TYPE_QUALS (type)
| ENCODE_QUAL_ADDR_SPACE (ADDR_SPACE_LDS)));
-
- gcc_assert (offset
- < (machfun->reduction_limit - machfun->reduction_base));
- tree ptr_type = build_pointer_type (var_type);
- tree addr = build_int_cst (ptr_type, machfun->reduction_base + offset);
+ tree addr;
+ if (TREE_CONSTANT (offset_expr))
+ {
+ unsigned offset = TREE_INT_CST_LOW (offset_expr);
+ gcc_assert (offset
+ < (machfun->reduction_limit - machfun->reduction_base));
+ tree ptr_type = build_pointer_type (var_type);
+ addr = build_int_cst (ptr_type, machfun->reduction_base + offset);
+ }
+ else
+ {
+ tree ptr_type = build_pointer_type (var_type);
+ tree red_base = build_int_cst (ptr_type, machfun->reduction_base);
+ addr = build2 (POINTER_PLUS_EXPR, ptr_type,
+ red_base, fold_convert (size_type_node, offset_expr));
+ }
return build_simple_mem_ref (addr);
}
+static tree
+gcn_goacc_get_worker_array_reduction_buffer (tree array_type,
+ tree array_max_idx,
+ gimple_seq *seq)
+{
+ gcc_assert (!gcn_array_reduction_buffers.is_empty ());
+ tree red_buf_ptr = gcn_array_reduction_buffers.last ();
+
+ tree ptr = make_ssa_name (ptr_type_node);
+ gimplify_assign (ptr, red_buf_ptr, seq);
+
+ tree whole_block_ptr;
+ if (TREE_CODE (array_type) == ARRAY_TYPE)
+ whole_block_ptr = fold_convert (build_pointer_type (array_type), ptr);
+ else
+ whole_block_ptr = array_type;
+
+ tree arg = build_int_cst (unsigned_type_node, GOMP_DIM_GANG);
+ tree gang_id = make_ssa_name (integer_type_node);
+ gimple *gang_id_call = gimple_build_call_internal (IFN_GOACC_DIM_POS, 1, arg);
+ gimple_call_set_lhs (gang_id_call, gang_id);
+ gimple_seq_add_stmt (seq, gang_id_call);
+
+ tree len = fold_build2 (PLUS_EXPR, size_type_node, array_max_idx,
+ size_int (1));
+ tree elem_size = TYPE_SIZE_UNIT (TREE_TYPE (array_type));
+ tree array_size_expr = build2 (MULT_EXPR, size_type_node, len, elem_size);
+ tree type_size = make_ssa_name (size_type_node);
+ gimplify_assign (type_size, array_size_expr, seq);
+
+ tree idx = make_ssa_name (size_type_node);
+ gimplify_assign (idx, build2 (MULT_EXPR, size_type_node, type_size,
+ fold_convert (size_type_node, gang_id)), seq);
+
+ tree addr = fold_convert (ptr_type_node, whole_block_ptr);;
+ addr = build2 (POINTER_PLUS_EXPR, ptr_type_node, addr, idx);
+ addr = fold_convert (build_pointer_type (array_type), addr);
+
+ tree addr_reg = make_ssa_name (build_pointer_type (array_type));
+ gimplify_assign (addr_reg, addr, seq);
+
+ return build_simple_mem_ref (addr_reg);
+}
+
+static void
+gcn_create_if_else_seq (gimple_stmt_iterator *gsi_p, gimple *split_stmt,
+ gimple_seq *then_seq, gimple_seq *else_seq)
+{
+ basic_block init_bb = gsi_bb (*gsi_p);
+
+ edge fallthru_edge = split_block (init_bb, split_stmt);
+ basic_block then_bb = fallthru_edge->dest;
+
+ /* Reset the iterator. */
+ *gsi_p = gsi_for_stmt (gsi_stmt (*gsi_p));
+
+ gimple *then_seq_end = gimple_seq_last (*then_seq);
+ gsi_insert_seq_before (gsi_p, *then_seq, GSI_SAME_STMT);
+
+ basic_block last_bb = then_bb;
+ gimple *last_seq_end = then_seq_end;
+
+ basic_block else_bb = NULL;
+ edge then_else_fallthru_edge = NULL;
+ if (else_seq)
+ {
+ then_else_fallthru_edge = split_block (then_bb, then_seq_end);
+ else_bb = then_else_fallthru_edge->dest;
+
+ /* Reset the iterator. */
+ *gsi_p = gsi_for_stmt (gsi_stmt (*gsi_p));
+
+ gimple *else_seq_end = gimple_seq_last (*else_seq);
+ gsi_insert_seq_before (gsi_p, *else_seq, GSI_SAME_STMT);
+
+ last_bb = else_bb;
+ last_seq_end = else_seq_end;
+ }
+
+ edge post_edge = split_block (last_bb, last_seq_end);
+ basic_block post_bb = post_edge->dest;
+
+ /* Reset the iterator. */
+ *gsi_p = gsi_for_stmt (gsi_stmt (*gsi_p));
+
+ edge if_true_edge = make_edge (init_bb, (else_seq ? else_bb : post_bb),
+ EDGE_TRUE_VALUE);
+ if_true_edge->probability = profile_probability::even ();
+ fallthru_edge->flags = EDGE_FALSE_VALUE;
+ fallthru_edge->probability = profile_probability::even ();
+
+ post_edge->flags = EDGE_FALLTHRU;
+ post_edge->probability = profile_probability::always ();
+
+ set_immediate_dominator (CDI_DOMINATORS, then_bb, init_bb);
+ set_immediate_dominator (CDI_DOMINATORS, post_bb, init_bb);
+
+ if (else_seq)
+ {
+ redirect_edge_and_branch (then_else_fallthru_edge, post_bb);
+ set_immediate_dominator (CDI_DOMINATORS, else_bb, init_bb);
+ }
+}
+
+static void
+gcn_create_do_while_loop_seq (gimple_stmt_iterator *gsi_p,
+ gimple_seq *body_seq, int edge_flags)
+{
+ gimple *g = NULL;
+ basic_block init_bb = gsi_bb (*gsi_p);
+ edge init_edge = split_block (init_bb, g);
+ basic_block loop_bb = init_edge->dest;
+ init_bb = init_edge->src;
+
+ /* Reset the iterator. */
+ *gsi_p = gsi_for_stmt (gsi_stmt (*gsi_p));
+
+ gimple_stmt_iterator loop_gsi = gsi_start_bb (loop_bb);
+
+ gimple *body_seq_end = gimple_seq_last (*body_seq);
+ gsi_insert_seq_before (&loop_gsi, *body_seq, GSI_SAME_STMT);
+
+ edge post_edge = split_block (loop_bb, body_seq_end);
+ basic_block post_bb = post_edge->dest;
+
+ /* Reset the iterator. */
+ *gsi_p = gsi_for_stmt (gsi_stmt (*gsi_p));
+
+ make_edge (loop_bb, loop_bb, edge_flags);
+ post_edge->flags = EDGE_FALSE_VALUE;
+ set_immediate_dominator (CDI_DOMINATORS, loop_bb, init_bb);
+ set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb);
+
+ loop *loop = alloc_loop ();
+ loop->header = loop_bb;
+ loop->latch = loop_bb;
+ add_loop (loop, loop_bb->loop_father);
+}
+
/* Expand IFN_GOACC_REDUCTION_SETUP. */
static void
@@ -335,35 +625,155 @@ gcn_goacc_reduction_setup (gcall *call)
tree lhs = gimple_call_lhs (call);
tree var = gimple_call_arg (call, 2);
int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3));
+
+ tree array_addr = gimple_call_arg (call, 6);
+ tree array_max_idx = gimple_call_arg (call, 7);
+ bool array_p = !integer_zerop (array_addr);
+
+ tree array_type = NULL_TREE;
+ if (array_p)
+ array_type
+ = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE
+ && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE
+ ? TREE_TYPE (TREE_TYPE (array_addr))
+ : TREE_TYPE (array_addr));
+
gimple_seq seq = NULL;
push_gimplify_context (true);
+ /* Copy the receiver object. */
+ tree ref_to_res = gimple_call_arg (call, 1);
+
if (level != GOMP_DIM_GANG)
{
- /* Copy the receiver object. */
- tree ref_to_res = gimple_call_arg (call, 1);
-
if (!integer_zerop (ref_to_res))
- var = build_simple_mem_ref (ref_to_res);
+ {
+ if (!array_p)
+ var = build_simple_mem_ref (ref_to_res);
+ }
+ }
+
+ if (array_p && !integer_zerop (ref_to_res))
+ {
+ gimple_seq condseq = NULL;
+
+ /* Create global variable to store pointer to array reduction buffer. */
+ tree reduction_buffer_ptr_type
+ = build_qualified_type (ptr_type_node, TYPE_QUAL_VOLATILE);
+ tree reduction_buffer_ptr
+ = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+ create_tmp_var_name ("gcn_array_reduction_buf"),
+ reduction_buffer_ptr_type);
+ TREE_STATIC (reduction_buffer_ptr) = 1;
+ TREE_PUBLIC (reduction_buffer_ptr) = 0;
+ DECL_INITIAL (reduction_buffer_ptr) = 0;
+ DECL_EXTERNAL (reduction_buffer_ptr) = 0;
+
+ varpool_node::add (reduction_buffer_ptr);
+
+ tree reduction_buffer_ptr_addr = make_ssa_name (ptr_type_node);
+ gimplify_assign (reduction_buffer_ptr_addr,
+ build_fold_addr_expr (reduction_buffer_ptr), &condseq);
+
+ tree gang_dim_arg = build_int_cst (unsigned_type_node, GOMP_DIM_GANG);
+ tree gang_pos = make_ssa_name (integer_type_node);
+ gimple *gang_pos_call = gimple_build_call_internal (IFN_GOACC_DIM_POS,
+ 1, gang_dim_arg);
+ gimple_call_set_lhs (gang_pos_call, gang_pos);
+ gimple_seq_add_stmt (&condseq, gang_pos_call);
+ gimple *cond = gimple_build_cond (NE_EXPR, gang_pos, integer_zero_node,
+ NULL, NULL);
+ gimple_seq_add_stmt (&condseq, cond);
+ gimple *cond_end = gimple_seq_last (condseq);
+ gsi_insert_seq_before (&gsi, condseq, GSI_SAME_STMT);
+
+ gimple_seq malloc_seq = NULL;
+ tree gang_num = make_ssa_name (integer_type_node);
+ gimple *gang_num_call = gimple_build_call_internal (IFN_GOACC_DIM_SIZE,
+ 1, gang_dim_arg);
+ gimple_call_set_lhs (gang_num_call, gang_num);
+ gimple_seq_add_stmt (&malloc_seq, gang_num_call);
+
+ tree len = fold_build2 (PLUS_EXPR, size_type_node, array_max_idx,
+ size_int (1));
+ tree elem_size = TYPE_SIZE_UNIT (TREE_TYPE (array_type));
+ tree malloc_size_expr = build2 (MULT_EXPR, size_type_node, len,
+ elem_size);
+ malloc_size_expr = build2 (MULT_EXPR, size_type_node, malloc_size_expr,
+ fold_convert (size_type_node, gang_num));
+ tree malloc_size = make_ssa_name (size_type_node);
+ gimplify_assign (malloc_size, malloc_size_expr, &malloc_seq);
+
+ tree ptr = make_ssa_name (ptr_type_node);
+ tree malloc_decl = builtin_decl_explicit (BUILT_IN_MALLOC);
+ gcall *stmt = gimple_build_call (malloc_decl, 1, malloc_size);
+ gimple_call_set_lhs (stmt, ptr);
+ gimple_seq_add_stmt (&malloc_seq, stmt);
+
+ tree atomic_store_decl = builtin_decl_explicit (BUILT_IN_ATOMIC_STORE_8);
+ gcall *atomic_store
+ = gimple_build_call (atomic_store_decl, 3, reduction_buffer_ptr_addr,
+ ptr, build_int_cst (integer_type_node,
+ MEMMODEL_RELEASE));
+ gimple_seq_add_stmt (&malloc_seq, atomic_store);
+
+ gimple_seq wait_seq = NULL;
+ gimple *nop = gimple_build_nop ();
+ gimple_seq_add_stmt (&wait_seq, nop);
+
+ gcn_create_if_else_seq (&gsi, cond_end, &malloc_seq, &wait_seq);
+
+ /* Create cmp-swap loop for other gangs to wait for
+ gcn_array_reduction_buf.* to be properly set by gang zero. */
+ gimple_stmt_iterator ngsi = gsi_for_stmt (nop);
+
+ gimple_seq atomic_load_seq = NULL;
+ tree loadval = make_ssa_name (size_type_node);
+ tree atomic_load_decl = builtin_decl_explicit (BUILT_IN_ATOMIC_LOAD_8);
+ gcall *atomic_load
+ = gimple_build_call (atomic_load_decl, 2, reduction_buffer_ptr_addr,
+ build_int_cst (integer_type_node,
+ MEMMODEL_ACQUIRE));
+ gimple_call_set_lhs (atomic_load, loadval);
+ gimple_seq_add_stmt (&atomic_load_seq, atomic_load);
+ cond = gimple_build_cond (EQ_EXPR, loadval, size_zero_node,
+ NULL_TREE, NULL_TREE);
+ gimple_seq_add_stmt (&atomic_load_seq, cond);
+
+ gcn_create_do_while_loop_seq (&ngsi, &atomic_load_seq, EDGE_TRUE_VALUE);
+ gcn_array_reduction_buffers.safe_push (reduction_buffer_ptr);
}
if (level == GOMP_DIM_WORKER)
{
- tree var_type = TREE_TYPE (var);
- /* Store incoming value to worker reduction buffer. */
tree offset = gimple_call_arg (call, 5);
- tree decl
- = gcn_goacc_get_worker_red_decl (var_type, TREE_INT_CST_LOW (offset));
-
- gimplify_assign (decl, var, &seq);
+ if (array_p)
+ {
+ tree decl = gcn_goacc_get_worker_array_reduction_buffer
+ (array_type, array_max_idx, &seq);
+ tree ptr = make_ssa_name (TREE_TYPE (array_addr));
+ gimplify_assign (ptr, build_fold_addr_expr (decl), &seq);
+
+ /* Store incoming value to worker reduction buffer. */
+ oacc_build_array_copy (ptr, array_addr, array_max_idx, &seq);
+ }
+ else
+ {
+ tree var_type = TREE_TYPE (var);
+ /* Store incoming value to worker reduction buffer. */
+ tree decl = gcn_goacc_get_worker_red_decl (var_type, offset);
+ gimplify_assign (decl, var, &seq);
+ }
}
if (lhs)
- gimplify_assign (lhs, var, &seq);
+ gimplify_assign (lhs, unshare_expr (var), &seq);
pop_gimplify_context (NULL);
- gsi_replace_with_seq (&gsi, seq, true);
+
+ gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
+ gsi_remove (&gsi, true);
}
/* Expand IFN_GOACC_REDUCTION_INIT. */
@@ -377,12 +787,55 @@ gcn_goacc_reduction_init (gcall *call)
int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3));
enum tree_code rcode
= (enum tree_code) TREE_INT_CST_LOW (gimple_call_arg (call, 4));
- tree init = omp_reduction_init_op (gimple_location (call), rcode,
- TREE_TYPE (var));
+
+ tree array_addr = gimple_call_arg (call, 6);
+ tree array_max_idx = gimple_call_arg (call, 7);
+ bool array_p = !integer_zerop (array_addr);
+
+ tree array_type = NULL_TREE;
+ if (array_p)
+ array_type
+ = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE
+ && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE
+ ? TREE_TYPE (TREE_TYPE (array_addr))
+ : TREE_TYPE (array_addr));
+
+ tree init = NULL_TREE;
gimple_seq seq = NULL;
push_gimplify_context (true);
+ if (array_p)
+ {
+ tree loop_index;
+ gimple_stmt_iterator loop_body_gsi;
+ oacc_build_indexed_ssa_loop (gimple_location (call), array_max_idx, &gsi,
+ &loop_index, &loop_body_gsi);
+
+ tree init_type = TREE_TYPE (array_type);
+ init = omp_reduction_init_op (gimple_location (call), rcode,
+ init_type);
+ gimple_seq seq = NULL;
+
+ tree ptr = fold_convert (ptr_type_node, array_addr);
+ tree offset = build2 (MULT_EXPR, sizetype,
+ fold_convert (sizetype, loop_index),
+ TYPE_SIZE_UNIT (init_type));
+
+ tree addr = build2 (POINTER_PLUS_EXPR, build_pointer_type (init_type),
+ ptr, offset);
+ tree ref = build_simple_mem_ref (addr);
+
+ push_gimplify_context (true);
+ gimplify_assign (ref, init, &seq);
+ pop_gimplify_context (NULL);
+ gsi_insert_seq_before (&loop_body_gsi, seq, GSI_SAME_STMT);
+ init = var;
+ }
+ else
+ init = omp_reduction_init_op (gimple_location (call), rcode,
+ TREE_TYPE (var));
+
if (level == GOMP_DIM_GANG)
{
/* If there's no receiver object, propagate the incoming VAR. */
@@ -395,7 +848,9 @@ gcn_goacc_reduction_init (gcall *call)
gimplify_assign (lhs, init, &seq);
pop_gimplify_context (NULL);
- gsi_replace_with_seq (&gsi, seq, true);
+
+ gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
+ gsi_remove (&gsi, true);
}
/* Expand IFN_GOACC_REDUCTION_FINI. */
@@ -410,8 +865,13 @@ gcn_goacc_reduction_fini (gcall *call)
int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3));
enum tree_code op
= (enum tree_code) TREE_INT_CST_LOW (gimple_call_arg (call, 4));
+
+ tree array_addr = gimple_call_arg (call, 6);
+ tree array_max_idx = gimple_call_arg (call, 7);
+ bool array_p = !integer_zerop (array_addr);
+
gimple_seq seq = NULL;
- tree r = NULL_TREE;;
+ tree r = NULL_TREE;
push_gimplify_context (true);
@@ -419,11 +879,19 @@ gcn_goacc_reduction_fini (gcall *call)
if (level == GOMP_DIM_WORKER)
{
- tree var_type = TREE_TYPE (var);
tree offset = gimple_call_arg (call, 5);
- tree decl
- = gcn_goacc_get_worker_red_decl (var_type, TREE_INT_CST_LOW (offset));
-
+ tree decl;
+ if (array_p)
+ {
+ tree array_type = TREE_TYPE (TREE_TYPE (array_addr));
+ decl = gcn_goacc_get_worker_array_reduction_buffer
+ (array_type, array_max_idx, &seq);
+ }
+ else
+ {
+ tree var_type = TREE_TYPE (var);
+ decl = gcn_goacc_get_worker_red_decl (var_type, offset);
+ }
accum = build_fold_addr_expr (decl);
}
else if (integer_zerop (ref_to_res))
@@ -436,14 +904,22 @@ gcn_goacc_reduction_fini (gcall *call)
/* UPDATE the accumulator. */
gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
seq = NULL;
- r = gcn_reduction_update (gimple_location (call), &gsi, accum, var, op);
+ if (array_p)
+ {
+ gcn_reduction_update (gimple_location (call), &gsi, accum,
+ array_addr, op, array_max_idx);
+ r = var;
+ }
+ else
+ r = gcn_reduction_update (gimple_location (call), &gsi, accum, var, op);
}
if (lhs)
gimplify_assign (lhs, r, &seq);
pop_gimplify_context (NULL);
- gsi_replace_with_seq (&gsi, seq, true);
+ gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
+ gsi_remove (&gsi, true);
}
/* Expand IFN_GOACC_REDUCTION_TEARDOWN. */
@@ -455,28 +931,90 @@ gcn_goacc_reduction_teardown (gcall *call)
tree lhs = gimple_call_lhs (call);
tree var = gimple_call_arg (call, 2);
int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3));
+
+ tree array_addr = gimple_call_arg (call, 6);
+ tree array_max_idx = gimple_call_arg (call, 7);
+ bool array_p = !integer_zerop (array_addr);
+
+ tree array_accum = NULL_TREE;
+
gimple_seq seq = NULL;
push_gimplify_context (true);
if (level == GOMP_DIM_WORKER)
{
- tree var_type = TREE_TYPE (var);
-
- /* Read the worker reduction buffer. */
tree offset = gimple_call_arg (call, 5);
- tree decl
- = gcn_goacc_get_worker_red_decl (var_type, TREE_INT_CST_LOW (offset));
- var = decl;
+ if (array_p)
+ {
+ tree array_type = TREE_TYPE (TREE_TYPE (array_addr));
+ array_accum = gcn_goacc_get_worker_array_reduction_buffer
+ (array_type, array_max_idx, &seq);
+ }
+ else
+ {
+ tree var_type = TREE_TYPE (var);
+
+ /* Read the worker reduction buffer. */
+ tree decl = gcn_goacc_get_worker_red_decl (var_type, offset);
+ var = decl;
+ }
}
+ /* Write to the receiver object. */
+ tree ref_to_res = gimple_call_arg (call, 1);
if (level != GOMP_DIM_GANG)
{
- /* Write to the receiver object. */
- tree ref_to_res = gimple_call_arg (call, 1);
-
if (!integer_zerop (ref_to_res))
- gimplify_assign (build_simple_mem_ref (ref_to_res), var, &seq);
+ {
+ if (array_p)
+ {
+ tree ptr
+ = make_ssa_name (build_pointer_type (TREE_TYPE (array_addr)));
+ gimplify_assign (ptr, build_fold_addr_expr (array_accum), &seq);
+ oacc_build_array_copy (ref_to_res, ptr, array_max_idx, &seq);
+ }
+ else
+ gimplify_assign (build_simple_mem_ref (ref_to_res), var, &seq);
+ }
+ else if (array_p)
+ {
+ tree ptr
+ = make_ssa_name (build_pointer_type (TREE_TYPE (array_accum)));
+ gimplify_assign (ptr, build_fold_addr_expr (array_accum), &seq);
+ oacc_build_array_copy (array_addr, ptr, array_max_idx, &seq);
+ }
+ }
+
+ if (array_p && !integer_zerop (ref_to_res))
+ {
+ gimple_seq condseq = NULL;
+ tree gang_dim_arg = build_int_cst (unsigned_type_node, GOMP_DIM_GANG);
+ tree gang_pos = make_ssa_name (integer_type_node);
+ gimple *gang_pos_call = gimple_build_call_internal (IFN_GOACC_DIM_POS,
+ 1, gang_dim_arg);
+ gimple_call_set_lhs (gang_pos_call, gang_pos);
+ gimple_seq_add_stmt (&condseq, gang_pos_call);
+ gimple *cond = gimple_build_cond (NE_EXPR, gang_pos, integer_zero_node,
+ NULL, NULL);
+ gimple_seq_add_stmt (&condseq, cond);
+ gimple *cond_end = gimple_seq_last (condseq);
+ gsi_insert_seq_before (&gsi, condseq, GSI_SAME_STMT);
+
+ gimple_seq free_seq = NULL;
+ gcc_assert (!gcn_array_reduction_buffers.is_empty ());
+ tree red_buf_ptr = gcn_array_reduction_buffers.last ();
+
+ tree ptr = make_ssa_name (ptr_type_node);
+ gimplify_assign (ptr, red_buf_ptr, &free_seq);
+
+ gcn_array_reduction_buffers.pop ();
+
+ tree free_decl = builtin_decl_explicit (BUILT_IN_FREE);
+ gcall *stmt = gimple_build_call (free_decl, 1, ptr);
+ gimple_seq_add_stmt (&free_seq, stmt);
+
+ gcn_create_if_else_seq (&gsi, cond_end, &free_seq, NULL);
}
if (lhs)
@@ -484,7 +1022,8 @@ gcn_goacc_reduction_teardown (gcall *call)
pop_gimplify_context (NULL);
- gsi_replace_with_seq (&gsi, seq, true);
+ gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
+ gsi_remove (&gsi, true);
}
/* Implement TARGET_GOACC_REDUCTION.
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index e6d5254..ba40a84 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -2142,19 +2142,15 @@ nvptx_gen_shuffle (rtx dst, rtx src, rtx idx, nvptx_shuffle_kind kind)
break;
case E_V2SImode:
{
- rtx src0 = gen_rtx_SUBREG (SImode, src, 0);
- rtx src1 = gen_rtx_SUBREG (SImode, src, 4);
- rtx dst0 = gen_rtx_SUBREG (SImode, dst, 0);
- rtx dst1 = gen_rtx_SUBREG (SImode, dst, 4);
rtx tmp0 = gen_reg_rtx (SImode);
rtx tmp1 = gen_reg_rtx (SImode);
start_sequence ();
- emit_insn (gen_movsi (tmp0, src0));
- emit_insn (gen_movsi (tmp1, src1));
+ emit_insn (gen_vec_extractv2sisi (tmp0, src, GEN_INT (0)));
+ emit_insn (gen_vec_extractv2sisi (tmp1, src, GEN_INT (1)));
emit_insn (nvptx_gen_shuffle (tmp0, tmp0, idx, kind));
emit_insn (nvptx_gen_shuffle (tmp1, tmp1, idx, kind));
- emit_insn (gen_movsi (dst0, tmp0));
- emit_insn (gen_movsi (dst1, tmp1));
+ emit_insn (gen_vec_setv2si (dst, tmp0, GEN_INT (0)));
+ emit_insn (gen_vec_setv2si (dst, tmp1, GEN_INT (1)));
res = get_insns ();
end_sequence ();
}
@@ -6512,15 +6508,24 @@ nvptx_expand_shared_addr (tree exp, rtx target,
unsigned align = TREE_INT_CST_LOW (CALL_EXPR_ARG (exp, 2));
unsigned offset = TREE_INT_CST_LOW (CALL_EXPR_ARG (exp, 0));
- unsigned size = TREE_INT_CST_LOW (CALL_EXPR_ARG (exp, 1));
+ unsigned size = 0;
+
+ tree size_expr = CALL_EXPR_ARG (exp, 1);
rtx addr = worker_red_sym;
+ if (TREE_CONSTANT (size_expr))
+ size = TREE_INT_CST_LOW (size_expr);
+
if (vector)
{
offload_attrs oa;
populate_offload_attrs (&oa);
+ /* Default size for unknown size expression. */
+ if (size == 0)
+ size = 256;
+
unsigned int psize = ROUND_UP (size + offset, align);
unsigned int pnum = nvptx_mach_max_workers ();
vector_red_partition = MAX (vector_red_partition, psize);
@@ -6536,7 +6541,8 @@ nvptx_expand_shared_addr (tree exp, rtx target,
else
{
worker_red_align = MAX (worker_red_align, align);
- worker_red_size = MAX (worker_red_size, size + offset);
+ if (size)
+ worker_red_size = MAX (worker_red_size, size + offset);
if (offset)
{
@@ -7141,16 +7147,32 @@ nvptx_get_shared_red_addr (tree type, tree offset, bool vector)
enum nvptx_builtins addr_dim = NVPTX_BUILTIN_WORKER_ADDR;
if (vector)
addr_dim = NVPTX_BUILTIN_VECTOR_ADDR;
- machine_mode mode = TYPE_MODE (type);
tree fndecl = nvptx_builtin_decl (addr_dim, true);
- tree size = build_int_cst (unsigned_type_node, GET_MODE_SIZE (mode));
- tree align = build_int_cst (unsigned_type_node,
- GET_MODE_ALIGNMENT (mode) / BITS_PER_UNIT);
+ tree size = TYPE_SIZE_UNIT (type);
+ tree align = build_int_cst (unsigned_type_node, TYPE_ALIGN_UNIT (type));
tree call = build_call_expr (fndecl, 3, offset, size, align);
return fold_convert (build_pointer_type (type), call);
}
+static tree
+nvptx_get_shared_red_addr (tree array_elem_type, tree array_max_idx,
+ tree offset, bool vector)
+{
+ tree fndecl = nvptx_builtin_decl ((vector
+ ? NVPTX_BUILTIN_VECTOR_ADDR
+ : NVPTX_BUILTIN_WORKER_ADDR),
+ true);
+ tree align = build_int_cst (unsigned_type_node,
+ TYPE_ALIGN_UNIT (array_elem_type));
+ tree array_length = fold_build2 (PLUS_EXPR, sizetype, array_max_idx,
+ build_int_cst (sizetype, 1));
+ tree size = fold_build2 (MULT_EXPR, sizetype,
+ TYPE_SIZE_UNIT (array_elem_type), array_length);
+ tree call = build_call_expr (fndecl, 3, offset, size, align);
+ return fold_convert (build_pointer_type (array_elem_type), call);
+}
+
/* Emit a SHFL.DOWN using index SHFL of VAR into DEST_VAR. This function
will cast the variable if necessary. */
@@ -7456,11 +7478,144 @@ nvptx_lockfull_update (location_t loc, gimple_stmt_iterator *gsi,
static tree
nvptx_reduction_update (location_t loc, gimple_stmt_iterator *gsi,
- tree ptr, tree var, tree_code op, int level)
+ tree ptr, tree var, tree_code op, int level,
+ tree array_max_idx = NULL_TREE)
{
tree type = TREE_TYPE (var);
tree size = TYPE_SIZE (type);
+ if (!VAR_P (ptr))
+ {
+ tree t = make_ssa_name (TREE_TYPE (ptr));
+ gimple_seq seq = NULL;
+ gimplify_assign (t, ptr, &seq);
+ gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+ ptr = t;
+ }
+
+ if (TREE_CODE (type) == ARRAY_TYPE
+ || TREE_CODE (type) == POINTER_TYPE)
+ {
+ tree array_type;
+ if (TREE_CODE (type) == ARRAY_TYPE)
+ {
+ array_type = TREE_TYPE (var);
+ }
+ else if (TREE_CODE (type) == POINTER_TYPE
+ && TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE)
+ {
+ array_type = TREE_TYPE (TREE_TYPE (var));
+ }
+ else if (TREE_CODE (type) == POINTER_TYPE)
+ {
+ array_type = TREE_TYPE (var);
+ }
+ else
+ gcc_unreachable ();
+
+ tree array_elem_type = TREE_TYPE (array_type);
+
+ gimple *g;
+ gimple_seq seq = NULL;
+ tree max_index = array_max_idx;
+ gcc_assert (array_max_idx);
+
+ tree init_index = make_ssa_name (TREE_TYPE (max_index));
+ tree loop_index = make_ssa_name (TREE_TYPE (max_index));
+ tree update_index = make_ssa_name (TREE_TYPE (max_index));
+
+ g = gimple_build_assign (init_index,
+ build_int_cst (TREE_TYPE (init_index), 0));
+ gimple_seq_add_stmt (&seq, g);
+ gimple *init_end = gimple_seq_last (seq);
+ gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+
+ basic_block init_bb = gsi_bb (*gsi);
+ edge init_edge = split_block (init_bb, init_end);
+ basic_block loop_bb = init_edge->dest;
+ /* Reset the iterator. */
+ *gsi = gsi_for_stmt (gsi_stmt (*gsi));
+
+ seq = NULL;
+ g = gimple_build_assign (update_index, PLUS_EXPR, loop_index,
+ build_int_cst (TREE_TYPE (loop_index), 1));
+ gimple_seq_add_stmt (&seq, g);
+
+ g = gimple_build_cond (LE_EXPR, update_index, max_index, NULL, NULL);
+ gimple_seq_add_stmt (&seq, g);
+ gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+
+ edge post_edge = split_block (loop_bb, g);
+ basic_block post_bb = post_edge->dest;
+ loop_bb = post_edge->src;
+ /* Reset the iterator. */
+ *gsi = gsi_for_stmt (gsi_stmt (*gsi));
+
+ /* Place where we insert reduction code below. */
+ gimple_stmt_iterator reduction_code_gsi = gsi_start_bb (loop_bb);
+
+ post_edge->flags ^= EDGE_FALSE_VALUE | EDGE_FALLTHRU;
+ post_edge->probability = profile_probability::even ();
+ edge loop_edge = make_edge (loop_bb, loop_bb, EDGE_TRUE_VALUE);
+ loop_edge->probability = profile_probability::even ();
+ set_immediate_dominator (CDI_DOMINATORS, loop_bb, init_bb);
+ set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb);
+ class loop *new_loop = alloc_loop ();
+ new_loop->header = loop_bb;
+ new_loop->latch = loop_bb;
+ add_loop (new_loop, loop_bb->loop_father);
+
+ gphi *phi = create_phi_node (loop_index, loop_bb);
+ add_phi_arg (phi, init_index, init_edge, loc);
+ add_phi_arg (phi, update_index, loop_edge, loc);
+
+ tree var_ptr = fold_convert (build_pointer_type (array_elem_type),
+ var);
+ tree idx = fold_build2 (MULT_EXPR, sizetype,
+ fold_convert (sizetype, loop_index),
+ TYPE_SIZE_UNIT (array_elem_type));
+ var_ptr = build2 (POINTER_PLUS_EXPR, TREE_TYPE (var_ptr), var_ptr, idx);
+ tree var_aref = build_simple_mem_ref (var_ptr);
+ ptr = build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr),ptr, idx);
+
+ nvptx_reduction_update (loc, &reduction_code_gsi,
+ ptr, var_aref, op, level);
+
+ return build_simple_mem_ref (ptr);
+ }
+ else if (TREE_CODE (type) == RECORD_TYPE)
+ {
+ for (tree fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
+ if (TREE_CODE (fld) == FIELD_DECL)
+ {
+ tree var_fld_ref = build3 (COMPONENT_REF, TREE_TYPE (fld),
+ var, fld, NULL);
+ tree ptr_ref = build_simple_mem_ref (ptr);
+ tree ptr_fld_type
+ = build_qualified_type (TREE_TYPE (fld),
+ TYPE_QUALS (TREE_TYPE (ptr_ref)));
+ tree ptr_fld_ref = build3 (COMPONENT_REF, ptr_fld_type,
+ ptr_ref, fld, NULL);
+
+ if (TREE_CODE (TREE_TYPE (fld)) == ARRAY_TYPE)
+ {
+ tree array_elem_ptr_type
+ = build_pointer_type (TREE_TYPE (TREE_TYPE (fld)));
+ nvptx_reduction_update
+ (loc, gsi,
+ fold_convert (array_elem_ptr_type,
+ build_fold_addr_expr (ptr_fld_ref)),
+ build_fold_addr_expr (var_fld_ref), op, level,
+ TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (fld))));
+ }
+ else
+ nvptx_reduction_update (loc, gsi,
+ build_fold_addr_expr (ptr_fld_ref),
+ var_fld_ref, op, level);
+ }
+ return build_simple_mem_ref (ptr);
+ }
+
if (size == TYPE_SIZE (unsigned_type_node)
|| size == TYPE_SIZE (long_long_unsigned_type_node))
return nvptx_lockless_update (loc, gsi, ptr, var, op);
@@ -7602,6 +7757,19 @@ nvptx_goacc_reduction_setup (gcall *call, offload_attrs *oa)
tree lhs = gimple_call_lhs (call);
tree var = gimple_call_arg (call, 2);
int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3));
+
+ tree array_addr = gimple_call_arg (call, 6);
+ tree array_max_idx = gimple_call_arg (call, 7);
+ bool array_p = !integer_zerop (array_addr);
+
+ tree array_type = NULL_TREE;
+ if (array_p)
+ array_type
+ = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE
+ && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE
+ ? TREE_TYPE (TREE_TYPE (array_addr))
+ : TREE_TYPE (array_addr));
+
gimple_seq seq = NULL;
push_gimplify_context (true);
@@ -7611,34 +7779,52 @@ nvptx_goacc_reduction_setup (gcall *call, offload_attrs *oa)
/* Copy the receiver object. */
tree ref_to_res = gimple_call_arg (call, 1);
- if (!integer_zerop (ref_to_res))
+ if (!integer_zerop (ref_to_res) && !array_p)
{
- ref_to_res = nvptx_adjust_reduction_type (ref_to_res, TREE_TYPE (var),
- &seq);
+ ref_to_res = nvptx_adjust_reduction_type (ref_to_res,
+ TREE_TYPE (var), &seq);
var = build_simple_mem_ref (ref_to_res);
}
}
if (level == GOMP_DIM_WORKER
- || (level == GOMP_DIM_VECTOR && oa->vector_length > PTX_WARP_SIZE))
+ || (level == GOMP_DIM_VECTOR
+ && (oa->vector_length > PTX_WARP_SIZE
+ || array_p
+ || TREE_CODE (TREE_TYPE (var)) == RECORD_TYPE)))
{
/* Store incoming value to worker reduction buffer. */
tree offset = gimple_call_arg (call, 5);
- tree call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset,
- level == GOMP_DIM_VECTOR);
- tree ptr = make_ssa_name (TREE_TYPE (call));
-
- gimplify_assign (ptr, call, &seq);
- tree ref = build_simple_mem_ref (ptr);
- TREE_THIS_VOLATILE (ref) = 1;
- gimplify_assign (ref, var, &seq);
+ tree call, ptr;
+ if (array_p)
+ {
+ tree array_elem_type = TREE_TYPE (array_type);
+ call = nvptx_get_shared_red_addr (array_elem_type, array_max_idx,
+ offset, level == GOMP_DIM_VECTOR);
+ ptr = make_ssa_name (TREE_TYPE (call));
+ gimplify_assign (ptr, call, &seq);
+ oacc_build_array_copy (fold_convert (TREE_TYPE (array_addr), ptr),
+ array_addr, array_max_idx, &seq);
+ }
+ else
+ {
+ call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset,
+ level == GOMP_DIM_VECTOR);
+ ptr = make_ssa_name (TREE_TYPE (call));
+ gimplify_assign (ptr, call, &seq);
+ tree ref = build_simple_mem_ref (ptr);
+ TREE_THIS_VOLATILE (ref) = 1;
+ gimplify_assign (ref, var, &seq);
+ }
}
if (lhs)
- gimplify_assign (lhs, var, &seq);
+ gimplify_assign (lhs, unshare_expr (var), &seq);
pop_gimplify_context (NULL);
- gsi_replace_with_seq (&gsi, seq, true);
+
+ gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
+ gsi_remove (&gsi, true);
}
/* NVPTX implementation of GOACC_REDUCTION_INIT. */
@@ -7652,13 +7838,55 @@ nvptx_goacc_reduction_init (gcall *call, offload_attrs *oa)
int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3));
enum tree_code rcode
= (enum tree_code)TREE_INT_CST_LOW (gimple_call_arg (call, 4));
- tree init = omp_reduction_init_op (gimple_location (call), rcode,
- TREE_TYPE (var));
+ tree array_addr = gimple_call_arg (call, 6);
+ tree array_max_idx = gimple_call_arg (call, 7);
+ bool array_p = !integer_zerop (array_addr);
+
+ tree array_type = NULL_TREE;
+ if (array_p)
+ array_type
+ = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE
+ && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE
+ ? TREE_TYPE (TREE_TYPE (array_addr))
+ : TREE_TYPE (array_addr));
+
+ tree init = NULL_TREE;
gimple_seq seq = NULL;
push_gimplify_context (true);
+ if (array_p)
+ {
+ tree loop_index;
+ gimple_stmt_iterator loop_body_gsi;
+ oacc_build_indexed_ssa_loop (gimple_location (call), array_max_idx, &gsi,
+ &loop_index, &loop_body_gsi);
+
+ tree init_type = TREE_TYPE (array_type);
+ init = omp_reduction_init_op (gimple_location (call), rcode,
+ init_type);
+ gimple_seq seq = NULL;
+
+ tree ptr = fold_convert (ptr_type_node, array_addr);
+ tree offset = build2 (MULT_EXPR, sizetype,
+ fold_convert (sizetype, loop_index),
+ TYPE_SIZE_UNIT (init_type));
+ tree addr = build2 (POINTER_PLUS_EXPR, build_pointer_type (init_type),
+ ptr, offset);
+ tree ref = build_simple_mem_ref (addr);
+
+ push_gimplify_context (true);
+ gimplify_assign (ref, init, &seq);
+ pop_gimplify_context (NULL);
+ gsi_insert_seq_before (&loop_body_gsi, seq, GSI_SAME_STMT);
+ init = var;
+ }
+ else
+ init = omp_reduction_init_op (gimple_location (call), rcode,
+ TREE_TYPE (var));
- if (level == GOMP_DIM_VECTOR && oa->vector_length == PTX_WARP_SIZE)
+ if (level == GOMP_DIM_VECTOR && oa->vector_length == PTX_WARP_SIZE
+ && !array_p
+ && TREE_CODE (TREE_TYPE (var)) != RECORD_TYPE)
{
/* Initialize vector-non-zeroes to INIT_VAL (OP). */
tree tid = make_ssa_name (integer_type_node);
@@ -7723,7 +7951,9 @@ nvptx_goacc_reduction_init (gcall *call, offload_attrs *oa)
}
pop_gimplify_context (NULL);
- gsi_replace_with_seq (&gsi, seq, true);
+
+ gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
+ gsi_remove (&gsi, true);
}
/* NVPTX implementation of GOACC_REDUCTION_FINI. */
@@ -7738,25 +7968,49 @@ nvptx_goacc_reduction_fini (gcall *call, offload_attrs *oa)
int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3));
enum tree_code op
= (enum tree_code)TREE_INT_CST_LOW (gimple_call_arg (call, 4));
+
+ tree array_addr = gimple_call_arg (call, 6);
+ tree array_max_idx = gimple_call_arg (call, 7);
+ bool array_p = !integer_zerop (array_addr);
+
+ tree array_type = NULL_TREE;
+ if (array_p)
+ array_type
+ = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE
+ && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE
+ ? TREE_TYPE (TREE_TYPE (array_addr))
+ : TREE_TYPE (array_addr));
+
gimple_seq seq = NULL;
tree r = NULL_TREE;
push_gimplify_context (true);
- if (level == GOMP_DIM_VECTOR && oa->vector_length == PTX_WARP_SIZE)
+ if (level == GOMP_DIM_VECTOR && oa->vector_length == PTX_WARP_SIZE
+ && !array_p
+ && TREE_CODE (TREE_TYPE (var)) != RECORD_TYPE)
r = nvptx_vector_reduction (gimple_location (call), &gsi, var, op);
else
{
tree accum = NULL_TREE;
+ tree ptr = NULL_TREE;
if (level == GOMP_DIM_WORKER || level == GOMP_DIM_VECTOR)
{
/* Get reduction buffer address. */
tree offset = gimple_call_arg (call, 5);
- tree call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset,
- level == GOMP_DIM_VECTOR);
- tree ptr = make_ssa_name (TREE_TYPE (call));
-
+ tree call;
+ if (array_p)
+ {
+ tree array_elem_type = TREE_TYPE (array_type);
+ call = nvptx_get_shared_red_addr (array_elem_type, array_max_idx,
+ offset,
+ level == GOMP_DIM_VECTOR);
+ }
+ else
+ call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset,
+ level == GOMP_DIM_VECTOR);
+ ptr = make_ssa_name (TREE_TYPE (call));
gimplify_assign (ptr, call, &seq);
accum = ptr;
}
@@ -7774,8 +8028,16 @@ nvptx_goacc_reduction_fini (gcall *call, offload_attrs *oa)
/* UPDATE the accumulator. */
gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
seq = NULL;
- r = nvptx_reduction_update (gimple_location (call), &gsi,
- accum, var, op, level);
+ if (array_p)
+ {
+ nvptx_reduction_update (gimple_location (call), &gsi,
+ accum, array_addr, op, level,
+ array_max_idx);
+ r = var;
+ }
+ else
+ r = nvptx_reduction_update (gimple_location (call), &gsi,
+ accum, var, op, level);
}
}
@@ -7783,7 +8045,8 @@ nvptx_goacc_reduction_fini (gcall *call, offload_attrs *oa)
gimplify_assign (lhs, r, &seq);
pop_gimplify_context (NULL);
- gsi_replace_with_seq (&gsi, seq, true);
+ gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
+ gsi_remove (&gsi, true);
}
/* NVPTX implementation of GOACC_REDUCTION_TEARDOWN. */
@@ -7795,21 +8058,47 @@ nvptx_goacc_reduction_teardown (gcall *call, offload_attrs *oa)
tree lhs = gimple_call_lhs (call);
tree var = gimple_call_arg (call, 2);
int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3));
+
+ tree array_addr = gimple_call_arg (call, 6);
+ tree array_max_idx = gimple_call_arg (call, 7);
+ bool array_p = !integer_zerop (array_addr);
+
+ tree array_type = NULL_TREE;
+ if (array_p)
+ array_type
+ = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE
+ && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE
+ ? TREE_TYPE (TREE_TYPE (array_addr))
+ : TREE_TYPE (array_addr));
+
+ tree ptr = NULL_TREE;
gimple_seq seq = NULL;
push_gimplify_context (true);
if (level == GOMP_DIM_WORKER
- || (level == GOMP_DIM_VECTOR && oa->vector_length > PTX_WARP_SIZE))
+ || (level == GOMP_DIM_VECTOR && (oa->vector_length > PTX_WARP_SIZE
+ || array_p)))
{
/* Read the worker reduction buffer. */
tree offset = gimple_call_arg (call, 5);
- tree call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset,
- level == GOMP_DIM_VECTOR);
- tree ptr = make_ssa_name (TREE_TYPE (call));
-
- gimplify_assign (ptr, call, &seq);
- var = build_simple_mem_ref (ptr);
- TREE_THIS_VOLATILE (var) = 1;
+ if (array_p)
+ {
+ tree array_elem_type = TREE_TYPE (array_type);
+ tree call
+ = nvptx_get_shared_red_addr (array_elem_type, array_max_idx,
+ offset, level == GOMP_DIM_VECTOR);
+ ptr = make_ssa_name (TREE_TYPE (call));
+ gimplify_assign (ptr, call, &seq);
+ }
+ else
+ {
+ tree call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset,
+ level == GOMP_DIM_VECTOR);
+ ptr = make_ssa_name (TREE_TYPE (call));
+ gimplify_assign (ptr, call, &seq);
+ var = build_simple_mem_ref (ptr);
+ TREE_THIS_VOLATILE (var) = 1;
+ }
}
if (level != GOMP_DIM_GANG)
@@ -7819,18 +8108,28 @@ nvptx_goacc_reduction_teardown (gcall *call, offload_attrs *oa)
if (!integer_zerop (ref_to_res))
{
- ref_to_res = nvptx_adjust_reduction_type (ref_to_res, TREE_TYPE (var),
- &seq);
- gimplify_assign (build_simple_mem_ref (ref_to_res), var, &seq);
+ if (array_p)
+ oacc_build_array_copy (ref_to_res, ptr, array_max_idx, &seq);
+ else
+ {
+ ref_to_res = nvptx_adjust_reduction_type (ref_to_res,
+ TREE_TYPE (var), &seq);
+ gimplify_assign (build_simple_mem_ref (ref_to_res), var, &seq);
+ }
+ }
+ else if (array_p)
+ {
+ oacc_build_array_copy (array_addr, ptr, array_max_idx, &seq);
}
}
if (lhs)
- gimplify_assign (lhs, var, &seq);
+ gimplify_assign (lhs, unshare_expr (var), &seq);
pop_gimplify_context (NULL);
- gsi_replace_with_seq (&gsi, seq, true);
+ gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
+ gsi_remove (&gsi, true);
}
/* NVPTX reduction expander. */
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 1c104eb..232881d 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -39799,8 +39799,7 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
case OMP_CLAUSE_HAS_DEVICE_ADDR:
array_section_p = false;
dims.truncate (0);
- while ((ort != C_ORT_ACC || kind != OMP_CLAUSE_REDUCTION)
- && cp_lexer_next_token_is (parser->lexer, CPP_OPEN_SQUARE))
+ while (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_SQUARE))
{
location_t loc = UNKNOWN_LOCATION;
tree low_bound = NULL_TREE, length = NULL_TREE;
@@ -41512,6 +41511,12 @@ cp_parser_omp_clause_reduction (cp_parser *parser, enum omp_clause_code kind,
code = TRUTH_ANDIF_EXPR;
else if (id == ovl_op_identifier (false, TRUTH_ORIF_EXPR))
code = TRUTH_ORIF_EXPR;
+ if (code == ERROR_MARK && ort == C_ORT_ACC)
+ {
+ cp_parser_error (parser, "expected %<+%>, %<*%>, %<-%>, %<&%>, "
+ "%<^%>, %<|%>, %<&&%>, %<||%>, %<min%> or %<max%>");
+ goto resync_fail;
+ }
id = omp_reduction_id (code, id, NULL_TREE);
tree scope = parser->scope;
if (scope)
@@ -41538,6 +41543,10 @@ cp_parser_omp_clause_reduction (cp_parser *parser, enum omp_clause_code kind,
for (c = nlist; c != list; c = OMP_CLAUSE_CHAIN (c))
{
OMP_CLAUSE_REDUCTION_CODE (c) = code;
+ /* OpenACC does not require anything below. */
+ if (ort == C_ORT_ACC)
+ continue;
+
if (task)
OMP_CLAUSE_REDUCTION_TASK (c) = 1;
else if (inscan)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index deb82c9..196eb9f 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -7187,6 +7187,69 @@ cp_check_omp_declare_reduction (tree udr)
return true;
}
+
+static bool
+cp_oacc_reduction_defined_type_p (enum tree_code reduction_code, tree t)
+{
+ if (TREE_CODE (t) == INTEGER_TYPE)
+ return true;
+
+ if (FLOAT_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE)
+ switch (reduction_code)
+ {
+ case PLUS_EXPR:
+ case MULT_EXPR:
+ case MINUS_EXPR:
+ case TRUTH_ANDIF_EXPR:
+ case TRUTH_ORIF_EXPR:
+ return true;
+ case MIN_EXPR:
+ case MAX_EXPR:
+ return TREE_CODE (t) != COMPLEX_TYPE;
+ case BIT_AND_EXPR:
+ case BIT_XOR_EXPR:
+ case BIT_IOR_EXPR:
+ return false;
+ default:
+ gcc_unreachable ();
+ }
+
+ if (TREE_CODE (t) == ARRAY_TYPE)
+ return cp_oacc_reduction_defined_type_p (reduction_code, TREE_TYPE (t));
+
+ if (TREE_CODE (t) == RECORD_TYPE)
+ {
+ for (tree fld = TYPE_FIELDS (t); fld; fld = TREE_CHAIN (fld))
+ if (TREE_CODE (fld) == FIELD_DECL
+ && !cp_oacc_reduction_defined_type_p (reduction_code,
+ TREE_TYPE (fld)))
+ return false;
+ return true;
+ }
+
+ return false;
+}
+
+static const char *
+cp_oacc_reduction_code_name (enum tree_code reduction_code)
+{
+ switch (reduction_code)
+ {
+ case PLUS_EXPR: return "+";
+ case MULT_EXPR: return "*";
+ case MINUS_EXPR: return "-";
+ case TRUTH_ANDIF_EXPR: return "&&";
+ case TRUTH_ORIF_EXPR: return "||";
+ case MIN_EXPR: return "min";
+ case MAX_EXPR: return "max";
+ case BIT_AND_EXPR: return "&";
+ case BIT_XOR_EXPR: return "^";
+ case BIT_IOR_EXPR: return "|";
+ default:
+ gcc_unreachable ();
+ }
+}
+
/* Helper function of finish_omp_clauses. Clone STMT as if we were making
an inline call. But, remap
the OMP_DECL1 VAR_DECL (omp_out resp. omp_orig) to PLACEHOLDER
@@ -7333,6 +7396,20 @@ finish_omp_reduction_clause (tree c, enum c_omp_region_type ort,
return false;
}
+ if (ort == C_ORT_ACC)
+ {
+ enum tree_code r_code = OMP_CLAUSE_REDUCTION_CODE (c);
+ if (!cp_oacc_reduction_defined_type_p (r_code, TREE_TYPE (t)))
+ {
+ const char *r_name = cp_oacc_reduction_code_name (r_code);
+ error_at (OMP_CLAUSE_LOCATION (c),
+ "%qE has invalid type for %<reduction(%s)%>",
+ t, r_name);
+ return true;
+ }
+ return false;
+ }
+
tree id = OMP_CLAUSE_REDUCTION_PLACEHOLDER (c);
type = TYPE_MAIN_VARIANT (type);
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index ddf4e50..563ba57 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -8482,6 +8482,95 @@ oacc_is_loop (gfc_code *code)
|| code->op == EXEC_OACC_LOOP;
}
+static bool
+oacc_reduction_defined_type_p (enum gfc_omp_reduction_op rop, gfc_typespec *ts,
+ const char **rop_name = NULL)
+{
+ gcc_assert (rop != OMP_REDUCTION_USER && rop != OMP_REDUCTION_NONE);
+
+ if (rop_name)
+ switch (rop)
+ {
+ case OMP_REDUCTION_MAX:
+ *rop_name = "max";
+ break;
+ case OMP_REDUCTION_MIN:
+ *rop_name = "min";
+ break;
+ case OMP_REDUCTION_IAND:
+ *rop_name = "iand";
+ break;
+ case OMP_REDUCTION_IOR:
+ *rop_name = "ior";
+ break;
+ case OMP_REDUCTION_IEOR:
+ *rop_name = "ieor";
+ break;
+ default:
+ *rop_name = gfc_op2string ((gfc_intrinsic_op) rop);
+ break;
+ }
+
+ if (ts->type == BT_INTEGER)
+ switch (rop)
+ {
+ case OMP_REDUCTION_AND:
+ case OMP_REDUCTION_OR:
+ case OMP_REDUCTION_EQV:
+ case OMP_REDUCTION_NEQV:
+ return false;
+ default:
+ return true;
+ }
+
+ if (ts->type == BT_LOGICAL)
+ switch (rop)
+ {
+ case OMP_REDUCTION_AND:
+ case OMP_REDUCTION_OR:
+ case OMP_REDUCTION_EQV:
+ case OMP_REDUCTION_NEQV:
+ return true;
+ default:
+ return false;
+ }
+
+ if (ts->type == BT_REAL || ts->type == BT_COMPLEX)
+ switch (rop)
+ {
+ case OMP_REDUCTION_PLUS:
+ case OMP_REDUCTION_TIMES:
+ case OMP_REDUCTION_MINUS:
+ return true;
+
+ case OMP_REDUCTION_AND:
+ case OMP_REDUCTION_OR:
+ case OMP_REDUCTION_EQV:
+ case OMP_REDUCTION_NEQV:
+ return false;
+
+ case OMP_REDUCTION_MAX:
+ case OMP_REDUCTION_MIN:
+ return ts->type != BT_COMPLEX;
+ case OMP_REDUCTION_IAND:
+ case OMP_REDUCTION_IOR:
+ case OMP_REDUCTION_IEOR:
+ return false;
+ default:
+ gcc_unreachable ();
+ }
+
+ if (ts->type == BT_DERIVED)
+ {
+ for (gfc_component *p = ts->u.derived->components; p; p = p->next)
+ if (!oacc_reduction_defined_type_p (rop, &p->ts))
+ return false;
+ return true;
+ }
+
+ return false;
+}
+
static void
resolve_scalar_int_expr (gfc_expr *expr, const char *clause)
{
@@ -9917,8 +10006,10 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
else
n->sym->mark = 1;
- /* OpenACC does not support reductions on arrays. */
- if (n->sym->as)
+ /* OpenACC current only supports array reductions on explicit-shape
+ arrays. */
+ if ((n->sym->as && n->sym->as->type != AS_EXPLICIT)
+ || n->sym->attr.codimension)
gfc_error ("Array %qs is not permitted in reduction at %L",
n->sym->name, &n->where);
}
@@ -10001,6 +10092,10 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
}
}
break;
+ case OMP_LIST_REDUCTION:
+ if (!openacc)
+ goto default_case;
+ gcc_fallthrough ();
case OMP_LIST_AFFINITY:
case OMP_LIST_DEPEND:
case OMP_LIST_MAP:
@@ -10009,6 +10104,38 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
case OMP_LIST_CACHE:
for (; n != NULL; n = n->next)
{
+ if (openacc && list == OMP_LIST_REDUCTION)
+ {
+ if (n->sym->attr.threadprivate)
+ gfc_error ("THREADPRIVATE object %qs in %s clause at %L",
+ n->sym->name, name, &n->where);
+ if (n->sym->attr.cray_pointee)
+ gfc_error ("Cray pointee %qs in %s clause at %L",
+ n->sym->name, name, &n->where);
+ if (n->sym->attr.associate_var)
+ gfc_error ("Associate name %qs in %s clause at %L",
+ n->sym->attr.select_type_temporary
+ ? n->sym->assoc->target->symtree->n.sym->name
+ : n->sym->name, name, &n->where);
+ if (n->sym->attr.proc_pointer)
+ gfc_error ("Procedure pointer %qs in %s clause at %L",
+ n->sym->name, name, &n->where);
+ if (n->sym->attr.pointer)
+ gfc_error ("POINTER object %qs in %s clause at %L",
+ n->sym->name, name, &n->where);
+ if (n->sym->attr.cray_pointer)
+ gfc_error ("Cray pointer %qs in %s clause at %L",
+ n->sym->name, name, &n->where);
+
+ const char *rop_name;
+ if (!oacc_reduction_defined_type_p (n->u.reduction_op,
+ &n->sym->ts, &rop_name))
+ {
+ gfc_error ("Reduction operator %s is not valid for %qs at %L",
+ rop_name, n->sym->name, &n->where);
+ break;
+ }
+ }
if ((list == OMP_LIST_DEPEND || list == OMP_LIST_AFFINITY)
&& n->u2.ns && !n->u2.ns->resolved)
{
@@ -10250,6 +10377,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
break;
}
default:
+ default_case:
for (; n != NULL; n = n->next)
{
if (n->sym == NULL)
@@ -10402,39 +10530,46 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
case OMP_LIST_IN_REDUCTION:
case OMP_LIST_TASK_REDUCTION:
case OMP_LIST_REDUCTION_INSCAN:
- switch (n->u.reduction_op)
+ if (openacc)
{
- case OMP_REDUCTION_PLUS:
- case OMP_REDUCTION_TIMES:
- case OMP_REDUCTION_MINUS:
- if (!gfc_numeric_ts (&n->sym->ts))
+ if (!oacc_reduction_defined_type_p (n->u.reduction_op,
+ &n->sym->ts))
bad = true;
- break;
- case OMP_REDUCTION_AND:
- case OMP_REDUCTION_OR:
- case OMP_REDUCTION_EQV:
- case OMP_REDUCTION_NEQV:
- if (n->sym->ts.type != BT_LOGICAL)
- bad = true;
- break;
- case OMP_REDUCTION_MAX:
- case OMP_REDUCTION_MIN:
- if (n->sym->ts.type != BT_INTEGER
- && n->sym->ts.type != BT_REAL)
- bad = true;
- break;
- case OMP_REDUCTION_IAND:
- case OMP_REDUCTION_IOR:
- case OMP_REDUCTION_IEOR:
- if (n->sym->ts.type != BT_INTEGER)
- bad = true;
- break;
- case OMP_REDUCTION_USER:
- bad = true;
- break;
- default:
- break;
}
+ else
+ switch (n->u.reduction_op)
+ {
+ case OMP_REDUCTION_PLUS:
+ case OMP_REDUCTION_TIMES:
+ case OMP_REDUCTION_MINUS:
+ if (!gfc_numeric_ts (&n->sym->ts))
+ bad = true;
+ break;
+ case OMP_REDUCTION_AND:
+ case OMP_REDUCTION_OR:
+ case OMP_REDUCTION_EQV:
+ case OMP_REDUCTION_NEQV:
+ if (n->sym->ts.type != BT_LOGICAL)
+ bad = true;
+ break;
+ case OMP_REDUCTION_MAX:
+ case OMP_REDUCTION_MIN:
+ if (n->sym->ts.type != BT_INTEGER
+ && n->sym->ts.type != BT_REAL)
+ bad = true;
+ break;
+ case OMP_REDUCTION_IAND:
+ case OMP_REDUCTION_IOR:
+ case OMP_REDUCTION_IEOR:
+ if (n->sym->ts.type != BT_INTEGER)
+ bad = true;
+ break;
+ case OMP_REDUCTION_USER:
+ bad = true;
+ break;
+ default:
+ break;
+ }
if (!bad)
n->u2.udr = NULL;
else
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 521f7cc..cec01475 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -2840,7 +2840,9 @@ omp_udr_find_orig (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
}
static void
-gfc_trans_omp_array_reduction_or_udr (tree c, gfc_omp_namelist *n, locus where)
+gfc_trans_omp_array_reduction_or_udr (stmtblock_t *block, tree c,
+ gfc_omp_namelist *n, locus where,
+ bool openacc)
{
gfc_symbol *sym = n->sym;
gfc_symtree *root1 = NULL, *root2 = NULL, *root3 = NULL, *root4 = NULL;
@@ -2868,6 +2870,98 @@ gfc_trans_omp_array_reduction_or_udr (tree c, gfc_omp_namelist *n, locus where)
type = TREE_TYPE (type);
}
+ if (openacc)
+ {
+ if ((n->expr == NULL && n->sym->as != NULL)
+ || (n->expr
+ && n->expr->ref->type == REF_ARRAY
+ && n->expr->ref->u.ar.type == AR_FULL))
+ {
+ tree t = build_fold_addr_expr (decl);
+ t = build2 (MEM_REF, type, t,
+ build_int_cst (build_pointer_type (integer_type_node), 0));
+ OMP_CLAUSE_DECL (c) = t;
+ return;
+ }
+
+ if (n->expr
+ && n->expr->expr_type == EXPR_VARIABLE
+ && n->expr->ref->type == REF_ARRAY
+ && !n->expr->ref->next)
+ {
+ bool t = gfc_resolve_expr (n->expr);
+ gcc_assert (t);
+
+ gfc_se se;
+ bool is_element = n->expr->ref->u.ar.type == AR_ELEMENT;
+ tree ptr;
+ gfc_init_se (&se, NULL);
+ if (is_element)
+ {
+ gfc_conv_expr_reference (&se, n->expr);
+ gfc_add_block_to_block (block, &se.pre);
+ ptr = se.expr;
+
+ tree elem_type;
+ tree type
+ = build_range_type (TREE_TYPE (TREE_TYPE (ptr)),
+ integer_zero_node,
+ integer_zero_node);
+ elem_type = build_array_type (TREE_TYPE (type), type);
+ gcc_assert (TREE_CODE (ptr) == ADDR_EXPR
+ && TREE_CODE (TREE_OPERAND (ptr, 0)) == ARRAY_REF);
+ tree aref = TREE_OPERAND (ptr, 0);
+ tree array = TREE_OPERAND (aref, 0);
+ tree offset = TREE_OPERAND (aref, 1);
+ tree t = build2 (POINTER_PLUS_EXPR,
+ build_pointer_type (elem_type),
+ build_fold_addr_expr (array),
+ fold_convert (size_type_node, offset));
+ t = build2 (MEM_REF, elem_type, t, null_pointer_node);
+ OMP_CLAUSE_DECL (c) = t;
+ return;
+ }
+ else
+ {
+ gfc_conv_expr_descriptor (&se, n->expr);
+ gfc_add_block_to_block (block, &se.pre);
+
+ ptr = gfc_conv_array_data (se.expr);
+ tree type = TREE_TYPE (TREE_TYPE (TREE_TYPE (ptr)));
+ tree idx
+ = gfc_rank_cst[GFC_TYPE_ARRAY_RANK (TREE_TYPE (se.expr)) - 1];
+ tree sz
+ = fold_build2 (MINUS_EXPR, gfc_array_index_type,
+ gfc_conv_descriptor_ubound_get (se.expr, idx),
+ gfc_conv_descriptor_lbound_get (se.expr, idx));
+ sz = fold_build2 (PLUS_EXPR, gfc_array_index_type,
+ sz, gfc_index_one_node);
+ tree domain
+ = build_index_type (fold_build2 (MINUS_EXPR,
+ gfc_array_index_type,
+ sz, gfc_index_one_node));
+ tree t, array_type = build_array_type (type, domain, false);
+ tree offset = create_tmp_var (sizetype);
+ t = build2 (MINUS_EXPR, sizetype,
+ fold_convert (sizetype, ptr),
+ fold_convert (sizetype,
+ build_fold_addr_expr (decl)));
+ t = build2 (MODIFY_EXPR, sizetype, offset, t);
+ gfc_add_expr_to_block (block, t);
+
+ t = build2 (POINTER_PLUS_EXPR, build_pointer_type (array_type),
+ build_fold_addr_expr (decl), offset);
+ t = build2 (MEM_REF, array_type, t, null_pointer_node);
+ OMP_CLAUSE_DECL (c) = t;
+ return;
+ }
+ gcc_assert (se.post.head == NULL_TREE);
+ gcc_assert (POINTER_TYPE_P (TREE_TYPE (ptr)));
+ OMP_CLAUSE_DECL (c) = build_fold_indirect_ref (ptr);
+ return;
+ }
+ }
+
/* Create a fake symbol for init value. */
memset (&init_val_sym, 0, sizeof (init_val_sym));
init_val_sym.ns = sym->ns;
@@ -3095,21 +3189,24 @@ gfc_trans_omp_array_reduction_or_udr (tree c, gfc_omp_namelist *n, locus where)
poplevel (0, 0);
OMP_CLAUSE_REDUCTION_INIT (c) = stmt;
- /* Create the merge statement list. */
- pushlevel ();
- if (e4)
- stmt = gfc_trans_assignment (e3, e4, false, true);
- else
- stmt = gfc_trans_call (n->u2.udr->combiner, false,
- NULL_TREE, NULL_TREE, false);
- if (TREE_CODE (stmt) != BIND_EXPR)
- stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
- else
- poplevel (0, 0);
- OMP_CLAUSE_REDUCTION_MERGE (c) = stmt;
+ if (!openacc)
+ {
+ /* Create the merge statement list. */
+ pushlevel ();
+ if (e4)
+ stmt = gfc_trans_assignment (e3, e4, false, true);
+ else
+ stmt = gfc_trans_call (n->u2.udr->combiner, false,
+ NULL_TREE, NULL_TREE, false);
+ if (TREE_CODE (stmt) != BIND_EXPR)
+ stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
+ else
+ poplevel (0, 0);
+ OMP_CLAUSE_REDUCTION_MERGE (c) = stmt;
- /* And stick the placeholder VAR_DECL into the clause as well. */
- OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) = outer_decl;
+ /* And stick the placeholder VAR_DECL into the clause as well. */
+ OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) = outer_decl;
+ }
gfc_current_locus = old_loc;
@@ -3139,8 +3236,10 @@ gfc_trans_omp_array_reduction_or_udr (tree c, gfc_omp_namelist *n, locus where)
}
static tree
-gfc_trans_omp_reduction_list (int kind, gfc_omp_namelist *namelist, tree list,
- locus where, bool mark_addressable)
+gfc_trans_omp_reduction_list (stmtblock_t *block, int kind,
+ gfc_omp_namelist *namelist, tree list,
+ locus where, bool mark_addressable,
+ bool openacc)
{
omp_clause_code clause = OMP_CLAUSE_REDUCTION;
switch (kind)
@@ -3220,7 +3319,8 @@ gfc_trans_omp_reduction_list (int kind, gfc_omp_namelist *namelist, tree list,
if (namelist->sym->attr.dimension
|| namelist->u.reduction_op == OMP_REDUCTION_USER
|| namelist->sym->attr.allocatable)
- gfc_trans_omp_array_reduction_or_udr (node, namelist, where);
+ gfc_trans_omp_array_reduction_or_udr (block, node, namelist,
+ where, openacc);
list = gfc_trans_add_clause (node, list);
}
}
@@ -3290,13 +3390,15 @@ gfc_trans_omp_array_section (stmtblock_t *block, toc_directive cd,
{
tree type = TREE_TYPE (se.expr);
gfc_add_block_to_block (block, &se.pre);
- OMP_CLAUSE_SIZE (node) = gfc_full_array_size (block, se.expr,
- GFC_TYPE_ARRAY_RANK (type));
if (!elemsz)
elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type));
elemsz = fold_convert (gfc_array_index_type, elemsz);
- OMP_CLAUSE_SIZE (node) = fold_build2 (MULT_EXPR, gfc_array_index_type,
- OMP_CLAUSE_SIZE (node), elemsz);
+ tree size = gfc_full_array_size (block, se.expr,
+ GFC_TYPE_ARRAY_RANK (type));
+ size = fold_build2 (MULT_EXPR, gfc_array_index_type, size, elemsz);
+ if (OMP_CLAUSE_CODE (node) == OMP_CLAUSE_MAP)
+ OMP_CLAUSE_SIZE (node) = size;
+
if (n->expr->ts.type == BT_DERIVED
&& n->expr->ts.u.derived->attr.alloc_comp)
{
@@ -3387,6 +3489,11 @@ gfc_trans_omp_array_section (stmtblock_t *block, toc_directive cd,
fold_convert (ptrdiff_type_node, ptr2));
offset = build2 (TRUNC_DIV_EXPR, ptrdiff_type_node,
offset, fold_convert (ptrdiff_type_node, elemsz));
+
+ tree offset_tmp = create_tmp_var (ptrdiff_type_node);
+ gfc_add_expr_to_block (block, build2 (MODIFY_EXPR, ptrdiff_type_node,
+ offset_tmp, offset));
+ offset = offset_tmp;
offset = build4_loc (input_location, ARRAY_REF,
TREE_TYPE (TREE_TYPE (decl)),
decl, offset, NULL_TREE, NULL_TREE);
@@ -3405,9 +3512,14 @@ gfc_trans_omp_array_section (stmtblock_t *block, toc_directive cd,
OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind);
OMP_CLAUSE_DECL (node3) = decl;
}
- ptr2 = fold_convert (ptrdiff_type_node, ptr2);
- OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, ptrdiff_type_node,
- ptr, ptr2);
+
+ tree ptr_tmp = create_tmp_var (ptrdiff_type_node);
+ ptr = fold_build2 (MINUS_EXPR, ptrdiff_type_node, ptr,
+ fold_convert (ptrdiff_type_node, ptr2));
+ gfc_add_expr_to_block (block, build2 (MODIFY_EXPR, ptrdiff_type_node,
+ ptr_tmp, ptr));
+ OMP_CLAUSE_SIZE (node3) = ptr_tmp;
+
if (n->u.map.readonly)
OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
}
@@ -4017,8 +4129,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
case OMP_LIST_TASK_REDUCTION:
/* An OpenACC async clause indicates the need to set reduction
arguments addressable, to allow asynchronous copy-out. */
- omp_clauses = gfc_trans_omp_reduction_list (list, n, omp_clauses,
- where, clauses->async);
+ omp_clauses = gfc_trans_omp_reduction_list (block, list, n, omp_clauses,
+ where, clauses->async,
+ openacc);
break;
case OMP_LIST_PRIVATE:
clause_code = OMP_CLAUSE_PRIVATE;
@@ -7605,7 +7718,9 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
on the simd construct and DO's clauses are translated elsewhere. */
do_clauses->sched_simd = false;
- omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc);
+ omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc,
+ (op == EXEC_OACC_LOOP
+ ? TOC_OPENACC : TOC_OPENMP));
for (i = 0; i < collapse; i++)
{
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index b48d114..da44cdf 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -13749,6 +13749,20 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
omp_firstprivatize_variable (ctx, v);
omp_notice_variable (ctx, v, true);
}
+ if (TREE_CODE (TREE_OPERAND (decl, 0)) == COMPONENT_REF
+ || CONVERT_EXPR_CODE_P (TREE_CODE (TREE_OPERAND (decl, 0))))
+ {
+ gimplify_ctxp->into_ssa = false;
+ if (gimplify_expr (&TREE_OPERAND (decl, 0), pre_p,
+ NULL, is_gimple_val, fb_rvalue, false)
+ == GS_ERROR)
+ {
+ gimplify_ctxp->into_ssa = saved_into_ssa;
+ remove = true;
+ break;
+ }
+ gimplify_ctxp->into_ssa = saved_into_ssa;
+ }
decl = TREE_OPERAND (decl, 0);
if (TREE_CODE (decl) == POINTER_PLUS_EXPR)
{
@@ -14326,7 +14340,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
remove = true;
break;
}
- if (DECL_NAME (decl) == NULL_TREE && (flags & GOVD_SHARED) == 0)
+ if (DECL_P (decl) && DECL_NAME (decl) == NULL_TREE
+ && (flags & GOVD_SHARED) == 0)
{
tree t = omp_member_access_dummy_var (decl);
if (t)
@@ -15877,6 +15892,8 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
else if ((OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER
|| (OMP_CLAUSE_MAP_KIND (c)
== GOMP_MAP_FIRSTPRIVATE_REFERENCE)
+ || (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_POINTER
+ && ctx->region_type != ORT_ACC_KERNELS)
|| OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH)
&& TREE_CODE (OMP_CLAUSE_SIZE (c)) != INTEGER_CST)
{
@@ -16065,6 +16082,15 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
OMP_CLAUSE_DECL (c));
remove = true;
}
+
+ if (TREE_CODE (*pd) == ARRAY_REF
+ && DECL_P (TREE_OPERAND (*pd, 1))
+ && (ctx->region_type & ORT_TARGET) != 0
+ && (ctx->region_type & ORT_ACC) != 0
+ && ctx->region_type != ORT_ACC_KERNELS)
+ omp_add_variable (ctx, TREE_OPERAND (*pd, 1),
+ GOVD_FIRSTPRIVATE | GOVD_SEEN);
+
gimplify_omp_ctxp = ctx;
break;
}
@@ -16263,10 +16289,20 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
/* OpenACC reductions need a present_or_copy data clause.
Add one if necessary. Emit error when the reduction is
private. */
- if (DECL_P (decl) &&
- (ctx->region_type == ORT_ACC_PARALLEL
- || ctx->region_type == ORT_ACC_SERIAL))
+ if (ctx->region_type == ORT_ACC_PARALLEL
+ || ctx->region_type == ORT_ACC_SERIAL)
{
+ if (TREE_CODE (decl) == MEM_REF
+ && TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE)
+ {
+ tree addr = TREE_OPERAND (decl, 0);
+ if (TREE_CODE (addr) == POINTER_PLUS_EXPR)
+ addr = TREE_OPERAND (addr, 0);
+ if (TREE_CODE (addr) == ADDR_EXPR
+ && DECL_P (TREE_OPERAND (addr, 0)))
+ decl = TREE_OPERAND (addr, 0);
+ }
+
n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
if (n->value & (GOVD_PRIVATE | GOVD_FIRSTPRIVATE))
{
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index ee2b1f3..038374e 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -106,6 +106,11 @@ struct omp_context
construct. In the case of a parallel, this is in the child function. */
tree block_vars;
+ /* A hash map to track variables added through omp_copy_decl_*, to ensure
+ repeated calls of install_var_local on sam DECL do not get duplicated
+ local versions. */
+ hash_map<tree, tree> *block_vars_map;
+
/* Label to which GOMP_cancel{,llation_point} and explicit and implicit
barriers should jump to during omplower pass. */
tree cancel_label;
@@ -596,12 +601,26 @@ use_pointer_for_field (tree decl, omp_context *shared_ctx)
static tree
omp_copy_decl_2 (tree var, tree name, tree type, omp_context *ctx)
{
+ if (ctx)
+ {
+ if (!ctx->block_vars_map)
+ ctx->block_vars_map = new hash_map<tree, tree> ();
+ else
+ {
+ tree *tp = ctx->block_vars_map->get (var);
+ if (tp)
+ return *tp;
+ }
+ }
+
tree copy = copy_var_decl (var, name, type);
DECL_CONTEXT (copy) = current_function_decl;
if (ctx)
{
+ ctx->block_vars_map->put (var, copy);
+
DECL_CHAIN (copy) = ctx->block_vars;
ctx->block_vars = copy;
}
@@ -786,7 +805,7 @@ build_sender_ref (tree var, omp_context *ctx)
static void
install_var_field (tree var, bool by_ref, int mask, omp_context *ctx,
- tree key_expr = NULL_TREE)
+ tree key_expr = NULL_TREE, bool field_may_exist = false)
{
tree field, type, sfield = NULL_TREE;
splay_tree_key key = (splay_tree_key) var;
@@ -808,9 +827,9 @@ install_var_field (tree var, bool by_ref, int mask, omp_context *ctx,
}
}
gcc_assert ((mask & 1) == 0
- || !splay_tree_lookup (ctx->field_map, key));
+ || !splay_tree_lookup (ctx->field_map, key) || field_may_exist);
gcc_assert ((mask & 2) == 0 || !ctx->sfield_map
- || !splay_tree_lookup (ctx->sfield_map, key));
+ || !splay_tree_lookup (ctx->sfield_map, key) || field_may_exist);
gcc_assert ((mask & 3) == 3
|| !is_gimple_omp_oacc (ctx->stmt));
@@ -1206,6 +1225,7 @@ delete_omp_context (splay_tree_value value)
delete ctx->task_reduction_map;
}
+ delete ctx->block_vars_map;
delete ctx->lastprivate_conditional_map;
delete ctx->allocate_map;
@@ -2001,9 +2021,9 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
&& OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_POINTER
&& !OMP_CLAUSE_MAP_ZERO_BIAS_ARRAY_SECTION (c)
&& TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE)
- install_var_field (decl, true, 7, ctx, c);
+ install_var_field (decl, true, 7, ctx, c, true);
else
- install_var_field (decl, true, 3, ctx, c);
+ install_var_field (decl, true, 3, ctx, c, true);
if (is_gimple_omp_offloaded (ctx->stmt)
&& !(is_gimple_omp_oacc (ctx->stmt)
&& OMP_CLAUSE_MAP_IN_REDUCTION (c)))
@@ -2033,13 +2053,26 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
}
gcc_assert (!splay_tree_lookup (ctx->field_map,
(splay_tree_key) decl));
+ tree ptr_type = ptr_type_node;
+ if (TREE_CODE (decl) == ARRAY_REF)
+ {
+ tree array_type = TREE_TYPE (TREE_OPERAND (decl, 0));
+ ptr_type = build_pointer_type (array_type);
+ }
tree field
= build_decl (OMP_CLAUSE_LOCATION (c),
- FIELD_DECL, NULL_TREE, ptr_type_node);
- SET_DECL_ALIGN (field, TYPE_ALIGN (ptr_type_node));
+ FIELD_DECL, NULL_TREE, ptr_type);
+ SET_DECL_ALIGN (field, TYPE_ALIGN (ptr_type));
insert_field_into_struct (ctx->record_type, field);
- splay_tree_insert (ctx->field_map, (splay_tree_key) c,
+ splay_tree_insert (ctx->field_map, (splay_tree_key) /*decl,xxx*/ c,
(splay_tree_value) field);
+
+ if (TREE_CODE (decl) == ARRAY_REF
+ && is_gimple_omp_offloaded (ctx->stmt)
+ && !splay_tree_lookup (ctx->field_map,
+ (splay_tree_key) base))
+ splay_tree_insert (ctx->field_map, (splay_tree_key) base,
+ (splay_tree_value) field);
}
}
break;
@@ -4907,6 +4940,33 @@ maybe_lookup_field_in_outer_ctx (tree decl, omp_context *ctx)
tree
omp_reduction_init_op (location_t loc, enum tree_code op, tree type)
{
+ if (TREE_CODE (type) == ARRAY_TYPE)
+ {
+ tree max = TYPE_MAX_VALUE (TYPE_DOMAIN (type));
+ if (TREE_CONSTANT (max))
+ {
+ vec<constructor_elt, va_gc> *v = NULL;
+ HOST_WIDE_INT max_val = tree_to_shwi (max);
+ tree t = omp_reduction_init_op (loc, op, TREE_TYPE (type));
+ for (HOST_WIDE_INT i = 0; i <= max_val; i++)
+ CONSTRUCTOR_APPEND_ELT (v, size_int (i), t);
+ return build_constructor (type, v);
+ }
+ else
+ gcc_unreachable ();
+ }
+ else if (TREE_CODE (type) == RECORD_TYPE)
+ {
+ vec<constructor_elt, va_gc> *v = NULL;
+ for (tree fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
+ if (TREE_CODE (fld) == FIELD_DECL)
+ {
+ tree t = omp_reduction_init_op (loc, op, TREE_TYPE (fld));
+ CONSTRUCTOR_APPEND_ELT (v, fld, t);
+ }
+ return build_constructor (type, v);
+ }
+
switch (op)
{
case PLUS_EXPR:
@@ -7856,6 +7916,76 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *body_p,
gimple_seq_add_seq (stmt_list, post_stmt_list);
}
+/* Give an array reduction clause, and the surrounding map clause that mapped
+ the array (section), calculate the actual bias for the reduction inside
+ the OpenACC region, generally just: reduction_bias - map_bias, but
+ encapsulate the hairy details. */
+
+static tree
+oacc_array_reduction_bias (location_t loc, tree reduction_clause,
+ omp_context *ctx, tree map_clause,
+ omp_context *outer)
+{
+ tree bias = TREE_OPERAND (OMP_CLAUSE_DECL (reduction_clause), 1);
+ tree orig_var = TREE_OPERAND (OMP_CLAUSE_DECL (reduction_clause), 0);
+ if (TREE_CODE (orig_var) == POINTER_PLUS_EXPR)
+ {
+ tree b = TREE_OPERAND (orig_var, 1);
+ b = maybe_lookup_decl (b, ctx);
+ if (b == NULL)
+ {
+ b = TREE_OPERAND (orig_var, 1);
+ b = maybe_lookup_decl_in_outer_ctx (b, ctx);
+ }
+ if (integer_zerop (bias))
+ bias = b;
+ else
+ {
+ bias = fold_convert_loc (loc, TREE_TYPE (b), bias);
+ bias = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (b), b, bias);
+ }
+ orig_var = TREE_OPERAND (orig_var, 0);
+ }
+
+ if (TREE_CODE (orig_var) == INDIRECT_REF
+ || TREE_CODE (orig_var) == ADDR_EXPR)
+ orig_var = TREE_OPERAND (orig_var, 0);
+
+ tree map_decl = OMP_CLAUSE_DECL (map_clause);
+ tree next = OMP_CLAUSE_CHAIN (map_clause);
+
+ tree orig_bias = integer_zero_node;
+ if (TREE_CODE (map_decl) == ARRAY_REF)
+ {
+ if (next && OMP_CLAUSE_CODE (next) == OMP_CLAUSE_MAP
+ && OMP_CLAUSE_DECL (next) == orig_var
+ && (OMP_CLAUSE_MAP_KIND (next) == GOMP_MAP_FIRSTPRIVATE_POINTER
+ || OMP_CLAUSE_MAP_KIND (next) == GOMP_MAP_POINTER))
+ {
+ orig_bias = OMP_CLAUSE_SIZE (next);
+ if (DECL_P (orig_bias))
+ orig_bias = lookup_decl (orig_bias, outer);
+ orig_bias = fold_convert_loc (loc, pointer_sized_int_node,
+ orig_bias);
+ }
+ else
+ {
+ tree idx = TREE_OPERAND (map_decl, 1);
+ idx = lookup_decl (idx, outer);
+ idx = fold_convert_loc (loc, pointer_sized_int_node, idx);
+ orig_bias = fold_build2_loc (loc, MULT_EXPR,
+ pointer_sized_int_node, idx,
+ TYPE_SIZE_UNIT (TREE_TYPE (map_decl)));
+ }
+ }
+
+ bias = fold_convert_loc (loc, pointer_sized_int_node, bias);
+ tree adjusted_bias = fold_build2_loc (loc, MINUS_EXPR,
+ pointer_sized_int_node,
+ bias, orig_bias);
+ return adjusted_bias;
+}
+
/* Lower the OpenACC reductions of CLAUSES for compute axis LEVEL
(which might be a placeholder). INNER is true if this is an inner
axis of a multi-axis loop. FORK and JOIN are (optional) fork and
@@ -7895,9 +8025,107 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
tree orig = OMP_CLAUSE_DECL (c);
tree orig_clause;
+ tree array_type = NULL_TREE;
+ tree array_addr = NULL_TREE, array_max_idx = NULL_TREE;
+ tree array_bias = NULL_TREE;
tree var;
- tree ref_to_res = NULL_TREE;
+ if (TREE_CODE (orig) == MEM_REF)
+ {
+ array_type = TREE_TYPE (orig);
+
+ tree bias = TREE_OPERAND (OMP_CLAUSE_DECL (c), 1);
+ tree orig_var = TREE_OPERAND (OMP_CLAUSE_DECL (c), 0);
+
+ if (TREE_CODE (orig_var) == POINTER_PLUS_EXPR)
+ {
+ tree b = TREE_OPERAND (orig_var, 1);
+ if (is_omp_target (ctx->stmt))
+ b = NULL_TREE;
+ else
+ b = maybe_lookup_decl (b, ctx);
+ if (b == NULL)
+ {
+ b = TREE_OPERAND (orig_var, 1);
+ b = maybe_lookup_decl_in_outer_ctx (b, ctx);
+ }
+ if (integer_zerop (bias))
+ bias = b;
+ else
+ {
+ bias = fold_convert_loc (loc,
+ TREE_TYPE (b), bias);
+ bias = fold_build2_loc (loc, PLUS_EXPR,
+ TREE_TYPE (b), b, bias);
+ }
+ orig_var = TREE_OPERAND (orig_var, 0);
+ }
+ if (TREE_CODE (orig_var) == INDIRECT_REF
+ || TREE_CODE (orig_var) == ADDR_EXPR)
+ orig_var = TREE_OPERAND (orig_var, 0);
+
+ gcc_assert (DECL_P (orig_var));
+
+ tree local_orig_var = lookup_decl (orig_var, ctx);
+ tree priv_addr = local_orig_var;
+ if (TREE_CODE (TREE_TYPE (priv_addr)) == ARRAY_TYPE)
+ priv_addr = build_fold_addr_expr (priv_addr);
+
+ tree priv_addr_type = build_pointer_type (array_type);
+
+ /* Peel away MEM_REF to get at base array VAR_DECL. */
+ tree addr = TREE_OPERAND (orig, 0);
+ if (TREE_CODE (addr) == POINTER_PLUS_EXPR)
+ addr = TREE_OPERAND (addr, 0);
+ if (TREE_CODE (addr) == ADDR_EXPR)
+ addr = TREE_OPERAND (addr, 0);
+ else if (INDIRECT_REF_P (addr))
+ addr = TREE_OPERAND (addr, 0);
+ orig = addr;
+
+ if (omp_privatize_by_reference (orig))
+ {
+ gcc_assert (DECL_HAS_VALUE_EXPR_P (priv_addr)
+ && (TREE_CODE (DECL_VALUE_EXPR (priv_addr))
+ == MEM_REF));
+ priv_addr = TREE_OPERAND (DECL_VALUE_EXPR (priv_addr), 0);
+ }
+
+ tree tmp = fold_build2 (POINTER_PLUS_EXPR, ptr_type_node,
+ fold_convert (ptr_type_node, priv_addr),
+ fold_convert (sizetype, bias));
+ priv_addr = fold_convert (priv_addr_type, tmp);
+
+ tree addr_var = create_tmp_var (priv_addr_type,
+ ".array_reduction_addr");
+
+ gimple_seq s = NULL;
+ gimplify_assign (addr_var, priv_addr, &s);
+ gimple_seq_add_seq (&before_fork, s);
+
+ var = create_tmp_var (integer_type_node,
+ ".array_reduction_data_dep");
+ gimple_seq_add_stmt (&before_fork,
+ gimple_build_assign (var, integer_zero_node));
+
+ array_addr = addr_var;
+ array_bias = bias;
+ array_max_idx
+ = TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (OMP_CLAUSE_DECL (c))));
+ tree t = maybe_lookup_decl (array_max_idx, ctx);
+ if (t)
+ array_max_idx = t;
+ }
+ else
+ {
+ var = OMP_CLAUSE_REDUCTION_PRIVATE_DECL (c);
+ if (!var)
+ var = maybe_lookup_decl (orig, ctx);
+ if (!var)
+ var = orig;
+ }
+
tree incoming, outgoing;
+ tree ref_to_res = NULL_TREE;
bool is_private = false;
bool is_fpp = false;
@@ -7910,12 +8138,6 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
rcode = BIT_IOR_EXPR;
tree op = build_int_cst (unsigned_type_node, rcode);
- var = OMP_CLAUSE_REDUCTION_PRIVATE_DECL (c);
- if (!var)
- var = maybe_lookup_decl (orig, ctx);
- if (!var)
- var = orig;
-
incoming = outgoing = var;
if (!inner)
@@ -7967,7 +8189,8 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
else if (OMP_CLAUSE_CODE (cls) == OMP_CLAUSE_MAP
&& (OMP_CLAUSE_MAP_KIND (cls)
== GOMP_MAP_FIRSTPRIVATE_POINTER)
- && orig == OMP_CLAUSE_DECL (cls))
+ && orig == OMP_CLAUSE_DECL (cls)
+ && !array_addr)
{
is_fpp = true;
goto do_lookup;
@@ -7988,21 +8211,65 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
orig_clause = cls;
break;
}
+ if (gimple_code (outer->stmt) == GIMPLE_OMP_TARGET
+ && !orig_clause
+ && !is_private
+ && maybe_lookup_field (orig, outer))
+ orig_clause = orig;
if ((orig_clause != NULL_TREE || is_fpp) && !is_private)
{
tree type = TREE_TYPE (var);
- if (is_fpp)
+ if (is_fpp && !array_addr)
{
tree x = create_tmp_var (type);
gimplify_assign (x, lookup_decl (orig, outer), fork_seq);
ref_to_res = x;
}
- else
+ else if (orig_clause)
{
ref_to_res = build_receiver_ref (orig_clause, false, outer);
if (omp_privatize_by_reference (orig))
ref_to_res = build_simple_mem_ref (ref_to_res);
+
+ bool ptr_ptr_array = false;
+ if (TREE_CODE (TREE_TYPE (orig)) == ARRAY_TYPE
+ && TREE_CODE (TREE_TYPE (ref_to_res)) == POINTER_TYPE
+ && (TREE_CODE (TREE_TYPE (TREE_TYPE (ref_to_res)))
+ == POINTER_TYPE))
+ {
+ ref_to_res = build_simple_mem_ref (ref_to_res);
+ ptr_ptr_array = true;
+ }
+
+ if (array_bias)
+ {
+ tree map_bias = integer_zero_node;
+ if (ptr_ptr_array)
+ map_bias = array_bias;
+ else
+ {
+ tree m = gimple_omp_target_clauses (outer->stmt);
+ for (; m; m = OMP_CLAUSE_CHAIN (m))
+ if (OMP_CLAUSE_CODE (m) == OMP_CLAUSE_MAP)
+ {
+ tree md = OMP_CLAUSE_DECL (m);
+ if (orig == md
+ || (TREE_CODE (md) == ARRAY_REF
+ && TREE_OPERAND (md, 0) == orig))
+ {
+ map_bias
+ = oacc_array_reduction_bias (loc, c, ctx,
+ m, outer);
+ break;
+ }
+ }
+ }
+ tree t = fold_convert (ptr_type_node, ref_to_res);
+ t = build2 (POINTER_PLUS_EXPR, ptr_type_node, t,
+ fold_convert (sizetype, map_bias));
+ ref_to_res = fold_convert (TREE_TYPE (ref_to_res), t);
+ }
}
if (POINTER_TYPE_P (type))
@@ -8016,13 +8283,91 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
/* Try to look at enclosing contexts for reduction var,
use original if no mapping found. */
tree t = NULL_TREE;
- omp_context *c = ctx->outer;
- while (c && !t)
+ omp_context *cp = ctx->outer;
+ while (cp)
+ {
+ t = maybe_lookup_decl (orig, cp);
+ if (t)
+ break;
+
+ cp = cp->outer;
+ }
+
+ if (array_addr)
{
- t = maybe_lookup_decl (orig, c);
- c = c->outer;
+ if (t)
+ {
+ if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+ {
+ if (!is_private)
+ {
+ gcc_assert (DECL_SIZE (t)
+ && (TREE_CODE (DECL_SIZE (t))
+ != INTEGER_CST)
+ && DECL_HAS_VALUE_EXPR_P (t));
+ t = DECL_VALUE_EXPR (t);
+ }
+
+ t = fold_convert (ptr_type_node,
+ build_fold_addr_expr (t));
+ if (array_bias)
+ t = build2 (POINTER_PLUS_EXPR, ptr_type_node, t,
+ fold_convert (sizetype, array_bias));
+ ref_to_res
+ = fold_convert (build_pointer_type
+ (TREE_TYPE (orig)), t);
+ }
+ else if (TREE_CODE (TREE_TYPE (t)) == POINTER_TYPE)
+ {
+ if (array_bias)
+ t = build2 (POINTER_PLUS_EXPR, ptr_type_node, t,
+ fold_convert (sizetype, array_bias));
+ ref_to_res
+ = fold_convert (build_pointer_type
+ (array_type), t);
+ }
+ else
+ gcc_unreachable ();
+ }
+ else
+ {
+ gcc_assert (!cp && (gimple_code (ctx->stmt)
+ == GIMPLE_OMP_TARGET));
+
+ tree mem_ref = NULL_TREE;
+ tree mem_ref_clause = NULL_TREE;
+ tree m = gimple_omp_target_clauses (ctx->stmt);
+ tree orig_val = (DECL_HAS_VALUE_EXPR_P (orig)
+ ? DECL_VALUE_EXPR (orig) : orig);
+ for (; m; m = OMP_CLAUSE_CHAIN (m))
+ if (OMP_CLAUSE_CODE (m) == OMP_CLAUSE_MAP)
+ {
+ tree md = OMP_CLAUSE_DECL (m);
+ if (orig_val == md
+ || (TREE_CODE (md) == MEM_REF
+ && INDIRECT_REF_P (orig_val)
+ && (TREE_OPERAND (md, 0)
+ == TREE_OPERAND (orig_val, 0))))
+ {
+ mem_ref = md;
+ mem_ref_clause = m;
+ break;
+ }
+ }
+ gcc_assert (mem_ref);
+ mem_ref = build_receiver_ref (mem_ref_clause, false, ctx);
+
+ if (array_bias)
+ mem_ref = build2 (POINTER_PLUS_EXPR, ptr_type_node,
+ mem_ref, fold_convert (sizetype,
+ array_bias));
+ ref_to_res
+ = fold_convert (build_pointer_type (TREE_TYPE (orig)),
+ mem_ref);
+ }
}
- incoming = outgoing = (t ? t : orig);
+ else
+ incoming = outgoing = (t ? t : orig);
}
has_outer_reduction:;
@@ -8031,6 +8376,9 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
if (!ref_to_res)
ref_to_res = integer_zero_node;
+ if (!array_addr)
+ array_addr = array_max_idx = integer_zero_node;
+
if (omp_privatize_by_reference (outgoing))
{
outgoing = build_simple_mem_ref (outgoing);
@@ -8063,25 +8411,29 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
tree setup_call
= build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION,
- TREE_TYPE (var), 6, setup_code,
+ TREE_TYPE (var), 8, setup_code,
unshare_expr (ref_to_res),
unshare_expr (incoming),
- level, op, off);
+ level, op, off,
+ array_addr, array_max_idx);
tree init_call
= build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION,
- TREE_TYPE (var), 6, init_code,
+ TREE_TYPE (var), 8, init_code,
unshare_expr (ref_to_res),
- unshare_expr (var), level, op, off);
+ unshare_expr (var), level, op, off,
+ array_addr, array_max_idx);
tree fini_call
= build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION,
- TREE_TYPE (var), 6, fini_code,
+ TREE_TYPE (var), 8, fini_code,
unshare_expr (ref_to_res),
- unshare_expr (var), level, op, off);
+ unshare_expr (var), level, op, off,
+ array_addr, array_max_idx);
tree teardown_call
= build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION,
- TREE_TYPE (var), 6, teardown_code,
+ TREE_TYPE (var), 8, teardown_code,
ref_to_res, unshare_expr (var),
- level, op, off);
+ level, op, off,
+ array_addr, array_max_idx);
gimplify_assign (unshare_expr (var), setup_call, &before_fork);
gimplify_assign (unshare_expr (var), init_call, &after_fork);
@@ -14283,11 +14635,10 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
&& OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_POINTER
&& !OMP_CLAUSE_MAP_ZERO_BIAS_ARRAY_SECTION (c)
- && TREE_CODE (TREE_TYPE (ovar)) == ARRAY_TYPE)
+ && TREE_CODE (TREE_TYPE (ovar)) == ARRAY_TYPE
+ && offloaded)
{
- gcc_assert (offloaded);
- tree avar
- = create_tmp_var (TREE_TYPE (TREE_TYPE (x)));
+ tree avar = create_tmp_var (TREE_TYPE (TREE_TYPE (x)));
mark_addressable (avar);
gimplify_assign (avar, build_fold_addr_expr (var), &ilist);
talign = DECL_ALIGN_UNIT (avar);
@@ -15279,8 +15630,29 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
gimple_build_assign (t, x));
x = build_fold_addr_expr_loc (clause_loc, t);
}
- gimple_seq_add_stmt (&new_body,
- gimple_build_assign (new_var, x));
+ if (offloaded && is_gimple_omp_oacc (ctx->stmt)
+ && OMP_CLAUSE_MAP_IN_REDUCTION (prev)
+ && TREE_CODE (type) == POINTER_TYPE
+ && TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE
+ && !TREE_CONSTANT (TYPE_SIZE_UNIT (TREE_TYPE (type))))
+ {
+ tree array_type = TREE_TYPE (type);
+ tree atmp
+ = builtin_decl_explicit (BUILT_IN_ALLOCA_WITH_ALIGN);
+ tree al = size_int (TYPE_ALIGN (array_type));
+ tree sz = TYPE_SIZE_UNIT (array_type);
+ tree call = build_call_expr_loc (OMP_CLAUSE_LOCATION (c),
+ atmp, 2, sz, al);
+ gimplify_assign (x, call, &new_body);
+
+ /* In some cases, we need to preserve the pointer to array
+ type, as it will be passed into OpenACC reduction
+ internal-fns, and we require the type for proper copy
+ generation. */
+ TREE_TYPE (x) = TREE_TYPE (new_var);
+ }
+ gimple *g = gimple_build_assign (new_var, x);
+ gimple_seq_add_stmt (&new_body, g);
prev = NULL_TREE;
}
else if (OMP_CLAUSE_CHAIN (c)
diff --git a/gcc/omp-oacc-neuter-broadcast.cc b/gcc/omp-oacc-neuter-broadcast.cc
index 52addad..1336aa9 100644
--- a/gcc/omp-oacc-neuter-broadcast.cc
+++ b/gcc/omp-oacc-neuter-broadcast.cc
@@ -148,6 +148,34 @@ local_var_based_p (tree decl)
}
}
+static bool
+local_assignment_p (gimple *stmt, hash_set<tree> *array_reduction_base_vars)
+{
+ if (is_gimple_assign (stmt)
+ || gimple_call_builtin_p (stmt, BUILT_IN_ALLOCA_WITH_ALIGN))
+ {
+ tree lhs = (is_gimple_assign (stmt)
+ ? gimple_assign_lhs (stmt) : gimple_call_lhs (stmt));
+ if (TREE_CODE (lhs) == SSA_NAME
+ && array_reduction_base_vars->contains (lhs))
+ {
+ use_operand_p use_p;
+ ssa_op_iter iter;
+ FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
+ {
+ tree use = USE_FROM_PTR (use_p);
+ if (TREE_CODE (use) != SSA_NAME
+ || SSA_NAME_IS_DEFAULT_DEF (use))
+ continue;
+ if (!array_reduction_base_vars->contains (use))
+ return false;
+ }
+ return true;
+ }
+ }
+ return false;
+}
+
/* Map of basic blocks to gimple stmts. */
typedef hash_map<basic_block, gimple *> bb_stmt_map_t;
@@ -991,7 +1019,8 @@ worker_single_copy (basic_block from, basic_block to,
hash_set<tree> *worker_partitioned_uses,
tree record_type, record_field_map_t *record_field_map,
unsigned HOST_WIDE_INT placement,
- bool isolate_broadcasts, bool has_gang_private_write)
+ bool isolate_broadcasts, bool has_gang_private_write,
+ hash_set<tree> *array_reduction_base_vars)
{
/* If we only have virtual defs, we'll have no record type, but we still want
to emit single_copy_start and (particularly) single_copy_end to act as
@@ -1015,6 +1044,31 @@ worker_single_copy (basic_block from, basic_block to,
edge e = split_block (to, gsi_stmt (gsi));
basic_block barrier_block = e->dest;
+ gimple_seq local_asgns = NULL;
+
+ /* For accesses of variables used in array reductions, instead of
+ propagating the value for the main thread to all other worker threads
+ (which doesn't make sense as a reduction private var), move the defs
+ of such SSA_NAMEs to before the copy block and leave them alone (each
+ thread should access their own local copy). */
+ for (gimple_stmt_iterator i = gsi_after_labels (from); !gsi_end_p (i);)
+ {
+ gimple *stmt = gsi_stmt (i);
+ if (local_assignment_p (stmt, array_reduction_base_vars))
+ {
+ gsi_remove (&i, false);
+ gimple_seq_add_stmt (&local_asgns, stmt);
+ }
+ else
+ gsi_next (&i);
+ }
+
+ if (dump_file)
+ {
+ fprintf (dump_file, "Local assignments to be moved:\n");
+ print_gimple_seq (dump_file, local_asgns, 0, TDF_NONE);
+ }
+
gimple_stmt_iterator start = gsi_after_labels (from);
tree decl = builtin_decl_explicit (BUILT_IN_GOACC_SINGLE_COPY_START);
@@ -1029,6 +1083,9 @@ worker_single_copy (basic_block from, basic_block to,
gsi_insert_before (&start, call, GSI_NEW_STMT);
update_stmt (call);
+ if (local_asgns)
+ gsi_insert_seq_before (&start, local_asgns, GSI_SAME_STMT);
+
/* The shared-memory range for this block overflowed. Add a barrier before
the GOACC_single_copy_start call. */
if (isolate_broadcasts)
@@ -1128,6 +1185,12 @@ worker_single_copy (basic_block from, basic_block to,
if (gimple_nop_p (def_stmt))
continue;
+ /* For accesses of variables used in array reductions, skip creating
+ the barrier phi. Each thread runs same def_stmt to access
+ local variable, there is no main/worker divide here. */
+ if (local_assignment_p (def_stmt, array_reduction_base_vars))
+ continue;
+
/* The barrier phi takes one result from the actual work of the
block we're neutering, and the other result is constant zero of
the same type. */
@@ -1248,7 +1311,8 @@ neuter_worker_single (parallel_g *par, unsigned outer_mask,
hash_set<tree> *partitioned_var_uses,
record_field_map_t *record_field_map,
blk_offset_map_t *blk_offset_map,
- bitmap writes_gang_private)
+ bitmap writes_gang_private,
+ hash_set<tree> *array_reduction_base_vars)
{
unsigned mask = outer_mask | par->mask;
@@ -1398,7 +1462,8 @@ neuter_worker_single (parallel_g *par, unsigned outer_mask,
&worker_partitioned_uses, record_type,
record_field_map,
offset, !range_allocated,
- has_gang_private_write);
+ has_gang_private_write,
+ array_reduction_base_vars);
}
else
worker_single_simple (block, block, &def_escapes_block);
@@ -1436,11 +1501,13 @@ neuter_worker_single (parallel_g *par, unsigned outer_mask,
if (par->inner)
neuter_worker_single (par->inner, mask, worker_single, vector_single,
prop_set, partitioned_var_uses, record_field_map,
- blk_offset_map, writes_gang_private);
+ blk_offset_map, writes_gang_private,
+ array_reduction_base_vars);
if (par->next)
neuter_worker_single (par->next, outer_mask, worker_single, vector_single,
prop_set, partitioned_var_uses, record_field_map,
- blk_offset_map, writes_gang_private);
+ blk_offset_map, writes_gang_private,
+ array_reduction_base_vars);
}
static void
@@ -1587,7 +1654,8 @@ merge_ranges (splay_tree accum, splay_tree sp)
static void
oacc_do_neutering (unsigned HOST_WIDE_INT bounds_lo,
- unsigned HOST_WIDE_INT bounds_hi)
+ unsigned HOST_WIDE_INT bounds_hi,
+ hash_set<tree> *array_reduction_base_vars)
{
bb_stmt_map_t bb_stmt_map;
auto_bitmap worker_single, vector_single;
@@ -1792,7 +1860,8 @@ oacc_do_neutering (unsigned HOST_WIDE_INT bounds_lo,
neuter_worker_single (par, mask, worker_single, vector_single, &prop_set,
&partitioned_var_uses, &record_field_map,
- &blk_offset_map, writes_gang_private);
+ &blk_offset_map, writes_gang_private,
+ array_reduction_base_vars);
record_field_map.empty ();
@@ -1831,6 +1900,9 @@ execute_omp_oacc_neuter_broadcast ()
private_size[i] = 0;
}
+ /* Set of base variables referencing arrays used in array reductions. */
+ hash_set<tree> array_reduction_base_vars;
+
/* Calculate shared memory size required for reduction variables and
gang-private memory for this offloaded function. */
basic_block bb;
@@ -1869,6 +1941,42 @@ execute_omp_oacc_neuter_broadcast ()
+ tree_to_uhwi (TYPE_SIZE_UNIT (var_type)));
reduction_size[level]
= MAX (reduction_size[level], limit);
+
+ tree array_addr = gimple_call_arg (call, 6);
+ if (!integer_zerop (array_addr)
+ && TREE_CODE (array_addr) == SSA_NAME)
+ {
+ /* For array reductions, trace all dependent SSA names
+ and add to array_reduction_base_vars, which makes
+ them replicated private for each thread, instead
+ of being copied around. */
+ auto_vec<tree> ssa_name_queue;
+ ssa_name_queue.safe_push (array_addr);
+
+ tree array_max_idx = gimple_call_arg (call, 7);
+ if (TREE_CODE (array_max_idx) == SSA_NAME)
+ ssa_name_queue.safe_push (array_max_idx);
+
+ while (ssa_name_queue.length ())
+ {
+ tree t = ssa_name_queue.pop ();
+
+ array_reduction_base_vars.add (t);
+ gimple *g = SSA_NAME_DEF_STMT (t);
+
+ use_operand_p use_p;
+ ssa_op_iter iter;
+ FOR_EACH_SSA_USE_OPERAND (use_p, g, iter,
+ SSA_OP_USE)
+ {
+ tree use = USE_FROM_PTR (use_p);
+ if (TREE_CODE (use) != SSA_NAME
+ || SSA_NAME_IS_DEFAULT_DEF (use))
+ continue;
+ ssa_name_queue.safe_push (use);
+ }
+ }
+ }
}
}
break;
@@ -1917,7 +2025,7 @@ execute_omp_oacc_neuter_broadcast ()
/* Perform worker partitioning unless we know 'num_workers(1)'. */
if (dims[GOMP_DIM_WORKER] != 1)
- oacc_do_neutering (bounds_lo, bounds_hi);
+ oacc_do_neutering (bounds_lo, bounds_hi, &array_reduction_base_vars);
return 0;
}
diff --git a/gcc/omp-offload.cc b/gcc/omp-offload.cc
index b5f54b2..3218f69 100644
--- a/gcc/omp-offload.cc
+++ b/gcc/omp-offload.cc
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see
#include "stringpool.h"
#include "attribs.h"
#include "cfgloop.h"
+#include "cfghooks.h"
#include "context.h"
#include "convert.h"
#include "opts.h"
@@ -2151,6 +2152,129 @@ default_goacc_fork_join (gcall *ARG_UNUSED (call),
return targetm.have_oacc_join ();
}
+void
+oacc_build_array_copy (tree dst, tree src, tree max_idx, gimple_seq *seq)
+{
+ push_gimplify_context (true);
+
+ tree len = fold_build2 (PLUS_EXPR, size_type_node, max_idx, size_int (1));
+ tree ptr_to_array = (TREE_TYPE (dst) == ptr_type_node ? src : dst);
+ tree elem_type;
+ if (TREE_CODE (TREE_TYPE (ptr_to_array)) == POINTER_TYPE
+ && TREE_CODE (TREE_TYPE (TREE_TYPE (ptr_to_array))) == ARRAY_TYPE)
+ elem_type = TREE_TYPE (TREE_TYPE (TREE_TYPE (ptr_to_array)));
+ else
+ elem_type = TREE_TYPE (TREE_TYPE (ptr_to_array));
+ tree elem_size = TYPE_SIZE_UNIT (elem_type);
+ tree size = fold_build2 (MULT_EXPR, size_type_node, len, elem_size);
+
+ tree memcpy_decl = builtin_decl_implicit (BUILT_IN_MEMCPY);
+ tree call = build_call_expr (memcpy_decl, 3, dst, src, size);
+ gimplify_and_add (call, seq);
+ pop_gimplify_context (NULL);
+}
+
+void
+oacc_build_array_copy_loop (location_t loc, tree dst, tree src, tree max_idx,
+ gimple_stmt_iterator *gsi)
+{
+ push_gimplify_context (true);
+
+ tree loop_index;
+ gimple_stmt_iterator loop_body_gsi;
+ oacc_build_indexed_ssa_loop (loc, max_idx, gsi,
+ &loop_index, &loop_body_gsi);
+ gimple_seq copy_seq = NULL;
+
+ tree dst_array_type = TREE_TYPE (TREE_TYPE (dst));
+ tree dst_elem_type = build_qualified_type (TREE_TYPE (dst_array_type),
+ TYPE_QUALS (dst_array_type));
+ tree dst_elem_ptr_type = build_pointer_type (dst_elem_type);
+ tree dst_ptr = fold_convert (dst_elem_ptr_type, dst);
+
+ tree src_array_type = TREE_TYPE (TREE_TYPE (src));
+ tree src_elem_type = build_qualified_type (TREE_TYPE (src_array_type),
+ TYPE_QUALS (src_array_type));
+ tree src_elem_ptr_type = build_pointer_type (src_elem_type);
+ tree src_ptr = fold_convert (src_elem_ptr_type, src);
+
+ tree offset = build2 (MULT_EXPR, sizetype,
+ loop_index, TYPE_SIZE_UNIT (dst_elem_type));
+
+ dst_ptr = build2 (POINTER_PLUS_EXPR, dst_elem_ptr_type, dst_ptr, offset);
+ src_ptr = build2 (POINTER_PLUS_EXPR, src_elem_ptr_type, src_ptr, offset);
+
+ tree dst_mem_ref = build_simple_mem_ref (dst_ptr);
+ tree src_mem_ref = build_simple_mem_ref (src_ptr);
+
+ gimplify_assign (dst_mem_ref, src_mem_ref, &copy_seq);
+
+ gsi_insert_seq_before (&loop_body_gsi, copy_seq, GSI_SAME_STMT);
+ pop_gimplify_context (NULL);
+}
+
+void
+oacc_build_indexed_ssa_loop (location_t loc, tree max_index,
+ gimple_stmt_iterator *gsi, tree *out_loop_index,
+ gimple_stmt_iterator *out_loop_body_code_gsi)
+{
+ gimple *g;
+ gimple_seq seq = NULL;
+
+ tree init_index = make_ssa_name (TREE_TYPE (max_index));
+ tree loop_index = make_ssa_name (TREE_TYPE (max_index));
+ tree update_index = make_ssa_name (TREE_TYPE (max_index));
+
+ g = gimple_build_assign (init_index,
+ build_int_cst (TREE_TYPE (init_index), 0));
+ gimple_seq_add_stmt (&seq, g);
+
+ gimple *init_end = gimple_seq_last (seq);
+ gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+
+ basic_block init_bb = gsi_bb (*gsi);
+ edge init_edge = split_block (init_bb, init_end);
+ basic_block loop_bb = init_edge->dest;
+ /* Reset the iterator. */
+ *gsi = gsi_for_stmt (gsi_stmt (*gsi));
+
+ seq = NULL;
+ g = gimple_build_assign (update_index, PLUS_EXPR, loop_index,
+ build_int_cst (TREE_TYPE (loop_index), 1));
+ gimple_seq_add_stmt (&seq, g);
+
+ g = gimple_build_cond (LE_EXPR, update_index, max_index, NULL, NULL);
+ gimple_seq_add_stmt (&seq, g);
+ gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+
+ edge post_edge = split_block (loop_bb, g);
+ basic_block post_bb = post_edge->dest;
+ loop_bb = post_edge->src;
+ /* Reset the iterator. */
+ *gsi = gsi_for_stmt (gsi_stmt (*gsi));
+
+ /* Return place where we insert loop body code. */
+ gimple_stmt_iterator loop_body_code_gsi = gsi_start_bb (loop_bb);
+
+ post_edge->flags ^= EDGE_FALSE_VALUE | EDGE_FALLTHRU;
+ post_edge->probability = profile_probability::even ();
+ edge loop_edge = make_edge (loop_bb, loop_bb, EDGE_TRUE_VALUE);
+ loop_edge->probability = profile_probability::even ();
+ set_immediate_dominator (CDI_DOMINATORS, loop_bb, init_bb);
+ set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb);
+ class loop *new_loop = alloc_loop ();
+ new_loop->header = loop_bb;
+ new_loop->latch = loop_bb;
+ add_loop (new_loop, loop_bb->loop_father);
+
+ gphi *phi = create_phi_node (loop_index, loop_bb);
+ add_phi_arg (phi, init_index, init_edge, loc);
+ add_phi_arg (phi, update_index, loop_edge, loc);
+
+ *out_loop_index = loop_index;
+ *out_loop_body_code_gsi = loop_body_code_gsi;
+}
+
/* Default goacc.reduction early expander.
LHS-opt = IFN_REDUCTION (KIND, RES_PTR, VAR, LEVEL, OP, OFFSET)
@@ -2176,29 +2300,44 @@ default_goacc_reduction (gcall *call)
if there is one. */
tree ref_to_res = gimple_call_arg (call, 1);
+ tree array_addr = gimple_call_arg (call, 6);
+ tree array_max_idx = gimple_call_arg (call, 7);
+
if (!integer_zerop (ref_to_res))
{
- /* Dummy reduction vars that have GOMP_MAP_FIRSTPRIVATE_POINTER data
- mappings gets retyped to (void *). Adjust the type of ref_to_res
- as appropriate. */
- if (TREE_TYPE (TREE_TYPE (ref_to_res)) != TREE_TYPE (var))
+ if (!integer_zerop (array_addr))
{
- tree ptype = build_pointer_type (TREE_TYPE (var));
- tree t = make_ssa_name (ptype);
- tree expr = fold_build1 (NOP_EXPR, ptype, ref_to_res);
- gimple_seq_add_stmt (&seq, gimple_build_assign (t, expr));
- ref_to_res = t;
+ tree dst, src;
+ if (code == IFN_GOACC_REDUCTION_SETUP)
+ dst = array_addr, src = ref_to_res;
+ else
+ src = array_addr, dst = ref_to_res;
+ oacc_build_array_copy (dst, src, array_max_idx, &seq);
}
- tree dst = build_simple_mem_ref (ref_to_res);
- tree src = var;
-
- if (code == IFN_GOACC_REDUCTION_SETUP)
+ else
{
- src = dst;
- dst = lhs;
- lhs = NULL;
+ /* Dummy reduction vars that have GOMP_MAP_FIRSTPRIVATE_POINTER data
+ mappings gets retyped to (void *). Adjust the type of ref_to_res
+ as appropriate. */
+ if (TREE_TYPE (TREE_TYPE (ref_to_res)) != TREE_TYPE (var))
+ {
+ tree ptype = build_pointer_type (TREE_TYPE (var));
+ tree t = make_ssa_name (ptype);
+ tree expr = fold_build1 (NOP_EXPR, ptype, ref_to_res);
+ gimple_seq_add_stmt (&seq, gimple_build_assign (t, expr));
+ ref_to_res = t;
+ }
+ tree dst = build_simple_mem_ref (ref_to_res);
+ tree src = var;
+
+ if (code == IFN_GOACC_REDUCTION_SETUP)
+ {
+ src = dst;
+ dst = lhs;
+ lhs = NULL;
+ }
+ gimple_seq_add_stmt (&seq, gimple_build_assign (dst, src));
}
- gimple_seq_add_stmt (&seq, gimple_build_assign (dst, src));
}
}
diff --git a/gcc/omp-offload.h b/gcc/omp-offload.h
index 9911083..c8b697a 100644
--- a/gcc/omp-offload.h
+++ b/gcc/omp-offload.h
@@ -26,6 +26,16 @@ extern int oacc_get_default_dim (int dim);
extern int oacc_get_min_dim (int dim);
extern int oacc_fn_attrib_level (tree attr);
+/* Used by both NVPTX/GCN OpenACC code. */
+extern void oacc_build_indexed_ssa_loop (location_t loc, tree max_index,
+ gimple_stmt_iterator *gsi,
+ tree *out_loop_index,
+ gimple_stmt_iterator *out_loop_body_code_gsi);
+extern void oacc_build_array_copy (tree dst, tree src, tree max_idx, gimple_seq *seq);
+extern void oacc_build_array_copy_loop (location_t loc, tree dst, tree src,
+ tree max_idx,
+ gimple_stmt_iterator *gsi);
+
extern GTY(()) vec<tree, va_gc> *offload_funcs;
extern GTY(()) vec<tree, va_gc> *offload_vars;
extern GTY(()) vec<tree, va_gc> *offload_ind_funcs;
diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-2.c b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
index 3f52a9f..def81c2 100644
--- a/gcc/testsuite/c-c++-common/goacc/readonly-2.c
+++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
@@ -12,5 +12,5 @@ int main (void)
r = a[8];
}
}
-/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */
-/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
+/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = \\(\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 2 "phiprop1" } } */
+/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = \\(\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 1 "fre1" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-11.c b/gcc/testsuite/c-c++-common/goacc/reduction-11.c
new file mode 100644
index 0000000..29eb4b5
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/reduction-11.c
@@ -0,0 +1,81 @@
+/* { dg-additional-options "-fdump-tree-gimple" } */
+/* Integer array reductions. */
+
+#define n 1000
+
+int
+main(void)
+{
+ int i, j;
+ int result[n], array[n];
+ int lresult[n];
+
+ /* '+' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (+:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] += array[i];
+
+ /* '*' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (*:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] *= array[i];
+
+ /* 'max' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (max:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] = result[j] > array[i] ? result[j] : array[i];
+
+ /* 'min' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (min:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] = result[j] < array[i] ? result[j] : array[i];
+
+ /* '&' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (&:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] &= array[i];
+
+ /* '|' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (|:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] |= array[i];
+
+ /* '^' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (^:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] ^= array[i];
+
+ /* '&&' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (&&:lresult)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ lresult[j] = lresult[j] && (result[j] > array[i]);
+
+ /* '||' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (||:lresult)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ lresult[j] = lresult[j] || (result[j] > array[i]);
+
+ return 0;
+}
+
+/* Check that default copy maps are generated for loop reductions. */
+/* { dg-final { scan-tree-dump-times "map\\(tofrom:result \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 9 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "map\\(tofrom:lresult \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 2 "gimple" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-12.c b/gcc/testsuite/c-c++-common/goacc/reduction-12.c
new file mode 100644
index 0000000..e9dcc1c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/reduction-12.c
@@ -0,0 +1,60 @@
+/* { dg-additional-options "-fdump-tree-gimple" } */
+/* float array reductions. */
+
+#define n 1000
+
+int
+main(void)
+{
+ int i, j;
+ float result[n], array[n];
+ int lresult[n];
+
+ /* '+' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (+:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] += array[i];
+
+ /* '*' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (*:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] *= array[i];
+
+ /* 'max' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (max:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] = result[j] > array[i] ? result[j] : array[i];
+
+ /* 'min' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (min:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] = result[j] < array[i] ? result[j] : array[i];
+
+ /* '&&' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (&&:lresult)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ lresult[j] = lresult[j] && (result[j] > array[i]);
+
+ /* '||' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (||:lresult)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ lresult[j] = lresult[j] || (result[j] > array[i]);
+
+ return 0;
+}
+
+/* Check that default copy maps are generated for loop reductions. */
+/* { dg-final { scan-tree-dump-times "map\\(tofrom:result \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 6 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "map\\(tofrom:lresult \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 2 "gimple" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-13.c b/gcc/testsuite/c-c++-common/goacc/reduction-13.c
new file mode 100644
index 0000000..8800b44
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/reduction-13.c
@@ -0,0 +1,60 @@
+/* { dg-additional-options "-fdump-tree-gimple" } */
+/* double array reductions. */
+
+#define n 1000
+
+int
+main(void)
+{
+ int i, j;
+ double result[n], array[n];
+ int lresult[n];
+
+ /* '+' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (+:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] += array[i];
+
+ /* '*' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (*:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] *= array[i];
+
+ /* 'max' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (max:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] = result[j] > array[i] ? result[j] : array[i];
+
+ /* 'min' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (min:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] = result[j] < array[i] ? result[j] : array[i];
+
+ /* '&&' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (&&:lresult)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ lresult[j] = lresult[j] && (result[j] > array[i]);
+
+ /* '||' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (||:lresult)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ lresult[j] = lresult[j] || (result[j] > array[i]);
+
+ return 0;
+}
+
+/* Check that default copy maps are generated for loop reductions. */
+/* { dg-final { scan-tree-dump-times "map\\(tofrom:result \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 6 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "map\\(tofrom:lresult \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 2 "gimple" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-14.c b/gcc/testsuite/c-c++-common/goacc/reduction-14.c
new file mode 100644
index 0000000..48117a3
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/reduction-14.c
@@ -0,0 +1,46 @@
+/* { dg-additional-options "-fdump-tree-gimple" } */
+/* complex array reductions. */
+
+#define n 1000
+
+int
+main(void)
+{
+ int i, j;
+ __complex__ double result[n], array[n];
+ int lresult[n];
+
+ /* '+' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (+:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] += array[i];
+
+ /* '*' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (*:result)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ result[j] *= array[i];
+
+ /* '&&' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (&&:lresult)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ lresult[j] = lresult[j] && (__real__(result[j]) > __real__(array[i]));
+
+ /* '||' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (||:lresult)
+ for (i = 0; i < n; i++)
+ for (j = 0; j < n; j++)
+ lresult[j] = lresult[j] || (__real__(result[j]) > __real__(array[i]));
+
+ return 0;
+}
+
+/* Check that default copy maps are generated for loop reductions. */
+/* { dg-final { scan-tree-dump-times "map\\(tofrom:result \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 4 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "map\\(tofrom:lresult \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 2 "gimple" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-15.c b/gcc/testsuite/c-c++-common/goacc/reduction-15.c
new file mode 100644
index 0000000..f01d988
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/reduction-15.c
@@ -0,0 +1,51 @@
+/* { dg-additional-options "-fdump-tree-gimple" } */
+/* struct reductions. */
+
+typedef struct { int x, y; } int_pair;
+typedef struct { float m, n; } flt_pair;
+typedef struct
+{
+ int i;
+ double d;
+ float f;
+ int a[4];
+ int_pair ip;
+ flt_pair fp;
+} rectype;
+
+#define n 1000
+
+int
+main(void)
+{
+ int i;
+ rectype result, array[n];
+
+ /* '+' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (+:result)
+ for (i = 0; i < n; i++)
+ {
+ result.i += array[i].i;
+ result.f += array[i].f;
+ result.ip.x += array[i].ip.x;
+ result.ip.y += array[i].ip.y;
+ }
+
+ /* '*' reductions. */
+#pragma acc parallel
+#pragma acc loop gang worker vector reduction (*:result)
+ for (i = 0; i < n; i++)
+ {
+ result.i *= array[i].i;
+ result.f *= array[i].f;
+ result.ip.x *= array[i].ip.x;
+ result.ip.y *= array[i].ip.y;
+ }
+
+ return 0;
+}
+
+/* Check that default copy maps are generated for loop reductions. */
+/* { dg-final { scan-tree-dump-times "map\\(tofrom:result \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 2 "gimple" } } */
+
diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-16.c b/gcc/testsuite/c-c++-common/goacc/reduction-16.c
new file mode 100644
index 0000000..6fb7054
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/reduction-16.c
@@ -0,0 +1,30 @@
+/* { dg-compile } */
+#include <stdlib.h>
+
+int foo (int n)
+{
+ int x[5][5];
+ int y[n];
+ int *z = (int *) malloc (5 * sizeof (int));
+
+ #pragma acc parallel
+ {
+ #pragma acc loop reduction(+:x)
+ for (int i = 0; i < 5; i++) ;
+ #pragma acc loop reduction(+:y)
+ for (int i = 0; i < 5; i++) ;
+
+ #pragma acc loop reduction(+:x[2:1][0:5])
+ for (int i = 0; i < 5; i++) ;
+ #pragma acc loop reduction(+:x[0:5][2:1]) /* { dg-error "array section is not contiguous in 'reduction' clause" } */
+ for (int i = 0; i < 5; i++) ;
+
+ #pragma acc loop reduction(+:y[0:5])
+ for (int i = 0; i < 5; i++) ;
+
+ #pragma acc loop reduction(+:z[0:5])
+ for (int i = 0; i < 5; i++) ;
+ }
+
+ return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-9.c b/gcc/testsuite/c-c++-common/goacc/reduction-9.c
index eba1d02..72b2f07 100644
--- a/gcc/testsuite/c-c++-common/goacc/reduction-9.c
+++ b/gcc/testsuite/c-c++-common/goacc/reduction-9.c
@@ -19,15 +19,15 @@ test_parallel ()
for (i = 0; i < 10; i++)
s1.b[3] += 1;
-#pragma acc parallel reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\[' token" } */
+#pragma acc parallel reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\.' token" } */
for (i = 0; i < 10; i++)
s2[2].a += 1;
-#pragma acc parallel reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\[' token" } */
+#pragma acc parallel reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\.' token" } */
for (i = 0; i < 10; i++)
s2[3].b[4] += 1;
-#pragma acc parallel reduction(+:z[5]) /* { dg-error "expected '\\\)' before '\\\[' token" } */
+#pragma acc parallel reduction(+:z[5])
for (i = 0; i < 10; i++)
z[5] += 1;
}
@@ -51,15 +51,15 @@ test_combined ()
for (i = 0; i < 10; i++)
s1.b[3] += 1;
-#pragma acc parallel loop reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\[' token" } */
+#pragma acc parallel loop reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\.' token" } */
for (i = 0; i < 10; i++)
s2[2].a += 1;
-#pragma acc parallel loop reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\[' token" } */
+#pragma acc parallel loop reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\.' token" } */
for (i = 0; i < 10; i++)
s2[3].b[4] += 1;
-#pragma acc parallel loop reduction(+:z[5]) /* { dg-error "expected '\\\)' before '\\\[' token" } */
+#pragma acc parallel loop reduction(+:z[5])
for (i = 0; i < 10; i++)
z[5] += 1;
@@ -86,15 +86,15 @@ test_loops ()
for (i = 0; i < 10; i++)
s1.b[3] += 1;
-#pragma acc loop reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\[' token" } */
+#pragma acc loop reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\.' token" } */
for (i = 0; i < 10; i++)
s2[2].a += 1;
-#pragma acc loop reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\[' token" } */
+#pragma acc loop reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\.' token" } */
for (i = 0; i < 10; i++)
s2[3].b[4] += 1;
-#pragma acc loop reduction(+:z[5]) /* { dg-error "expected '\\\)' before '\\\[' token" } */
+#pragma acc loop reduction(+:z[5])
for (i = 0; i < 10; i++)
z[5] += 1;
}
diff --git a/gcc/testsuite/g++.dg/goacc/reductions-1.C b/gcc/testsuite/g++.dg/goacc/reductions-1.C
index 18f43f4..829a658 100644
--- a/gcc/testsuite/g++.dg/goacc/reductions-1.C
+++ b/gcc/testsuite/g++.dg/goacc/reductions-1.C
@@ -73,19 +73,19 @@ test_parallel ()
for (i = 0; i < 100; i++)
c1.get_d ()[1] += 1;
-#pragma acc parallel reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].a += 1;
-#pragma acc parallel reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].get_b () += 1;
-#pragma acc parallel reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].c[1] += 1;
-#pragma acc parallel reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].get_d ()[1] += 1;
@@ -109,19 +109,19 @@ test_parallel ()
c2.get_d ()[1] += 1;
-#pragma acc parallel reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].a += 1;
-#pragma acc parallel reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].get_b () += 1;
-#pragma acc parallel reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].c[1] += 1;
-#pragma acc parallel reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].get_d ()[1] += 1;
@@ -144,19 +144,19 @@ test_parallel ()
for (i = 0; i < 100; i++)
s1.get_d ()[1] += 1;
-#pragma acc parallel reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].a += 1;
-#pragma acc parallel reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].get_b () += 1;
-#pragma acc parallel reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].c[1] += 1;
-#pragma acc parallel reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].get_d ()[1] += 1;
@@ -179,30 +179,30 @@ test_parallel ()
for (i = 0; i < 100; i++)
s2.get_d ()[1] += 1;
-#pragma acc parallel reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].a += 1;
-#pragma acc parallel reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].get_b () += 1;
-#pragma acc parallel reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].c[1] += 1;
-#pragma acc parallel reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].get_d ()[1] += 1;
// Reductions on arrays.
-#pragma acc parallel reduction(+:a[10]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:a[10]) // { dg-error "zero length array section in 'reduction' clause" }
for (i = 0; i < 100; i++)
a[10] += 1;
-#pragma acc parallel reduction(+:b[10]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel reduction(+:b[10]) // { dg-error "zero length array section in 'reduction' clause" }
for (i = 0; i < 100; i++)
b[10] += 1;
}
@@ -236,19 +236,19 @@ test_combined ()
for (i = 0; i < 100; i++)
c1.get_d ()[1] += 1;
-#pragma acc parallel loop reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].a += 1;
-#pragma acc parallel loop reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].get_b () += 1;
-#pragma acc parallel loop reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].c[1] += 1;
-#pragma acc parallel loop reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].get_d ()[1] += 1;
@@ -272,19 +272,19 @@ test_combined ()
c2.get_d ()[1] += 1;
-#pragma acc parallel loop reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].a += 1;
-#pragma acc parallel loop reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].get_b () += 1;
-#pragma acc parallel loop reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].c[1] += 1;
-#pragma acc parallel loop reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].get_d ()[1] += 1;
@@ -307,19 +307,19 @@ test_combined ()
for (i = 0; i < 100; i++)
s1.get_d ()[1] += 1;
-#pragma acc parallel loop reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].a += 1;
-#pragma acc parallel loop reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].get_b () += 1;
-#pragma acc parallel loop reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].c[1] += 1;
-#pragma acc parallel loop reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].get_d ()[1] += 1;
@@ -342,30 +342,30 @@ test_combined ()
for (i = 0; i < 100; i++)
s2.get_d ()[1] += 1;
-#pragma acc parallel loop reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].a += 1;
-#pragma acc parallel loop reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].get_b () += 1;
-#pragma acc parallel loop reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].c[1] += 1;
-#pragma acc parallel loop reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].get_d ()[1] += 1;
// Reductions on arrays.
-#pragma acc parallel loop reduction(+:a[10]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:a[10]) // { dg-error "zero length array section in 'reduction' clause" }
for (i = 0; i < 100; i++)
a[10] += 1;
-#pragma acc parallel loop reduction(+:b[10]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc parallel loop reduction(+:b[10]) // { dg-error "zero length array section in 'reduction' clause" }
for (i = 0; i < 100; i++)
b[10] += 1;
}
@@ -402,19 +402,19 @@ test_loop ()
for (i = 0; i < 100; i++)
c1.get_d ()[1] += 1;
-#pragma acc loop reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].a += 1;
-#pragma acc loop reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].get_b () += 1;
-#pragma acc loop reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].c[1] += 1;
-#pragma acc loop reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c1a[1].get_d ()[1] += 1;
@@ -438,19 +438,19 @@ test_loop ()
c2.get_d ()[1] += 1;
-#pragma acc loop reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].a += 1;
-#pragma acc loop reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].get_b () += 1;
-#pragma acc loop reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].c[1] += 1;
-#pragma acc loop reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
c2a[1].get_d ()[1] += 1;
@@ -473,19 +473,19 @@ test_loop ()
for (i = 0; i < 100; i++)
s1.get_d ()[1] += 1;
-#pragma acc loop reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].a += 1;
-#pragma acc loop reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.. token" }
for (i = 0; i < 100; i++)
s1a[1].get_b () += 1;
-#pragma acc loop reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].c[1] += 1;
-#pragma acc loop reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s1a[1].get_d ()[1] += 1;
@@ -508,30 +508,30 @@ test_loop ()
for (i = 0; i < 100; i++)
s2.get_d ()[1] += 1;
-#pragma acc loop reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].a += 1;
-#pragma acc loop reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].get_b () += 1;
-#pragma acc loop reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].c[1] += 1;
-#pragma acc loop reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" }
for (i = 0; i < 100; i++)
s2a[1].get_d ()[1] += 1;
// Reductions on arrays.
-#pragma acc loop reduction(+:a[10]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:a[10]) // { dg-error "zero length array section in 'reduction' clause" }
for (i = 0; i < 100; i++)
a[10] += 1;
-#pragma acc loop reduction(+:b[10]) // { dg-error "expected '\\\)' before '\\\[' token" }
+#pragma acc loop reduction(+:b[10]) // { dg-error "zero length array section in 'reduction' clause" }
for (i = 0; i < 100; i++)
b[10] += 1;
}
diff --git a/gcc/testsuite/gfortran.dg/goacc/array-reduction.f90 b/gcc/testsuite/gfortran.dg/goacc/array-reduction.f90
index 5c489ff..f521bcc 100644
--- a/gcc/testsuite/gfortran.dg/goacc/array-reduction.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/array-reduction.f90
@@ -1,39 +1,42 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-gimple" }
+
program test
implicit none
integer a(10), i
a(:) = 0
-
+
! Array reductions.
-
- !$acc parallel reduction (+:a) ! { dg-error "Array 'a' is not permitted in reduction" }
+
+ !$acc parallel reduction (+:a)
do i = 1, 10
a = a + 1
end do
!$acc end parallel
!$acc parallel
- !$acc loop reduction (+:a) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc loop reduction (+:a)
do i = 1, 10
a = a + 1
end do
!$acc end parallel
!$acc kernels
- !$acc loop reduction (+:a) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc loop reduction (+:a)
do i = 1, 10
a = a + 1
end do
!$acc end kernels
- !$acc serial reduction (+:a) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc serial reduction (+:a)
do i = 1, 10
a = a + 1
end do
!$acc end serial
!$acc serial
- !$acc loop reduction (+:a) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc loop reduction (+:a)
do i = 1, 10
a = a + 1
end do
@@ -41,35 +44,35 @@ program test
! Subarray reductions.
-
- !$acc parallel reduction (+:a(1:5)) ! { dg-error "Array 'a' is not permitted in reduction" }
+
+ !$acc parallel reduction (+:a(1:5))
do i = 1, 10
a = a + 1
end do
!$acc end parallel
!$acc parallel
- !$acc loop reduction (+:a(1:5)) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc loop reduction (+:a(1:5))
do i = 1, 10
a = a + 1
end do
!$acc end parallel
!$acc kernels
- !$acc loop reduction (+:a(1:5)) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc loop reduction (+:a(1:5))
do i = 1, 10
a = a + 1
end do
!$acc end kernels
- !$acc serial reduction (+:a(1:5)) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc serial reduction (+:a(1:5))
do i = 1, 10
a = a + 1
end do
!$acc end serial
!$acc serial
- !$acc loop reduction (+:a(1:5)) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc loop reduction (+:a(1:5))
do i = 1, 10
a = a + 1
end do
@@ -77,35 +80,35 @@ program test
! Reductions on array elements.
-
- !$acc parallel reduction (+:a(1)) ! { dg-error "Array 'a' is not permitted in reduction" }
+
+ !$acc parallel reduction (+:a(1))
do i = 1, 10
a(1) = a(1) + 1
end do
!$acc end parallel
!$acc parallel
- !$acc loop reduction (+:a(1)) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc loop reduction (+:a(1))
do i = 1, 10
a(1) = a(1) + 1
end do
!$acc end parallel
!$acc kernels
- !$acc loop reduction (+:a(1)) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc loop reduction (+:a(1))
do i = 1, 10
a(1) = a(1) + 1
end do
!$acc end kernels
-
- !$acc serial reduction (+:a(1)) ! { dg-error "Array 'a' is not permitted in reduction" }
+
+ !$acc serial reduction (+:a(1))
do i = 1, 10
a(1) = a(1) + 1
end do
!$acc end serial
!$acc serial
- !$acc loop reduction (+:a(1)) ! { dg-error "Array 'a' is not permitted in reduction" }
+ !$acc loop reduction (+:a(1))
do i = 1, 10
a(1) = a(1) + 1
end do
@@ -114,3 +117,7 @@ program test
print *, a
end program test
+
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc loop private\\(i\\) reduction\\(\\+:MEM.*\\\[.*&a.*\\\]\\)" 9 "gimple" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_parallel reduction\\(\\+:MEM.*\\\[.*&a.*\\\]\\) map\\(tofrom:a \\\[len: \[0-9\]+\\\]\\)" 3 "gimple" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_serial reduction\\(\\+:MEM.*\\\[.*&a.*\\\]\\) map\\(tofrom:a \\\[len: \[0-9\]+\\\]\\)" 3 "gimple" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data-2.f90 b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data-2.f90
index 6a16c8a..a835937 100644
--- a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data-2.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data-2.f90
@@ -9,17 +9,17 @@ type(t) :: var
allocate (var%arr(1:100))
!$acc enter data copyin(var%arr(10:20))
-! { dg-final { scan-tree-dump-times {(?n)#pragma acc enter data map\(to:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\);$} 1 "original" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma acc enter data map\(to:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: D.[0-9]+\]\);$} 1 "original" } }
!$acc exit data delete(var%arr(10:20))
-! { dg-final { scan-tree-dump-times {(?n)#pragma acc exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\);$} 1 "original" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma acc exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: D.[0-9]+\]\);$} 1 "original" } }
!$acc enter data create(var%arr(20:30))
-! { dg-final { scan-tree-dump-times {(?n)#pragma acc enter data map\(alloc:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\);$} 1 "original" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma acc enter data map\(alloc:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: D.[0-9]+\]\);$} 1 "original" } }
!$acc exit data finalize delete(var%arr(20:30))
-! { dg-final { scan-tree-dump-times {(?n)#pragma acc exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\) finalize;$} 1 "original" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma acc exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: D.[0-9]+\]\) finalize;$} 1 "original" } }
!$acc enter data copyin(var%arr)
diff --git a/gcc/testsuite/gfortran.dg/goacc/finalize-1.f b/gcc/testsuite/gfortran.dg/goacc/finalize-1.f
index 63beb47..b34c070 100644
--- a/gcc/testsuite/gfortran.dg/goacc/finalize-1.f
+++ b/gcc/testsuite/gfortran.dg/goacc/finalize-1.f
@@ -20,7 +20,7 @@
! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_exit_data map\\(delete:del_f \\\[len: \[0-9\]+\\\]\\) finalize$" 1 "gimple" } }
!$ACC EXIT DATA FINALIZE DELETE (del_f_p(2:5))
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(release:\\*\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\) parm\\.0\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(release:del_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:\\(integer\\(kind=1\\)\\\[0:\\\] \\* restrict\\) del_f_p\\.data \\\[pointer assign, bias: \\(.*int.*\\) parm\\.0\\.data - \\(.*int.*\\) del_f_p\\.data\\\]\\) finalize;$" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(release:\\*\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\) parm\\.0\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(release:del_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:\\(integer\\(kind=1\\)\\\[0:\\\] \\* restrict\\) del_f_p\\.data \\\[pointer assign, bias: D.\[0-9\]+\\\]\\) finalize;$" 1 "original" } }
! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_exit_data map\\(delete:MEM <\[^>\]+> \\\[\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\)_\[0-9\]+\\\] \\\[len: \[^\\\]\]+\\\]\\) map\\(delete:del_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:del_f_p\\.data \\\[pointer assign, bias: \[^\\\]\]+\\\]\\) finalize$" 1 "gimple" } }
!$ACC EXIT DATA COPYOUT (cpo_r)
@@ -32,6 +32,6 @@
! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_exit_data map\\(force_from:cpo_f \\\[len: \[0-9\]+\\\]\\) finalize$" 1 "gimple" } }
!$ACC EXIT DATA COPYOUT (cpo_f_p(4:10)) FINALIZE
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(from:\\*\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\) parm\\.1\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(release:cpo_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:\\(integer\\(kind=1\\)\\\[0:\\\] \\* restrict\\) cpo_f_p\\.data \\\[pointer assign, bias: \\(.*int.*\\) parm\\.1\\.data - \\(.*int.*\\) cpo_f_p\\.data\\\]\\) finalize;$" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(from:\\*\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\) parm\\.1\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(release:cpo_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:\\(integer\\(kind=1\\)\\\[0:\\\] \\* restrict\\) cpo_f_p\\.data \\\[pointer assign, bias: D.\[0-9\]+\\\]\\) finalize;$" 1 "original" } }
! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_exit_data map\\(force_from:MEM <\[^>\]+> \\\[\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\)_\[0-9\]+\\\] \\\[len: \[^\\\]\]+\\\]\\) map\\(delete:cpo_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:cpo_f_p\\.data \\\[pointer assign, bias: \[^\\\]\]+\\\]\\) finalize$" 1 "gimple" } }
END SUBROUTINE f
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95
index 1a26844..dc823b7 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95
@@ -77,7 +77,7 @@ program main
!$acc end kernels
end program main
-! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_kernels map\(to:a\[_[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: _[0-9]+\]\) map\(tofrom:sum \[len: [0-9]+\]\)$} 1 "gimple" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_kernels map\(to:a\[D.[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: D.[0-9]+\]\) map\(tofrom:sum \[len: [0-9]+\]\)$} 1 "gimple" } }
! { dg-final { scan-tree-dump-times {(?n)#pragma acc loop private\(i\)$} 2 "gimple" } }
! { dg-final { scan-tree-dump-times {(?n)#pragma acc loop private\(i\) independent$} 1 "gimple" } }
@@ -86,11 +86,11 @@ end program main
! Check that the OpenACC 'kernels' got decomposed into 'data' and an enclosed
! sequence of compute constructs.
-! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_data_kernels map\(to:a\[_[0-9]+\] \[len: _[0-9]+\]\) map\(tofrom:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_data_kernels map\(to:a\[D.[0-9]+\] \[len: _[0-9]+\]\) map\(tofrom:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } }
! As noted above, we get three "old-style" kernel regions, one gang-single region, and one parallelized loop region.
-! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_kernels async\(-1\) map\(force_present:a\[_[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: _[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 3 "omp_oacc_kernels_decompose" } }
-! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_parallel_kernels_parallelized async\(-1\) map\(force_present:a\[_[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: _[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } }
-! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_parallel_kernels_gang_single async\(-1\) num_gangs\(1\) map\(force_present:a\[_[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: _[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_kernels async\(-1\) map\(force_present:a\[D.[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: D.[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 3 "omp_oacc_kernels_decompose" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_parallel_kernels_parallelized async\(-1\) map\(force_present:a\[D.[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: D.[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_parallel_kernels_gang_single async\(-1\) num_gangs\(1\) map\(force_present:a\[D.[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: D.[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } }
!
! 'data' plus five CCs.
! { dg-final { scan-tree-dump-times {(?n)#pragma omp target } 6 "omp_oacc_kernels_decompose" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/pr70828.f90 b/gcc/testsuite/gfortran.dg/goacc/pr70828.f90
index fcfe086..648bad8 100644
--- a/gcc/testsuite/gfortran.dg/goacc/pr70828.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/pr70828.f90
@@ -18,5 +18,5 @@ program test
!$acc end data
end program test
-! { dg-final { scan-tree-dump-times "omp target oacc_data map\\(tofrom:data\\\[\_\[0-9\]+\\\] \\\[len: _\[0-9\]+\\\]\\) map\\(alloc:data \\\[pointer assign, bias: _\[0-9\]+\\\]\\)" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "omp target oacc_data map\\(tofrom:data\\\[D\\.\[0-9\]+\\\] \\\[len: _\[0-9\]+\\\]\\) map\\(alloc:data \\\[pointer assign, bias: D\\.\[0-9\]+\\\]\\)" 1 "gimple" } }
! { dg-final { scan-tree-dump-times "omp target oacc_parallel map\\(force_present:data\\\[D\\.\[0-9\]+\\\] \\\[len: D\\.\[0-9\]+\\\]\\) map\\(alloc:data \\\[pointer assign, bias: D\\.\[0-9\]+\\\]\\)" 1 "gimple" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/reduction.f95 b/gcc/testsuite/gfortran.dg/goacc/reduction.f95
index 2d9a111..4ab180a 100644
--- a/gcc/testsuite/gfortran.dg/goacc/reduction.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/reduction.f95
@@ -25,19 +25,14 @@ save i2
common /blk/ i1
!$acc parallel reduction (+:ia2)
-! { dg-error "Array 'ia2' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end parallel
!$acc parallel reduction (+:ra1)
-! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end parallel
!$acc parallel reduction (+:ca1)
-! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end parallel
!$acc parallel reduction (+:da1)
-! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end parallel
!$acc parallel reduction (.and.:la1)
-! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end parallel
!$acc parallel reduction (+:i3, r1, d1, c1)
!$acc end parallel
@@ -73,104 +68,84 @@ common /blk/ i1
!$acc parallel reduction (*:ia1) ! { dg-error "Assumed size" }
! { dg-error "Array 'ia1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end parallel
-!$acc parallel reduction (+:l1) ! { dg-error "OMP DECLARE REDUCTION \\+ not found for type LOGICAL" }
+!$acc parallel reduction (+:l1) ! { dg-error "Reduction operator \\+ is not valid for 'l1'" }
!$acc end parallel
-!$acc parallel reduction (*:la1) ! { dg-error "OMP DECLARE REDUCTION \\* not found for type LOGICAL" }
-! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (*:la1) ! { dg-error "Reduction operator \\* is not valid for 'la1'" }
!$acc end parallel
-!$acc parallel reduction (-:a1) ! { dg-error "OMP DECLARE REDUCTION - not found for type CHARACTER" }
+!$acc parallel reduction (-:a1) ! { dg-error "Reduction operator - is not valid for 'a1'" }
!$acc end parallel
-!$acc parallel reduction (+:t1) ! { dg-error "OMP DECLARE REDUCTION \\+ not found for type TYPE" }
+!$acc parallel reduction (+:t1)
!$acc end parallel
-!$acc parallel reduction (*:ta1) ! { dg-error "OMP DECLARE REDUCTION \\* not found for type TYPE" }
-! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (*:ta1)
!$acc end parallel
-!$acc parallel reduction (.and.:i3) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type INTEGER" }
+!$acc parallel reduction (.and.:i3) ! { dg-error "Reduction operator \\.and\\. is not valid for 'i3'" }
!$acc end parallel
-!$acc parallel reduction (.or.:ia2) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type INTEGER" }
-! { dg-error "Array 'ia2' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (.or.:ia2) ! { dg-error "Reduction operator \\.or\\. is not valid for 'ia2'" }
!$acc end parallel
-!$acc parallel reduction (.eqv.:r1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type REAL" }
+!$acc parallel reduction (.eqv.:r1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'r1'" }
!$acc end parallel
-!$acc parallel reduction (.neqv.:ra1) ! { dg-error "OMP DECLARE REDUCTION \\.neqv\\. not found for type REAL" }
-! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (.neqv.:ra1) ! { dg-error "Reduction operator \\.neqv. is not valid for 'ra1'" }
!$acc end parallel
-!$acc parallel reduction (.and.:d1) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type REAL" }
+!$acc parallel reduction (.and.:d1) ! { dg-error "Reduction operator \\.and\\. is not valid for 'd1'" }
!$acc end parallel
-!$acc parallel reduction (.or.:da1) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type REAL" }
-! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (.or.:da1) ! { dg-error "Reduction operator \\.or\\. is not valid for 'da1'" }
!$acc end parallel
-!$acc parallel reduction (.eqv.:c1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type COMPLEX" }
+!$acc parallel reduction (.eqv.:c1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'c1'" }
!$acc end parallel
-!$acc parallel reduction (.neqv.:ca1) ! { dg-error "OMP DECLARE REDUCTION \\.neqv\\. not found for type COMPLEX" }
-! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (.neqv.:ca1) ! { dg-error "Reduction operator \\.neqv\\. is not valid for 'ca1'" }
!$acc end parallel
-!$acc parallel reduction (.and.:a1) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type CHARACTER" }
+!$acc parallel reduction (.and.:a1) ! { dg-error "Reduction operator \\.and\\. is not valid for 'a1'" }
!$acc end parallel
-!$acc parallel reduction (.or.:t1) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type TYPE" }
+!$acc parallel reduction (.or.:t1) ! { dg-error "Reduction operator \\.or\\. is not valid for 't1'" }
!$acc end parallel
-!$acc parallel reduction (.eqv.:ta1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type TYPE" }
-! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (.eqv.:ta1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'ta1'" }
!$acc end parallel
-!$acc parallel reduction (min:c1) ! { dg-error "OMP DECLARE REDUCTION min not found for type COMPLEX" }
+!$acc parallel reduction (min:c1) ! { dg-error "Reduction operator min is not valid for 'c1'" }
!$acc end parallel
-!$acc parallel reduction (max:ca1) ! { dg-error "OMP DECLARE REDUCTION max not found for type COMPLEX" }
-! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (max:ca1) ! { dg-error "Reduction operator max is not valid for 'ca1'" }
!$acc end parallel
-!$acc parallel reduction (max:l1) ! { dg-error "OMP DECLARE REDUCTION max not found for type LOGICAL" }
+!$acc parallel reduction (max:l1) ! { dg-error "Reduction operator max is not valid for 'l1'" }
!$acc end parallel
-!$acc parallel reduction (min:la1) ! { dg-error "OMP DECLARE REDUCTION min not found for type LOGICAL" }
-! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (min:la1) ! { dg-error "Reduction operator min is not valid for 'la1'" }
!$acc end parallel
-!$acc parallel reduction (max:a1) ! { dg-error "OMP DECLARE REDUCTION max not found for type CHARACTER" }
+!$acc parallel reduction (max:a1) ! { dg-error "Reduction operator max is not valid for 'a1'" }
!$acc end parallel
-!$acc parallel reduction (min:t1) ! { dg-error "OMP DECLARE REDUCTION min not found for type TYPE" }
+!$acc parallel reduction (min:t1)
!$acc end parallel
-!$acc parallel reduction (max:ta1) ! { dg-error "OMP DECLARE REDUCTION max not found for type TYPE" }
-! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (max:ta1)
!$acc end parallel
-!$acc parallel reduction (iand:r1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type REAL" }
+!$acc parallel reduction (iand:r1) ! { dg-error "Reduction operator iand is not valid for 'r1'" }
!$acc end parallel
-!$acc parallel reduction (ior:ra1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type REAL" }
-! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (ior:ra1) ! { dg-error "Reduction operator ior is not valid for 'ra1'" }
!$acc end parallel
-!$acc parallel reduction (ieor:d1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type REAL" }
+!$acc parallel reduction (ieor:d1) ! { dg-error "Reduction operator ieor is not valid for 'd1'" }
!$acc end parallel
-!$acc parallel reduction (ior:da1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type REAL" }
-! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (ior:da1) ! { dg-error "Reduction operator ior is not valid for 'da1'" }
!$acc end parallel
-!$acc parallel reduction (iand:c1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type COMPLEX" }
+!$acc parallel reduction (iand:c1) ! { dg-error "Reduction operator iand is not valid for 'c1'" }
!$acc end parallel
-!$acc parallel reduction (ior:ca1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type COMPLEX" }
-! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (ior:ca1) ! { dg-error "Reduction operator ior is not valid for 'ca1'" }
!$acc end parallel
-!$acc parallel reduction (ieor:l1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type LOGICAL" }
+!$acc parallel reduction (ieor:l1) ! { dg-error "Reduction operator ieor is not valid for 'l1'" }
!$acc end parallel
-!$acc parallel reduction (iand:la1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type LOGICAL" }
-! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (iand:la1) ! { dg-error "Reduction operator iand is not valid for 'la1'" }
!$acc end parallel
-!$acc parallel reduction (ior:a1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type CHARACTER" }
+!$acc parallel reduction (ior:a1) ! { dg-error "Reduction operator ior is not valid for 'a1'" }
!$acc end parallel
-!$acc parallel reduction (ieor:t1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type TYPE" }
+!$acc parallel reduction (ieor:t1)
!$acc end parallel
-!$acc parallel reduction (iand:ta1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type TYPE" }
-! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc parallel reduction (iand:ta1)
!$acc end parallel
!$acc serial reduction (+:ia2)
-! { dg-error "Array 'ia2' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end serial
!$acc serial reduction (+:ra1)
-! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end serial
!$acc serial reduction (+:ca1)
-! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end serial
!$acc serial reduction (+:da1)
-! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end serial
!$acc serial reduction (.and.:la1)
-! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end serial
!$acc serial reduction (+:i3, r1, d1, c1)
!$acc end serial
@@ -206,88 +181,73 @@ common /blk/ i1
!$acc serial reduction (*:ia1) ! { dg-error "Assumed size" }
! { dg-error "Array 'ia1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
!$acc end serial
-!$acc serial reduction (+:l1) ! { dg-error "OMP DECLARE REDUCTION \\+ not found for type LOGICAL" }
+!$acc serial reduction (+:l1) ! { dg-error "Reduction operator \\+ is not valid for 'l1'" }
!$acc end serial
-!$acc serial reduction (*:la1) ! { dg-error "OMP DECLARE REDUCTION \\* not found for type LOGICAL" }
-! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (*:la1) ! { dg-error "Reduction operator \\* is not valid for 'la1'" }
!$acc end serial
-!$acc serial reduction (-:a1) ! { dg-error "OMP DECLARE REDUCTION - not found for type CHARACTER" }
+!$acc serial reduction (-:a1) ! { dg-error "Reduction operator - is not valid for 'a1'" }
!$acc end serial
-!$acc serial reduction (+:t1) ! { dg-error "OMP DECLARE REDUCTION \\+ not found for type TYPE" }
+!$acc serial reduction (+:t1)
!$acc end serial
-!$acc serial reduction (*:ta1) ! { dg-error "OMP DECLARE REDUCTION \\* not found for type TYPE" }
-! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (*:ta1)
!$acc end serial
-!$acc serial reduction (.and.:i3) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type INTEGER" }
+!$acc serial reduction (.and.:i3) ! { dg-error "Reduction operator \\.and\\. is not valid for 'i3'" }
!$acc end serial
-!$acc serial reduction (.or.:ia2) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type INTEGER" }
-! { dg-error "Array 'ia2' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (.or.:ia2) ! { dg-error "Reduction operator \\.or\\. is not valid for 'ia2'" }
!$acc end serial
-!$acc serial reduction (.eqv.:r1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type REAL" }
+!$acc serial reduction (.eqv.:r1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'r1'" }
!$acc end serial
-!$acc serial reduction (.neqv.:ra1) ! { dg-error "OMP DECLARE REDUCTION \\.neqv\\. not found for type REAL" }
-! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (.neqv.:ra1) ! { dg-error "Reduction operator \\.neqv\\. is not valid for 'ra1'" }
!$acc end serial
-!$acc serial reduction (.and.:d1) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type REAL" }
+!$acc serial reduction (.and.:d1) ! { dg-error "Reduction operator \\.and\\. is not valid for 'd1'" }
!$acc end serial
-!$acc serial reduction (.or.:da1) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type REAL" }
-! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (.or.:da1) ! { dg-error "Reduction operator \\.or\\. is not valid for 'da1'" }
!$acc end serial
-!$acc serial reduction (.eqv.:c1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type COMPLEX" }
+!$acc serial reduction (.eqv.:c1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'c1'" }
!$acc end serial
-!$acc serial reduction (.neqv.:ca1) ! { dg-error "OMP DECLARE REDUCTION \\.neqv\\. not found for type COMPLEX" }
-! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (.neqv.:ca1) ! { dg-error "Reduction operator \\.neqv\\. is not valid for 'ca1'" }
!$acc end serial
-!$acc serial reduction (.and.:a1) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type CHARACTER" }
+!$acc serial reduction (.and.:a1) ! { dg-error "Reduction operator \\.and\\. is not valid for 'a1'" }
!$acc end serial
-!$acc serial reduction (.or.:t1) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type TYPE" }
+!$acc serial reduction (.or.:t1) ! { dg-error "Reduction operator \\.or\\. is not valid for 't1'" }
!$acc end serial
-!$acc serial reduction (.eqv.:ta1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type TYPE" }
-! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (.eqv.:ta1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'ta1'" }
!$acc end serial
-!$acc serial reduction (min:c1) ! { dg-error "OMP DECLARE REDUCTION min not found for type COMPLEX" }
+!$acc serial reduction (min:c1) ! { dg-error "Reduction operator min is not valid for 'c1'" }
!$acc end serial
-!$acc serial reduction (max:ca1) ! { dg-error "OMP DECLARE REDUCTION max not found for type COMPLEX" }
-! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (max:ca1) ! { dg-error "Reduction operator max is not valid for 'ca1'" }
!$acc end serial
-!$acc serial reduction (max:l1) ! { dg-error "OMP DECLARE REDUCTION max not found for type LOGICAL" }
+!$acc serial reduction (max:l1) ! { dg-error "Reduction operator max is not valid for 'l1'" }
!$acc end serial
-!$acc serial reduction (min:la1) ! { dg-error "OMP DECLARE REDUCTION min not found for type LOGICAL" }
-! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (min:la1) ! { dg-error "Reduction operator min is not valid for 'la1'" }
!$acc end serial
-!$acc serial reduction (max:a1) ! { dg-error "OMP DECLARE REDUCTION max not found for type CHARACTER" }
+!$acc serial reduction (max:a1) ! { dg-error "Reduction operator max is not valid for 'a1'" }
!$acc end serial
-!$acc serial reduction (min:t1) ! { dg-error "OMP DECLARE REDUCTION min not found for type TYPE" }
+!$acc serial reduction (min:t1)
!$acc end serial
-!$acc serial reduction (max:ta1) ! { dg-error "OMP DECLARE REDUCTION max not found for type TYPE" }
-! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (max:ta1)
!$acc end serial
-!$acc serial reduction (iand:r1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type REAL" }
+!$acc serial reduction (iand:r1) ! { dg-error "Reduction operator iand is not valid for 'r1'" }
!$acc end serial
-!$acc serial reduction (ior:ra1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type REAL" }
-! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (ior:ra1) ! { dg-error "Reduction operator ior is not valid for 'ra1'" }
!$acc end serial
-!$acc serial reduction (ieor:d1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type REAL" }
+!$acc serial reduction (ieor:d1) ! { dg-error "Reduction operator ieor is not valid for 'd1'" }
!$acc end serial
-!$acc serial reduction (ior:da1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type REAL" }
-! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (ior:da1) ! { dg-error "Reduction operator ior is not valid for 'da1'" }
!$acc end serial
-!$acc serial reduction (iand:c1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type COMPLEX" }
+!$acc serial reduction (iand:c1) ! { dg-error "Reduction operator iand is not valid for 'c1'" }
!$acc end serial
-!$acc serial reduction (ior:ca1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type COMPLEX" }
-! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (ior:ca1) ! { dg-error "Reduction operator ior is not valid for 'ca1'" }
!$acc end serial
-!$acc serial reduction (ieor:l1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type LOGICAL" }
+!$acc serial reduction (ieor:l1) ! { dg-error "Reduction operator ieor is not valid for 'l1'" }
!$acc end serial
-!$acc serial reduction (iand:la1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type LOGICAL" }
-! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (iand:la1) ! { dg-error "Reduction operator iand is not valid for 'la1'" }
!$acc end serial
-!$acc serial reduction (ior:a1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type CHARACTER" }
+!$acc serial reduction (ior:a1) ! { dg-error "Reduction operator ior is not valid for 'a1'" }
!$acc end serial
-!$acc serial reduction (ieor:t1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type TYPE" }
+!$acc serial reduction (ieor:t1)
!$acc end serial
-!$acc serial reduction (iand:ta1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type TYPE" }
-! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 }
+!$acc serial reduction (iand:ta1)
!$acc end serial
end subroutine
diff --git a/gcc/testsuite/gfortran.dg/gomp/target-enter-exit-data.f90 b/gcc/testsuite/gfortran.dg/gomp/target-enter-exit-data.f90
index 74eb894..dc267c2 100644
--- a/gcc/testsuite/gfortran.dg/gomp/target-enter-exit-data.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/target-enter-exit-data.f90
@@ -9,17 +9,17 @@ type(t) :: var
allocate (var%arr(1:100))
!$omp target enter data map(to: var%arr(10:20))
-! { dg-final { scan-tree-dump-times {(?n)#pragma omp target enter data map\(to:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\)$} 1 "original" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma omp target enter data map\(to:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: D\.[0-9]+\]\)$} 1 "original" } }
!$omp target exit data map(release: var%arr(10:20))
-! { dg-final { scan-tree-dump-times {(?n)#pragma omp target exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\)$} 1 "original" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma omp target exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: D\.[0-9]+\]\)$} 1 "original" } }
!$omp target enter data map(alloc: var%arr(20:30))
-! { dg-final { scan-tree-dump-times {(?n)#pragma omp target enter data map\(alloc:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\)$} 1 "original" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma omp target enter data map\(alloc:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: D\.[0-9]+\]\)$} 1 "original" } }
!$omp target exit data map(delete: var%arr(20:30))
-! { dg-final { scan-tree-dump-times {(?n)#pragma omp target exit data map\(delete:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(delete:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\)$} 1 "original" } }
+! { dg-final { scan-tree-dump-times {(?n)#pragma omp target exit data map\(delete:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(delete:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: D\.[0-9]+\]\)$} 1 "original" } }
!$omp target enter data map(to: var%arr)
diff --git a/gcc/tree-loop-distribution.cc b/gcc/tree-loop-distribution.cc
index fc0cd39..6be5afd 100644
--- a/gcc/tree-loop-distribution.cc
+++ b/gcc/tree-loop-distribution.cc
@@ -1197,6 +1197,16 @@ generate_memset_builtin (class loop *loop, partition *partition)
/* The new statements will be placed before LOOP. */
gsi = gsi_last_bb (loop_preheader_edge (loop)->src);
+ if (flag_openacc
+ && gsi_stmt (gsi)
+ && gimple_call_internal_p (gsi_stmt (gsi), IFN_UNIQUE)
+ && (TREE_INT_CST_LOW (gimple_call_arg (gsi_stmt (gsi), 0))
+ == (unsigned HOST_WIDE_INT) IFN_UNIQUE_OACC_FORK))
+ {
+ edge e = split_block (loop_preheader_edge (loop)->src, gsi_stmt (gsi));
+ gsi = gsi_last_bb (e->dest);
+ }
+
nb_bytes = rewrite_to_non_trapping_overflow (builtin->size);
nb_bytes = force_gimple_operand_gsi (&gsi, nb_bytes, true, NULL_TREE,
false, GSI_CONTINUE_LINKING);
@@ -1251,6 +1261,16 @@ generate_memcpy_builtin (class loop *loop, partition *partition)
/* The new statements will be placed before LOOP. */
gsi = gsi_last_bb (loop_preheader_edge (loop)->src);
+ if (flag_openacc
+ && gsi_stmt (gsi)
+ && gimple_call_internal_p (gsi_stmt (gsi), IFN_UNIQUE)
+ && (TREE_INT_CST_LOW (gimple_call_arg (gsi_stmt (gsi), 0))
+ == (unsigned HOST_WIDE_INT) IFN_UNIQUE_OACC_FORK))
+ {
+ edge e = split_block (loop_preheader_edge (loop)->src, gsi_stmt (gsi));
+ gsi = gsi_last_bb (e->dest);
+ }
+
nb_bytes = rewrite_to_non_trapping_overflow (builtin->size);
nb_bytes = force_gimple_operand_gsi (&gsi, nb_bytes, true, NULL_TREE,
false, GSI_CONTINUE_LINKING);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-1.c
new file mode 100644
index 0000000..6f1b86a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-1.c
@@ -0,0 +1,69 @@
+/* { dg-do run } */
+
+/* Array reductions. */
+
+#include <stdlib.h>
+#include "reduction.h"
+
+#define ng 8
+#define nw 4
+#define vl 32
+
+#define N 10
+
+#define check_reduction_array_op_all(type, opr, init, b) \
+ check_reduction_xxx_xx_all(array, op, type, opr, init, b)
+#define check_reduction_arraysec_op_all(type, opr, init, b) \
+ check_reduction_xxx_xx_all(arraysec, op, type, opr, init, b)
+#define check_reduction_array_macro_all(type, opr, init, b) \
+ check_reduction_xxx_xx_all(array, macro, type, opr, init, b)
+#define check_reduction_arraysec_macro_all(type, opr, init, b) \
+ check_reduction_xxx_xx_all(arraysec, macro, type, opr, init, b)
+
+int
+main (void)
+{
+ const int n = 100;
+ int ints[n];
+ float flts[n];
+ double dbls[n];
+ int cmp_val = 5;
+
+ for (int i = 0; i < n; i++)
+ {
+ ints[i] = i + 1;
+ flts[i] = i + 1;
+ dbls[i] = i + 1;
+ }
+
+ check_reduction_array_op_all (int, +, 0, ints[i]);
+ check_reduction_array_op_all (int, *, 1, ints[i]);
+ check_reduction_array_op_all (int, &, -1, ints[i]);
+ check_reduction_array_op_all (int, |, 0, ints[i]);
+ check_reduction_array_op_all (int, ^, 0, ints[i]);
+ check_reduction_array_op_all (int, &&, 1, (cmp_val > ints[i]));
+ check_reduction_array_op_all (int, ||, 0, (cmp_val > ints[i]));
+ check_reduction_array_macro_all (int, min, n + 1, ints[i]);
+ check_reduction_array_macro_all (int, max, -1, ints[i]);
+
+ check_reduction_array_op_all (float, +, 0, flts[i]);
+ check_reduction_array_op_all (float, *, 1, flts[i]);
+ check_reduction_array_macro_all (float, min, n + 1, flts[i]);
+ check_reduction_array_macro_all (float, max, -1, flts[i]);
+
+ check_reduction_arraysec_op_all (int, +, 0, ints[i]);
+ check_reduction_arraysec_op_all (float, *, 1, flts[i]);
+ check_reduction_arraysec_macro_all (double, min, n + 1, dbls[i]);
+ check_reduction_arraysec_macro_all (double, max, -1, dbls[i]);
+
+ check_reduction_array_op_all (double, +, 0, dbls[i]);
+#if 0
+ /* Currently fails due to unclear issue, presumably unrelated to reduction
+ mechanics. Avoiding for now. */
+ check_reduction_array_op_all (double, *, 1.0, dbls[i]);
+#endif
+ check_reduction_array_macro_all (double, min, n + 1, dbls[i]);
+ check_reduction_array_macro_all (double, max, -1, dbls[i]);
+
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c
new file mode 100644
index 0000000..43e139f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c
@@ -0,0 +1,91 @@
+/* { dg-do run } */
+
+/* More array reduction tests, different combinations of parallel/loop
+ construct, implied/explicit copy clauses, and subarrays. */
+
+#define ARRAY_BODY(ARRAY, MIN, LEN) \
+ for (int i = 0; i < 10; i++) \
+ for (int j = MIN; j < MIN + LEN; j++) \
+ ARRAY[j] += 1;
+
+int main (void)
+{
+ int o[6] = { 5, 1, 1, 5, 9, 9 };
+ int a[6];
+
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ a[i] = o[i];
+
+ #pragma acc parallel
+ #pragma acc loop reduction(+:a[1:2])
+ ARRAY_BODY (a, 1, 2)
+ ARRAY_BODY (o, 1, 2)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a[3:2])
+ #pragma acc loop reduction(+:a[3:2])
+ ARRAY_BODY (a, 3, 2)
+ ARRAY_BODY (o, 3, 2)
+ for (int i = 0; i < 6; i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a)
+ #pragma acc loop reduction(+:a[0:5])
+ ARRAY_BODY (a, 0, 5)
+ ARRAY_BODY (o, 0, 5)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel
+ #pragma acc loop reduction(+:a)
+ ARRAY_BODY (a, 4, 1)
+ ARRAY_BODY (o, 4, 1)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a)
+ #pragma acc loop reduction(+:a)
+ ARRAY_BODY (a, 3, 3)
+ ARRAY_BODY (o, 3, 3)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+#if !defined(ACC_DEVICE_TYPE_host)
+
+ #pragma acc parallel loop reduction(+:a)
+ ARRAY_BODY (a, 1, 3)
+ ARRAY_BODY (o, 1, 3)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel loop reduction(+:a[2:3])
+ ARRAY_BODY (a, 2, 3)
+ ARRAY_BODY (o, 2, 3)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel reduction(+:a)
+ ARRAY_BODY (a, 3, 2)
+ ARRAY_BODY (o, 3, 2)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel reduction(+:a[1:2])
+ ARRAY_BODY (a, 1, 2)
+ ARRAY_BODY (o, 1, 2)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+#endif
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c
new file mode 100644
index 0000000..aeae2e0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c
@@ -0,0 +1,90 @@
+/* { dg-do run } */
+
+/* Same as reduction-arrays-2.c test, but with non-constant subarray
+ base indexes. */
+
+#define ARRAY_BODY(ARRAY, MIN, LEN) \
+ for (int i = 0; i < 10; i++) \
+ for (int j = MIN; j < MIN + LEN; j++) \
+ ARRAY[j] += 1;
+
+int zero = 0;
+int one = 1;
+int two = 2;
+int three = 3;
+int four = 4;
+
+int main (void)
+{
+ int o[6] = { 5, 1, 1, 5, 9, 9 };
+ int a[6];
+
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ a[i] = o[i];
+
+ #pragma acc parallel
+ #pragma acc loop reduction(+:a[one:2])
+ ARRAY_BODY (a, one, 2)
+ ARRAY_BODY (o, one, 2)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a[three:2])
+ #pragma acc loop reduction(+:a[three:2])
+ ARRAY_BODY (a, three, 2)
+ ARRAY_BODY (o, three, 2)
+ for (int i = 0; i < 6; i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a)
+ #pragma acc loop reduction(+:a[zero:5])
+ ARRAY_BODY (a, zero, 5)
+ ARRAY_BODY (o, zero, 5)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel
+ #pragma acc loop reduction(+:a)
+ ARRAY_BODY (a, four, 1)
+ ARRAY_BODY (o, four, 1)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a)
+ #pragma acc loop reduction(+:a)
+ ARRAY_BODY (a, three, 3)
+ ARRAY_BODY (o, three, 3)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+#if !defined(ACC_DEVICE_TYPE_host)
+
+ #pragma acc parallel loop reduction(+:a)
+ ARRAY_BODY (a, one, 3)
+ ARRAY_BODY (o, one, 3)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel loop reduction(+:a[two:3])
+ ARRAY_BODY (a, two, 3)
+ ARRAY_BODY (o, two, 3)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel reduction(+:a[one:2])
+ ARRAY_BODY (a, one, 2)
+ ARRAY_BODY (o, one, 2)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+#endif
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c
new file mode 100644
index 0000000..c095284
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c
@@ -0,0 +1,91 @@
+/* { dg-do run } */
+
+/* Same as reduction-arrays-3.c test, but additionally with
+ non-constant subarray lengths. */
+
+#define ARRAY_BODY(ARRAY, MIN, LEN) \
+ for (int i = 0; i < 10; i++) \
+ for (int j = MIN; j < MIN + LEN; j++) \
+ ARRAY[j] += 1;
+
+int zero = 0;
+int one = 1;
+int two = 2;
+int three = 3;
+int four = 4;
+int five = 5;
+
+int main (void)
+{
+ int o[6] = { 5, 1, 1, 5, 9, 9 };
+ int a[6];
+
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ a[i] = o[i];
+
+ #pragma acc parallel
+ #pragma acc loop reduction(+:a[one:two])
+ ARRAY_BODY (a, one, two)
+ ARRAY_BODY (o, one, two)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a[three:two])
+ #pragma acc loop reduction(+:a[three:two])
+ ARRAY_BODY (a, three, two)
+ ARRAY_BODY (o, three, two)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a)
+ #pragma acc loop reduction(+:a[zero:five])
+ ARRAY_BODY (a, zero, five)
+ ARRAY_BODY (o, zero, five)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel
+ #pragma acc loop reduction(+:a)
+ ARRAY_BODY (a, four, one)
+ ARRAY_BODY (o, four, one)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a)
+ #pragma acc loop reduction(+:a)
+ ARRAY_BODY (a, three, three)
+ ARRAY_BODY (o, three, three)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+#if !defined(ACC_DEVICE_TYPE_host)
+
+ #pragma acc parallel loop reduction(+:a)
+ ARRAY_BODY (a, one, three)
+ ARRAY_BODY (o, one, three)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel loop reduction(+:a[two:three])
+ ARRAY_BODY (a, two, three)
+ ARRAY_BODY (o, two, three)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel reduction(+:a[one:two])
+ ARRAY_BODY (a, one, two)
+ ARRAY_BODY (o, one, two)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+#endif
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c
new file mode 100644
index 0000000..4794350
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c
@@ -0,0 +1,89 @@
+/* { dg-do run } */
+
+/* Same as reduction-arrays-4.c test, but reduced arrays are VLAs. */
+
+#define ARRAY_BODY(ARRAY, MIN, LEN) \
+ for (int i = 0; i < 10; i++) \
+ for (int j = MIN; j < MIN + LEN; j++) \
+ ARRAY[j] += 1;
+
+int zero = 0;
+int one = 1;
+int two = 2;
+int three = 3;
+int four = 4;
+int five = 5;
+int six = 6;
+
+int main (void)
+{
+ int init[6] = { 5, 1, 1, 5, 9, 9 };
+ int o[six];
+ int a[six];
+
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ a[i] = o[i] = init[i];
+
+ #pragma acc parallel
+ #pragma acc loop reduction(+:a[one:two])
+ ARRAY_BODY (a, one, two)
+ ARRAY_BODY (o, one, two)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a[three:two])
+ #pragma acc loop reduction(+:a[three:two])
+ ARRAY_BODY (a, three, two)
+ ARRAY_BODY (o, three, two)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a)
+ #pragma acc loop reduction(+:a[zero:five])
+ ARRAY_BODY (a, zero, five)
+ ARRAY_BODY (o, zero, five)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel
+ #pragma acc loop reduction(+:a)
+ ARRAY_BODY (a, four, one)
+ ARRAY_BODY (o, four, one)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel copy(a)
+ #pragma acc loop reduction(+:a)
+ ARRAY_BODY (a, three, three)
+ ARRAY_BODY (o, three, three)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel loop reduction(+:a)
+ ARRAY_BODY (a, one, three)
+ ARRAY_BODY (o, one, three)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel loop reduction(+:a[two:three])
+ ARRAY_BODY (a, two, three)
+ ARRAY_BODY (o, two, three)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ #pragma acc parallel reduction(+:a[one:two])
+ ARRAY_BODY (a, one, two)
+ ARRAY_BODY (o, one, two)
+ for (int i = 0; i < sizeof (a) / sizeof (int); i++)
+ if (a[i] != o[i])
+ __builtin_abort ();
+
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-structs-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-structs-1.c
new file mode 100644
index 0000000..22216ff
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-structs-1.c
@@ -0,0 +1,121 @@
+/* { dg-do run } */
+
+/* Struct reductions. */
+
+#include <stdlib.h>
+#include "reduction.h"
+
+#define ng 8
+#define nw 4
+#define vl 32
+
+#define N 10
+
+typedef struct { int x, y; } int_pair;
+typedef struct { float m, n; } flt_pair;
+typedef struct
+{
+ int i;
+ double d;
+ float f;
+ int a[N];
+ int_pair ip;
+ flt_pair fp;
+} rectype;
+
+static void
+init_struct (rectype *rec, int val)
+{
+ rec->i = val;
+ rec->d = (double) val;
+ rec->f = (float) val;
+ for (int i = 0; i < N; i++)
+ rec->a[i] = val;
+ rec->ip.x = val;
+ rec->ip.y = val;
+ rec->fp.m = (float) val;
+ rec->fp.n = (float) val;
+}
+
+static int
+struct_eq (rectype *a, rectype *b)
+{
+ if (a->i != b->i || a->d != b->d
+ || a->f != b->f
+ || a->ip.x != b->ip.x
+ || a->ip.y != b->ip.y
+ || a->fp.m != b->fp.m
+ || a->fp.n != b->fp.n)
+ return 0;
+
+ for (int i = 0; i < N; i++)
+ if (a->a[i] != b->a[i])
+ return 0;
+ return 1;
+}
+
+#define check_reduction_struct_xx(type, op, init, b, gwv_par, gwv_loop, apply) \
+ { \
+ type res, vres; \
+ init_struct (&res, init); \
+ DO_PRAGMA (acc parallel gwv_par copy(res)) \
+ DO_PRAGMA (acc loop gwv_loop reduction (op:res)) \
+ for (int i = 0; i < n; i++) \
+ { \
+ res.i = apply (op, res.i, b); \
+ res.d = apply (op, res.d, b); \
+ res.f = apply (op, res.f, b); \
+ for (int j = 0; j < N; j++) \
+ res.a[j] = apply (op, res.a[j], b); \
+ res.ip.x = apply (op, res.ip.x, b); \
+ res.ip.y = apply (op, res.ip.y, b); \
+ res.fp.m = apply (op, res.fp.m, b); \
+ res.fp.n = apply (op, res.fp.n, b); \
+ } \
+ \
+ init_struct (&vres, init); \
+ for (int i = 0; i < n; i++) \
+ { \
+ vres.i = apply (op, vres.i, b); \
+ vres.d = apply (op, vres.d, b); \
+ vres.f = apply (op, vres.f, b); \
+ for (int j = 0; j < N; j++) \
+ vres.a[j] = apply (op, vres.a[j], b); \
+ vres.ip.x = apply (op, vres.ip.x, b); \
+ vres.ip.y = apply (op, vres.ip.y, b); \
+ vres.fp.m = apply (op, vres.fp.m, b); \
+ vres.fp.n = apply (op, vres.fp.n, b); \
+ } \
+ \
+ if (!struct_eq (&res, &vres)) \
+ __builtin_abort (); \
+ }
+
+#define operator_apply(op, a, b) (a op b)
+#define check_reduction_struct_op(type, op, init, b, gwv_par, gwv_loop) \
+ check_reduction_struct_xx(type, op, init, b, gwv_par, gwv_loop, operator_apply)
+
+#define function_apply(op, a, b) (op (a, b))
+#define check_reduction_struct_macro(type, op, init, b, gwv_par, gwv_loop) \
+ check_reduction_struct_xx(type, op, init, b, gwv_par, gwv_loop, function_apply)
+
+#define check_reduction_struct_op_all(type, opr, init, b) \
+ check_reduction_xxx_xx_all (struct, op, type, opr, init, b)
+#define check_reduction_struct_macro_all(type, opr, init, b) \
+ check_reduction_xxx_xx_all (struct, macro, type, opr, init, b)
+
+int
+main (void)
+{
+ const int n = 10;
+ int ints[n];
+
+ for (int i = 0; i < n; i++)
+ ints[i] = i + 1;
+
+ check_reduction_struct_op_all (rectype, +, 0, ints[i]);
+ check_reduction_struct_op_all (rectype, *, 1, ints[i]);
+ check_reduction_struct_macro_all (rectype, min, n + 1, ints[i]);
+ check_reduction_struct_macro_all (rectype, max, -1, ints[i]);
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h
index 1b3f8d4..c928578 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h
@@ -37,6 +37,58 @@ DO_PRAGMA (acc loop gwv_loop reduction (op:res)) \
abort (); \
}
+#define check_reduction_array_xx(type, var, var_in_clause, op, init, b, \
+ gwv_par, gwv_loop, apply) \
+ { \
+ type var[N], var ## _check[N]; \
+ for (int i = 0; i < N; i++) \
+ var[i] = var ## _check[i] = (init); \
+ DO_PRAGMA (acc parallel gwv_par copy (var_in_clause)) \
+ DO_PRAGMA (acc loop gwv_loop reduction (op: var_in_clause)) \
+ for (int i = 0; i < n; i++) \
+ for (int j = 0; j < N; j++) \
+ var[j] = apply (op, var[j], (b)); \
+ \
+ for (int i = 0; i < n; i++) \
+ for (int j = 0; j < N; j++) \
+ var ## _check[j] = apply (op, var ## _check[j], (b)); \
+ \
+ for (int j = 0; j < N; j++) \
+ if (var[j] != var ## _check[j]) \
+ abort (); \
+ }
+
+#define operator_apply(op, a, b) (a op b)
+#define check_reduction_array_op(type, op, init, b, gwv_par, gwv_loop) \
+ check_reduction_array_xx (type, v, v, op, init, b, gwv_par, gwv_loop, \
+ operator_apply)
+#define check_reduction_arraysec_op(type, op, init, b, gwv_par, gwv_loop) \
+ check_reduction_array_xx (type, v, v[:N], op, init, b, gwv_par, gwv_loop, \
+ operator_apply)
+
+
+#define function_apply(op, a, b) (op (a, b))
+#define check_reduction_array_macro(type, op, init, b, gwv_par, gwv_loop)\
+ check_reduction_array_xx (type, v, v, op, init, b, gwv_par, gwv_loop, \
+ function_apply)
+#define check_reduction_arraysec_macro(type, op, init, b, gwv_par, gwv_loop)\
+ check_reduction_array_xx (type, v, v[:N], op, init, b, gwv_par, gwv_loop, \
+ function_apply)
+
+#define check_reduction_xxx_xx_all(tclass, form, type, op, init, b) \
+ check_reduction_ ## tclass ## _ ## form (type, op, init, b, num_gangs (ng), gang); \
+ check_reduction_ ## tclass ## _ ## form (type, op, init, b, num_workers (nw), worker); \
+ check_reduction_ ## tclass ## _ ## form (type, op, init, b, vector_length (vl), vector); \
+ check_reduction_ ## tclass ## _ ## form (type, op, init, b, \
+ num_gangs (ng) num_workers (nw), gang worker); \
+ check_reduction_ ## tclass ## _ ## form (type, op, init, b, \
+ num_gangs (ng) vector_length (vl), gang vector); \
+ check_reduction_ ## tclass ## _ ## form (type, op, init, b, \
+ num_workers (nw) vector_length (vl), worker vector); \
+ check_reduction_ ## tclass ## _ ## form (type, op, init, b, \
+ num_gangs (ng) num_workers (nw) vector_length (vl), \
+ gang worker vector);
+
#define max(a, b) (((a) > (b)) ? (a) : (b))
#define min(a, b) (((a) < (b)) ? (a) : (b))
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-10.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-10.f90
new file mode 100644
index 0000000..f766524
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-10.f90
@@ -0,0 +1,598 @@
+! { dg-do run }
+
+! integer array reductions
+
+program main
+ implicit none
+
+ integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32
+ integer :: i, j
+ integer, dimension (n) :: vresult, rg, rw, rv, rc
+ logical, dimension (n) :: lrg, lrw, lrv, lrc, lvresult
+ integer, dimension (n) :: array
+
+ do i = 1, n
+ array(i) = i
+ end do
+
+ !
+ ! '+' reductions
+ !
+
+ rg = 0
+ rw = 0
+ rv = 0
+ rc = 0
+ vresult = 0
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(+:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = rg(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(+:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = rw(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(+:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = rv(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(+:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = rc(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = vresult(j) + array(i)
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 1
+ if (count (rw .ne. vresult) .ne. 0) STOP 2
+ if (count (rv .ne. vresult) .ne. 0) STOP 3
+ if (count (rc .ne. vresult) .ne. 0) STOP 4
+
+ !
+ ! '*' reductions
+ !
+
+ rg = 1
+ rw = 1
+ rv = 1
+ rc = 1
+ vresult = 1
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(*:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = rg(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(*:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = rw(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(*:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = rv(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(*:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = rc(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = vresult(j) * array(i)
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 5
+ if (count (rw .ne. vresult) .ne. 0) STOP 6
+ if (count (rv .ne. vresult) .ne. 0) STOP 7
+ if (count (rc .ne. vresult) .ne. 0) STOP 8
+
+ !
+ ! 'max' reductions
+ !
+
+ rg = 0
+ rw = 0
+ rv = 0
+ rc = 0
+ vresult = 0
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(max:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = max (rg(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(max:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = max (rw(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(max:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = max (rv(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(max:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = max (rc(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = max (vresult(j), array(i))
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 9
+ if (count (rw .ne. vresult) .ne. 0) STOP 10
+ if (count (rv .ne. vresult) .ne. 0) STOP 11
+ if (count (rc .ne. vresult) .ne. 0) STOP 12
+
+ !
+ ! 'min' reductions
+ !
+
+ rg = n + 1
+ rw = n + 1
+ rv = n + 1
+ rc = n + 1
+ vresult = n + 1
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(min:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = min (rg(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(min:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = min (rw(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(min:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = min (rv(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(min:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = min (rc(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = min (vresult(j), array(i))
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 13
+ if (count (rw .ne. vresult) .ne. 0) STOP 14
+ if (count (rv .ne. vresult) .ne. 0) STOP 15
+ if (count (rc .ne. vresult) .ne. 0) STOP 16
+
+ !
+ ! 'iand' reductions
+ !
+
+ rg = 1
+ rw = 1
+ rv = 1
+ rc = 1
+ vresult = 1
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(iand:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = iand (rg(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(iand:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = iand (rw(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(iand:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = iand (rv(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(iand:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = iand (rc(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = iand (vresult(j), array(i))
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 17
+ if (count (rw .ne. vresult) .ne. 0) STOP 18
+ if (count (rv .ne. vresult) .ne. 0) STOP 19
+ if (count (rc .ne. vresult) .ne. 0) STOP 20
+
+ !
+ ! 'ior' reductions
+ !
+
+ rg = 0
+ rw = 0
+ rv = 0
+ rc = 0
+ vresult = 0
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(ior:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = ior (rg(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(ior:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = ior (rw(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(ior:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = ior (rv(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(ior:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = ior (rc(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = ior (vresult(j), array(i))
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 21
+ if (count (rw .ne. vresult) .ne. 0) STOP 22
+ if (count (rv .ne. vresult) .ne. 0) STOP 23
+ if (count (rc .ne. vresult) .ne. 0) STOP 24
+
+ !
+ ! 'ieor' reductions
+ !
+
+ rg = 0
+ rw = 0
+ rv = 0
+ rc = 0
+ vresult = 0
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(ieor:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = ieor (rg(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(ieor:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = ieor (rw(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(ieor:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = ieor (rv(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(ieor:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = ieor (rc(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = ieor (vresult(j), array(i))
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 25
+ if (count (rw .ne. vresult) .ne. 0) STOP 26
+ if (count (rv .ne. vresult) .ne. 0) STOP 27
+ if (count (rc .ne. vresult) .ne. 0) STOP 28
+
+ !
+ ! '.and.' reductions
+ !
+
+ lrg = .true.
+ lrw = .true.
+ lrv = .true.
+ lrc = .true.
+ lvresult = .true.
+
+ !$acc parallel num_gangs(ng) copy(lrg)
+ !$acc loop reduction(.and.:lrg) gang
+ do i = 1, n
+ do j = 1, n
+ lrg(j) = lrg(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(lrw)
+ !$acc loop reduction(.and.:lrw) worker
+ do i = 1, n
+ do j = 1, n
+ lrw(j) = lrw(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(lrv)
+ !$acc loop reduction(.and.:lrv) vector
+ do i = 1, n
+ do j = 1, n
+ lrv(j) = lrv(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc)
+ !$acc loop reduction(.and.:lrc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ lrc(j) = lrc(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ lvresult(j) = lvresult(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+
+ if (count (lrg .neqv. lvresult) .ne. 0) STOP 29
+ if (count (lrw .neqv. lvresult) .ne. 0) STOP 30
+ if (count (lrv .neqv. lvresult) .ne. 0) STOP 31
+ if (count (lrc .neqv. lvresult) .ne. 0) STOP 32
+
+ !
+ ! '.or.' reductions
+ !
+
+ lrg = .true.
+ lrw = .true.
+ lrv = .true.
+ lrc = .true.
+ lvresult = .true.
+
+ !$acc parallel num_gangs(ng) copy(lrg)
+ !$acc loop reduction(.or.:lrg) gang
+ do i = 1, n
+ do j = 1, n
+ lrg(j) = lrg(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(lrw)
+ !$acc loop reduction(.or.:lrw) worker
+ do i = 1, n
+ do j = 1, n
+ lrw(j) = lrw(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(lrv)
+ !$acc loop reduction(.or.:lrv) vector
+ do i = 1, n
+ do j = 1, n
+ lrv(j) = lrv(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc)
+ !$acc loop reduction(.or.:lrc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ lrc(j) = lrc(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ lvresult(j) = lvresult(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+
+ if (count (lrg .neqv. lvresult) .ne. 0) STOP 33
+ if (count (lrw .neqv. lvresult) .ne. 0) STOP 34
+ if (count (lrv .neqv. lvresult) .ne. 0) STOP 35
+ if (count (lrc .neqv. lvresult) .ne. 0) STOP 36
+
+ !
+ ! '.eqv.' reductions
+ !
+
+ lrg = .true.
+ lrw = .true.
+ lrv = .true.
+ lrc = .true.
+ lvresult = .true.
+
+ !$acc parallel num_gangs(ng) copy(lrg)
+ !$acc loop reduction(.eqv.:lrg) gang
+ do i = 1, n
+ do j = 1, n
+ lrg(j) = lrg(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(lrw)
+ !$acc loop reduction(.eqv.:lrw) worker
+ do i = 1, n
+ do j = 1, n
+ lrw(j) = lrw(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(lrv)
+ !$acc loop reduction(.eqv.:lrv) vector
+ do i = 1, n
+ do j = 1, n
+ lrv(j) = lrv(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc)
+ !$acc loop reduction(.eqv.:lrc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ lrc(j) = lrc(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ lvresult(j) = lvresult(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+
+ if (count (lrg .neqv. lvresult) .ne. 0) STOP 37
+ if (count (lrw .neqv. lvresult) .ne. 0) STOP 38
+ if (count (lrv .neqv. lvresult) .ne. 0) STOP 39
+ if (count (lrc .neqv. lvresult) .ne. 0) STOP 40
+
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-11.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-11.f90
new file mode 100644
index 0000000..220871a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-11.f90
@@ -0,0 +1,424 @@
+! { dg-do run }
+
+! real array reductions
+
+program main
+ implicit none
+
+ integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32
+ integer :: i, j
+ real, dimension (n) :: vresult, rg, rw, rv, rc
+ logical, dimension (n) :: lrg, lrw, lrv, lrc, lvresult
+ real, dimension (n) :: array
+
+ do i = 1, n
+ array(i) = i
+ end do
+
+ !
+ ! '+' reductions
+ !
+
+ rg = 0
+ rw = 0
+ rv = 0
+ rc = 0
+ vresult = 0
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(+:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = rg(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(+:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = rw(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(+:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = rv(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(+:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = rc(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = vresult(j) + array(i)
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 1
+ if (count (rw .ne. vresult) .ne. 0) STOP 2
+ if (count (rv .ne. vresult) .ne. 0) STOP 3
+ if (count (rc .ne. vresult) .ne. 0) STOP 4
+
+ !
+ ! '*' reductions
+ !
+
+ rg = 1
+ rw = 1
+ rv = 1
+ rc = 1
+ vresult = 1
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(*:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = rg(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(*:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = rw(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(*:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = rv(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(*:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = rc(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = vresult(j) * array(i)
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 5
+ if (count (rw .ne. vresult) .ne. 0) STOP 6
+ if (count (rv .ne. vresult) .ne. 0) STOP 7
+ if (count (rc .ne. vresult) .ne. 0) STOP 8
+
+ !
+ ! 'max' reductions
+ !
+
+ rg = 0
+ rw = 0
+ rv = 0
+ rc = 0
+ vresult = 0
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(max:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = max (rg(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(max:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = max (rw(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(max:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = max (rv(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(max:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = max (rc(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = max (vresult(j), array(i))
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 9
+ if (count (rw .ne. vresult) .ne. 0) STOP 10
+ if (count (rv .ne. vresult) .ne. 0) STOP 11
+ if (count (rc .ne. vresult) .ne. 0) STOP 12
+
+ !
+ ! 'min' reductions
+ !
+
+ rg = n + 1
+ rw = n + 1
+ rv = n + 1
+ rc = n + 1
+ vresult = n + 1
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(min:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = min (rg(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(min:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = min (rw(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(min:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = min (rv(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(min:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = min (rc(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = min (vresult(j), array(i))
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 13
+ if (count (rw .ne. vresult) .ne. 0) STOP 14
+ if (count (rv .ne. vresult) .ne. 0) STOP 15
+ if (count (rc .ne. vresult) .ne. 0) STOP 16
+
+ !
+ ! '.and.' reductions
+ !
+
+ lrg = .true.
+ lrw = .true.
+ lrv = .true.
+ lrc = .true.
+ lvresult = .true.
+
+ !$acc parallel num_gangs(ng) copy(lrg)
+ !$acc loop reduction(.and.:lrg) gang
+ do i = 1, n
+ do j = 1, n
+ lrg(j) = lrg(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(lrw)
+ !$acc loop reduction(.and.:lrw) worker
+ do i = 1, n
+ do j = 1, n
+ lrw(j) = lrw(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(lrv)
+ !$acc loop reduction(.and.:lrv) vector
+ do i = 1, n
+ do j = 1, n
+ lrv(j) = lrv(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc)
+ !$acc loop reduction(.and.:lrc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ lrc(j) = lrc(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ lvresult(j) = lvresult(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+
+ if (count (lrg .neqv. lvresult) .ne. 0) STOP 17
+ if (count (lrw .neqv. lvresult) .ne. 0) STOP 18
+ if (count (lrv .neqv. lvresult) .ne. 0) STOP 19
+ if (count (lrc .neqv. lvresult) .ne. 0) STOP 20
+
+ !
+ ! '.or.' reductions
+ !
+
+ lrg = .true.
+ lrw = .true.
+ lrv = .true.
+ lrc = .true.
+ lvresult = .true.
+
+ !$acc parallel num_gangs(ng) copy(lrg)
+ !$acc loop reduction(.or.:lrg) gang
+ do i = 1, n
+ do j = 1, n
+ lrg(j) = lrg(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(lrw)
+ !$acc loop reduction(.or.:lrw) worker
+ do i = 1, n
+ do j = 1, n
+ lrw(j) = lrw(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(lrv)
+ !$acc loop reduction(.or.:lrv) vector
+ do i = 1, n
+ do j = 1, n
+ lrv(j) = lrv(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc)
+ !$acc loop reduction(.or.:lrc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ lrc(j) = lrc(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ lvresult(j) = lvresult(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+
+ if (count (lrg .neqv. lvresult) .ne. 0) STOP 21
+ if (count (lrw .neqv. lvresult) .ne. 0) STOP 22
+ if (count (lrv .neqv. lvresult) .ne. 0) STOP 23
+ if (count (lrc .neqv. lvresult) .ne. 0) STOP 24
+
+ !
+ ! '.eqv.' reductions
+ !
+
+ lrg = .true.
+ lrw = .true.
+ lrv = .true.
+ lrc = .true.
+ lvresult = .true.
+
+ !$acc parallel num_gangs(ng) copy(lrg)
+ !$acc loop reduction(.eqv.:lrg) gang
+ do i = 1, n
+ do j = 1, n
+ lrg(j) = lrg(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(lrw)
+ !$acc loop reduction(.eqv.:lrw) worker
+ do i = 1, n
+ do j = 1, n
+ lrw(j) = lrw(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(lrv)
+ !$acc loop reduction(.eqv.:lrv) vector
+ do i = 1, n
+ do j = 1, n
+ lrv(j) = lrv(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc)
+ !$acc loop reduction(.eqv.:lrc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ lrc(j) = lrc(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ lvresult(j) = lvresult(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+
+ if (count (lrg .neqv. lvresult) .ne. 0) STOP 25
+ if (count (lrw .neqv. lvresult) .ne. 0) STOP 26
+ if (count (lrv .neqv. lvresult) .ne. 0) STOP 27
+ if (count (lrc .neqv. lvresult) .ne. 0) STOP 28
+
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-12.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-12.f90
new file mode 100644
index 0000000..d89d8ed
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-12.f90
@@ -0,0 +1,424 @@
+! { dg-do run }
+
+! double precision array reductions
+
+program main
+ implicit none
+
+ integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32
+ integer :: i, j
+ double precision, dimension (n) :: vresult, rg, rw, rv, rc
+ logical, dimension (n) :: lrg, lrw, lrv, lrc, lvresult
+ double precision, dimension (n) :: array
+
+ do i = 1, n
+ array(i) = i
+ end do
+
+ !
+ ! '+' reductions
+ !
+
+ rg = 0
+ rw = 0
+ rv = 0
+ rc = 0
+ vresult = 0
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(+:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = rg(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(+:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = rw(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(+:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = rv(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(+:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = rc(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = vresult(j) + array(i)
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 1
+ if (count (rw .ne. vresult) .ne. 0) STOP 2
+ if (count (rv .ne. vresult) .ne. 0) STOP 3
+ if (count (rc .ne. vresult) .ne. 0) STOP 4
+
+ !
+ ! '*' reductions
+ !
+
+ rg = 1
+ rw = 1
+ rv = 1
+ rc = 1
+ vresult = 1
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(*:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = rg(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(*:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = rw(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(*:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = rv(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(*:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = rc(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = vresult(j) * array(i)
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 5
+ if (count (rw .ne. vresult) .ne. 0) STOP 6
+ if (count (rv .ne. vresult) .ne. 0) STOP 7
+ if (count (rc .ne. vresult) .ne. 0) STOP 8
+
+ !
+ ! 'max' reductions
+ !
+
+ rg = 0
+ rw = 0
+ rv = 0
+ rc = 0
+ vresult = 0
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(max:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = max (rg(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(max:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = max (rw(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(max:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = max (rv(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(max:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = max (rc(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = max (vresult(j), array(i))
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 9
+ if (count (rw .ne. vresult) .ne. 0) STOP 10
+ if (count (rv .ne. vresult) .ne. 0) STOP 11
+ if (count (rc .ne. vresult) .ne. 0) STOP 12
+
+ !
+ ! 'min' reductions
+ !
+
+ rg = n + 1
+ rw = n + 1
+ rv = n + 1
+ rc = n + 1
+ vresult = n + 1
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(min:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = min (rg(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(min:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = min (rw(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(min:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = min (rv(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(min:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = min (rc(j), array(i))
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = min (vresult(j), array(i))
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 13
+ if (count (rw .ne. vresult) .ne. 0) STOP 14
+ if (count (rv .ne. vresult) .ne. 0) STOP 15
+ if (count (rc .ne. vresult) .ne. 0) STOP 16
+
+ !
+ ! '.and.' reductions
+ !
+
+ lrg = .true.
+ lrw = .true.
+ lrv = .true.
+ lrc = .true.
+ lvresult = .true.
+
+ !$acc parallel num_gangs(ng) copy(lrg)
+ !$acc loop reduction(.and.:lrg) gang
+ do i = 1, n
+ do j = 1, n
+ lrg(j) = lrg(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(lrw)
+ !$acc loop reduction(.and.:lrw) worker
+ do i = 1, n
+ do j = 1, n
+ lrw(j) = lrw(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(lrv)
+ !$acc loop reduction(.and.:lrv) vector
+ do i = 1, n
+ do j = 1, n
+ lrv(j) = lrv(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc)
+ !$acc loop reduction(.and.:lrc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ lrc(j) = lrc(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ lvresult(j) = lvresult(j) .and. (array(i) .ge. 5)
+ end do
+ end do
+
+ if (count (lrg .neqv. lvresult) .ne. 0) STOP 17
+ if (count (lrw .neqv. lvresult) .ne. 0) STOP 18
+ if (count (lrv .neqv. lvresult) .ne. 0) STOP 19
+ if (count (lrc .neqv. lvresult) .ne. 0) STOP 20
+
+ !
+ ! '.or.' reductions
+ !
+
+ lrg = .true.
+ lrw = .true.
+ lrv = .true.
+ lrc = .true.
+ lvresult = .true.
+
+ !$acc parallel num_gangs(ng) copy(lrg)
+ !$acc loop reduction(.or.:lrg) gang
+ do i = 1, n
+ do j = 1, n
+ lrg(j) = lrg(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(lrw)
+ !$acc loop reduction(.or.:lrw) worker
+ do i = 1, n
+ do j = 1, n
+ lrw(j) = lrw(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(lrv)
+ !$acc loop reduction(.or.:lrv) vector
+ do i = 1, n
+ do j = 1, n
+ lrv(j) = lrv(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc)
+ !$acc loop reduction(.or.:lrc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ lrc(j) = lrc(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ lvresult(j) = lvresult(j) .or. (array(i) .ge. 5)
+ end do
+ end do
+
+ if (count (lrg .neqv. lvresult) .ne. 0) STOP 21
+ if (count (lrw .neqv. lvresult) .ne. 0) STOP 22
+ if (count (lrv .neqv. lvresult) .ne. 0) STOP 23
+ if (count (lrc .neqv. lvresult) .ne. 0) STOP 24
+
+ !
+ ! '.eqv.' reductions
+ !
+
+ lrg = .true.
+ lrw = .true.
+ lrv = .true.
+ lrc = .true.
+ lvresult = .true.
+
+ !$acc parallel num_gangs(ng) copy(lrg)
+ !$acc loop reduction(.eqv.:lrg) gang
+ do i = 1, n
+ do j = 1, n
+ lrg(j) = lrg(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(lrw)
+ !$acc loop reduction(.eqv.:lrw) worker
+ do i = 1, n
+ do j = 1, n
+ lrw(j) = lrw(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(lrv)
+ !$acc loop reduction(.eqv.:lrv) vector
+ do i = 1, n
+ do j = 1, n
+ lrv(j) = lrv(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc)
+ !$acc loop reduction(.eqv.:lrc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ lrc(j) = lrc(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ lvresult(j) = lvresult(j) .eqv. (array(i) .ge. 5)
+ end do
+ end do
+
+ if (count (lrg .neqv. lvresult) .ne. 0) STOP 25
+ if (count (lrw .neqv. lvresult) .ne. 0) STOP 26
+ if (count (lrv .neqv. lvresult) .ne. 0) STOP 27
+ if (count (lrc .neqv. lvresult) .ne. 0) STOP 28
+
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-13.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-13.f90
new file mode 100644
index 0000000..701cbb9
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-13.f90
@@ -0,0 +1,134 @@
+! { dg-do run }
+
+! complex array reductions
+
+program main
+ implicit none
+
+ integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32
+ integer :: i, j
+ complex, dimension (n) :: vresult, rg, rw, rv, rc
+ logical, dimension (n) :: lrg, lrw, lrv, lrc, lvresult
+ complex, dimension (n) :: array
+
+ do i = 1, n
+ array(i) = i
+ end do
+
+ !
+ ! '+' reductions
+ !
+
+ rg = 0
+ rw = 0
+ rv = 0
+ rc = 0
+ vresult = 0
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(+:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = rg(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(+:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = rw(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(+:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = rv(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(+:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = rc(j) + array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = vresult(j) + array(i)
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 1
+ if (count (rw .ne. vresult) .ne. 0) STOP 2
+ if (count (rv .ne. vresult) .ne. 0) STOP 3
+ if (count (rc .ne. vresult) .ne. 0) STOP 4
+
+ !
+ ! '*' reductions
+ !
+
+ rg = 1
+ rw = 1
+ rv = 1
+ rc = 1
+ vresult = 1
+
+ !$acc parallel num_gangs(ng) copy(rg)
+ !$acc loop reduction(*:rg) gang
+ do i = 1, n
+ do j = 1, n
+ rg(j) = rg(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_workers(nw) copy(rw)
+ !$acc loop reduction(*:rw) worker
+ do i = 1, n
+ do j = 1, n
+ rw(j) = rw(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel vector_length(vl) copy(rv)
+ !$acc loop reduction(*:rv) vector
+ do i = 1, n
+ do j = 1, n
+ rv(j) = rv(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc)
+ !$acc loop reduction(*:rc) gang worker vector
+ do i = 1, n
+ do j = 1, n
+ rc(j) = rc(j) * array(i)
+ end do
+ end do
+ !$acc end parallel
+
+ ! Verify the results
+ do i = 1, n
+ do j = 1, n
+ vresult(j) = vresult(j) * array(i)
+ end do
+ end do
+
+ if (count (rg .ne. vresult) .ne. 0) STOP 5
+ if (count (rw .ne. vresult) .ne. 0) STOP 6
+ if (count (rv .ne. vresult) .ne. 0) STOP 7
+ if (count (rc .ne. vresult) .ne. 0) STOP 8
+
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-14.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-14.f90
new file mode 100644
index 0000000..95e56c9
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-14.f90
@@ -0,0 +1,68 @@
+! { dg-do run }
+
+! record type reductions
+
+program main
+ implicit none
+
+ type t1
+ integer :: i
+ real :: r
+ end type t1
+
+ type t2
+ real :: r
+ integer :: i
+ double precision :: d
+ end type t2
+
+ double precision, parameter :: e = 0.001
+ integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32
+ integer :: i
+ type(t1) :: v1, a1
+ type (t2) :: v2, a2
+
+ v1%i = 0
+ v1%r = 0
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(v1)
+ !$acc loop reduction (+:v1)
+ do i = 1, n
+ v1%i = v1%i + 1
+ v1%r = v1%r + 2
+ end do
+ !$acc end parallel
+ a1%i = 0
+ a1%r = 0
+ do i = 1, n
+ a1%i = a1%i + 1
+ a1%r = a1%r + 2
+ end do
+ if (v1%i .ne. a1%i) STOP 1
+ if (v1%r .ne. a1%r) STOP 2
+
+ v2%i = 1
+ v2%r = 1
+ v2%d = 1
+ !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(v2)
+ !$acc loop reduction (*:v2)
+ do i = 1, n
+ v2%i = v2%i * 2
+ v2%r = v2%r * 1.1
+ v2%d = v2%d * 1.3
+ end do
+ !$acc end parallel
+ a2%i = 1
+ a2%r = 1
+ a2%d = 1
+ do i = 1, n
+ a2%i = a2%i * 2
+ a2%r = a2%r * 1.1
+ a2%d = a2%d * 1.3
+ end do
+
+ if (v2%i .ne. a2%i) STOP 3
+ if (v2%r .ne. a2%r) STOP 4
+ if (abs (v2%d - a2%d) .ge. e) STOP 5
+
+end program main
+
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-15.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-15.f90
new file mode 100644
index 0000000..7a36fb2
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-15.f90
@@ -0,0 +1,98 @@
+! { dg-do run }
+! { dg-additional-options "-cpp" }
+
+#define ARRAY_BODY(ARRAY, MIN, MAX) \
+ do i = 1, 10; \
+ do j = MIN, MAX; \
+ ARRAY(j) = ARRAY(j) + 1; \
+ end do; \
+ end do
+
+program main
+ implicit none
+ integer :: i, j, max = 6, two = 2, three = 3, four = 4, five = 5, six = 6
+ integer :: a(6) = (/ 5, 1, 1, 5, 9, 9 /)
+ integer :: o(6)
+ o = a
+
+ !$acc parallel
+ !$acc loop reduction(+:a(2:3))
+ ARRAY_BODY (a, 2, 3)
+ !$acc end parallel
+ ARRAY_BODY (o, 2, 3)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 1
+ end do
+
+ !$acc parallel copy(a(4:6))
+ !$acc loop reduction(+:a(4:6))
+ ARRAY_BODY (a, 4, 6)
+ !$acc end parallel
+ ARRAY_BODY (o, 4, 6)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 2
+ end do
+
+ !$acc parallel copy(a)
+ !$acc loop reduction(+:a(1:6))
+ ARRAY_BODY (a, 1, 6)
+ !$acc end parallel
+ ARRAY_BODY (o, 1, 6)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 3
+ end do
+
+ !$acc parallel
+ !$acc loop reduction(+:a)
+ ARRAY_BODY (a, 4, 4)
+ !$acc end parallel
+ ARRAY_BODY (o, 4, 4)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 4
+ end do
+
+ !$acc parallel copy(a)
+ !$acc loop reduction(+:a)
+ ARRAY_BODY (a, 4, 6)
+ !$acc end parallel
+ ARRAY_BODY (o, 4, 6)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 5
+ end do
+
+#if !defined(ACC_DEVICE_TYPE_host)
+
+ !$acc parallel loop reduction(+:a)
+ ARRAY_BODY (a, 2, 4)
+ !$acc end parallel loop
+ ARRAY_BODY (o, 2, 4)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 6
+ end do
+
+ !$acc parallel loop reduction(+:a(2:4))
+ ARRAY_BODY (a, 2, 4)
+ !$acc end parallel loop
+ ARRAY_BODY (o, 2, 4)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 7
+ end do
+
+ !$acc parallel reduction(+:a)
+ ARRAY_BODY (a, 3, 4)
+ !$acc end parallel
+ ARRAY_BODY (o, 3, 4)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 8
+ end do
+
+ !$acc parallel reduction(+:a(2:3))
+ ARRAY_BODY (a, 2, 3)
+ !$acc end parallel
+ ARRAY_BODY (o, 2, 3)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 9
+ end do
+#endif
+
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-16.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-16.f90
new file mode 100644
index 0000000..c524f2a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-16.f90
@@ -0,0 +1,99 @@
+! { dg-do run }
+! { dg-additional-options "-cpp" }
+
+#define ARRAY_BODY(ARRAY, MIN, MAX) \
+ do i = 1, 10; \
+ do j = MIN, MAX; \
+ ARRAY(j) = ARRAY(j) + 1; \
+ end do; \
+ end do
+
+program main
+ implicit none
+ integer :: i, j, max = 6, one = 1, two = 2, three = 3, four = 4, five = 5, six = 6
+ integer :: a(6) = (/ 5, 1, 1, 5, 9, 9 /)
+ integer :: o(6)
+ o = a
+
+ !$acc parallel
+ !$acc loop reduction(+:a(two:three))
+ ARRAY_BODY (a, two, three)
+ !$acc end parallel
+
+ ARRAY_BODY (o, two, three)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 1
+ end do
+
+ !$acc parallel copy(a(four:six))
+ !$acc loop reduction(+:a(four:six))
+ ARRAY_BODY (a, four, six)
+ !$acc end parallel
+ ARRAY_BODY (o, four, six)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 2
+ end do
+
+ !$acc parallel copy(a)
+ !$acc loop reduction(+:a(one:six))
+ ARRAY_BODY (a, one, six)
+ !$acc end parallel
+ ARRAY_BODY (o, one, six)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 3
+ end do
+
+ !$acc parallel
+ !$acc loop reduction(+:a)
+ ARRAY_BODY (a, four, four)
+ !$acc end parallel
+ ARRAY_BODY (o, four, four)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 4
+ end do
+
+ !$acc parallel copy(a)
+ !$acc loop reduction(+:a)
+ ARRAY_BODY (a, four, six)
+ !$acc end parallel
+ ARRAY_BODY (o, four, six)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 5
+ end do
+
+#if !defined(ACC_DEVICE_TYPE_host)
+
+ !$acc parallel loop reduction(+:a)
+ ARRAY_BODY (a, two, four)
+ !$acc end parallel loop
+ ARRAY_BODY (o, two, four)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 6
+ end do
+
+ !$acc parallel loop reduction(+:a(two:four))
+ ARRAY_BODY (a, two, four)
+ !$acc end parallel loop
+ ARRAY_BODY (o, two, four)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 7
+ end do
+
+ !$acc parallel reduction(+:a)
+ ARRAY_BODY (a, three, four)
+ !$acc end parallel
+ ARRAY_BODY (o, three, four)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 8
+ end do
+
+ !$acc parallel reduction(+:a(two:three))
+ ARRAY_BODY (a, two, three)
+ !$acc end parallel
+ ARRAY_BODY (o, two, three)
+ do i = 1, max
+ if (a(i) .ne. o(i)) STOP 9
+ end do
+#endif
+
+end program main