diff options
author | Chung-Lin Tang <cltang@baylibre.com> | 2025-05-02 00:27:53 +0000 |
---|---|---|
committer | Sandra Loosemore <sloosemore@baylibre.com> | 2025-05-15 20:25:51 +0000 |
commit | 5fd60a678116773e99d5fd2d64a118f837e5d6f0 (patch) | |
tree | 2821bef94bfb75dceeec8369d9227e3ba0fe1b9a | |
parent | 3dfe9733eabd74f21999dde36166bdedc5d06b1c (diff) | |
download | gcc-5fd60a678116773e99d5fd2d64a118f837e5d6f0.zip gcc-5fd60a678116773e99d5fd2d64a118f837e5d6f0.tar.gz gcc-5fd60a678116773e99d5fd2d64a118f837e5d6f0.tar.bz2 |
OpenACC 2.7: Implement reductions for arrays and records
This patch is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675222.html
This patch implements reductions for arrays, array sections, and
struct/record types as according to the OpenACC 2.7 specification.
2025-02-23 Chung-Lin Tang <cltang@baylibre.com>
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_variable_list): Adjust parsing of opening
square bracket.
(c_parser_omp_clause_reduction): Adjustments for
OpenACC-specific cases.
* c-typeck.cc (c_oacc_reduction_defined_type_p): New function.
(c_oacc_reduction_code_name): Likewise.
(c_finish_omp_clauses): Handle OpenACC cases using new functions.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_var_list_no_open): Adjust parsing of opening
square bracket.
(cp_parser_omp_clause_reduction): Adjustments for
OpenACC-specific cases.
* semantics.cc (cp_oacc_reduction_defined_type_p): New function.
(cp_oacc_reduction_code_name): Likewise.
(finish_omp_reduction_clause): Handle OpenACC cases using new
functions.
gcc/fortran/ChangeLog:
* openmp.cc (oacc_reduction_defined_type_p): New function.
(resolve_omp_clauses): Adjust OpenACC array reduction error case.
Adjust OMP_LIST_REDUCTION case. Use oacc_reduction_defined_type_p for
OpenACC.
* trans-openmp.cc (gfc_trans_omp_array_reduction_or_udr):
Add 'stmtblock_t *block', and 'bool openacc' parameters. Add array and
array section handling for openacc case. Adjust part of function to be
!openacc only.
(gfc_trans_omp_reduction_list):
Add 'stmtblock_t *block', and 'bool openacc' parameters, pass to calls
to gfc_trans_omp_array_reduction_or_udr.
(gfc_trans_omp_array_section): Adjust setting of OMP_CLAUSE_SIZE to only
OMP_CLAUSE_MAP clauses. Adjust calculations of decls and bias to use
temporary variables instead of tree expression inside clauses.
(gfc_trans_omp_clauses): Add 'block' and 'openacc' arguments to calls to
gfc_trans_omp_reduction_list.
(gfc_trans_omp_do): Pass 'op == EXEC_OACC_LOOP' as 'bool openacc'
parameter in call to gfc_trans_omp_clauses.
gcc/ChangeLog:
* config/gcn/gcn-tree.cc (#include "omp-offload.h"): Add include.
(#include "memmodel.h"): Add include.
(gcn_array_reduction_buffers): New vec<tree>
for holding DECLs for reduction buffer pointer variables.
(gcn_lockfull_update): Add pointer type fold_converts.
(gcn_reduction_update): Additions for handling ARRAY_TYPE, pointer to
ARRAY_TYPE, and RECORD_TYPE reductions.
(gcn_goacc_get_worker_red_decl): Adjust parameters to handle
non-constant offset case.
(gcn_goacc_get_worker_array_reduction_buffer): New function.
(gcn_create_if_else_seq): New function.
(gcn_create_do_while_loop_seq): New function.
(gcn_goacc_reduction_setup): Adjustments to handle arrays and records.
(gcn_goacc_reduction_init): Likewise.
(gcn_goacc_reduction_fini): Likewise.
(gcn_goacc_reduction_teardown): Likewise.
* config/nvptx/nvptx.cc (nvptx_gen_shuffle): Properly generate
V2SI shuffle using vec_extract op.
(nvptx_expand_shared_addr): Adjustments to handle non-constant size.
(nvptx_get_shared_red_addr): Adjust type/alignment calculations to
use TYPE_SIZE/ALIGN_UNIT instead of machine mode based.
(nvptx_get_shared_red_addr): New function with array_max_idx parameter.
(nvptx_reduction_update): Additions for handling ARRAY_TYPE, pointer to
ARRA_TYPE, and RECORD_TYPE reductions.
(nvptx_goacc_reduction_setup): Likewise.
(nvptx_goacc_reduction_init): Likewise.
(nvptx_goacc_reduction_fini): Likewise.
(nvptx_goacc_reduction_teardown): Likewise.
* gimplify.cc (gimplify_scan_omp_clauses): Gimplify inside COMPONENT_REF
and convert codes for OMP_CLAUSE_REDUCTION cases. Add DECL_P check for
do_add/do_add_decl goto case.
(gimplify_adjust_omp_clauses): Avoid GOMP_MAP_POINTER OMP_CLAUSE_SIZE
handling for OpenACC kernels. Call omp_add_variable for ARRAY_REF index.
Peel away array MEM_REF for decl lookup.
* omp-low.cc (struct omp_context):
Add 'hash_map<tree, tree> *block_vars_map' field.
(omp_copy_decl_2): Create/lookup using ctx->block_vars_map first. Add
new copy into ctx->block_vars_map.
(install_var_field): Add 'bool field_may_exist = false' parameter.
Adjust lookup assertions.
(delete_omp_context): Add delete of ctx->block_vars_map.
(scan_sharing_clauses): Adjust calls to install_var_field. Adjust
ARRAY_REF pointer type building to use decl type, rather than generic
ptr_type_node. For ARRAY_REFs on offloaded constructs, also add base
expression as field lookup key.
(omp_reduction_init_op): Add ARRAY_TYPE and RECORD_TYPE init op
construction.
(oacc_array_reduction_bias): New function.
(lower_oacc_reductions): Add array reduction handling code. Arrays use
a different mode of IFN parameters, using additional 'array_addr' and
'array_max_idx' arguments. The LHS var is a simple integer for
dependency ordering.
(lower_omp_target): Adjust 'offload' condition for GOMP_MAP_POINTER
case. Generate BUILT_IN_ALLOCA_WITH_ALIGN to create private copy
for reductions of non-constant size types.
* omp-oacc-neuter-broadcast.cc (worker_single_copy):
Add 'hash_set<tree> *array_reduction_base_vars' parameter. Avoid
propagation for SSA_NAMEs used for array reduction accesses.
(neuter_worker_single): Add 'hash_set<tree> *array_reduction_base_vars'
parameter. Adjust recursive calls to self and worker_single_copy.
(oacc_do_neutering): Add 'hash_set<tree> *array_reduction_base_vars'
parameter. Adjust call to neuter_worker_single.
(execute_omp_oacc_neuter_broadcast): Add local
'hash_set<tree> array_reduction_base_vars' declaration. Collect MEM_REF
base-pointer SSA_NAMEs of arrays into array_reduction_base_vars. Add
'&array_reduction_base_vars' argument to call of oacc_do_neutering.
* omp-offload.cc (#include "cfghooks.h"): Add include.
(oacc_build_array_copy): New function.
(oacc_build_array_copy_loop): New function.
(oacc_build_indexed_ssa_loop): New function.
(default_goacc_reduction): Adjustments to handle arrays.
* omp-offload.h (oacc_build_array_copy): New declaration.
(oacc_build_array_copy_loop): New declaration.
(oacc_build_indexed_ssa_loop): New declaration.
* tree-loop-distribution.cc (generate_memset_builtin): Under OpenACC,
when last stmt of pre-header block is a UNIQUE(OACC_FORK) internal-fn,
split a new basic block to serve as place of insertion, otherwise
may fail later checking because UNIQUE(OACC_FORK) counts as control
flow stmt.
(generate_memcpy_builtin): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/readonly-2.c: Adjust test.
* c-c++-common/goacc/reduction-9.c: Adjust test.
* c-c++-common/goacc/reduction-11.c: New test.
* c-c++-common/goacc/reduction-12.c: New test.
* c-c++-common/goacc/reduction-13.c: New test.
* c-c++-common/goacc/reduction-14.c: New test.
* c-c++-common/goacc/reduction-15.c: New test.
* c-c++-common/goacc/reduction-16.c: New test.
* g++.dg/goacc/reductions-1.C: Adjust test.
* gfortran.dg/goacc/array-reduction.f90: Adjust test.
* gfortran.dg/goacc/enter-exit-data-2.f90: Adjust test.
* gfortran.dg/goacc/finalize-1.f: Adjust test.
* gfortran.dg/goacc/kernels-decompose-1.f95: Adjust test.
* gfortran.dg/goacc/pr70828.f90: Adjust test.
* gfortran.dg/goacc/reduction.f95: Adjust test.
* gfortran.dg/gomp/target-enter-exit-data.f90: Adjust test.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-c-c++-common/reduction.h
(check_reduction_array_xx): New macro.
(operator_apply): Likewise.
(check_reduction_array_op): Likewise.
(check_reduction_arraysec_op): Likewise.
(function_apply): Likewise.
(check_reduction_array_macro): Likewise.
(check_reduction_arraysec_macro): Likewise.
(check_reduction_xxx_xx_all): Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c: New test.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c: New test.
* testsuite/libgomp.oacc-c-c++-common/reduction-structs-1.c: New test.
* testsuite/libgomp.oacc-fortran/reduction-10.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-11.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-12.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-13.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-14.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-15.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-16.f90: New test.
44 files changed, 5130 insertions, 442 deletions
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 6d607e9..fbaa917 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -16945,8 +16945,7 @@ c_parser_omp_variable_list (c_parser *parser, case OMP_CLAUSE_HAS_DEVICE_ADDR: array_section_p = false; dims.truncate (0); - while ((ort != C_ORT_ACC || kind != OMP_CLAUSE_REDUCTION) - && c_parser_next_token_is (parser, CPP_OPEN_SQUARE)) + while (c_parser_next_token_is (parser, CPP_OPEN_SQUARE)) { location_t loc = UNKNOWN_LOCATION; tree low_bound = NULL_TREE, length = NULL_TREE; @@ -18867,13 +18866,21 @@ c_parser_omp_clause_reduction (c_parser *parser, enum omp_clause_code kind, code = MAX_EXPR; break; } + if (ort == C_ORT_ACC) + goto name_error; reduc_id = c_parser_peek_token (parser)->value; break; } default: - c_parser_error (parser, - "expected %<+%>, %<*%>, %<-%>, %<&%>, " - "%<^%>, %<|%>, %<&&%>, %<||%> or identifier"); + name_error: + if (ort == C_ORT_OMP) + c_parser_error (parser, + "expected %<+%>, %<*%>, %<-%>, %<&%>, " + "%<^%>, %<|%>, %<&&%>, %<||%> or identifier"); + else + c_parser_error (parser, + "expected %<+%>, %<*%>, %<-%>, %<&%>, " + "%<^%>, %<|%>, %<&&%>, %<||%>, %<min%> or %<max%>"); c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, 0); return list; } @@ -18887,6 +18894,11 @@ c_parser_omp_clause_reduction (c_parser *parser, enum omp_clause_code kind, for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c)) { + OMP_CLAUSE_REDUCTION_CODE (c) = code; + /* OpenACC does not require anything below. */ + if (ort == C_ORT_ACC) + continue; + tree d = OMP_CLAUSE_DECL (c), type; if (TREE_CODE (d) != OMP_ARRAY_SECTION) type = TREE_TYPE (d); @@ -18910,7 +18922,6 @@ c_parser_omp_clause_reduction (c_parser *parser, enum omp_clause_code kind, } while (TREE_CODE (type) == ARRAY_TYPE) type = TREE_TYPE (type); - OMP_CLAUSE_REDUCTION_CODE (c) = code; if (task) OMP_CLAUSE_REDUCTION_TASK (c) = 1; else if (inscan) diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index a1731ff..68640ff 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -16248,6 +16248,68 @@ c_oacc_check_attachments (tree c) return false; } +static bool +c_oacc_reduction_defined_type_p (enum tree_code reduction_code, tree t) +{ + if (TREE_CODE (t) == INTEGER_TYPE) + return true; + + if (FLOAT_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE) + switch (reduction_code) + { + case PLUS_EXPR: + case MULT_EXPR: + case MINUS_EXPR: + case TRUTH_ANDIF_EXPR: + case TRUTH_ORIF_EXPR: + return true; + case MIN_EXPR: + case MAX_EXPR: + return TREE_CODE (t) != COMPLEX_TYPE; + case BIT_AND_EXPR: + case BIT_XOR_EXPR: + case BIT_IOR_EXPR: + return false; + default: + gcc_unreachable (); + } + + if (TREE_CODE (t) == ARRAY_TYPE) + return c_oacc_reduction_defined_type_p (reduction_code, TREE_TYPE (t)); + + if (TREE_CODE (t) == RECORD_TYPE) + { + for (tree fld = TYPE_FIELDS (t); fld; fld = TREE_CHAIN (fld)) + if (TREE_CODE (fld) == FIELD_DECL + && !c_oacc_reduction_defined_type_p (reduction_code, + TREE_TYPE (fld))) + return false; + return true; + } + + return false; +} + +static const char * +c_oacc_reduction_code_name (enum tree_code reduction_code) +{ + switch (reduction_code) + { + case PLUS_EXPR: return "+"; + case MULT_EXPR: return "*"; + case MINUS_EXPR: return "-"; + case TRUTH_ANDIF_EXPR: return "&&"; + case TRUTH_ORIF_EXPR: return "||"; + case MIN_EXPR: return "min"; + case MAX_EXPR: return "max"; + case BIT_AND_EXPR: return "&"; + case BIT_XOR_EXPR: return "^"; + case BIT_IOR_EXPR: return "|"; + default: + gcc_unreachable (); + } +} + /* For all elements of CLAUSES, validate them against their constraints. Remove any elements from the list that are invalid. */ @@ -16447,9 +16509,22 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) break; } } - if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) == NULL_TREE - && (FLOAT_TYPE_P (type) - || TREE_CODE (type) == COMPLEX_TYPE)) + if (ort == C_ORT_ACC) + { + enum tree_code r_code = OMP_CLAUSE_REDUCTION_CODE (c); + if (!c_oacc_reduction_defined_type_p (r_code, TREE_TYPE (t))) + { + const char *r_name = c_oacc_reduction_code_name (r_code); + error_at (OMP_CLAUSE_LOCATION (c), + "%qE has invalid type for %<reduction(%s)%>", + t, r_name); + remove = true; + break; + } + } + else if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) == NULL_TREE + && (FLOAT_TYPE_P (type) + || TREE_CODE (type) == COMPLEX_TYPE)) { enum tree_code r_code = OMP_CLAUSE_REDUCTION_CODE (c); const char *r_name = NULL; diff --git a/gcc/config/gcn/gcn-tree.cc b/gcc/config/gcn/gcn-tree.cc index 4ae29bd..87c4267 100644 --- a/gcc/config/gcn/gcn-tree.cc +++ b/gcc/config/gcn/gcn-tree.cc @@ -35,6 +35,7 @@ #include "varasm.h" #include "omp-low.h" #include "omp-general.h" +#include "omp-offload.h" #include "internal-fn.h" #include "tree-vrp.h" #include "tree-ssanames.h" @@ -44,6 +45,7 @@ #include "cgraph.h" #include "targhooks.h" #include "langhooks-def.h" +#include "memmodel.h" /* }}} */ /* {{{ OpenACC reductions. */ @@ -78,6 +80,9 @@ gcn_global_lock_addr () return build_fold_addr_expr (v); } +/* Pointer variables for array reduction buffers used. */ +static vec<tree> gcn_array_reduction_buffers; + /* Helper function for gcn_reduction_update. Insert code to locklessly update *PTR with *PTR OP VAR just before @@ -259,7 +264,8 @@ gcn_lockfull_update (location_t loc, gimple_stmt_iterator *gsi, /* Build and insert the reduction calculation. */ gimple_seq red_seq = NULL; tree acc_in = make_ssa_name (var_type); - tree ref_in = build_simple_mem_ref (ptr); + tree ref_in + = build_simple_mem_ref (fold_convert (build_pointer_type (var_type), ptr)); TREE_THIS_VOLATILE (ref_in) = 1; gimplify_assign (acc_in, ref_in, &red_seq); @@ -267,7 +273,8 @@ gcn_lockfull_update (location_t loc, gimple_stmt_iterator *gsi, tree update_expr = fold_build2 (op, var_type, ref_in, var); gimplify_assign (acc_out, update_expr, &red_seq); - tree ref_out = build_simple_mem_ref (ptr); + tree ref_out + = build_simple_mem_ref (fold_convert (build_pointer_type (var_type), ptr)); TREE_THIS_VOLATILE (ref_out) = 1; gimplify_assign (ref_out, acc_out, &red_seq); @@ -291,11 +298,144 @@ gcn_lockfull_update (location_t loc, gimple_stmt_iterator *gsi, static tree gcn_reduction_update (location_t loc, gimple_stmt_iterator *gsi, - tree ptr, tree var, tree_code op) + tree ptr, tree var, tree_code op, + tree array_max_idx = NULL_TREE) { tree type = TREE_TYPE (var); tree size = TYPE_SIZE (type); + if (!VAR_P (ptr)) + { + tree t = make_ssa_name (TREE_TYPE (ptr)); + gimple_seq seq = NULL; + gimplify_assign (t, ptr, &seq); + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); + ptr = t; + } + + if (TREE_CODE (type) == ARRAY_TYPE + || TREE_CODE (type) == POINTER_TYPE) + { + tree array_type; + if (TREE_CODE (type) == ARRAY_TYPE) + { + array_type = TREE_TYPE (var); + } + else if (TREE_CODE (type) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE) + { + array_type = TREE_TYPE (TREE_TYPE (var)); + } + else if (TREE_CODE (type) == POINTER_TYPE) + { + array_type = TREE_TYPE (var); + } + else + gcc_unreachable (); + + tree array_elem_type = TREE_TYPE (array_type); + + gimple *g; + gimple_seq seq = NULL; + tree max_index = array_max_idx; + gcc_assert (array_max_idx); + + tree init_index = make_ssa_name (TREE_TYPE (max_index)); + tree loop_index = make_ssa_name (TREE_TYPE (max_index)); + tree update_index = make_ssa_name (TREE_TYPE (max_index)); + + g = gimple_build_assign (init_index, + build_int_cst (TREE_TYPE (init_index), 0)); + gimple_seq_add_stmt (&seq, g); + gimple *init_end = gimple_seq_last (seq); + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); + + basic_block init_bb = gsi_bb (*gsi); + edge init_edge = split_block (init_bb, init_end); + basic_block loop_bb = init_edge->dest; + /* Reset the iterator. */ + *gsi = gsi_for_stmt (gsi_stmt (*gsi)); + + seq = NULL; + g = gimple_build_assign (update_index, PLUS_EXPR, loop_index, + build_int_cst (TREE_TYPE (loop_index), 1)); + gimple_seq_add_stmt (&seq, g); + + g = gimple_build_cond (LE_EXPR, update_index, max_index, NULL, NULL); + gimple_seq_add_stmt (&seq, g); + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); + + edge post_edge = split_block (loop_bb, g); + basic_block post_bb = post_edge->dest; + loop_bb = post_edge->src; + /* Reset the iterator. */ + *gsi = gsi_for_stmt (gsi_stmt (*gsi)); + + /* Place where we insert reduction code below. */ + gimple_stmt_iterator reduction_code_gsi = gsi_start_bb (loop_bb); + + post_edge->flags ^= EDGE_FALSE_VALUE | EDGE_FALLTHRU; + post_edge->probability = profile_probability::even (); + edge loop_edge = make_edge (loop_bb, loop_bb, EDGE_TRUE_VALUE); + loop_edge->probability = profile_probability::even (); + set_immediate_dominator (CDI_DOMINATORS, loop_bb, init_bb); + set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb); + class loop *new_loop = alloc_loop (); + new_loop->header = loop_bb; + new_loop->latch = loop_bb; + add_loop (new_loop, loop_bb->loop_father); + + gphi *phi = create_phi_node (loop_index, loop_bb); + add_phi_arg (phi, init_index, init_edge, loc); + add_phi_arg (phi, update_index, loop_edge, loc); + + tree var_ptr = fold_convert (build_pointer_type (array_elem_type), + var); + tree idx = fold_build2 (MULT_EXPR, sizetype, + fold_convert (sizetype, loop_index), + TYPE_SIZE_UNIT (array_elem_type)); + var_ptr = build2 (POINTER_PLUS_EXPR, TREE_TYPE (var_ptr), var_ptr, idx); + tree var_aref = build_simple_mem_ref (var_ptr); + ptr = build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, idx); + + gcn_reduction_update (loc, &reduction_code_gsi, + ptr, var_aref, op); + + return build_simple_mem_ref (ptr); + } + else if (TREE_CODE (type) == RECORD_TYPE) + { + for (tree fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld)) + if (TREE_CODE (fld) == FIELD_DECL) + { + tree var_fld_ref = build3 (COMPONENT_REF, TREE_TYPE (fld), + var, fld, NULL); + tree ptr_ref = build_simple_mem_ref (ptr); + tree ptr_fld_type + = build_qualified_type (TREE_TYPE (fld), + TYPE_QUALS (TREE_TYPE (ptr_ref))); + tree ptr_fld_ref = build3 (COMPONENT_REF, ptr_fld_type, + ptr_ref, fld, NULL); + + if (TREE_CODE (TREE_TYPE (fld)) == ARRAY_TYPE) + { + tree array_elem_ptr_type + = build_pointer_type (TREE_TYPE (TREE_TYPE (fld))); + gcn_reduction_update + (loc, gsi, + fold_convert (array_elem_ptr_type, + build_fold_addr_expr (ptr_fld_ref)), + build_fold_addr_expr (var_fld_ref), op, + TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (fld)))); + } + else + gcn_reduction_update (loc, gsi, + build_fold_addr_expr (ptr_fld_ref), + var_fld_ref, op); + } + return build_simple_mem_ref (ptr); + } + if (size == TYPE_SIZE (unsigned_type_node) || size == TYPE_SIZE (long_long_unsigned_type_node)) return gcn_lockless_update (loc, gsi, ptr, var, op); @@ -306,7 +446,7 @@ gcn_reduction_update (location_t loc, gimple_stmt_iterator *gsi, /* Return a temporary variable decl to use for an OpenACC worker reduction. */ static tree -gcn_goacc_get_worker_red_decl (tree type, unsigned offset) +gcn_goacc_get_worker_red_decl (tree type, tree offset_expr) { machine_function *machfun = cfun->machine; @@ -317,15 +457,165 @@ gcn_goacc_get_worker_red_decl (tree type, unsigned offset) = build_qualified_type (type, (TYPE_QUALS (type) | ENCODE_QUAL_ADDR_SPACE (ADDR_SPACE_LDS))); - - gcc_assert (offset - < (machfun->reduction_limit - machfun->reduction_base)); - tree ptr_type = build_pointer_type (var_type); - tree addr = build_int_cst (ptr_type, machfun->reduction_base + offset); + tree addr; + if (TREE_CONSTANT (offset_expr)) + { + unsigned offset = TREE_INT_CST_LOW (offset_expr); + gcc_assert (offset + < (machfun->reduction_limit - machfun->reduction_base)); + tree ptr_type = build_pointer_type (var_type); + addr = build_int_cst (ptr_type, machfun->reduction_base + offset); + } + else + { + tree ptr_type = build_pointer_type (var_type); + tree red_base = build_int_cst (ptr_type, machfun->reduction_base); + addr = build2 (POINTER_PLUS_EXPR, ptr_type, + red_base, fold_convert (size_type_node, offset_expr)); + } return build_simple_mem_ref (addr); } +static tree +gcn_goacc_get_worker_array_reduction_buffer (tree array_type, + tree array_max_idx, + gimple_seq *seq) +{ + gcc_assert (!gcn_array_reduction_buffers.is_empty ()); + tree red_buf_ptr = gcn_array_reduction_buffers.last (); + + tree ptr = make_ssa_name (ptr_type_node); + gimplify_assign (ptr, red_buf_ptr, seq); + + tree whole_block_ptr; + if (TREE_CODE (array_type) == ARRAY_TYPE) + whole_block_ptr = fold_convert (build_pointer_type (array_type), ptr); + else + whole_block_ptr = array_type; + + tree arg = build_int_cst (unsigned_type_node, GOMP_DIM_GANG); + tree gang_id = make_ssa_name (integer_type_node); + gimple *gang_id_call = gimple_build_call_internal (IFN_GOACC_DIM_POS, 1, arg); + gimple_call_set_lhs (gang_id_call, gang_id); + gimple_seq_add_stmt (seq, gang_id_call); + + tree len = fold_build2 (PLUS_EXPR, size_type_node, array_max_idx, + size_int (1)); + tree elem_size = TYPE_SIZE_UNIT (TREE_TYPE (array_type)); + tree array_size_expr = build2 (MULT_EXPR, size_type_node, len, elem_size); + tree type_size = make_ssa_name (size_type_node); + gimplify_assign (type_size, array_size_expr, seq); + + tree idx = make_ssa_name (size_type_node); + gimplify_assign (idx, build2 (MULT_EXPR, size_type_node, type_size, + fold_convert (size_type_node, gang_id)), seq); + + tree addr = fold_convert (ptr_type_node, whole_block_ptr);; + addr = build2 (POINTER_PLUS_EXPR, ptr_type_node, addr, idx); + addr = fold_convert (build_pointer_type (array_type), addr); + + tree addr_reg = make_ssa_name (build_pointer_type (array_type)); + gimplify_assign (addr_reg, addr, seq); + + return build_simple_mem_ref (addr_reg); +} + +static void +gcn_create_if_else_seq (gimple_stmt_iterator *gsi_p, gimple *split_stmt, + gimple_seq *then_seq, gimple_seq *else_seq) +{ + basic_block init_bb = gsi_bb (*gsi_p); + + edge fallthru_edge = split_block (init_bb, split_stmt); + basic_block then_bb = fallthru_edge->dest; + + /* Reset the iterator. */ + *gsi_p = gsi_for_stmt (gsi_stmt (*gsi_p)); + + gimple *then_seq_end = gimple_seq_last (*then_seq); + gsi_insert_seq_before (gsi_p, *then_seq, GSI_SAME_STMT); + + basic_block last_bb = then_bb; + gimple *last_seq_end = then_seq_end; + + basic_block else_bb = NULL; + edge then_else_fallthru_edge = NULL; + if (else_seq) + { + then_else_fallthru_edge = split_block (then_bb, then_seq_end); + else_bb = then_else_fallthru_edge->dest; + + /* Reset the iterator. */ + *gsi_p = gsi_for_stmt (gsi_stmt (*gsi_p)); + + gimple *else_seq_end = gimple_seq_last (*else_seq); + gsi_insert_seq_before (gsi_p, *else_seq, GSI_SAME_STMT); + + last_bb = else_bb; + last_seq_end = else_seq_end; + } + + edge post_edge = split_block (last_bb, last_seq_end); + basic_block post_bb = post_edge->dest; + + /* Reset the iterator. */ + *gsi_p = gsi_for_stmt (gsi_stmt (*gsi_p)); + + edge if_true_edge = make_edge (init_bb, (else_seq ? else_bb : post_bb), + EDGE_TRUE_VALUE); + if_true_edge->probability = profile_probability::even (); + fallthru_edge->flags = EDGE_FALSE_VALUE; + fallthru_edge->probability = profile_probability::even (); + + post_edge->flags = EDGE_FALLTHRU; + post_edge->probability = profile_probability::always (); + + set_immediate_dominator (CDI_DOMINATORS, then_bb, init_bb); + set_immediate_dominator (CDI_DOMINATORS, post_bb, init_bb); + + if (else_seq) + { + redirect_edge_and_branch (then_else_fallthru_edge, post_bb); + set_immediate_dominator (CDI_DOMINATORS, else_bb, init_bb); + } +} + +static void +gcn_create_do_while_loop_seq (gimple_stmt_iterator *gsi_p, + gimple_seq *body_seq, int edge_flags) +{ + gimple *g = NULL; + basic_block init_bb = gsi_bb (*gsi_p); + edge init_edge = split_block (init_bb, g); + basic_block loop_bb = init_edge->dest; + init_bb = init_edge->src; + + /* Reset the iterator. */ + *gsi_p = gsi_for_stmt (gsi_stmt (*gsi_p)); + + gimple_stmt_iterator loop_gsi = gsi_start_bb (loop_bb); + + gimple *body_seq_end = gimple_seq_last (*body_seq); + gsi_insert_seq_before (&loop_gsi, *body_seq, GSI_SAME_STMT); + + edge post_edge = split_block (loop_bb, body_seq_end); + basic_block post_bb = post_edge->dest; + + /* Reset the iterator. */ + *gsi_p = gsi_for_stmt (gsi_stmt (*gsi_p)); + + make_edge (loop_bb, loop_bb, edge_flags); + post_edge->flags = EDGE_FALSE_VALUE; + set_immediate_dominator (CDI_DOMINATORS, loop_bb, init_bb); + set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb); + + loop *loop = alloc_loop (); + loop->header = loop_bb; + loop->latch = loop_bb; + add_loop (loop, loop_bb->loop_father); +} + /* Expand IFN_GOACC_REDUCTION_SETUP. */ static void @@ -335,35 +625,155 @@ gcn_goacc_reduction_setup (gcall *call) tree lhs = gimple_call_lhs (call); tree var = gimple_call_arg (call, 2); int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3)); + + tree array_addr = gimple_call_arg (call, 6); + tree array_max_idx = gimple_call_arg (call, 7); + bool array_p = !integer_zerop (array_addr); + + tree array_type = NULL_TREE; + if (array_p) + array_type + = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE + ? TREE_TYPE (TREE_TYPE (array_addr)) + : TREE_TYPE (array_addr)); + gimple_seq seq = NULL; push_gimplify_context (true); + /* Copy the receiver object. */ + tree ref_to_res = gimple_call_arg (call, 1); + if (level != GOMP_DIM_GANG) { - /* Copy the receiver object. */ - tree ref_to_res = gimple_call_arg (call, 1); - if (!integer_zerop (ref_to_res)) - var = build_simple_mem_ref (ref_to_res); + { + if (!array_p) + var = build_simple_mem_ref (ref_to_res); + } + } + + if (array_p && !integer_zerop (ref_to_res)) + { + gimple_seq condseq = NULL; + + /* Create global variable to store pointer to array reduction buffer. */ + tree reduction_buffer_ptr_type + = build_qualified_type (ptr_type_node, TYPE_QUAL_VOLATILE); + tree reduction_buffer_ptr + = build_decl (UNKNOWN_LOCATION, VAR_DECL, + create_tmp_var_name ("gcn_array_reduction_buf"), + reduction_buffer_ptr_type); + TREE_STATIC (reduction_buffer_ptr) = 1; + TREE_PUBLIC (reduction_buffer_ptr) = 0; + DECL_INITIAL (reduction_buffer_ptr) = 0; + DECL_EXTERNAL (reduction_buffer_ptr) = 0; + + varpool_node::add (reduction_buffer_ptr); + + tree reduction_buffer_ptr_addr = make_ssa_name (ptr_type_node); + gimplify_assign (reduction_buffer_ptr_addr, + build_fold_addr_expr (reduction_buffer_ptr), &condseq); + + tree gang_dim_arg = build_int_cst (unsigned_type_node, GOMP_DIM_GANG); + tree gang_pos = make_ssa_name (integer_type_node); + gimple *gang_pos_call = gimple_build_call_internal (IFN_GOACC_DIM_POS, + 1, gang_dim_arg); + gimple_call_set_lhs (gang_pos_call, gang_pos); + gimple_seq_add_stmt (&condseq, gang_pos_call); + gimple *cond = gimple_build_cond (NE_EXPR, gang_pos, integer_zero_node, + NULL, NULL); + gimple_seq_add_stmt (&condseq, cond); + gimple *cond_end = gimple_seq_last (condseq); + gsi_insert_seq_before (&gsi, condseq, GSI_SAME_STMT); + + gimple_seq malloc_seq = NULL; + tree gang_num = make_ssa_name (integer_type_node); + gimple *gang_num_call = gimple_build_call_internal (IFN_GOACC_DIM_SIZE, + 1, gang_dim_arg); + gimple_call_set_lhs (gang_num_call, gang_num); + gimple_seq_add_stmt (&malloc_seq, gang_num_call); + + tree len = fold_build2 (PLUS_EXPR, size_type_node, array_max_idx, + size_int (1)); + tree elem_size = TYPE_SIZE_UNIT (TREE_TYPE (array_type)); + tree malloc_size_expr = build2 (MULT_EXPR, size_type_node, len, + elem_size); + malloc_size_expr = build2 (MULT_EXPR, size_type_node, malloc_size_expr, + fold_convert (size_type_node, gang_num)); + tree malloc_size = make_ssa_name (size_type_node); + gimplify_assign (malloc_size, malloc_size_expr, &malloc_seq); + + tree ptr = make_ssa_name (ptr_type_node); + tree malloc_decl = builtin_decl_explicit (BUILT_IN_MALLOC); + gcall *stmt = gimple_build_call (malloc_decl, 1, malloc_size); + gimple_call_set_lhs (stmt, ptr); + gimple_seq_add_stmt (&malloc_seq, stmt); + + tree atomic_store_decl = builtin_decl_explicit (BUILT_IN_ATOMIC_STORE_8); + gcall *atomic_store + = gimple_build_call (atomic_store_decl, 3, reduction_buffer_ptr_addr, + ptr, build_int_cst (integer_type_node, + MEMMODEL_RELEASE)); + gimple_seq_add_stmt (&malloc_seq, atomic_store); + + gimple_seq wait_seq = NULL; + gimple *nop = gimple_build_nop (); + gimple_seq_add_stmt (&wait_seq, nop); + + gcn_create_if_else_seq (&gsi, cond_end, &malloc_seq, &wait_seq); + + /* Create cmp-swap loop for other gangs to wait for + gcn_array_reduction_buf.* to be properly set by gang zero. */ + gimple_stmt_iterator ngsi = gsi_for_stmt (nop); + + gimple_seq atomic_load_seq = NULL; + tree loadval = make_ssa_name (size_type_node); + tree atomic_load_decl = builtin_decl_explicit (BUILT_IN_ATOMIC_LOAD_8); + gcall *atomic_load + = gimple_build_call (atomic_load_decl, 2, reduction_buffer_ptr_addr, + build_int_cst (integer_type_node, + MEMMODEL_ACQUIRE)); + gimple_call_set_lhs (atomic_load, loadval); + gimple_seq_add_stmt (&atomic_load_seq, atomic_load); + cond = gimple_build_cond (EQ_EXPR, loadval, size_zero_node, + NULL_TREE, NULL_TREE); + gimple_seq_add_stmt (&atomic_load_seq, cond); + + gcn_create_do_while_loop_seq (&ngsi, &atomic_load_seq, EDGE_TRUE_VALUE); + gcn_array_reduction_buffers.safe_push (reduction_buffer_ptr); } if (level == GOMP_DIM_WORKER) { - tree var_type = TREE_TYPE (var); - /* Store incoming value to worker reduction buffer. */ tree offset = gimple_call_arg (call, 5); - tree decl - = gcn_goacc_get_worker_red_decl (var_type, TREE_INT_CST_LOW (offset)); - - gimplify_assign (decl, var, &seq); + if (array_p) + { + tree decl = gcn_goacc_get_worker_array_reduction_buffer + (array_type, array_max_idx, &seq); + tree ptr = make_ssa_name (TREE_TYPE (array_addr)); + gimplify_assign (ptr, build_fold_addr_expr (decl), &seq); + + /* Store incoming value to worker reduction buffer. */ + oacc_build_array_copy (ptr, array_addr, array_max_idx, &seq); + } + else + { + tree var_type = TREE_TYPE (var); + /* Store incoming value to worker reduction buffer. */ + tree decl = gcn_goacc_get_worker_red_decl (var_type, offset); + gimplify_assign (decl, var, &seq); + } } if (lhs) - gimplify_assign (lhs, var, &seq); + gimplify_assign (lhs, unshare_expr (var), &seq); pop_gimplify_context (NULL); - gsi_replace_with_seq (&gsi, seq, true); + + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); + gsi_remove (&gsi, true); } /* Expand IFN_GOACC_REDUCTION_INIT. */ @@ -377,12 +787,55 @@ gcn_goacc_reduction_init (gcall *call) int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3)); enum tree_code rcode = (enum tree_code) TREE_INT_CST_LOW (gimple_call_arg (call, 4)); - tree init = omp_reduction_init_op (gimple_location (call), rcode, - TREE_TYPE (var)); + + tree array_addr = gimple_call_arg (call, 6); + tree array_max_idx = gimple_call_arg (call, 7); + bool array_p = !integer_zerop (array_addr); + + tree array_type = NULL_TREE; + if (array_p) + array_type + = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE + ? TREE_TYPE (TREE_TYPE (array_addr)) + : TREE_TYPE (array_addr)); + + tree init = NULL_TREE; gimple_seq seq = NULL; push_gimplify_context (true); + if (array_p) + { + tree loop_index; + gimple_stmt_iterator loop_body_gsi; + oacc_build_indexed_ssa_loop (gimple_location (call), array_max_idx, &gsi, + &loop_index, &loop_body_gsi); + + tree init_type = TREE_TYPE (array_type); + init = omp_reduction_init_op (gimple_location (call), rcode, + init_type); + gimple_seq seq = NULL; + + tree ptr = fold_convert (ptr_type_node, array_addr); + tree offset = build2 (MULT_EXPR, sizetype, + fold_convert (sizetype, loop_index), + TYPE_SIZE_UNIT (init_type)); + + tree addr = build2 (POINTER_PLUS_EXPR, build_pointer_type (init_type), + ptr, offset); + tree ref = build_simple_mem_ref (addr); + + push_gimplify_context (true); + gimplify_assign (ref, init, &seq); + pop_gimplify_context (NULL); + gsi_insert_seq_before (&loop_body_gsi, seq, GSI_SAME_STMT); + init = var; + } + else + init = omp_reduction_init_op (gimple_location (call), rcode, + TREE_TYPE (var)); + if (level == GOMP_DIM_GANG) { /* If there's no receiver object, propagate the incoming VAR. */ @@ -395,7 +848,9 @@ gcn_goacc_reduction_init (gcall *call) gimplify_assign (lhs, init, &seq); pop_gimplify_context (NULL); - gsi_replace_with_seq (&gsi, seq, true); + + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); + gsi_remove (&gsi, true); } /* Expand IFN_GOACC_REDUCTION_FINI. */ @@ -410,8 +865,13 @@ gcn_goacc_reduction_fini (gcall *call) int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3)); enum tree_code op = (enum tree_code) TREE_INT_CST_LOW (gimple_call_arg (call, 4)); + + tree array_addr = gimple_call_arg (call, 6); + tree array_max_idx = gimple_call_arg (call, 7); + bool array_p = !integer_zerop (array_addr); + gimple_seq seq = NULL; - tree r = NULL_TREE;; + tree r = NULL_TREE; push_gimplify_context (true); @@ -419,11 +879,19 @@ gcn_goacc_reduction_fini (gcall *call) if (level == GOMP_DIM_WORKER) { - tree var_type = TREE_TYPE (var); tree offset = gimple_call_arg (call, 5); - tree decl - = gcn_goacc_get_worker_red_decl (var_type, TREE_INT_CST_LOW (offset)); - + tree decl; + if (array_p) + { + tree array_type = TREE_TYPE (TREE_TYPE (array_addr)); + decl = gcn_goacc_get_worker_array_reduction_buffer + (array_type, array_max_idx, &seq); + } + else + { + tree var_type = TREE_TYPE (var); + decl = gcn_goacc_get_worker_red_decl (var_type, offset); + } accum = build_fold_addr_expr (decl); } else if (integer_zerop (ref_to_res)) @@ -436,14 +904,22 @@ gcn_goacc_reduction_fini (gcall *call) /* UPDATE the accumulator. */ gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); seq = NULL; - r = gcn_reduction_update (gimple_location (call), &gsi, accum, var, op); + if (array_p) + { + gcn_reduction_update (gimple_location (call), &gsi, accum, + array_addr, op, array_max_idx); + r = var; + } + else + r = gcn_reduction_update (gimple_location (call), &gsi, accum, var, op); } if (lhs) gimplify_assign (lhs, r, &seq); pop_gimplify_context (NULL); - gsi_replace_with_seq (&gsi, seq, true); + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); + gsi_remove (&gsi, true); } /* Expand IFN_GOACC_REDUCTION_TEARDOWN. */ @@ -455,28 +931,90 @@ gcn_goacc_reduction_teardown (gcall *call) tree lhs = gimple_call_lhs (call); tree var = gimple_call_arg (call, 2); int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3)); + + tree array_addr = gimple_call_arg (call, 6); + tree array_max_idx = gimple_call_arg (call, 7); + bool array_p = !integer_zerop (array_addr); + + tree array_accum = NULL_TREE; + gimple_seq seq = NULL; push_gimplify_context (true); if (level == GOMP_DIM_WORKER) { - tree var_type = TREE_TYPE (var); - - /* Read the worker reduction buffer. */ tree offset = gimple_call_arg (call, 5); - tree decl - = gcn_goacc_get_worker_red_decl (var_type, TREE_INT_CST_LOW (offset)); - var = decl; + if (array_p) + { + tree array_type = TREE_TYPE (TREE_TYPE (array_addr)); + array_accum = gcn_goacc_get_worker_array_reduction_buffer + (array_type, array_max_idx, &seq); + } + else + { + tree var_type = TREE_TYPE (var); + + /* Read the worker reduction buffer. */ + tree decl = gcn_goacc_get_worker_red_decl (var_type, offset); + var = decl; + } } + /* Write to the receiver object. */ + tree ref_to_res = gimple_call_arg (call, 1); if (level != GOMP_DIM_GANG) { - /* Write to the receiver object. */ - tree ref_to_res = gimple_call_arg (call, 1); - if (!integer_zerop (ref_to_res)) - gimplify_assign (build_simple_mem_ref (ref_to_res), var, &seq); + { + if (array_p) + { + tree ptr + = make_ssa_name (build_pointer_type (TREE_TYPE (array_addr))); + gimplify_assign (ptr, build_fold_addr_expr (array_accum), &seq); + oacc_build_array_copy (ref_to_res, ptr, array_max_idx, &seq); + } + else + gimplify_assign (build_simple_mem_ref (ref_to_res), var, &seq); + } + else if (array_p) + { + tree ptr + = make_ssa_name (build_pointer_type (TREE_TYPE (array_accum))); + gimplify_assign (ptr, build_fold_addr_expr (array_accum), &seq); + oacc_build_array_copy (array_addr, ptr, array_max_idx, &seq); + } + } + + if (array_p && !integer_zerop (ref_to_res)) + { + gimple_seq condseq = NULL; + tree gang_dim_arg = build_int_cst (unsigned_type_node, GOMP_DIM_GANG); + tree gang_pos = make_ssa_name (integer_type_node); + gimple *gang_pos_call = gimple_build_call_internal (IFN_GOACC_DIM_POS, + 1, gang_dim_arg); + gimple_call_set_lhs (gang_pos_call, gang_pos); + gimple_seq_add_stmt (&condseq, gang_pos_call); + gimple *cond = gimple_build_cond (NE_EXPR, gang_pos, integer_zero_node, + NULL, NULL); + gimple_seq_add_stmt (&condseq, cond); + gimple *cond_end = gimple_seq_last (condseq); + gsi_insert_seq_before (&gsi, condseq, GSI_SAME_STMT); + + gimple_seq free_seq = NULL; + gcc_assert (!gcn_array_reduction_buffers.is_empty ()); + tree red_buf_ptr = gcn_array_reduction_buffers.last (); + + tree ptr = make_ssa_name (ptr_type_node); + gimplify_assign (ptr, red_buf_ptr, &free_seq); + + gcn_array_reduction_buffers.pop (); + + tree free_decl = builtin_decl_explicit (BUILT_IN_FREE); + gcall *stmt = gimple_build_call (free_decl, 1, ptr); + gimple_seq_add_stmt (&free_seq, stmt); + + gcn_create_if_else_seq (&gsi, cond_end, &free_seq, NULL); } if (lhs) @@ -484,7 +1022,8 @@ gcn_goacc_reduction_teardown (gcall *call) pop_gimplify_context (NULL); - gsi_replace_with_seq (&gsi, seq, true); + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); + gsi_remove (&gsi, true); } /* Implement TARGET_GOACC_REDUCTION. diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc index e6d5254..ba40a84 100644 --- a/gcc/config/nvptx/nvptx.cc +++ b/gcc/config/nvptx/nvptx.cc @@ -2142,19 +2142,15 @@ nvptx_gen_shuffle (rtx dst, rtx src, rtx idx, nvptx_shuffle_kind kind) break; case E_V2SImode: { - rtx src0 = gen_rtx_SUBREG (SImode, src, 0); - rtx src1 = gen_rtx_SUBREG (SImode, src, 4); - rtx dst0 = gen_rtx_SUBREG (SImode, dst, 0); - rtx dst1 = gen_rtx_SUBREG (SImode, dst, 4); rtx tmp0 = gen_reg_rtx (SImode); rtx tmp1 = gen_reg_rtx (SImode); start_sequence (); - emit_insn (gen_movsi (tmp0, src0)); - emit_insn (gen_movsi (tmp1, src1)); + emit_insn (gen_vec_extractv2sisi (tmp0, src, GEN_INT (0))); + emit_insn (gen_vec_extractv2sisi (tmp1, src, GEN_INT (1))); emit_insn (nvptx_gen_shuffle (tmp0, tmp0, idx, kind)); emit_insn (nvptx_gen_shuffle (tmp1, tmp1, idx, kind)); - emit_insn (gen_movsi (dst0, tmp0)); - emit_insn (gen_movsi (dst1, tmp1)); + emit_insn (gen_vec_setv2si (dst, tmp0, GEN_INT (0))); + emit_insn (gen_vec_setv2si (dst, tmp1, GEN_INT (1))); res = get_insns (); end_sequence (); } @@ -6512,15 +6508,24 @@ nvptx_expand_shared_addr (tree exp, rtx target, unsigned align = TREE_INT_CST_LOW (CALL_EXPR_ARG (exp, 2)); unsigned offset = TREE_INT_CST_LOW (CALL_EXPR_ARG (exp, 0)); - unsigned size = TREE_INT_CST_LOW (CALL_EXPR_ARG (exp, 1)); + unsigned size = 0; + + tree size_expr = CALL_EXPR_ARG (exp, 1); rtx addr = worker_red_sym; + if (TREE_CONSTANT (size_expr)) + size = TREE_INT_CST_LOW (size_expr); + if (vector) { offload_attrs oa; populate_offload_attrs (&oa); + /* Default size for unknown size expression. */ + if (size == 0) + size = 256; + unsigned int psize = ROUND_UP (size + offset, align); unsigned int pnum = nvptx_mach_max_workers (); vector_red_partition = MAX (vector_red_partition, psize); @@ -6536,7 +6541,8 @@ nvptx_expand_shared_addr (tree exp, rtx target, else { worker_red_align = MAX (worker_red_align, align); - worker_red_size = MAX (worker_red_size, size + offset); + if (size) + worker_red_size = MAX (worker_red_size, size + offset); if (offset) { @@ -7141,16 +7147,32 @@ nvptx_get_shared_red_addr (tree type, tree offset, bool vector) enum nvptx_builtins addr_dim = NVPTX_BUILTIN_WORKER_ADDR; if (vector) addr_dim = NVPTX_BUILTIN_VECTOR_ADDR; - machine_mode mode = TYPE_MODE (type); tree fndecl = nvptx_builtin_decl (addr_dim, true); - tree size = build_int_cst (unsigned_type_node, GET_MODE_SIZE (mode)); - tree align = build_int_cst (unsigned_type_node, - GET_MODE_ALIGNMENT (mode) / BITS_PER_UNIT); + tree size = TYPE_SIZE_UNIT (type); + tree align = build_int_cst (unsigned_type_node, TYPE_ALIGN_UNIT (type)); tree call = build_call_expr (fndecl, 3, offset, size, align); return fold_convert (build_pointer_type (type), call); } +static tree +nvptx_get_shared_red_addr (tree array_elem_type, tree array_max_idx, + tree offset, bool vector) +{ + tree fndecl = nvptx_builtin_decl ((vector + ? NVPTX_BUILTIN_VECTOR_ADDR + : NVPTX_BUILTIN_WORKER_ADDR), + true); + tree align = build_int_cst (unsigned_type_node, + TYPE_ALIGN_UNIT (array_elem_type)); + tree array_length = fold_build2 (PLUS_EXPR, sizetype, array_max_idx, + build_int_cst (sizetype, 1)); + tree size = fold_build2 (MULT_EXPR, sizetype, + TYPE_SIZE_UNIT (array_elem_type), array_length); + tree call = build_call_expr (fndecl, 3, offset, size, align); + return fold_convert (build_pointer_type (array_elem_type), call); +} + /* Emit a SHFL.DOWN using index SHFL of VAR into DEST_VAR. This function will cast the variable if necessary. */ @@ -7456,11 +7478,144 @@ nvptx_lockfull_update (location_t loc, gimple_stmt_iterator *gsi, static tree nvptx_reduction_update (location_t loc, gimple_stmt_iterator *gsi, - tree ptr, tree var, tree_code op, int level) + tree ptr, tree var, tree_code op, int level, + tree array_max_idx = NULL_TREE) { tree type = TREE_TYPE (var); tree size = TYPE_SIZE (type); + if (!VAR_P (ptr)) + { + tree t = make_ssa_name (TREE_TYPE (ptr)); + gimple_seq seq = NULL; + gimplify_assign (t, ptr, &seq); + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); + ptr = t; + } + + if (TREE_CODE (type) == ARRAY_TYPE + || TREE_CODE (type) == POINTER_TYPE) + { + tree array_type; + if (TREE_CODE (type) == ARRAY_TYPE) + { + array_type = TREE_TYPE (var); + } + else if (TREE_CODE (type) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE) + { + array_type = TREE_TYPE (TREE_TYPE (var)); + } + else if (TREE_CODE (type) == POINTER_TYPE) + { + array_type = TREE_TYPE (var); + } + else + gcc_unreachable (); + + tree array_elem_type = TREE_TYPE (array_type); + + gimple *g; + gimple_seq seq = NULL; + tree max_index = array_max_idx; + gcc_assert (array_max_idx); + + tree init_index = make_ssa_name (TREE_TYPE (max_index)); + tree loop_index = make_ssa_name (TREE_TYPE (max_index)); + tree update_index = make_ssa_name (TREE_TYPE (max_index)); + + g = gimple_build_assign (init_index, + build_int_cst (TREE_TYPE (init_index), 0)); + gimple_seq_add_stmt (&seq, g); + gimple *init_end = gimple_seq_last (seq); + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); + + basic_block init_bb = gsi_bb (*gsi); + edge init_edge = split_block (init_bb, init_end); + basic_block loop_bb = init_edge->dest; + /* Reset the iterator. */ + *gsi = gsi_for_stmt (gsi_stmt (*gsi)); + + seq = NULL; + g = gimple_build_assign (update_index, PLUS_EXPR, loop_index, + build_int_cst (TREE_TYPE (loop_index), 1)); + gimple_seq_add_stmt (&seq, g); + + g = gimple_build_cond (LE_EXPR, update_index, max_index, NULL, NULL); + gimple_seq_add_stmt (&seq, g); + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); + + edge post_edge = split_block (loop_bb, g); + basic_block post_bb = post_edge->dest; + loop_bb = post_edge->src; + /* Reset the iterator. */ + *gsi = gsi_for_stmt (gsi_stmt (*gsi)); + + /* Place where we insert reduction code below. */ + gimple_stmt_iterator reduction_code_gsi = gsi_start_bb (loop_bb); + + post_edge->flags ^= EDGE_FALSE_VALUE | EDGE_FALLTHRU; + post_edge->probability = profile_probability::even (); + edge loop_edge = make_edge (loop_bb, loop_bb, EDGE_TRUE_VALUE); + loop_edge->probability = profile_probability::even (); + set_immediate_dominator (CDI_DOMINATORS, loop_bb, init_bb); + set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb); + class loop *new_loop = alloc_loop (); + new_loop->header = loop_bb; + new_loop->latch = loop_bb; + add_loop (new_loop, loop_bb->loop_father); + + gphi *phi = create_phi_node (loop_index, loop_bb); + add_phi_arg (phi, init_index, init_edge, loc); + add_phi_arg (phi, update_index, loop_edge, loc); + + tree var_ptr = fold_convert (build_pointer_type (array_elem_type), + var); + tree idx = fold_build2 (MULT_EXPR, sizetype, + fold_convert (sizetype, loop_index), + TYPE_SIZE_UNIT (array_elem_type)); + var_ptr = build2 (POINTER_PLUS_EXPR, TREE_TYPE (var_ptr), var_ptr, idx); + tree var_aref = build_simple_mem_ref (var_ptr); + ptr = build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr),ptr, idx); + + nvptx_reduction_update (loc, &reduction_code_gsi, + ptr, var_aref, op, level); + + return build_simple_mem_ref (ptr); + } + else if (TREE_CODE (type) == RECORD_TYPE) + { + for (tree fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld)) + if (TREE_CODE (fld) == FIELD_DECL) + { + tree var_fld_ref = build3 (COMPONENT_REF, TREE_TYPE (fld), + var, fld, NULL); + tree ptr_ref = build_simple_mem_ref (ptr); + tree ptr_fld_type + = build_qualified_type (TREE_TYPE (fld), + TYPE_QUALS (TREE_TYPE (ptr_ref))); + tree ptr_fld_ref = build3 (COMPONENT_REF, ptr_fld_type, + ptr_ref, fld, NULL); + + if (TREE_CODE (TREE_TYPE (fld)) == ARRAY_TYPE) + { + tree array_elem_ptr_type + = build_pointer_type (TREE_TYPE (TREE_TYPE (fld))); + nvptx_reduction_update + (loc, gsi, + fold_convert (array_elem_ptr_type, + build_fold_addr_expr (ptr_fld_ref)), + build_fold_addr_expr (var_fld_ref), op, level, + TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (fld)))); + } + else + nvptx_reduction_update (loc, gsi, + build_fold_addr_expr (ptr_fld_ref), + var_fld_ref, op, level); + } + return build_simple_mem_ref (ptr); + } + if (size == TYPE_SIZE (unsigned_type_node) || size == TYPE_SIZE (long_long_unsigned_type_node)) return nvptx_lockless_update (loc, gsi, ptr, var, op); @@ -7602,6 +7757,19 @@ nvptx_goacc_reduction_setup (gcall *call, offload_attrs *oa) tree lhs = gimple_call_lhs (call); tree var = gimple_call_arg (call, 2); int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3)); + + tree array_addr = gimple_call_arg (call, 6); + tree array_max_idx = gimple_call_arg (call, 7); + bool array_p = !integer_zerop (array_addr); + + tree array_type = NULL_TREE; + if (array_p) + array_type + = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE + ? TREE_TYPE (TREE_TYPE (array_addr)) + : TREE_TYPE (array_addr)); + gimple_seq seq = NULL; push_gimplify_context (true); @@ -7611,34 +7779,52 @@ nvptx_goacc_reduction_setup (gcall *call, offload_attrs *oa) /* Copy the receiver object. */ tree ref_to_res = gimple_call_arg (call, 1); - if (!integer_zerop (ref_to_res)) + if (!integer_zerop (ref_to_res) && !array_p) { - ref_to_res = nvptx_adjust_reduction_type (ref_to_res, TREE_TYPE (var), - &seq); + ref_to_res = nvptx_adjust_reduction_type (ref_to_res, + TREE_TYPE (var), &seq); var = build_simple_mem_ref (ref_to_res); } } if (level == GOMP_DIM_WORKER - || (level == GOMP_DIM_VECTOR && oa->vector_length > PTX_WARP_SIZE)) + || (level == GOMP_DIM_VECTOR + && (oa->vector_length > PTX_WARP_SIZE + || array_p + || TREE_CODE (TREE_TYPE (var)) == RECORD_TYPE))) { /* Store incoming value to worker reduction buffer. */ tree offset = gimple_call_arg (call, 5); - tree call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset, - level == GOMP_DIM_VECTOR); - tree ptr = make_ssa_name (TREE_TYPE (call)); - - gimplify_assign (ptr, call, &seq); - tree ref = build_simple_mem_ref (ptr); - TREE_THIS_VOLATILE (ref) = 1; - gimplify_assign (ref, var, &seq); + tree call, ptr; + if (array_p) + { + tree array_elem_type = TREE_TYPE (array_type); + call = nvptx_get_shared_red_addr (array_elem_type, array_max_idx, + offset, level == GOMP_DIM_VECTOR); + ptr = make_ssa_name (TREE_TYPE (call)); + gimplify_assign (ptr, call, &seq); + oacc_build_array_copy (fold_convert (TREE_TYPE (array_addr), ptr), + array_addr, array_max_idx, &seq); + } + else + { + call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset, + level == GOMP_DIM_VECTOR); + ptr = make_ssa_name (TREE_TYPE (call)); + gimplify_assign (ptr, call, &seq); + tree ref = build_simple_mem_ref (ptr); + TREE_THIS_VOLATILE (ref) = 1; + gimplify_assign (ref, var, &seq); + } } if (lhs) - gimplify_assign (lhs, var, &seq); + gimplify_assign (lhs, unshare_expr (var), &seq); pop_gimplify_context (NULL); - gsi_replace_with_seq (&gsi, seq, true); + + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); + gsi_remove (&gsi, true); } /* NVPTX implementation of GOACC_REDUCTION_INIT. */ @@ -7652,13 +7838,55 @@ nvptx_goacc_reduction_init (gcall *call, offload_attrs *oa) int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3)); enum tree_code rcode = (enum tree_code)TREE_INT_CST_LOW (gimple_call_arg (call, 4)); - tree init = omp_reduction_init_op (gimple_location (call), rcode, - TREE_TYPE (var)); + tree array_addr = gimple_call_arg (call, 6); + tree array_max_idx = gimple_call_arg (call, 7); + bool array_p = !integer_zerop (array_addr); + + tree array_type = NULL_TREE; + if (array_p) + array_type + = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE + ? TREE_TYPE (TREE_TYPE (array_addr)) + : TREE_TYPE (array_addr)); + + tree init = NULL_TREE; gimple_seq seq = NULL; push_gimplify_context (true); + if (array_p) + { + tree loop_index; + gimple_stmt_iterator loop_body_gsi; + oacc_build_indexed_ssa_loop (gimple_location (call), array_max_idx, &gsi, + &loop_index, &loop_body_gsi); + + tree init_type = TREE_TYPE (array_type); + init = omp_reduction_init_op (gimple_location (call), rcode, + init_type); + gimple_seq seq = NULL; + + tree ptr = fold_convert (ptr_type_node, array_addr); + tree offset = build2 (MULT_EXPR, sizetype, + fold_convert (sizetype, loop_index), + TYPE_SIZE_UNIT (init_type)); + tree addr = build2 (POINTER_PLUS_EXPR, build_pointer_type (init_type), + ptr, offset); + tree ref = build_simple_mem_ref (addr); + + push_gimplify_context (true); + gimplify_assign (ref, init, &seq); + pop_gimplify_context (NULL); + gsi_insert_seq_before (&loop_body_gsi, seq, GSI_SAME_STMT); + init = var; + } + else + init = omp_reduction_init_op (gimple_location (call), rcode, + TREE_TYPE (var)); - if (level == GOMP_DIM_VECTOR && oa->vector_length == PTX_WARP_SIZE) + if (level == GOMP_DIM_VECTOR && oa->vector_length == PTX_WARP_SIZE + && !array_p + && TREE_CODE (TREE_TYPE (var)) != RECORD_TYPE) { /* Initialize vector-non-zeroes to INIT_VAL (OP). */ tree tid = make_ssa_name (integer_type_node); @@ -7723,7 +7951,9 @@ nvptx_goacc_reduction_init (gcall *call, offload_attrs *oa) } pop_gimplify_context (NULL); - gsi_replace_with_seq (&gsi, seq, true); + + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); + gsi_remove (&gsi, true); } /* NVPTX implementation of GOACC_REDUCTION_FINI. */ @@ -7738,25 +7968,49 @@ nvptx_goacc_reduction_fini (gcall *call, offload_attrs *oa) int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3)); enum tree_code op = (enum tree_code)TREE_INT_CST_LOW (gimple_call_arg (call, 4)); + + tree array_addr = gimple_call_arg (call, 6); + tree array_max_idx = gimple_call_arg (call, 7); + bool array_p = !integer_zerop (array_addr); + + tree array_type = NULL_TREE; + if (array_p) + array_type + = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE + ? TREE_TYPE (TREE_TYPE (array_addr)) + : TREE_TYPE (array_addr)); + gimple_seq seq = NULL; tree r = NULL_TREE; push_gimplify_context (true); - if (level == GOMP_DIM_VECTOR && oa->vector_length == PTX_WARP_SIZE) + if (level == GOMP_DIM_VECTOR && oa->vector_length == PTX_WARP_SIZE + && !array_p + && TREE_CODE (TREE_TYPE (var)) != RECORD_TYPE) r = nvptx_vector_reduction (gimple_location (call), &gsi, var, op); else { tree accum = NULL_TREE; + tree ptr = NULL_TREE; if (level == GOMP_DIM_WORKER || level == GOMP_DIM_VECTOR) { /* Get reduction buffer address. */ tree offset = gimple_call_arg (call, 5); - tree call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset, - level == GOMP_DIM_VECTOR); - tree ptr = make_ssa_name (TREE_TYPE (call)); - + tree call; + if (array_p) + { + tree array_elem_type = TREE_TYPE (array_type); + call = nvptx_get_shared_red_addr (array_elem_type, array_max_idx, + offset, + level == GOMP_DIM_VECTOR); + } + else + call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset, + level == GOMP_DIM_VECTOR); + ptr = make_ssa_name (TREE_TYPE (call)); gimplify_assign (ptr, call, &seq); accum = ptr; } @@ -7774,8 +8028,16 @@ nvptx_goacc_reduction_fini (gcall *call, offload_attrs *oa) /* UPDATE the accumulator. */ gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); seq = NULL; - r = nvptx_reduction_update (gimple_location (call), &gsi, - accum, var, op, level); + if (array_p) + { + nvptx_reduction_update (gimple_location (call), &gsi, + accum, array_addr, op, level, + array_max_idx); + r = var; + } + else + r = nvptx_reduction_update (gimple_location (call), &gsi, + accum, var, op, level); } } @@ -7783,7 +8045,8 @@ nvptx_goacc_reduction_fini (gcall *call, offload_attrs *oa) gimplify_assign (lhs, r, &seq); pop_gimplify_context (NULL); - gsi_replace_with_seq (&gsi, seq, true); + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); + gsi_remove (&gsi, true); } /* NVPTX implementation of GOACC_REDUCTION_TEARDOWN. */ @@ -7795,21 +8058,47 @@ nvptx_goacc_reduction_teardown (gcall *call, offload_attrs *oa) tree lhs = gimple_call_lhs (call); tree var = gimple_call_arg (call, 2); int level = TREE_INT_CST_LOW (gimple_call_arg (call, 3)); + + tree array_addr = gimple_call_arg (call, 6); + tree array_max_idx = gimple_call_arg (call, 7); + bool array_p = !integer_zerop (array_addr); + + tree array_type = NULL_TREE; + if (array_p) + array_type + = (TREE_CODE (TREE_TYPE (array_addr)) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (TREE_TYPE (array_addr))) == ARRAY_TYPE + ? TREE_TYPE (TREE_TYPE (array_addr)) + : TREE_TYPE (array_addr)); + + tree ptr = NULL_TREE; gimple_seq seq = NULL; push_gimplify_context (true); if (level == GOMP_DIM_WORKER - || (level == GOMP_DIM_VECTOR && oa->vector_length > PTX_WARP_SIZE)) + || (level == GOMP_DIM_VECTOR && (oa->vector_length > PTX_WARP_SIZE + || array_p))) { /* Read the worker reduction buffer. */ tree offset = gimple_call_arg (call, 5); - tree call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset, - level == GOMP_DIM_VECTOR); - tree ptr = make_ssa_name (TREE_TYPE (call)); - - gimplify_assign (ptr, call, &seq); - var = build_simple_mem_ref (ptr); - TREE_THIS_VOLATILE (var) = 1; + if (array_p) + { + tree array_elem_type = TREE_TYPE (array_type); + tree call + = nvptx_get_shared_red_addr (array_elem_type, array_max_idx, + offset, level == GOMP_DIM_VECTOR); + ptr = make_ssa_name (TREE_TYPE (call)); + gimplify_assign (ptr, call, &seq); + } + else + { + tree call = nvptx_get_shared_red_addr (TREE_TYPE (var), offset, + level == GOMP_DIM_VECTOR); + ptr = make_ssa_name (TREE_TYPE (call)); + gimplify_assign (ptr, call, &seq); + var = build_simple_mem_ref (ptr); + TREE_THIS_VOLATILE (var) = 1; + } } if (level != GOMP_DIM_GANG) @@ -7819,18 +8108,28 @@ nvptx_goacc_reduction_teardown (gcall *call, offload_attrs *oa) if (!integer_zerop (ref_to_res)) { - ref_to_res = nvptx_adjust_reduction_type (ref_to_res, TREE_TYPE (var), - &seq); - gimplify_assign (build_simple_mem_ref (ref_to_res), var, &seq); + if (array_p) + oacc_build_array_copy (ref_to_res, ptr, array_max_idx, &seq); + else + { + ref_to_res = nvptx_adjust_reduction_type (ref_to_res, + TREE_TYPE (var), &seq); + gimplify_assign (build_simple_mem_ref (ref_to_res), var, &seq); + } + } + else if (array_p) + { + oacc_build_array_copy (array_addr, ptr, array_max_idx, &seq); } } if (lhs) - gimplify_assign (lhs, var, &seq); + gimplify_assign (lhs, unshare_expr (var), &seq); pop_gimplify_context (NULL); - gsi_replace_with_seq (&gsi, seq, true); + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); + gsi_remove (&gsi, true); } /* NVPTX reduction expander. */ diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 1c104eb..232881d 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -39799,8 +39799,7 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, case OMP_CLAUSE_HAS_DEVICE_ADDR: array_section_p = false; dims.truncate (0); - while ((ort != C_ORT_ACC || kind != OMP_CLAUSE_REDUCTION) - && cp_lexer_next_token_is (parser->lexer, CPP_OPEN_SQUARE)) + while (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_SQUARE)) { location_t loc = UNKNOWN_LOCATION; tree low_bound = NULL_TREE, length = NULL_TREE; @@ -41512,6 +41511,12 @@ cp_parser_omp_clause_reduction (cp_parser *parser, enum omp_clause_code kind, code = TRUTH_ANDIF_EXPR; else if (id == ovl_op_identifier (false, TRUTH_ORIF_EXPR)) code = TRUTH_ORIF_EXPR; + if (code == ERROR_MARK && ort == C_ORT_ACC) + { + cp_parser_error (parser, "expected %<+%>, %<*%>, %<-%>, %<&%>, " + "%<^%>, %<|%>, %<&&%>, %<||%>, %<min%> or %<max%>"); + goto resync_fail; + } id = omp_reduction_id (code, id, NULL_TREE); tree scope = parser->scope; if (scope) @@ -41538,6 +41543,10 @@ cp_parser_omp_clause_reduction (cp_parser *parser, enum omp_clause_code kind, for (c = nlist; c != list; c = OMP_CLAUSE_CHAIN (c)) { OMP_CLAUSE_REDUCTION_CODE (c) = code; + /* OpenACC does not require anything below. */ + if (ort == C_ORT_ACC) + continue; + if (task) OMP_CLAUSE_REDUCTION_TASK (c) = 1; else if (inscan) diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index deb82c9..196eb9f 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -7187,6 +7187,69 @@ cp_check_omp_declare_reduction (tree udr) return true; } + +static bool +cp_oacc_reduction_defined_type_p (enum tree_code reduction_code, tree t) +{ + if (TREE_CODE (t) == INTEGER_TYPE) + return true; + + if (FLOAT_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE) + switch (reduction_code) + { + case PLUS_EXPR: + case MULT_EXPR: + case MINUS_EXPR: + case TRUTH_ANDIF_EXPR: + case TRUTH_ORIF_EXPR: + return true; + case MIN_EXPR: + case MAX_EXPR: + return TREE_CODE (t) != COMPLEX_TYPE; + case BIT_AND_EXPR: + case BIT_XOR_EXPR: + case BIT_IOR_EXPR: + return false; + default: + gcc_unreachable (); + } + + if (TREE_CODE (t) == ARRAY_TYPE) + return cp_oacc_reduction_defined_type_p (reduction_code, TREE_TYPE (t)); + + if (TREE_CODE (t) == RECORD_TYPE) + { + for (tree fld = TYPE_FIELDS (t); fld; fld = TREE_CHAIN (fld)) + if (TREE_CODE (fld) == FIELD_DECL + && !cp_oacc_reduction_defined_type_p (reduction_code, + TREE_TYPE (fld))) + return false; + return true; + } + + return false; +} + +static const char * +cp_oacc_reduction_code_name (enum tree_code reduction_code) +{ + switch (reduction_code) + { + case PLUS_EXPR: return "+"; + case MULT_EXPR: return "*"; + case MINUS_EXPR: return "-"; + case TRUTH_ANDIF_EXPR: return "&&"; + case TRUTH_ORIF_EXPR: return "||"; + case MIN_EXPR: return "min"; + case MAX_EXPR: return "max"; + case BIT_AND_EXPR: return "&"; + case BIT_XOR_EXPR: return "^"; + case BIT_IOR_EXPR: return "|"; + default: + gcc_unreachable (); + } +} + /* Helper function of finish_omp_clauses. Clone STMT as if we were making an inline call. But, remap the OMP_DECL1 VAR_DECL (omp_out resp. omp_orig) to PLACEHOLDER @@ -7333,6 +7396,20 @@ finish_omp_reduction_clause (tree c, enum c_omp_region_type ort, return false; } + if (ort == C_ORT_ACC) + { + enum tree_code r_code = OMP_CLAUSE_REDUCTION_CODE (c); + if (!cp_oacc_reduction_defined_type_p (r_code, TREE_TYPE (t))) + { + const char *r_name = cp_oacc_reduction_code_name (r_code); + error_at (OMP_CLAUSE_LOCATION (c), + "%qE has invalid type for %<reduction(%s)%>", + t, r_name); + return true; + } + return false; + } + tree id = OMP_CLAUSE_REDUCTION_PLACEHOLDER (c); type = TYPE_MAIN_VARIANT (type); diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index ddf4e50..563ba57 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -8482,6 +8482,95 @@ oacc_is_loop (gfc_code *code) || code->op == EXEC_OACC_LOOP; } +static bool +oacc_reduction_defined_type_p (enum gfc_omp_reduction_op rop, gfc_typespec *ts, + const char **rop_name = NULL) +{ + gcc_assert (rop != OMP_REDUCTION_USER && rop != OMP_REDUCTION_NONE); + + if (rop_name) + switch (rop) + { + case OMP_REDUCTION_MAX: + *rop_name = "max"; + break; + case OMP_REDUCTION_MIN: + *rop_name = "min"; + break; + case OMP_REDUCTION_IAND: + *rop_name = "iand"; + break; + case OMP_REDUCTION_IOR: + *rop_name = "ior"; + break; + case OMP_REDUCTION_IEOR: + *rop_name = "ieor"; + break; + default: + *rop_name = gfc_op2string ((gfc_intrinsic_op) rop); + break; + } + + if (ts->type == BT_INTEGER) + switch (rop) + { + case OMP_REDUCTION_AND: + case OMP_REDUCTION_OR: + case OMP_REDUCTION_EQV: + case OMP_REDUCTION_NEQV: + return false; + default: + return true; + } + + if (ts->type == BT_LOGICAL) + switch (rop) + { + case OMP_REDUCTION_AND: + case OMP_REDUCTION_OR: + case OMP_REDUCTION_EQV: + case OMP_REDUCTION_NEQV: + return true; + default: + return false; + } + + if (ts->type == BT_REAL || ts->type == BT_COMPLEX) + switch (rop) + { + case OMP_REDUCTION_PLUS: + case OMP_REDUCTION_TIMES: + case OMP_REDUCTION_MINUS: + return true; + + case OMP_REDUCTION_AND: + case OMP_REDUCTION_OR: + case OMP_REDUCTION_EQV: + case OMP_REDUCTION_NEQV: + return false; + + case OMP_REDUCTION_MAX: + case OMP_REDUCTION_MIN: + return ts->type != BT_COMPLEX; + case OMP_REDUCTION_IAND: + case OMP_REDUCTION_IOR: + case OMP_REDUCTION_IEOR: + return false; + default: + gcc_unreachable (); + } + + if (ts->type == BT_DERIVED) + { + for (gfc_component *p = ts->u.derived->components; p; p = p->next) + if (!oacc_reduction_defined_type_p (rop, &p->ts)) + return false; + return true; + } + + return false; +} + static void resolve_scalar_int_expr (gfc_expr *expr, const char *clause) { @@ -9917,8 +10006,10 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, else n->sym->mark = 1; - /* OpenACC does not support reductions on arrays. */ - if (n->sym->as) + /* OpenACC current only supports array reductions on explicit-shape + arrays. */ + if ((n->sym->as && n->sym->as->type != AS_EXPLICIT) + || n->sym->attr.codimension) gfc_error ("Array %qs is not permitted in reduction at %L", n->sym->name, &n->where); } @@ -10001,6 +10092,10 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, } } break; + case OMP_LIST_REDUCTION: + if (!openacc) + goto default_case; + gcc_fallthrough (); case OMP_LIST_AFFINITY: case OMP_LIST_DEPEND: case OMP_LIST_MAP: @@ -10009,6 +10104,38 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, case OMP_LIST_CACHE: for (; n != NULL; n = n->next) { + if (openacc && list == OMP_LIST_REDUCTION) + { + if (n->sym->attr.threadprivate) + gfc_error ("THREADPRIVATE object %qs in %s clause at %L", + n->sym->name, name, &n->where); + if (n->sym->attr.cray_pointee) + gfc_error ("Cray pointee %qs in %s clause at %L", + n->sym->name, name, &n->where); + if (n->sym->attr.associate_var) + gfc_error ("Associate name %qs in %s clause at %L", + n->sym->attr.select_type_temporary + ? n->sym->assoc->target->symtree->n.sym->name + : n->sym->name, name, &n->where); + if (n->sym->attr.proc_pointer) + gfc_error ("Procedure pointer %qs in %s clause at %L", + n->sym->name, name, &n->where); + if (n->sym->attr.pointer) + gfc_error ("POINTER object %qs in %s clause at %L", + n->sym->name, name, &n->where); + if (n->sym->attr.cray_pointer) + gfc_error ("Cray pointer %qs in %s clause at %L", + n->sym->name, name, &n->where); + + const char *rop_name; + if (!oacc_reduction_defined_type_p (n->u.reduction_op, + &n->sym->ts, &rop_name)) + { + gfc_error ("Reduction operator %s is not valid for %qs at %L", + rop_name, n->sym->name, &n->where); + break; + } + } if ((list == OMP_LIST_DEPEND || list == OMP_LIST_AFFINITY) && n->u2.ns && !n->u2.ns->resolved) { @@ -10250,6 +10377,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, break; } default: + default_case: for (; n != NULL; n = n->next) { if (n->sym == NULL) @@ -10402,39 +10530,46 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, case OMP_LIST_IN_REDUCTION: case OMP_LIST_TASK_REDUCTION: case OMP_LIST_REDUCTION_INSCAN: - switch (n->u.reduction_op) + if (openacc) { - case OMP_REDUCTION_PLUS: - case OMP_REDUCTION_TIMES: - case OMP_REDUCTION_MINUS: - if (!gfc_numeric_ts (&n->sym->ts)) + if (!oacc_reduction_defined_type_p (n->u.reduction_op, + &n->sym->ts)) bad = true; - break; - case OMP_REDUCTION_AND: - case OMP_REDUCTION_OR: - case OMP_REDUCTION_EQV: - case OMP_REDUCTION_NEQV: - if (n->sym->ts.type != BT_LOGICAL) - bad = true; - break; - case OMP_REDUCTION_MAX: - case OMP_REDUCTION_MIN: - if (n->sym->ts.type != BT_INTEGER - && n->sym->ts.type != BT_REAL) - bad = true; - break; - case OMP_REDUCTION_IAND: - case OMP_REDUCTION_IOR: - case OMP_REDUCTION_IEOR: - if (n->sym->ts.type != BT_INTEGER) - bad = true; - break; - case OMP_REDUCTION_USER: - bad = true; - break; - default: - break; } + else + switch (n->u.reduction_op) + { + case OMP_REDUCTION_PLUS: + case OMP_REDUCTION_TIMES: + case OMP_REDUCTION_MINUS: + if (!gfc_numeric_ts (&n->sym->ts)) + bad = true; + break; + case OMP_REDUCTION_AND: + case OMP_REDUCTION_OR: + case OMP_REDUCTION_EQV: + case OMP_REDUCTION_NEQV: + if (n->sym->ts.type != BT_LOGICAL) + bad = true; + break; + case OMP_REDUCTION_MAX: + case OMP_REDUCTION_MIN: + if (n->sym->ts.type != BT_INTEGER + && n->sym->ts.type != BT_REAL) + bad = true; + break; + case OMP_REDUCTION_IAND: + case OMP_REDUCTION_IOR: + case OMP_REDUCTION_IEOR: + if (n->sym->ts.type != BT_INTEGER) + bad = true; + break; + case OMP_REDUCTION_USER: + bad = true; + break; + default: + break; + } if (!bad) n->u2.udr = NULL; else diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 521f7cc..cec01475 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -2840,7 +2840,9 @@ omp_udr_find_orig (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED, } static void -gfc_trans_omp_array_reduction_or_udr (tree c, gfc_omp_namelist *n, locus where) +gfc_trans_omp_array_reduction_or_udr (stmtblock_t *block, tree c, + gfc_omp_namelist *n, locus where, + bool openacc) { gfc_symbol *sym = n->sym; gfc_symtree *root1 = NULL, *root2 = NULL, *root3 = NULL, *root4 = NULL; @@ -2868,6 +2870,98 @@ gfc_trans_omp_array_reduction_or_udr (tree c, gfc_omp_namelist *n, locus where) type = TREE_TYPE (type); } + if (openacc) + { + if ((n->expr == NULL && n->sym->as != NULL) + || (n->expr + && n->expr->ref->type == REF_ARRAY + && n->expr->ref->u.ar.type == AR_FULL)) + { + tree t = build_fold_addr_expr (decl); + t = build2 (MEM_REF, type, t, + build_int_cst (build_pointer_type (integer_type_node), 0)); + OMP_CLAUSE_DECL (c) = t; + return; + } + + if (n->expr + && n->expr->expr_type == EXPR_VARIABLE + && n->expr->ref->type == REF_ARRAY + && !n->expr->ref->next) + { + bool t = gfc_resolve_expr (n->expr); + gcc_assert (t); + + gfc_se se; + bool is_element = n->expr->ref->u.ar.type == AR_ELEMENT; + tree ptr; + gfc_init_se (&se, NULL); + if (is_element) + { + gfc_conv_expr_reference (&se, n->expr); + gfc_add_block_to_block (block, &se.pre); + ptr = se.expr; + + tree elem_type; + tree type + = build_range_type (TREE_TYPE (TREE_TYPE (ptr)), + integer_zero_node, + integer_zero_node); + elem_type = build_array_type (TREE_TYPE (type), type); + gcc_assert (TREE_CODE (ptr) == ADDR_EXPR + && TREE_CODE (TREE_OPERAND (ptr, 0)) == ARRAY_REF); + tree aref = TREE_OPERAND (ptr, 0); + tree array = TREE_OPERAND (aref, 0); + tree offset = TREE_OPERAND (aref, 1); + tree t = build2 (POINTER_PLUS_EXPR, + build_pointer_type (elem_type), + build_fold_addr_expr (array), + fold_convert (size_type_node, offset)); + t = build2 (MEM_REF, elem_type, t, null_pointer_node); + OMP_CLAUSE_DECL (c) = t; + return; + } + else + { + gfc_conv_expr_descriptor (&se, n->expr); + gfc_add_block_to_block (block, &se.pre); + + ptr = gfc_conv_array_data (se.expr); + tree type = TREE_TYPE (TREE_TYPE (TREE_TYPE (ptr))); + tree idx + = gfc_rank_cst[GFC_TYPE_ARRAY_RANK (TREE_TYPE (se.expr)) - 1]; + tree sz + = fold_build2 (MINUS_EXPR, gfc_array_index_type, + gfc_conv_descriptor_ubound_get (se.expr, idx), + gfc_conv_descriptor_lbound_get (se.expr, idx)); + sz = fold_build2 (PLUS_EXPR, gfc_array_index_type, + sz, gfc_index_one_node); + tree domain + = build_index_type (fold_build2 (MINUS_EXPR, + gfc_array_index_type, + sz, gfc_index_one_node)); + tree t, array_type = build_array_type (type, domain, false); + tree offset = create_tmp_var (sizetype); + t = build2 (MINUS_EXPR, sizetype, + fold_convert (sizetype, ptr), + fold_convert (sizetype, + build_fold_addr_expr (decl))); + t = build2 (MODIFY_EXPR, sizetype, offset, t); + gfc_add_expr_to_block (block, t); + + t = build2 (POINTER_PLUS_EXPR, build_pointer_type (array_type), + build_fold_addr_expr (decl), offset); + t = build2 (MEM_REF, array_type, t, null_pointer_node); + OMP_CLAUSE_DECL (c) = t; + return; + } + gcc_assert (se.post.head == NULL_TREE); + gcc_assert (POINTER_TYPE_P (TREE_TYPE (ptr))); + OMP_CLAUSE_DECL (c) = build_fold_indirect_ref (ptr); + return; + } + } + /* Create a fake symbol for init value. */ memset (&init_val_sym, 0, sizeof (init_val_sym)); init_val_sym.ns = sym->ns; @@ -3095,21 +3189,24 @@ gfc_trans_omp_array_reduction_or_udr (tree c, gfc_omp_namelist *n, locus where) poplevel (0, 0); OMP_CLAUSE_REDUCTION_INIT (c) = stmt; - /* Create the merge statement list. */ - pushlevel (); - if (e4) - stmt = gfc_trans_assignment (e3, e4, false, true); - else - stmt = gfc_trans_call (n->u2.udr->combiner, false, - NULL_TREE, NULL_TREE, false); - if (TREE_CODE (stmt) != BIND_EXPR) - stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0)); - else - poplevel (0, 0); - OMP_CLAUSE_REDUCTION_MERGE (c) = stmt; + if (!openacc) + { + /* Create the merge statement list. */ + pushlevel (); + if (e4) + stmt = gfc_trans_assignment (e3, e4, false, true); + else + stmt = gfc_trans_call (n->u2.udr->combiner, false, + NULL_TREE, NULL_TREE, false); + if (TREE_CODE (stmt) != BIND_EXPR) + stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0)); + else + poplevel (0, 0); + OMP_CLAUSE_REDUCTION_MERGE (c) = stmt; - /* And stick the placeholder VAR_DECL into the clause as well. */ - OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) = outer_decl; + /* And stick the placeholder VAR_DECL into the clause as well. */ + OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) = outer_decl; + } gfc_current_locus = old_loc; @@ -3139,8 +3236,10 @@ gfc_trans_omp_array_reduction_or_udr (tree c, gfc_omp_namelist *n, locus where) } static tree -gfc_trans_omp_reduction_list (int kind, gfc_omp_namelist *namelist, tree list, - locus where, bool mark_addressable) +gfc_trans_omp_reduction_list (stmtblock_t *block, int kind, + gfc_omp_namelist *namelist, tree list, + locus where, bool mark_addressable, + bool openacc) { omp_clause_code clause = OMP_CLAUSE_REDUCTION; switch (kind) @@ -3220,7 +3319,8 @@ gfc_trans_omp_reduction_list (int kind, gfc_omp_namelist *namelist, tree list, if (namelist->sym->attr.dimension || namelist->u.reduction_op == OMP_REDUCTION_USER || namelist->sym->attr.allocatable) - gfc_trans_omp_array_reduction_or_udr (node, namelist, where); + gfc_trans_omp_array_reduction_or_udr (block, node, namelist, + where, openacc); list = gfc_trans_add_clause (node, list); } } @@ -3290,13 +3390,15 @@ gfc_trans_omp_array_section (stmtblock_t *block, toc_directive cd, { tree type = TREE_TYPE (se.expr); gfc_add_block_to_block (block, &se.pre); - OMP_CLAUSE_SIZE (node) = gfc_full_array_size (block, se.expr, - GFC_TYPE_ARRAY_RANK (type)); if (!elemsz) elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type)); elemsz = fold_convert (gfc_array_index_type, elemsz); - OMP_CLAUSE_SIZE (node) = fold_build2 (MULT_EXPR, gfc_array_index_type, - OMP_CLAUSE_SIZE (node), elemsz); + tree size = gfc_full_array_size (block, se.expr, + GFC_TYPE_ARRAY_RANK (type)); + size = fold_build2 (MULT_EXPR, gfc_array_index_type, size, elemsz); + if (OMP_CLAUSE_CODE (node) == OMP_CLAUSE_MAP) + OMP_CLAUSE_SIZE (node) = size; + if (n->expr->ts.type == BT_DERIVED && n->expr->ts.u.derived->attr.alloc_comp) { @@ -3387,6 +3489,11 @@ gfc_trans_omp_array_section (stmtblock_t *block, toc_directive cd, fold_convert (ptrdiff_type_node, ptr2)); offset = build2 (TRUNC_DIV_EXPR, ptrdiff_type_node, offset, fold_convert (ptrdiff_type_node, elemsz)); + + tree offset_tmp = create_tmp_var (ptrdiff_type_node); + gfc_add_expr_to_block (block, build2 (MODIFY_EXPR, ptrdiff_type_node, + offset_tmp, offset)); + offset = offset_tmp; offset = build4_loc (input_location, ARRAY_REF, TREE_TYPE (TREE_TYPE (decl)), decl, offset, NULL_TREE, NULL_TREE); @@ -3405,9 +3512,14 @@ gfc_trans_omp_array_section (stmtblock_t *block, toc_directive cd, OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind); OMP_CLAUSE_DECL (node3) = decl; } - ptr2 = fold_convert (ptrdiff_type_node, ptr2); - OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, ptrdiff_type_node, - ptr, ptr2); + + tree ptr_tmp = create_tmp_var (ptrdiff_type_node); + ptr = fold_build2 (MINUS_EXPR, ptrdiff_type_node, ptr, + fold_convert (ptrdiff_type_node, ptr2)); + gfc_add_expr_to_block (block, build2 (MODIFY_EXPR, ptrdiff_type_node, + ptr_tmp, ptr)); + OMP_CLAUSE_SIZE (node3) = ptr_tmp; + if (n->u.map.readonly) OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1; } @@ -4017,8 +4129,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, case OMP_LIST_TASK_REDUCTION: /* An OpenACC async clause indicates the need to set reduction arguments addressable, to allow asynchronous copy-out. */ - omp_clauses = gfc_trans_omp_reduction_list (list, n, omp_clauses, - where, clauses->async); + omp_clauses = gfc_trans_omp_reduction_list (block, list, n, omp_clauses, + where, clauses->async, + openacc); break; case OMP_LIST_PRIVATE: clause_code = OMP_CLAUSE_PRIVATE; @@ -7605,7 +7718,9 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, on the simd construct and DO's clauses are translated elsewhere. */ do_clauses->sched_simd = false; - omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc); + omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc, + (op == EXEC_OACC_LOOP + ? TOC_OPENACC : TOC_OPENMP)); for (i = 0; i < collapse; i++) { diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index b48d114..da44cdf 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -13749,6 +13749,20 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, omp_firstprivatize_variable (ctx, v); omp_notice_variable (ctx, v, true); } + if (TREE_CODE (TREE_OPERAND (decl, 0)) == COMPONENT_REF + || CONVERT_EXPR_CODE_P (TREE_CODE (TREE_OPERAND (decl, 0)))) + { + gimplify_ctxp->into_ssa = false; + if (gimplify_expr (&TREE_OPERAND (decl, 0), pre_p, + NULL, is_gimple_val, fb_rvalue, false) + == GS_ERROR) + { + gimplify_ctxp->into_ssa = saved_into_ssa; + remove = true; + break; + } + gimplify_ctxp->into_ssa = saved_into_ssa; + } decl = TREE_OPERAND (decl, 0); if (TREE_CODE (decl) == POINTER_PLUS_EXPR) { @@ -14326,7 +14340,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, remove = true; break; } - if (DECL_NAME (decl) == NULL_TREE && (flags & GOVD_SHARED) == 0) + if (DECL_P (decl) && DECL_NAME (decl) == NULL_TREE + && (flags & GOVD_SHARED) == 0) { tree t = omp_member_access_dummy_var (decl); if (t) @@ -15877,6 +15892,8 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, else if ((OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER || (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_REFERENCE) + || (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_POINTER + && ctx->region_type != ORT_ACC_KERNELS) || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH) && TREE_CODE (OMP_CLAUSE_SIZE (c)) != INTEGER_CST) { @@ -16065,6 +16082,15 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, OMP_CLAUSE_DECL (c)); remove = true; } + + if (TREE_CODE (*pd) == ARRAY_REF + && DECL_P (TREE_OPERAND (*pd, 1)) + && (ctx->region_type & ORT_TARGET) != 0 + && (ctx->region_type & ORT_ACC) != 0 + && ctx->region_type != ORT_ACC_KERNELS) + omp_add_variable (ctx, TREE_OPERAND (*pd, 1), + GOVD_FIRSTPRIVATE | GOVD_SEEN); + gimplify_omp_ctxp = ctx; break; } @@ -16263,10 +16289,20 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, /* OpenACC reductions need a present_or_copy data clause. Add one if necessary. Emit error when the reduction is private. */ - if (DECL_P (decl) && - (ctx->region_type == ORT_ACC_PARALLEL - || ctx->region_type == ORT_ACC_SERIAL)) + if (ctx->region_type == ORT_ACC_PARALLEL + || ctx->region_type == ORT_ACC_SERIAL) { + if (TREE_CODE (decl) == MEM_REF + && TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE) + { + tree addr = TREE_OPERAND (decl, 0); + if (TREE_CODE (addr) == POINTER_PLUS_EXPR) + addr = TREE_OPERAND (addr, 0); + if (TREE_CODE (addr) == ADDR_EXPR + && DECL_P (TREE_OPERAND (addr, 0))) + decl = TREE_OPERAND (addr, 0); + } + n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl); if (n->value & (GOVD_PRIVATE | GOVD_FIRSTPRIVATE)) { diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index ee2b1f3..038374e 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -106,6 +106,11 @@ struct omp_context construct. In the case of a parallel, this is in the child function. */ tree block_vars; + /* A hash map to track variables added through omp_copy_decl_*, to ensure + repeated calls of install_var_local on sam DECL do not get duplicated + local versions. */ + hash_map<tree, tree> *block_vars_map; + /* Label to which GOMP_cancel{,llation_point} and explicit and implicit barriers should jump to during omplower pass. */ tree cancel_label; @@ -596,12 +601,26 @@ use_pointer_for_field (tree decl, omp_context *shared_ctx) static tree omp_copy_decl_2 (tree var, tree name, tree type, omp_context *ctx) { + if (ctx) + { + if (!ctx->block_vars_map) + ctx->block_vars_map = new hash_map<tree, tree> (); + else + { + tree *tp = ctx->block_vars_map->get (var); + if (tp) + return *tp; + } + } + tree copy = copy_var_decl (var, name, type); DECL_CONTEXT (copy) = current_function_decl; if (ctx) { + ctx->block_vars_map->put (var, copy); + DECL_CHAIN (copy) = ctx->block_vars; ctx->block_vars = copy; } @@ -786,7 +805,7 @@ build_sender_ref (tree var, omp_context *ctx) static void install_var_field (tree var, bool by_ref, int mask, omp_context *ctx, - tree key_expr = NULL_TREE) + tree key_expr = NULL_TREE, bool field_may_exist = false) { tree field, type, sfield = NULL_TREE; splay_tree_key key = (splay_tree_key) var; @@ -808,9 +827,9 @@ install_var_field (tree var, bool by_ref, int mask, omp_context *ctx, } } gcc_assert ((mask & 1) == 0 - || !splay_tree_lookup (ctx->field_map, key)); + || !splay_tree_lookup (ctx->field_map, key) || field_may_exist); gcc_assert ((mask & 2) == 0 || !ctx->sfield_map - || !splay_tree_lookup (ctx->sfield_map, key)); + || !splay_tree_lookup (ctx->sfield_map, key) || field_may_exist); gcc_assert ((mask & 3) == 3 || !is_gimple_omp_oacc (ctx->stmt)); @@ -1206,6 +1225,7 @@ delete_omp_context (splay_tree_value value) delete ctx->task_reduction_map; } + delete ctx->block_vars_map; delete ctx->lastprivate_conditional_map; delete ctx->allocate_map; @@ -2001,9 +2021,9 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_POINTER && !OMP_CLAUSE_MAP_ZERO_BIAS_ARRAY_SECTION (c) && TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE) - install_var_field (decl, true, 7, ctx, c); + install_var_field (decl, true, 7, ctx, c, true); else - install_var_field (decl, true, 3, ctx, c); + install_var_field (decl, true, 3, ctx, c, true); if (is_gimple_omp_offloaded (ctx->stmt) && !(is_gimple_omp_oacc (ctx->stmt) && OMP_CLAUSE_MAP_IN_REDUCTION (c))) @@ -2033,13 +2053,26 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) } gcc_assert (!splay_tree_lookup (ctx->field_map, (splay_tree_key) decl)); + tree ptr_type = ptr_type_node; + if (TREE_CODE (decl) == ARRAY_REF) + { + tree array_type = TREE_TYPE (TREE_OPERAND (decl, 0)); + ptr_type = build_pointer_type (array_type); + } tree field = build_decl (OMP_CLAUSE_LOCATION (c), - FIELD_DECL, NULL_TREE, ptr_type_node); - SET_DECL_ALIGN (field, TYPE_ALIGN (ptr_type_node)); + FIELD_DECL, NULL_TREE, ptr_type); + SET_DECL_ALIGN (field, TYPE_ALIGN (ptr_type)); insert_field_into_struct (ctx->record_type, field); - splay_tree_insert (ctx->field_map, (splay_tree_key) c, + splay_tree_insert (ctx->field_map, (splay_tree_key) /*decl,xxx*/ c, (splay_tree_value) field); + + if (TREE_CODE (decl) == ARRAY_REF + && is_gimple_omp_offloaded (ctx->stmt) + && !splay_tree_lookup (ctx->field_map, + (splay_tree_key) base)) + splay_tree_insert (ctx->field_map, (splay_tree_key) base, + (splay_tree_value) field); } } break; @@ -4907,6 +4940,33 @@ maybe_lookup_field_in_outer_ctx (tree decl, omp_context *ctx) tree omp_reduction_init_op (location_t loc, enum tree_code op, tree type) { + if (TREE_CODE (type) == ARRAY_TYPE) + { + tree max = TYPE_MAX_VALUE (TYPE_DOMAIN (type)); + if (TREE_CONSTANT (max)) + { + vec<constructor_elt, va_gc> *v = NULL; + HOST_WIDE_INT max_val = tree_to_shwi (max); + tree t = omp_reduction_init_op (loc, op, TREE_TYPE (type)); + for (HOST_WIDE_INT i = 0; i <= max_val; i++) + CONSTRUCTOR_APPEND_ELT (v, size_int (i), t); + return build_constructor (type, v); + } + else + gcc_unreachable (); + } + else if (TREE_CODE (type) == RECORD_TYPE) + { + vec<constructor_elt, va_gc> *v = NULL; + for (tree fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld)) + if (TREE_CODE (fld) == FIELD_DECL) + { + tree t = omp_reduction_init_op (loc, op, TREE_TYPE (fld)); + CONSTRUCTOR_APPEND_ELT (v, fld, t); + } + return build_constructor (type, v); + } + switch (op) { case PLUS_EXPR: @@ -7856,6 +7916,76 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *body_p, gimple_seq_add_seq (stmt_list, post_stmt_list); } +/* Give an array reduction clause, and the surrounding map clause that mapped + the array (section), calculate the actual bias for the reduction inside + the OpenACC region, generally just: reduction_bias - map_bias, but + encapsulate the hairy details. */ + +static tree +oacc_array_reduction_bias (location_t loc, tree reduction_clause, + omp_context *ctx, tree map_clause, + omp_context *outer) +{ + tree bias = TREE_OPERAND (OMP_CLAUSE_DECL (reduction_clause), 1); + tree orig_var = TREE_OPERAND (OMP_CLAUSE_DECL (reduction_clause), 0); + if (TREE_CODE (orig_var) == POINTER_PLUS_EXPR) + { + tree b = TREE_OPERAND (orig_var, 1); + b = maybe_lookup_decl (b, ctx); + if (b == NULL) + { + b = TREE_OPERAND (orig_var, 1); + b = maybe_lookup_decl_in_outer_ctx (b, ctx); + } + if (integer_zerop (bias)) + bias = b; + else + { + bias = fold_convert_loc (loc, TREE_TYPE (b), bias); + bias = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (b), b, bias); + } + orig_var = TREE_OPERAND (orig_var, 0); + } + + if (TREE_CODE (orig_var) == INDIRECT_REF + || TREE_CODE (orig_var) == ADDR_EXPR) + orig_var = TREE_OPERAND (orig_var, 0); + + tree map_decl = OMP_CLAUSE_DECL (map_clause); + tree next = OMP_CLAUSE_CHAIN (map_clause); + + tree orig_bias = integer_zero_node; + if (TREE_CODE (map_decl) == ARRAY_REF) + { + if (next && OMP_CLAUSE_CODE (next) == OMP_CLAUSE_MAP + && OMP_CLAUSE_DECL (next) == orig_var + && (OMP_CLAUSE_MAP_KIND (next) == GOMP_MAP_FIRSTPRIVATE_POINTER + || OMP_CLAUSE_MAP_KIND (next) == GOMP_MAP_POINTER)) + { + orig_bias = OMP_CLAUSE_SIZE (next); + if (DECL_P (orig_bias)) + orig_bias = lookup_decl (orig_bias, outer); + orig_bias = fold_convert_loc (loc, pointer_sized_int_node, + orig_bias); + } + else + { + tree idx = TREE_OPERAND (map_decl, 1); + idx = lookup_decl (idx, outer); + idx = fold_convert_loc (loc, pointer_sized_int_node, idx); + orig_bias = fold_build2_loc (loc, MULT_EXPR, + pointer_sized_int_node, idx, + TYPE_SIZE_UNIT (TREE_TYPE (map_decl))); + } + } + + bias = fold_convert_loc (loc, pointer_sized_int_node, bias); + tree adjusted_bias = fold_build2_loc (loc, MINUS_EXPR, + pointer_sized_int_node, + bias, orig_bias); + return adjusted_bias; +} + /* Lower the OpenACC reductions of CLAUSES for compute axis LEVEL (which might be a placeholder). INNER is true if this is an inner axis of a multi-axis loop. FORK and JOIN are (optional) fork and @@ -7895,9 +8025,107 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, tree orig = OMP_CLAUSE_DECL (c); tree orig_clause; + tree array_type = NULL_TREE; + tree array_addr = NULL_TREE, array_max_idx = NULL_TREE; + tree array_bias = NULL_TREE; tree var; - tree ref_to_res = NULL_TREE; + if (TREE_CODE (orig) == MEM_REF) + { + array_type = TREE_TYPE (orig); + + tree bias = TREE_OPERAND (OMP_CLAUSE_DECL (c), 1); + tree orig_var = TREE_OPERAND (OMP_CLAUSE_DECL (c), 0); + + if (TREE_CODE (orig_var) == POINTER_PLUS_EXPR) + { + tree b = TREE_OPERAND (orig_var, 1); + if (is_omp_target (ctx->stmt)) + b = NULL_TREE; + else + b = maybe_lookup_decl (b, ctx); + if (b == NULL) + { + b = TREE_OPERAND (orig_var, 1); + b = maybe_lookup_decl_in_outer_ctx (b, ctx); + } + if (integer_zerop (bias)) + bias = b; + else + { + bias = fold_convert_loc (loc, + TREE_TYPE (b), bias); + bias = fold_build2_loc (loc, PLUS_EXPR, + TREE_TYPE (b), b, bias); + } + orig_var = TREE_OPERAND (orig_var, 0); + } + if (TREE_CODE (orig_var) == INDIRECT_REF + || TREE_CODE (orig_var) == ADDR_EXPR) + orig_var = TREE_OPERAND (orig_var, 0); + + gcc_assert (DECL_P (orig_var)); + + tree local_orig_var = lookup_decl (orig_var, ctx); + tree priv_addr = local_orig_var; + if (TREE_CODE (TREE_TYPE (priv_addr)) == ARRAY_TYPE) + priv_addr = build_fold_addr_expr (priv_addr); + + tree priv_addr_type = build_pointer_type (array_type); + + /* Peel away MEM_REF to get at base array VAR_DECL. */ + tree addr = TREE_OPERAND (orig, 0); + if (TREE_CODE (addr) == POINTER_PLUS_EXPR) + addr = TREE_OPERAND (addr, 0); + if (TREE_CODE (addr) == ADDR_EXPR) + addr = TREE_OPERAND (addr, 0); + else if (INDIRECT_REF_P (addr)) + addr = TREE_OPERAND (addr, 0); + orig = addr; + + if (omp_privatize_by_reference (orig)) + { + gcc_assert (DECL_HAS_VALUE_EXPR_P (priv_addr) + && (TREE_CODE (DECL_VALUE_EXPR (priv_addr)) + == MEM_REF)); + priv_addr = TREE_OPERAND (DECL_VALUE_EXPR (priv_addr), 0); + } + + tree tmp = fold_build2 (POINTER_PLUS_EXPR, ptr_type_node, + fold_convert (ptr_type_node, priv_addr), + fold_convert (sizetype, bias)); + priv_addr = fold_convert (priv_addr_type, tmp); + + tree addr_var = create_tmp_var (priv_addr_type, + ".array_reduction_addr"); + + gimple_seq s = NULL; + gimplify_assign (addr_var, priv_addr, &s); + gimple_seq_add_seq (&before_fork, s); + + var = create_tmp_var (integer_type_node, + ".array_reduction_data_dep"); + gimple_seq_add_stmt (&before_fork, + gimple_build_assign (var, integer_zero_node)); + + array_addr = addr_var; + array_bias = bias; + array_max_idx + = TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (OMP_CLAUSE_DECL (c)))); + tree t = maybe_lookup_decl (array_max_idx, ctx); + if (t) + array_max_idx = t; + } + else + { + var = OMP_CLAUSE_REDUCTION_PRIVATE_DECL (c); + if (!var) + var = maybe_lookup_decl (orig, ctx); + if (!var) + var = orig; + } + tree incoming, outgoing; + tree ref_to_res = NULL_TREE; bool is_private = false; bool is_fpp = false; @@ -7910,12 +8138,6 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, rcode = BIT_IOR_EXPR; tree op = build_int_cst (unsigned_type_node, rcode); - var = OMP_CLAUSE_REDUCTION_PRIVATE_DECL (c); - if (!var) - var = maybe_lookup_decl (orig, ctx); - if (!var) - var = orig; - incoming = outgoing = var; if (!inner) @@ -7967,7 +8189,8 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, else if (OMP_CLAUSE_CODE (cls) == OMP_CLAUSE_MAP && (OMP_CLAUSE_MAP_KIND (cls) == GOMP_MAP_FIRSTPRIVATE_POINTER) - && orig == OMP_CLAUSE_DECL (cls)) + && orig == OMP_CLAUSE_DECL (cls) + && !array_addr) { is_fpp = true; goto do_lookup; @@ -7988,21 +8211,65 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, orig_clause = cls; break; } + if (gimple_code (outer->stmt) == GIMPLE_OMP_TARGET + && !orig_clause + && !is_private + && maybe_lookup_field (orig, outer)) + orig_clause = orig; if ((orig_clause != NULL_TREE || is_fpp) && !is_private) { tree type = TREE_TYPE (var); - if (is_fpp) + if (is_fpp && !array_addr) { tree x = create_tmp_var (type); gimplify_assign (x, lookup_decl (orig, outer), fork_seq); ref_to_res = x; } - else + else if (orig_clause) { ref_to_res = build_receiver_ref (orig_clause, false, outer); if (omp_privatize_by_reference (orig)) ref_to_res = build_simple_mem_ref (ref_to_res); + + bool ptr_ptr_array = false; + if (TREE_CODE (TREE_TYPE (orig)) == ARRAY_TYPE + && TREE_CODE (TREE_TYPE (ref_to_res)) == POINTER_TYPE + && (TREE_CODE (TREE_TYPE (TREE_TYPE (ref_to_res))) + == POINTER_TYPE)) + { + ref_to_res = build_simple_mem_ref (ref_to_res); + ptr_ptr_array = true; + } + + if (array_bias) + { + tree map_bias = integer_zero_node; + if (ptr_ptr_array) + map_bias = array_bias; + else + { + tree m = gimple_omp_target_clauses (outer->stmt); + for (; m; m = OMP_CLAUSE_CHAIN (m)) + if (OMP_CLAUSE_CODE (m) == OMP_CLAUSE_MAP) + { + tree md = OMP_CLAUSE_DECL (m); + if (orig == md + || (TREE_CODE (md) == ARRAY_REF + && TREE_OPERAND (md, 0) == orig)) + { + map_bias + = oacc_array_reduction_bias (loc, c, ctx, + m, outer); + break; + } + } + } + tree t = fold_convert (ptr_type_node, ref_to_res); + t = build2 (POINTER_PLUS_EXPR, ptr_type_node, t, + fold_convert (sizetype, map_bias)); + ref_to_res = fold_convert (TREE_TYPE (ref_to_res), t); + } } if (POINTER_TYPE_P (type)) @@ -8016,13 +8283,91 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, /* Try to look at enclosing contexts for reduction var, use original if no mapping found. */ tree t = NULL_TREE; - omp_context *c = ctx->outer; - while (c && !t) + omp_context *cp = ctx->outer; + while (cp) + { + t = maybe_lookup_decl (orig, cp); + if (t) + break; + + cp = cp->outer; + } + + if (array_addr) { - t = maybe_lookup_decl (orig, c); - c = c->outer; + if (t) + { + if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE) + { + if (!is_private) + { + gcc_assert (DECL_SIZE (t) + && (TREE_CODE (DECL_SIZE (t)) + != INTEGER_CST) + && DECL_HAS_VALUE_EXPR_P (t)); + t = DECL_VALUE_EXPR (t); + } + + t = fold_convert (ptr_type_node, + build_fold_addr_expr (t)); + if (array_bias) + t = build2 (POINTER_PLUS_EXPR, ptr_type_node, t, + fold_convert (sizetype, array_bias)); + ref_to_res + = fold_convert (build_pointer_type + (TREE_TYPE (orig)), t); + } + else if (TREE_CODE (TREE_TYPE (t)) == POINTER_TYPE) + { + if (array_bias) + t = build2 (POINTER_PLUS_EXPR, ptr_type_node, t, + fold_convert (sizetype, array_bias)); + ref_to_res + = fold_convert (build_pointer_type + (array_type), t); + } + else + gcc_unreachable (); + } + else + { + gcc_assert (!cp && (gimple_code (ctx->stmt) + == GIMPLE_OMP_TARGET)); + + tree mem_ref = NULL_TREE; + tree mem_ref_clause = NULL_TREE; + tree m = gimple_omp_target_clauses (ctx->stmt); + tree orig_val = (DECL_HAS_VALUE_EXPR_P (orig) + ? DECL_VALUE_EXPR (orig) : orig); + for (; m; m = OMP_CLAUSE_CHAIN (m)) + if (OMP_CLAUSE_CODE (m) == OMP_CLAUSE_MAP) + { + tree md = OMP_CLAUSE_DECL (m); + if (orig_val == md + || (TREE_CODE (md) == MEM_REF + && INDIRECT_REF_P (orig_val) + && (TREE_OPERAND (md, 0) + == TREE_OPERAND (orig_val, 0)))) + { + mem_ref = md; + mem_ref_clause = m; + break; + } + } + gcc_assert (mem_ref); + mem_ref = build_receiver_ref (mem_ref_clause, false, ctx); + + if (array_bias) + mem_ref = build2 (POINTER_PLUS_EXPR, ptr_type_node, + mem_ref, fold_convert (sizetype, + array_bias)); + ref_to_res + = fold_convert (build_pointer_type (TREE_TYPE (orig)), + mem_ref); + } } - incoming = outgoing = (t ? t : orig); + else + incoming = outgoing = (t ? t : orig); } has_outer_reduction:; @@ -8031,6 +8376,9 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, if (!ref_to_res) ref_to_res = integer_zero_node; + if (!array_addr) + array_addr = array_max_idx = integer_zero_node; + if (omp_privatize_by_reference (outgoing)) { outgoing = build_simple_mem_ref (outgoing); @@ -8063,25 +8411,29 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, tree setup_call = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION, - TREE_TYPE (var), 6, setup_code, + TREE_TYPE (var), 8, setup_code, unshare_expr (ref_to_res), unshare_expr (incoming), - level, op, off); + level, op, off, + array_addr, array_max_idx); tree init_call = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION, - TREE_TYPE (var), 6, init_code, + TREE_TYPE (var), 8, init_code, unshare_expr (ref_to_res), - unshare_expr (var), level, op, off); + unshare_expr (var), level, op, off, + array_addr, array_max_idx); tree fini_call = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION, - TREE_TYPE (var), 6, fini_code, + TREE_TYPE (var), 8, fini_code, unshare_expr (ref_to_res), - unshare_expr (var), level, op, off); + unshare_expr (var), level, op, off, + array_addr, array_max_idx); tree teardown_call = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION, - TREE_TYPE (var), 6, teardown_code, + TREE_TYPE (var), 8, teardown_code, ref_to_res, unshare_expr (var), - level, op, off); + level, op, off, + array_addr, array_max_idx); gimplify_assign (unshare_expr (var), setup_call, &before_fork); gimplify_assign (unshare_expr (var), init_call, &after_fork); @@ -14283,11 +14635,10 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_POINTER && !OMP_CLAUSE_MAP_ZERO_BIAS_ARRAY_SECTION (c) - && TREE_CODE (TREE_TYPE (ovar)) == ARRAY_TYPE) + && TREE_CODE (TREE_TYPE (ovar)) == ARRAY_TYPE + && offloaded) { - gcc_assert (offloaded); - tree avar - = create_tmp_var (TREE_TYPE (TREE_TYPE (x))); + tree avar = create_tmp_var (TREE_TYPE (TREE_TYPE (x))); mark_addressable (avar); gimplify_assign (avar, build_fold_addr_expr (var), &ilist); talign = DECL_ALIGN_UNIT (avar); @@ -15279,8 +15630,29 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) gimple_build_assign (t, x)); x = build_fold_addr_expr_loc (clause_loc, t); } - gimple_seq_add_stmt (&new_body, - gimple_build_assign (new_var, x)); + if (offloaded && is_gimple_omp_oacc (ctx->stmt) + && OMP_CLAUSE_MAP_IN_REDUCTION (prev) + && TREE_CODE (type) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE + && !TREE_CONSTANT (TYPE_SIZE_UNIT (TREE_TYPE (type)))) + { + tree array_type = TREE_TYPE (type); + tree atmp + = builtin_decl_explicit (BUILT_IN_ALLOCA_WITH_ALIGN); + tree al = size_int (TYPE_ALIGN (array_type)); + tree sz = TYPE_SIZE_UNIT (array_type); + tree call = build_call_expr_loc (OMP_CLAUSE_LOCATION (c), + atmp, 2, sz, al); + gimplify_assign (x, call, &new_body); + + /* In some cases, we need to preserve the pointer to array + type, as it will be passed into OpenACC reduction + internal-fns, and we require the type for proper copy + generation. */ + TREE_TYPE (x) = TREE_TYPE (new_var); + } + gimple *g = gimple_build_assign (new_var, x); + gimple_seq_add_stmt (&new_body, g); prev = NULL_TREE; } else if (OMP_CLAUSE_CHAIN (c) diff --git a/gcc/omp-oacc-neuter-broadcast.cc b/gcc/omp-oacc-neuter-broadcast.cc index 52addad..1336aa9 100644 --- a/gcc/omp-oacc-neuter-broadcast.cc +++ b/gcc/omp-oacc-neuter-broadcast.cc @@ -148,6 +148,34 @@ local_var_based_p (tree decl) } } +static bool +local_assignment_p (gimple *stmt, hash_set<tree> *array_reduction_base_vars) +{ + if (is_gimple_assign (stmt) + || gimple_call_builtin_p (stmt, BUILT_IN_ALLOCA_WITH_ALIGN)) + { + tree lhs = (is_gimple_assign (stmt) + ? gimple_assign_lhs (stmt) : gimple_call_lhs (stmt)); + if (TREE_CODE (lhs) == SSA_NAME + && array_reduction_base_vars->contains (lhs)) + { + use_operand_p use_p; + ssa_op_iter iter; + FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE) + { + tree use = USE_FROM_PTR (use_p); + if (TREE_CODE (use) != SSA_NAME + || SSA_NAME_IS_DEFAULT_DEF (use)) + continue; + if (!array_reduction_base_vars->contains (use)) + return false; + } + return true; + } + } + return false; +} + /* Map of basic blocks to gimple stmts. */ typedef hash_map<basic_block, gimple *> bb_stmt_map_t; @@ -991,7 +1019,8 @@ worker_single_copy (basic_block from, basic_block to, hash_set<tree> *worker_partitioned_uses, tree record_type, record_field_map_t *record_field_map, unsigned HOST_WIDE_INT placement, - bool isolate_broadcasts, bool has_gang_private_write) + bool isolate_broadcasts, bool has_gang_private_write, + hash_set<tree> *array_reduction_base_vars) { /* If we only have virtual defs, we'll have no record type, but we still want to emit single_copy_start and (particularly) single_copy_end to act as @@ -1015,6 +1044,31 @@ worker_single_copy (basic_block from, basic_block to, edge e = split_block (to, gsi_stmt (gsi)); basic_block barrier_block = e->dest; + gimple_seq local_asgns = NULL; + + /* For accesses of variables used in array reductions, instead of + propagating the value for the main thread to all other worker threads + (which doesn't make sense as a reduction private var), move the defs + of such SSA_NAMEs to before the copy block and leave them alone (each + thread should access their own local copy). */ + for (gimple_stmt_iterator i = gsi_after_labels (from); !gsi_end_p (i);) + { + gimple *stmt = gsi_stmt (i); + if (local_assignment_p (stmt, array_reduction_base_vars)) + { + gsi_remove (&i, false); + gimple_seq_add_stmt (&local_asgns, stmt); + } + else + gsi_next (&i); + } + + if (dump_file) + { + fprintf (dump_file, "Local assignments to be moved:\n"); + print_gimple_seq (dump_file, local_asgns, 0, TDF_NONE); + } + gimple_stmt_iterator start = gsi_after_labels (from); tree decl = builtin_decl_explicit (BUILT_IN_GOACC_SINGLE_COPY_START); @@ -1029,6 +1083,9 @@ worker_single_copy (basic_block from, basic_block to, gsi_insert_before (&start, call, GSI_NEW_STMT); update_stmt (call); + if (local_asgns) + gsi_insert_seq_before (&start, local_asgns, GSI_SAME_STMT); + /* The shared-memory range for this block overflowed. Add a barrier before the GOACC_single_copy_start call. */ if (isolate_broadcasts) @@ -1128,6 +1185,12 @@ worker_single_copy (basic_block from, basic_block to, if (gimple_nop_p (def_stmt)) continue; + /* For accesses of variables used in array reductions, skip creating + the barrier phi. Each thread runs same def_stmt to access + local variable, there is no main/worker divide here. */ + if (local_assignment_p (def_stmt, array_reduction_base_vars)) + continue; + /* The barrier phi takes one result from the actual work of the block we're neutering, and the other result is constant zero of the same type. */ @@ -1248,7 +1311,8 @@ neuter_worker_single (parallel_g *par, unsigned outer_mask, hash_set<tree> *partitioned_var_uses, record_field_map_t *record_field_map, blk_offset_map_t *blk_offset_map, - bitmap writes_gang_private) + bitmap writes_gang_private, + hash_set<tree> *array_reduction_base_vars) { unsigned mask = outer_mask | par->mask; @@ -1398,7 +1462,8 @@ neuter_worker_single (parallel_g *par, unsigned outer_mask, &worker_partitioned_uses, record_type, record_field_map, offset, !range_allocated, - has_gang_private_write); + has_gang_private_write, + array_reduction_base_vars); } else worker_single_simple (block, block, &def_escapes_block); @@ -1436,11 +1501,13 @@ neuter_worker_single (parallel_g *par, unsigned outer_mask, if (par->inner) neuter_worker_single (par->inner, mask, worker_single, vector_single, prop_set, partitioned_var_uses, record_field_map, - blk_offset_map, writes_gang_private); + blk_offset_map, writes_gang_private, + array_reduction_base_vars); if (par->next) neuter_worker_single (par->next, outer_mask, worker_single, vector_single, prop_set, partitioned_var_uses, record_field_map, - blk_offset_map, writes_gang_private); + blk_offset_map, writes_gang_private, + array_reduction_base_vars); } static void @@ -1587,7 +1654,8 @@ merge_ranges (splay_tree accum, splay_tree sp) static void oacc_do_neutering (unsigned HOST_WIDE_INT bounds_lo, - unsigned HOST_WIDE_INT bounds_hi) + unsigned HOST_WIDE_INT bounds_hi, + hash_set<tree> *array_reduction_base_vars) { bb_stmt_map_t bb_stmt_map; auto_bitmap worker_single, vector_single; @@ -1792,7 +1860,8 @@ oacc_do_neutering (unsigned HOST_WIDE_INT bounds_lo, neuter_worker_single (par, mask, worker_single, vector_single, &prop_set, &partitioned_var_uses, &record_field_map, - &blk_offset_map, writes_gang_private); + &blk_offset_map, writes_gang_private, + array_reduction_base_vars); record_field_map.empty (); @@ -1831,6 +1900,9 @@ execute_omp_oacc_neuter_broadcast () private_size[i] = 0; } + /* Set of base variables referencing arrays used in array reductions. */ + hash_set<tree> array_reduction_base_vars; + /* Calculate shared memory size required for reduction variables and gang-private memory for this offloaded function. */ basic_block bb; @@ -1869,6 +1941,42 @@ execute_omp_oacc_neuter_broadcast () + tree_to_uhwi (TYPE_SIZE_UNIT (var_type))); reduction_size[level] = MAX (reduction_size[level], limit); + + tree array_addr = gimple_call_arg (call, 6); + if (!integer_zerop (array_addr) + && TREE_CODE (array_addr) == SSA_NAME) + { + /* For array reductions, trace all dependent SSA names + and add to array_reduction_base_vars, which makes + them replicated private for each thread, instead + of being copied around. */ + auto_vec<tree> ssa_name_queue; + ssa_name_queue.safe_push (array_addr); + + tree array_max_idx = gimple_call_arg (call, 7); + if (TREE_CODE (array_max_idx) == SSA_NAME) + ssa_name_queue.safe_push (array_max_idx); + + while (ssa_name_queue.length ()) + { + tree t = ssa_name_queue.pop (); + + array_reduction_base_vars.add (t); + gimple *g = SSA_NAME_DEF_STMT (t); + + use_operand_p use_p; + ssa_op_iter iter; + FOR_EACH_SSA_USE_OPERAND (use_p, g, iter, + SSA_OP_USE) + { + tree use = USE_FROM_PTR (use_p); + if (TREE_CODE (use) != SSA_NAME + || SSA_NAME_IS_DEFAULT_DEF (use)) + continue; + ssa_name_queue.safe_push (use); + } + } + } } } break; @@ -1917,7 +2025,7 @@ execute_omp_oacc_neuter_broadcast () /* Perform worker partitioning unless we know 'num_workers(1)'. */ if (dims[GOMP_DIM_WORKER] != 1) - oacc_do_neutering (bounds_lo, bounds_hi); + oacc_do_neutering (bounds_lo, bounds_hi, &array_reduction_base_vars); return 0; } diff --git a/gcc/omp-offload.cc b/gcc/omp-offload.cc index b5f54b2..3218f69 100644 --- a/gcc/omp-offload.cc +++ b/gcc/omp-offload.cc @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see #include "stringpool.h" #include "attribs.h" #include "cfgloop.h" +#include "cfghooks.h" #include "context.h" #include "convert.h" #include "opts.h" @@ -2151,6 +2152,129 @@ default_goacc_fork_join (gcall *ARG_UNUSED (call), return targetm.have_oacc_join (); } +void +oacc_build_array_copy (tree dst, tree src, tree max_idx, gimple_seq *seq) +{ + push_gimplify_context (true); + + tree len = fold_build2 (PLUS_EXPR, size_type_node, max_idx, size_int (1)); + tree ptr_to_array = (TREE_TYPE (dst) == ptr_type_node ? src : dst); + tree elem_type; + if (TREE_CODE (TREE_TYPE (ptr_to_array)) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (TREE_TYPE (ptr_to_array))) == ARRAY_TYPE) + elem_type = TREE_TYPE (TREE_TYPE (TREE_TYPE (ptr_to_array))); + else + elem_type = TREE_TYPE (TREE_TYPE (ptr_to_array)); + tree elem_size = TYPE_SIZE_UNIT (elem_type); + tree size = fold_build2 (MULT_EXPR, size_type_node, len, elem_size); + + tree memcpy_decl = builtin_decl_implicit (BUILT_IN_MEMCPY); + tree call = build_call_expr (memcpy_decl, 3, dst, src, size); + gimplify_and_add (call, seq); + pop_gimplify_context (NULL); +} + +void +oacc_build_array_copy_loop (location_t loc, tree dst, tree src, tree max_idx, + gimple_stmt_iterator *gsi) +{ + push_gimplify_context (true); + + tree loop_index; + gimple_stmt_iterator loop_body_gsi; + oacc_build_indexed_ssa_loop (loc, max_idx, gsi, + &loop_index, &loop_body_gsi); + gimple_seq copy_seq = NULL; + + tree dst_array_type = TREE_TYPE (TREE_TYPE (dst)); + tree dst_elem_type = build_qualified_type (TREE_TYPE (dst_array_type), + TYPE_QUALS (dst_array_type)); + tree dst_elem_ptr_type = build_pointer_type (dst_elem_type); + tree dst_ptr = fold_convert (dst_elem_ptr_type, dst); + + tree src_array_type = TREE_TYPE (TREE_TYPE (src)); + tree src_elem_type = build_qualified_type (TREE_TYPE (src_array_type), + TYPE_QUALS (src_array_type)); + tree src_elem_ptr_type = build_pointer_type (src_elem_type); + tree src_ptr = fold_convert (src_elem_ptr_type, src); + + tree offset = build2 (MULT_EXPR, sizetype, + loop_index, TYPE_SIZE_UNIT (dst_elem_type)); + + dst_ptr = build2 (POINTER_PLUS_EXPR, dst_elem_ptr_type, dst_ptr, offset); + src_ptr = build2 (POINTER_PLUS_EXPR, src_elem_ptr_type, src_ptr, offset); + + tree dst_mem_ref = build_simple_mem_ref (dst_ptr); + tree src_mem_ref = build_simple_mem_ref (src_ptr); + + gimplify_assign (dst_mem_ref, src_mem_ref, ©_seq); + + gsi_insert_seq_before (&loop_body_gsi, copy_seq, GSI_SAME_STMT); + pop_gimplify_context (NULL); +} + +void +oacc_build_indexed_ssa_loop (location_t loc, tree max_index, + gimple_stmt_iterator *gsi, tree *out_loop_index, + gimple_stmt_iterator *out_loop_body_code_gsi) +{ + gimple *g; + gimple_seq seq = NULL; + + tree init_index = make_ssa_name (TREE_TYPE (max_index)); + tree loop_index = make_ssa_name (TREE_TYPE (max_index)); + tree update_index = make_ssa_name (TREE_TYPE (max_index)); + + g = gimple_build_assign (init_index, + build_int_cst (TREE_TYPE (init_index), 0)); + gimple_seq_add_stmt (&seq, g); + + gimple *init_end = gimple_seq_last (seq); + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); + + basic_block init_bb = gsi_bb (*gsi); + edge init_edge = split_block (init_bb, init_end); + basic_block loop_bb = init_edge->dest; + /* Reset the iterator. */ + *gsi = gsi_for_stmt (gsi_stmt (*gsi)); + + seq = NULL; + g = gimple_build_assign (update_index, PLUS_EXPR, loop_index, + build_int_cst (TREE_TYPE (loop_index), 1)); + gimple_seq_add_stmt (&seq, g); + + g = gimple_build_cond (LE_EXPR, update_index, max_index, NULL, NULL); + gimple_seq_add_stmt (&seq, g); + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); + + edge post_edge = split_block (loop_bb, g); + basic_block post_bb = post_edge->dest; + loop_bb = post_edge->src; + /* Reset the iterator. */ + *gsi = gsi_for_stmt (gsi_stmt (*gsi)); + + /* Return place where we insert loop body code. */ + gimple_stmt_iterator loop_body_code_gsi = gsi_start_bb (loop_bb); + + post_edge->flags ^= EDGE_FALSE_VALUE | EDGE_FALLTHRU; + post_edge->probability = profile_probability::even (); + edge loop_edge = make_edge (loop_bb, loop_bb, EDGE_TRUE_VALUE); + loop_edge->probability = profile_probability::even (); + set_immediate_dominator (CDI_DOMINATORS, loop_bb, init_bb); + set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb); + class loop *new_loop = alloc_loop (); + new_loop->header = loop_bb; + new_loop->latch = loop_bb; + add_loop (new_loop, loop_bb->loop_father); + + gphi *phi = create_phi_node (loop_index, loop_bb); + add_phi_arg (phi, init_index, init_edge, loc); + add_phi_arg (phi, update_index, loop_edge, loc); + + *out_loop_index = loop_index; + *out_loop_body_code_gsi = loop_body_code_gsi; +} + /* Default goacc.reduction early expander. LHS-opt = IFN_REDUCTION (KIND, RES_PTR, VAR, LEVEL, OP, OFFSET) @@ -2176,29 +2300,44 @@ default_goacc_reduction (gcall *call) if there is one. */ tree ref_to_res = gimple_call_arg (call, 1); + tree array_addr = gimple_call_arg (call, 6); + tree array_max_idx = gimple_call_arg (call, 7); + if (!integer_zerop (ref_to_res)) { - /* Dummy reduction vars that have GOMP_MAP_FIRSTPRIVATE_POINTER data - mappings gets retyped to (void *). Adjust the type of ref_to_res - as appropriate. */ - if (TREE_TYPE (TREE_TYPE (ref_to_res)) != TREE_TYPE (var)) + if (!integer_zerop (array_addr)) { - tree ptype = build_pointer_type (TREE_TYPE (var)); - tree t = make_ssa_name (ptype); - tree expr = fold_build1 (NOP_EXPR, ptype, ref_to_res); - gimple_seq_add_stmt (&seq, gimple_build_assign (t, expr)); - ref_to_res = t; + tree dst, src; + if (code == IFN_GOACC_REDUCTION_SETUP) + dst = array_addr, src = ref_to_res; + else + src = array_addr, dst = ref_to_res; + oacc_build_array_copy (dst, src, array_max_idx, &seq); } - tree dst = build_simple_mem_ref (ref_to_res); - tree src = var; - - if (code == IFN_GOACC_REDUCTION_SETUP) + else { - src = dst; - dst = lhs; - lhs = NULL; + /* Dummy reduction vars that have GOMP_MAP_FIRSTPRIVATE_POINTER data + mappings gets retyped to (void *). Adjust the type of ref_to_res + as appropriate. */ + if (TREE_TYPE (TREE_TYPE (ref_to_res)) != TREE_TYPE (var)) + { + tree ptype = build_pointer_type (TREE_TYPE (var)); + tree t = make_ssa_name (ptype); + tree expr = fold_build1 (NOP_EXPR, ptype, ref_to_res); + gimple_seq_add_stmt (&seq, gimple_build_assign (t, expr)); + ref_to_res = t; + } + tree dst = build_simple_mem_ref (ref_to_res); + tree src = var; + + if (code == IFN_GOACC_REDUCTION_SETUP) + { + src = dst; + dst = lhs; + lhs = NULL; + } + gimple_seq_add_stmt (&seq, gimple_build_assign (dst, src)); } - gimple_seq_add_stmt (&seq, gimple_build_assign (dst, src)); } } diff --git a/gcc/omp-offload.h b/gcc/omp-offload.h index 9911083..c8b697a 100644 --- a/gcc/omp-offload.h +++ b/gcc/omp-offload.h @@ -26,6 +26,16 @@ extern int oacc_get_default_dim (int dim); extern int oacc_get_min_dim (int dim); extern int oacc_fn_attrib_level (tree attr); +/* Used by both NVPTX/GCN OpenACC code. */ +extern void oacc_build_indexed_ssa_loop (location_t loc, tree max_index, + gimple_stmt_iterator *gsi, + tree *out_loop_index, + gimple_stmt_iterator *out_loop_body_code_gsi); +extern void oacc_build_array_copy (tree dst, tree src, tree max_idx, gimple_seq *seq); +extern void oacc_build_array_copy_loop (location_t loc, tree dst, tree src, + tree max_idx, + gimple_stmt_iterator *gsi); + extern GTY(()) vec<tree, va_gc> *offload_funcs; extern GTY(()) vec<tree, va_gc> *offload_vars; extern GTY(()) vec<tree, va_gc> *offload_ind_funcs; diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-2.c b/gcc/testsuite/c-c++-common/goacc/readonly-2.c index 3f52a9f..def81c2 100644 --- a/gcc/testsuite/c-c++-common/goacc/readonly-2.c +++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c @@ -12,5 +12,5 @@ int main (void) r = a[8]; } } -/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */ -/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */ +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = \\(\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 2 "phiprop1" } } */ +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = \\(\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 1 "fre1" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-11.c b/gcc/testsuite/c-c++-common/goacc/reduction-11.c new file mode 100644 index 0000000..29eb4b5 --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/reduction-11.c @@ -0,0 +1,81 @@ +/* { dg-additional-options "-fdump-tree-gimple" } */ +/* Integer array reductions. */ + +#define n 1000 + +int +main(void) +{ + int i, j; + int result[n], array[n]; + int lresult[n]; + + /* '+' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (+:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] += array[i]; + + /* '*' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (*:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] *= array[i]; + + /* 'max' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (max:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] = result[j] > array[i] ? result[j] : array[i]; + + /* 'min' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (min:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] = result[j] < array[i] ? result[j] : array[i]; + + /* '&' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (&:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] &= array[i]; + + /* '|' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (|:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] |= array[i]; + + /* '^' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (^:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] ^= array[i]; + + /* '&&' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (&&:lresult) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + lresult[j] = lresult[j] && (result[j] > array[i]); + + /* '||' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (||:lresult) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + lresult[j] = lresult[j] || (result[j] > array[i]); + + return 0; +} + +/* Check that default copy maps are generated for loop reductions. */ +/* { dg-final { scan-tree-dump-times "map\\(tofrom:result \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 9 "gimple" } } */ +/* { dg-final { scan-tree-dump-times "map\\(tofrom:lresult \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 2 "gimple" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-12.c b/gcc/testsuite/c-c++-common/goacc/reduction-12.c new file mode 100644 index 0000000..e9dcc1c --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/reduction-12.c @@ -0,0 +1,60 @@ +/* { dg-additional-options "-fdump-tree-gimple" } */ +/* float array reductions. */ + +#define n 1000 + +int +main(void) +{ + int i, j; + float result[n], array[n]; + int lresult[n]; + + /* '+' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (+:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] += array[i]; + + /* '*' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (*:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] *= array[i]; + + /* 'max' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (max:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] = result[j] > array[i] ? result[j] : array[i]; + + /* 'min' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (min:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] = result[j] < array[i] ? result[j] : array[i]; + + /* '&&' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (&&:lresult) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + lresult[j] = lresult[j] && (result[j] > array[i]); + + /* '||' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (||:lresult) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + lresult[j] = lresult[j] || (result[j] > array[i]); + + return 0; +} + +/* Check that default copy maps are generated for loop reductions. */ +/* { dg-final { scan-tree-dump-times "map\\(tofrom:result \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 6 "gimple" } } */ +/* { dg-final { scan-tree-dump-times "map\\(tofrom:lresult \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 2 "gimple" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-13.c b/gcc/testsuite/c-c++-common/goacc/reduction-13.c new file mode 100644 index 0000000..8800b44 --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/reduction-13.c @@ -0,0 +1,60 @@ +/* { dg-additional-options "-fdump-tree-gimple" } */ +/* double array reductions. */ + +#define n 1000 + +int +main(void) +{ + int i, j; + double result[n], array[n]; + int lresult[n]; + + /* '+' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (+:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] += array[i]; + + /* '*' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (*:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] *= array[i]; + + /* 'max' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (max:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] = result[j] > array[i] ? result[j] : array[i]; + + /* 'min' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (min:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] = result[j] < array[i] ? result[j] : array[i]; + + /* '&&' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (&&:lresult) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + lresult[j] = lresult[j] && (result[j] > array[i]); + + /* '||' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (||:lresult) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + lresult[j] = lresult[j] || (result[j] > array[i]); + + return 0; +} + +/* Check that default copy maps are generated for loop reductions. */ +/* { dg-final { scan-tree-dump-times "map\\(tofrom:result \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 6 "gimple" } } */ +/* { dg-final { scan-tree-dump-times "map\\(tofrom:lresult \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 2 "gimple" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-14.c b/gcc/testsuite/c-c++-common/goacc/reduction-14.c new file mode 100644 index 0000000..48117a3 --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/reduction-14.c @@ -0,0 +1,46 @@ +/* { dg-additional-options "-fdump-tree-gimple" } */ +/* complex array reductions. */ + +#define n 1000 + +int +main(void) +{ + int i, j; + __complex__ double result[n], array[n]; + int lresult[n]; + + /* '+' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (+:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] += array[i]; + + /* '*' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (*:result) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + result[j] *= array[i]; + + /* '&&' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (&&:lresult) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + lresult[j] = lresult[j] && (__real__(result[j]) > __real__(array[i])); + + /* '||' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (||:lresult) + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + lresult[j] = lresult[j] || (__real__(result[j]) > __real__(array[i])); + + return 0; +} + +/* Check that default copy maps are generated for loop reductions. */ +/* { dg-final { scan-tree-dump-times "map\\(tofrom:result \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 4 "gimple" } } */ +/* { dg-final { scan-tree-dump-times "map\\(tofrom:lresult \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 2 "gimple" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-15.c b/gcc/testsuite/c-c++-common/goacc/reduction-15.c new file mode 100644 index 0000000..f01d988 --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/reduction-15.c @@ -0,0 +1,51 @@ +/* { dg-additional-options "-fdump-tree-gimple" } */ +/* struct reductions. */ + +typedef struct { int x, y; } int_pair; +typedef struct { float m, n; } flt_pair; +typedef struct +{ + int i; + double d; + float f; + int a[4]; + int_pair ip; + flt_pair fp; +} rectype; + +#define n 1000 + +int +main(void) +{ + int i; + rectype result, array[n]; + + /* '+' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (+:result) + for (i = 0; i < n; i++) + { + result.i += array[i].i; + result.f += array[i].f; + result.ip.x += array[i].ip.x; + result.ip.y += array[i].ip.y; + } + + /* '*' reductions. */ +#pragma acc parallel +#pragma acc loop gang worker vector reduction (*:result) + for (i = 0; i < n; i++) + { + result.i *= array[i].i; + result.f *= array[i].f; + result.ip.x *= array[i].ip.x; + result.ip.y *= array[i].ip.y; + } + + return 0; +} + +/* Check that default copy maps are generated for loop reductions. */ +/* { dg-final { scan-tree-dump-times "map\\(tofrom:result \\\[len: \[0-9\]+\\\] \\\[runtime_implicit\\\]\\)" 2 "gimple" } } */ + diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-16.c b/gcc/testsuite/c-c++-common/goacc/reduction-16.c new file mode 100644 index 0000000..6fb7054 --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/reduction-16.c @@ -0,0 +1,30 @@ +/* { dg-compile } */ +#include <stdlib.h> + +int foo (int n) +{ + int x[5][5]; + int y[n]; + int *z = (int *) malloc (5 * sizeof (int)); + + #pragma acc parallel + { + #pragma acc loop reduction(+:x) + for (int i = 0; i < 5; i++) ; + #pragma acc loop reduction(+:y) + for (int i = 0; i < 5; i++) ; + + #pragma acc loop reduction(+:x[2:1][0:5]) + for (int i = 0; i < 5; i++) ; + #pragma acc loop reduction(+:x[0:5][2:1]) /* { dg-error "array section is not contiguous in 'reduction' clause" } */ + for (int i = 0; i < 5; i++) ; + + #pragma acc loop reduction(+:y[0:5]) + for (int i = 0; i < 5; i++) ; + + #pragma acc loop reduction(+:z[0:5]) + for (int i = 0; i < 5; i++) ; + } + + return 0; +} diff --git a/gcc/testsuite/c-c++-common/goacc/reduction-9.c b/gcc/testsuite/c-c++-common/goacc/reduction-9.c index eba1d02..72b2f07 100644 --- a/gcc/testsuite/c-c++-common/goacc/reduction-9.c +++ b/gcc/testsuite/c-c++-common/goacc/reduction-9.c @@ -19,15 +19,15 @@ test_parallel () for (i = 0; i < 10; i++) s1.b[3] += 1; -#pragma acc parallel reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\[' token" } */ +#pragma acc parallel reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\.' token" } */ for (i = 0; i < 10; i++) s2[2].a += 1; -#pragma acc parallel reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\[' token" } */ +#pragma acc parallel reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\.' token" } */ for (i = 0; i < 10; i++) s2[3].b[4] += 1; -#pragma acc parallel reduction(+:z[5]) /* { dg-error "expected '\\\)' before '\\\[' token" } */ +#pragma acc parallel reduction(+:z[5]) for (i = 0; i < 10; i++) z[5] += 1; } @@ -51,15 +51,15 @@ test_combined () for (i = 0; i < 10; i++) s1.b[3] += 1; -#pragma acc parallel loop reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\[' token" } */ +#pragma acc parallel loop reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\.' token" } */ for (i = 0; i < 10; i++) s2[2].a += 1; -#pragma acc parallel loop reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\[' token" } */ +#pragma acc parallel loop reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\.' token" } */ for (i = 0; i < 10; i++) s2[3].b[4] += 1; -#pragma acc parallel loop reduction(+:z[5]) /* { dg-error "expected '\\\)' before '\\\[' token" } */ +#pragma acc parallel loop reduction(+:z[5]) for (i = 0; i < 10; i++) z[5] += 1; @@ -86,15 +86,15 @@ test_loops () for (i = 0; i < 10; i++) s1.b[3] += 1; -#pragma acc loop reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\[' token" } */ +#pragma acc loop reduction(+:s2[2].a) /* { dg-error "expected '\\\)' before '\\\.' token" } */ for (i = 0; i < 10; i++) s2[2].a += 1; -#pragma acc loop reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\[' token" } */ +#pragma acc loop reduction(+:s2[3].b[4]) /* { dg-error "expected '\\\)' before '\\\.' token" } */ for (i = 0; i < 10; i++) s2[3].b[4] += 1; -#pragma acc loop reduction(+:z[5]) /* { dg-error "expected '\\\)' before '\\\[' token" } */ +#pragma acc loop reduction(+:z[5]) for (i = 0; i < 10; i++) z[5] += 1; } diff --git a/gcc/testsuite/g++.dg/goacc/reductions-1.C b/gcc/testsuite/g++.dg/goacc/reductions-1.C index 18f43f4..829a658 100644 --- a/gcc/testsuite/g++.dg/goacc/reductions-1.C +++ b/gcc/testsuite/g++.dg/goacc/reductions-1.C @@ -73,19 +73,19 @@ test_parallel () for (i = 0; i < 100; i++) c1.get_d ()[1] += 1; -#pragma acc parallel reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].a += 1; -#pragma acc parallel reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].get_b () += 1; -#pragma acc parallel reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].c[1] += 1; -#pragma acc parallel reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].get_d ()[1] += 1; @@ -109,19 +109,19 @@ test_parallel () c2.get_d ()[1] += 1; -#pragma acc parallel reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].a += 1; -#pragma acc parallel reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].get_b () += 1; -#pragma acc parallel reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].c[1] += 1; -#pragma acc parallel reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].get_d ()[1] += 1; @@ -144,19 +144,19 @@ test_parallel () for (i = 0; i < 100; i++) s1.get_d ()[1] += 1; -#pragma acc parallel reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].a += 1; -#pragma acc parallel reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].get_b () += 1; -#pragma acc parallel reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].c[1] += 1; -#pragma acc parallel reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].get_d ()[1] += 1; @@ -179,30 +179,30 @@ test_parallel () for (i = 0; i < 100; i++) s2.get_d ()[1] += 1; -#pragma acc parallel reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].a += 1; -#pragma acc parallel reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].get_b () += 1; -#pragma acc parallel reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].c[1] += 1; -#pragma acc parallel reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].get_d ()[1] += 1; // Reductions on arrays. -#pragma acc parallel reduction(+:a[10]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:a[10]) // { dg-error "zero length array section in 'reduction' clause" } for (i = 0; i < 100; i++) a[10] += 1; -#pragma acc parallel reduction(+:b[10]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel reduction(+:b[10]) // { dg-error "zero length array section in 'reduction' clause" } for (i = 0; i < 100; i++) b[10] += 1; } @@ -236,19 +236,19 @@ test_combined () for (i = 0; i < 100; i++) c1.get_d ()[1] += 1; -#pragma acc parallel loop reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].a += 1; -#pragma acc parallel loop reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].get_b () += 1; -#pragma acc parallel loop reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].c[1] += 1; -#pragma acc parallel loop reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].get_d ()[1] += 1; @@ -272,19 +272,19 @@ test_combined () c2.get_d ()[1] += 1; -#pragma acc parallel loop reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].a += 1; -#pragma acc parallel loop reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].get_b () += 1; -#pragma acc parallel loop reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].c[1] += 1; -#pragma acc parallel loop reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].get_d ()[1] += 1; @@ -307,19 +307,19 @@ test_combined () for (i = 0; i < 100; i++) s1.get_d ()[1] += 1; -#pragma acc parallel loop reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].a += 1; -#pragma acc parallel loop reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].get_b () += 1; -#pragma acc parallel loop reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].c[1] += 1; -#pragma acc parallel loop reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].get_d ()[1] += 1; @@ -342,30 +342,30 @@ test_combined () for (i = 0; i < 100; i++) s2.get_d ()[1] += 1; -#pragma acc parallel loop reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].a += 1; -#pragma acc parallel loop reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].get_b () += 1; -#pragma acc parallel loop reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].c[1] += 1; -#pragma acc parallel loop reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].get_d ()[1] += 1; // Reductions on arrays. -#pragma acc parallel loop reduction(+:a[10]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:a[10]) // { dg-error "zero length array section in 'reduction' clause" } for (i = 0; i < 100; i++) a[10] += 1; -#pragma acc parallel loop reduction(+:b[10]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc parallel loop reduction(+:b[10]) // { dg-error "zero length array section in 'reduction' clause" } for (i = 0; i < 100; i++) b[10] += 1; } @@ -402,19 +402,19 @@ test_loop () for (i = 0; i < 100; i++) c1.get_d ()[1] += 1; -#pragma acc loop reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:c1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].a += 1; -#pragma acc loop reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:c1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].get_b () += 1; -#pragma acc loop reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:c1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].c[1] += 1; -#pragma acc loop reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:c1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c1a[1].get_d ()[1] += 1; @@ -438,19 +438,19 @@ test_loop () c2.get_d ()[1] += 1; -#pragma acc loop reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:c2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].a += 1; -#pragma acc loop reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:c2a[1].get_b ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].get_b () += 1; -#pragma acc loop reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:c2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].c[1] += 1; -#pragma acc loop reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:c2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) c2a[1].get_d ()[1] += 1; @@ -473,19 +473,19 @@ test_loop () for (i = 0; i < 100; i++) s1.get_d ()[1] += 1; -#pragma acc loop reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:s1a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].a += 1; -#pragma acc loop reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:s1a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.. token" } for (i = 0; i < 100; i++) s1a[1].get_b () += 1; -#pragma acc loop reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:s1a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].c[1] += 1; -#pragma acc loop reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:s1a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s1a[1].get_d ()[1] += 1; @@ -508,30 +508,30 @@ test_loop () for (i = 0; i < 100; i++) s2.get_d ()[1] += 1; -#pragma acc loop reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:s2a[1].a) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].a += 1; -#pragma acc loop reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:s2a[1].get_b ()) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].get_b () += 1; -#pragma acc loop reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:s2a[1].c[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].c[1] += 1; -#pragma acc loop reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:s2a[1].get_d ()[1]) // { dg-error "expected '\\\)' before '\\\.' token" } for (i = 0; i < 100; i++) s2a[1].get_d ()[1] += 1; // Reductions on arrays. -#pragma acc loop reduction(+:a[10]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:a[10]) // { dg-error "zero length array section in 'reduction' clause" } for (i = 0; i < 100; i++) a[10] += 1; -#pragma acc loop reduction(+:b[10]) // { dg-error "expected '\\\)' before '\\\[' token" } +#pragma acc loop reduction(+:b[10]) // { dg-error "zero length array section in 'reduction' clause" } for (i = 0; i < 100; i++) b[10] += 1; } diff --git a/gcc/testsuite/gfortran.dg/goacc/array-reduction.f90 b/gcc/testsuite/gfortran.dg/goacc/array-reduction.f90 index 5c489ff..f521bcc 100644 --- a/gcc/testsuite/gfortran.dg/goacc/array-reduction.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/array-reduction.f90 @@ -1,39 +1,42 @@ +! { dg-do compile } +! { dg-additional-options "-fdump-tree-gimple" } + program test implicit none integer a(10), i a(:) = 0 - + ! Array reductions. - - !$acc parallel reduction (+:a) ! { dg-error "Array 'a' is not permitted in reduction" } + + !$acc parallel reduction (+:a) do i = 1, 10 a = a + 1 end do !$acc end parallel !$acc parallel - !$acc loop reduction (+:a) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc loop reduction (+:a) do i = 1, 10 a = a + 1 end do !$acc end parallel !$acc kernels - !$acc loop reduction (+:a) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc loop reduction (+:a) do i = 1, 10 a = a + 1 end do !$acc end kernels - !$acc serial reduction (+:a) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc serial reduction (+:a) do i = 1, 10 a = a + 1 end do !$acc end serial !$acc serial - !$acc loop reduction (+:a) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc loop reduction (+:a) do i = 1, 10 a = a + 1 end do @@ -41,35 +44,35 @@ program test ! Subarray reductions. - - !$acc parallel reduction (+:a(1:5)) ! { dg-error "Array 'a' is not permitted in reduction" } + + !$acc parallel reduction (+:a(1:5)) do i = 1, 10 a = a + 1 end do !$acc end parallel !$acc parallel - !$acc loop reduction (+:a(1:5)) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc loop reduction (+:a(1:5)) do i = 1, 10 a = a + 1 end do !$acc end parallel !$acc kernels - !$acc loop reduction (+:a(1:5)) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc loop reduction (+:a(1:5)) do i = 1, 10 a = a + 1 end do !$acc end kernels - !$acc serial reduction (+:a(1:5)) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc serial reduction (+:a(1:5)) do i = 1, 10 a = a + 1 end do !$acc end serial !$acc serial - !$acc loop reduction (+:a(1:5)) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc loop reduction (+:a(1:5)) do i = 1, 10 a = a + 1 end do @@ -77,35 +80,35 @@ program test ! Reductions on array elements. - - !$acc parallel reduction (+:a(1)) ! { dg-error "Array 'a' is not permitted in reduction" } + + !$acc parallel reduction (+:a(1)) do i = 1, 10 a(1) = a(1) + 1 end do !$acc end parallel !$acc parallel - !$acc loop reduction (+:a(1)) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc loop reduction (+:a(1)) do i = 1, 10 a(1) = a(1) + 1 end do !$acc end parallel !$acc kernels - !$acc loop reduction (+:a(1)) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc loop reduction (+:a(1)) do i = 1, 10 a(1) = a(1) + 1 end do !$acc end kernels - - !$acc serial reduction (+:a(1)) ! { dg-error "Array 'a' is not permitted in reduction" } + + !$acc serial reduction (+:a(1)) do i = 1, 10 a(1) = a(1) + 1 end do !$acc end serial !$acc serial - !$acc loop reduction (+:a(1)) ! { dg-error "Array 'a' is not permitted in reduction" } + !$acc loop reduction (+:a(1)) do i = 1, 10 a(1) = a(1) + 1 end do @@ -114,3 +117,7 @@ program test print *, a end program test + +! { dg-final { scan-tree-dump-times "(?n)#pragma acc loop private\\(i\\) reduction\\(\\+:MEM.*\\\[.*&a.*\\\]\\)" 9 "gimple" } } +! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_parallel reduction\\(\\+:MEM.*\\\[.*&a.*\\\]\\) map\\(tofrom:a \\\[len: \[0-9\]+\\\]\\)" 3 "gimple" } } +! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_serial reduction\\(\\+:MEM.*\\\[.*&a.*\\\]\\) map\\(tofrom:a \\\[len: \[0-9\]+\\\]\\)" 3 "gimple" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data-2.f90 b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data-2.f90 index 6a16c8a..a835937 100644 --- a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data-2.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data-2.f90 @@ -9,17 +9,17 @@ type(t) :: var allocate (var%arr(1:100)) !$acc enter data copyin(var%arr(10:20)) -! { dg-final { scan-tree-dump-times {(?n)#pragma acc enter data map\(to:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\);$} 1 "original" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma acc enter data map\(to:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: D.[0-9]+\]\);$} 1 "original" } } !$acc exit data delete(var%arr(10:20)) -! { dg-final { scan-tree-dump-times {(?n)#pragma acc exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\);$} 1 "original" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma acc exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: D.[0-9]+\]\);$} 1 "original" } } !$acc enter data create(var%arr(20:30)) -! { dg-final { scan-tree-dump-times {(?n)#pragma acc enter data map\(alloc:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\);$} 1 "original" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma acc enter data map\(alloc:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: D.[0-9]+\]\);$} 1 "original" } } !$acc exit data finalize delete(var%arr(20:30)) -! { dg-final { scan-tree-dump-times {(?n)#pragma acc exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\) finalize;$} 1 "original" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma acc exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:var\.arr\.data \[bias: D.[0-9]+\]\) finalize;$} 1 "original" } } !$acc enter data copyin(var%arr) diff --git a/gcc/testsuite/gfortran.dg/goacc/finalize-1.f b/gcc/testsuite/gfortran.dg/goacc/finalize-1.f index 63beb47..b34c070 100644 --- a/gcc/testsuite/gfortran.dg/goacc/finalize-1.f +++ b/gcc/testsuite/gfortran.dg/goacc/finalize-1.f @@ -20,7 +20,7 @@ ! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_exit_data map\\(delete:del_f \\\[len: \[0-9\]+\\\]\\) finalize$" 1 "gimple" } } !$ACC EXIT DATA FINALIZE DELETE (del_f_p(2:5)) -! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(release:\\*\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\) parm\\.0\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(release:del_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:\\(integer\\(kind=1\\)\\\[0:\\\] \\* restrict\\) del_f_p\\.data \\\[pointer assign, bias: \\(.*int.*\\) parm\\.0\\.data - \\(.*int.*\\) del_f_p\\.data\\\]\\) finalize;$" 1 "original" } } +! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(release:\\*\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\) parm\\.0\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(release:del_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:\\(integer\\(kind=1\\)\\\[0:\\\] \\* restrict\\) del_f_p\\.data \\\[pointer assign, bias: D.\[0-9\]+\\\]\\) finalize;$" 1 "original" } } ! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_exit_data map\\(delete:MEM <\[^>\]+> \\\[\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\)_\[0-9\]+\\\] \\\[len: \[^\\\]\]+\\\]\\) map\\(delete:del_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:del_f_p\\.data \\\[pointer assign, bias: \[^\\\]\]+\\\]\\) finalize$" 1 "gimple" } } !$ACC EXIT DATA COPYOUT (cpo_r) @@ -32,6 +32,6 @@ ! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_exit_data map\\(force_from:cpo_f \\\[len: \[0-9\]+\\\]\\) finalize$" 1 "gimple" } } !$ACC EXIT DATA COPYOUT (cpo_f_p(4:10)) FINALIZE -! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(from:\\*\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\) parm\\.1\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(release:cpo_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:\\(integer\\(kind=1\\)\\\[0:\\\] \\* restrict\\) cpo_f_p\\.data \\\[pointer assign, bias: \\(.*int.*\\) parm\\.1\\.data - \\(.*int.*\\) cpo_f_p\\.data\\\]\\) finalize;$" 1 "original" } } +! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(from:\\*\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\) parm\\.1\\.data \\\[len: \[^\\\]\]+\\\]\\) map\\(release:cpo_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:\\(integer\\(kind=1\\)\\\[0:\\\] \\* restrict\\) cpo_f_p\\.data \\\[pointer assign, bias: D.\[0-9\]+\\\]\\) finalize;$" 1 "original" } } ! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_exit_data map\\(force_from:MEM <\[^>\]+> \\\[\\(integer\\(kind=.\\)\\\[0:\\\] \\*\\)_\[0-9\]+\\\] \\\[len: \[^\\\]\]+\\\]\\) map\\(delete:cpo_f_p \\\[pointer set, len: \[0-9\]+\\\]\\) map\\(alloc:cpo_f_p\\.data \\\[pointer assign, bias: \[^\\\]\]+\\\]\\) finalize$" 1 "gimple" } } END SUBROUTINE f diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95 index 1a26844..dc823b7 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95 @@ -77,7 +77,7 @@ program main !$acc end kernels end program main -! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_kernels map\(to:a\[_[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: _[0-9]+\]\) map\(tofrom:sum \[len: [0-9]+\]\)$} 1 "gimple" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_kernels map\(to:a\[D.[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: D.[0-9]+\]\) map\(tofrom:sum \[len: [0-9]+\]\)$} 1 "gimple" } } ! { dg-final { scan-tree-dump-times {(?n)#pragma acc loop private\(i\)$} 2 "gimple" } } ! { dg-final { scan-tree-dump-times {(?n)#pragma acc loop private\(i\) independent$} 1 "gimple" } } @@ -86,11 +86,11 @@ end program main ! Check that the OpenACC 'kernels' got decomposed into 'data' and an enclosed ! sequence of compute constructs. -! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_data_kernels map\(to:a\[_[0-9]+\] \[len: _[0-9]+\]\) map\(tofrom:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_data_kernels map\(to:a\[D.[0-9]+\] \[len: _[0-9]+\]\) map\(tofrom:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } } ! As noted above, we get three "old-style" kernel regions, one gang-single region, and one parallelized loop region. -! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_kernels async\(-1\) map\(force_present:a\[_[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: _[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 3 "omp_oacc_kernels_decompose" } } -! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_parallel_kernels_parallelized async\(-1\) map\(force_present:a\[_[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: _[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } } -! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_parallel_kernels_gang_single async\(-1\) num_gangs\(1\) map\(force_present:a\[_[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: _[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_kernels async\(-1\) map\(force_present:a\[D.[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: D.[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 3 "omp_oacc_kernels_decompose" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_parallel_kernels_parallelized async\(-1\) map\(force_present:a\[D.[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: D.[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma omp target oacc_parallel_kernels_gang_single async\(-1\) num_gangs\(1\) map\(force_present:a\[D.[0-9]+\] \[len: _[0-9]+\]\) map\(alloc:a \[pointer assign, bias: D.[0-9]+\]\) map\(force_present:sum \[len: [0-9]+\]\)$} 1 "omp_oacc_kernels_decompose" } } ! ! 'data' plus five CCs. ! { dg-final { scan-tree-dump-times {(?n)#pragma omp target } 6 "omp_oacc_kernels_decompose" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/pr70828.f90 b/gcc/testsuite/gfortran.dg/goacc/pr70828.f90 index fcfe086..648bad8 100644 --- a/gcc/testsuite/gfortran.dg/goacc/pr70828.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/pr70828.f90 @@ -18,5 +18,5 @@ program test !$acc end data end program test -! { dg-final { scan-tree-dump-times "omp target oacc_data map\\(tofrom:data\\\[\_\[0-9\]+\\\] \\\[len: _\[0-9\]+\\\]\\) map\\(alloc:data \\\[pointer assign, bias: _\[0-9\]+\\\]\\)" 1 "gimple" } } +! { dg-final { scan-tree-dump-times "omp target oacc_data map\\(tofrom:data\\\[D\\.\[0-9\]+\\\] \\\[len: _\[0-9\]+\\\]\\) map\\(alloc:data \\\[pointer assign, bias: D\\.\[0-9\]+\\\]\\)" 1 "gimple" } } ! { dg-final { scan-tree-dump-times "omp target oacc_parallel map\\(force_present:data\\\[D\\.\[0-9\]+\\\] \\\[len: D\\.\[0-9\]+\\\]\\) map\\(alloc:data \\\[pointer assign, bias: D\\.\[0-9\]+\\\]\\)" 1 "gimple" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/reduction.f95 b/gcc/testsuite/gfortran.dg/goacc/reduction.f95 index 2d9a111..4ab180a 100644 --- a/gcc/testsuite/gfortran.dg/goacc/reduction.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/reduction.f95 @@ -25,19 +25,14 @@ save i2 common /blk/ i1 !$acc parallel reduction (+:ia2) -! { dg-error "Array 'ia2' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end parallel !$acc parallel reduction (+:ra1) -! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end parallel !$acc parallel reduction (+:ca1) -! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end parallel !$acc parallel reduction (+:da1) -! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end parallel !$acc parallel reduction (.and.:la1) -! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end parallel !$acc parallel reduction (+:i3, r1, d1, c1) !$acc end parallel @@ -73,104 +68,84 @@ common /blk/ i1 !$acc parallel reduction (*:ia1) ! { dg-error "Assumed size" } ! { dg-error "Array 'ia1' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end parallel -!$acc parallel reduction (+:l1) ! { dg-error "OMP DECLARE REDUCTION \\+ not found for type LOGICAL" } +!$acc parallel reduction (+:l1) ! { dg-error "Reduction operator \\+ is not valid for 'l1'" } !$acc end parallel -!$acc parallel reduction (*:la1) ! { dg-error "OMP DECLARE REDUCTION \\* not found for type LOGICAL" } -! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (*:la1) ! { dg-error "Reduction operator \\* is not valid for 'la1'" } !$acc end parallel -!$acc parallel reduction (-:a1) ! { dg-error "OMP DECLARE REDUCTION - not found for type CHARACTER" } +!$acc parallel reduction (-:a1) ! { dg-error "Reduction operator - is not valid for 'a1'" } !$acc end parallel -!$acc parallel reduction (+:t1) ! { dg-error "OMP DECLARE REDUCTION \\+ not found for type TYPE" } +!$acc parallel reduction (+:t1) !$acc end parallel -!$acc parallel reduction (*:ta1) ! { dg-error "OMP DECLARE REDUCTION \\* not found for type TYPE" } -! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (*:ta1) !$acc end parallel -!$acc parallel reduction (.and.:i3) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type INTEGER" } +!$acc parallel reduction (.and.:i3) ! { dg-error "Reduction operator \\.and\\. is not valid for 'i3'" } !$acc end parallel -!$acc parallel reduction (.or.:ia2) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type INTEGER" } -! { dg-error "Array 'ia2' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (.or.:ia2) ! { dg-error "Reduction operator \\.or\\. is not valid for 'ia2'" } !$acc end parallel -!$acc parallel reduction (.eqv.:r1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type REAL" } +!$acc parallel reduction (.eqv.:r1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'r1'" } !$acc end parallel -!$acc parallel reduction (.neqv.:ra1) ! { dg-error "OMP DECLARE REDUCTION \\.neqv\\. not found for type REAL" } -! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (.neqv.:ra1) ! { dg-error "Reduction operator \\.neqv. is not valid for 'ra1'" } !$acc end parallel -!$acc parallel reduction (.and.:d1) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type REAL" } +!$acc parallel reduction (.and.:d1) ! { dg-error "Reduction operator \\.and\\. is not valid for 'd1'" } !$acc end parallel -!$acc parallel reduction (.or.:da1) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type REAL" } -! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (.or.:da1) ! { dg-error "Reduction operator \\.or\\. is not valid for 'da1'" } !$acc end parallel -!$acc parallel reduction (.eqv.:c1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type COMPLEX" } +!$acc parallel reduction (.eqv.:c1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'c1'" } !$acc end parallel -!$acc parallel reduction (.neqv.:ca1) ! { dg-error "OMP DECLARE REDUCTION \\.neqv\\. not found for type COMPLEX" } -! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (.neqv.:ca1) ! { dg-error "Reduction operator \\.neqv\\. is not valid for 'ca1'" } !$acc end parallel -!$acc parallel reduction (.and.:a1) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type CHARACTER" } +!$acc parallel reduction (.and.:a1) ! { dg-error "Reduction operator \\.and\\. is not valid for 'a1'" } !$acc end parallel -!$acc parallel reduction (.or.:t1) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type TYPE" } +!$acc parallel reduction (.or.:t1) ! { dg-error "Reduction operator \\.or\\. is not valid for 't1'" } !$acc end parallel -!$acc parallel reduction (.eqv.:ta1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type TYPE" } -! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (.eqv.:ta1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'ta1'" } !$acc end parallel -!$acc parallel reduction (min:c1) ! { dg-error "OMP DECLARE REDUCTION min not found for type COMPLEX" } +!$acc parallel reduction (min:c1) ! { dg-error "Reduction operator min is not valid for 'c1'" } !$acc end parallel -!$acc parallel reduction (max:ca1) ! { dg-error "OMP DECLARE REDUCTION max not found for type COMPLEX" } -! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (max:ca1) ! { dg-error "Reduction operator max is not valid for 'ca1'" } !$acc end parallel -!$acc parallel reduction (max:l1) ! { dg-error "OMP DECLARE REDUCTION max not found for type LOGICAL" } +!$acc parallel reduction (max:l1) ! { dg-error "Reduction operator max is not valid for 'l1'" } !$acc end parallel -!$acc parallel reduction (min:la1) ! { dg-error "OMP DECLARE REDUCTION min not found for type LOGICAL" } -! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (min:la1) ! { dg-error "Reduction operator min is not valid for 'la1'" } !$acc end parallel -!$acc parallel reduction (max:a1) ! { dg-error "OMP DECLARE REDUCTION max not found for type CHARACTER" } +!$acc parallel reduction (max:a1) ! { dg-error "Reduction operator max is not valid for 'a1'" } !$acc end parallel -!$acc parallel reduction (min:t1) ! { dg-error "OMP DECLARE REDUCTION min not found for type TYPE" } +!$acc parallel reduction (min:t1) !$acc end parallel -!$acc parallel reduction (max:ta1) ! { dg-error "OMP DECLARE REDUCTION max not found for type TYPE" } -! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (max:ta1) !$acc end parallel -!$acc parallel reduction (iand:r1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type REAL" } +!$acc parallel reduction (iand:r1) ! { dg-error "Reduction operator iand is not valid for 'r1'" } !$acc end parallel -!$acc parallel reduction (ior:ra1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type REAL" } -! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (ior:ra1) ! { dg-error "Reduction operator ior is not valid for 'ra1'" } !$acc end parallel -!$acc parallel reduction (ieor:d1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type REAL" } +!$acc parallel reduction (ieor:d1) ! { dg-error "Reduction operator ieor is not valid for 'd1'" } !$acc end parallel -!$acc parallel reduction (ior:da1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type REAL" } -! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (ior:da1) ! { dg-error "Reduction operator ior is not valid for 'da1'" } !$acc end parallel -!$acc parallel reduction (iand:c1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type COMPLEX" } +!$acc parallel reduction (iand:c1) ! { dg-error "Reduction operator iand is not valid for 'c1'" } !$acc end parallel -!$acc parallel reduction (ior:ca1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type COMPLEX" } -! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (ior:ca1) ! { dg-error "Reduction operator ior is not valid for 'ca1'" } !$acc end parallel -!$acc parallel reduction (ieor:l1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type LOGICAL" } +!$acc parallel reduction (ieor:l1) ! { dg-error "Reduction operator ieor is not valid for 'l1'" } !$acc end parallel -!$acc parallel reduction (iand:la1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type LOGICAL" } -! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (iand:la1) ! { dg-error "Reduction operator iand is not valid for 'la1'" } !$acc end parallel -!$acc parallel reduction (ior:a1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type CHARACTER" } +!$acc parallel reduction (ior:a1) ! { dg-error "Reduction operator ior is not valid for 'a1'" } !$acc end parallel -!$acc parallel reduction (ieor:t1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type TYPE" } +!$acc parallel reduction (ieor:t1) !$acc end parallel -!$acc parallel reduction (iand:ta1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type TYPE" } -! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc parallel reduction (iand:ta1) !$acc end parallel !$acc serial reduction (+:ia2) -! { dg-error "Array 'ia2' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end serial !$acc serial reduction (+:ra1) -! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end serial !$acc serial reduction (+:ca1) -! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end serial !$acc serial reduction (+:da1) -! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end serial !$acc serial reduction (.and.:la1) -! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end serial !$acc serial reduction (+:i3, r1, d1, c1) !$acc end serial @@ -206,88 +181,73 @@ common /blk/ i1 !$acc serial reduction (*:ia1) ! { dg-error "Assumed size" } ! { dg-error "Array 'ia1' is not permitted in reduction" "" { target "*-*-*" } .-1 } !$acc end serial -!$acc serial reduction (+:l1) ! { dg-error "OMP DECLARE REDUCTION \\+ not found for type LOGICAL" } +!$acc serial reduction (+:l1) ! { dg-error "Reduction operator \\+ is not valid for 'l1'" } !$acc end serial -!$acc serial reduction (*:la1) ! { dg-error "OMP DECLARE REDUCTION \\* not found for type LOGICAL" } -! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (*:la1) ! { dg-error "Reduction operator \\* is not valid for 'la1'" } !$acc end serial -!$acc serial reduction (-:a1) ! { dg-error "OMP DECLARE REDUCTION - not found for type CHARACTER" } +!$acc serial reduction (-:a1) ! { dg-error "Reduction operator - is not valid for 'a1'" } !$acc end serial -!$acc serial reduction (+:t1) ! { dg-error "OMP DECLARE REDUCTION \\+ not found for type TYPE" } +!$acc serial reduction (+:t1) !$acc end serial -!$acc serial reduction (*:ta1) ! { dg-error "OMP DECLARE REDUCTION \\* not found for type TYPE" } -! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (*:ta1) !$acc end serial -!$acc serial reduction (.and.:i3) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type INTEGER" } +!$acc serial reduction (.and.:i3) ! { dg-error "Reduction operator \\.and\\. is not valid for 'i3'" } !$acc end serial -!$acc serial reduction (.or.:ia2) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type INTEGER" } -! { dg-error "Array 'ia2' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (.or.:ia2) ! { dg-error "Reduction operator \\.or\\. is not valid for 'ia2'" } !$acc end serial -!$acc serial reduction (.eqv.:r1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type REAL" } +!$acc serial reduction (.eqv.:r1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'r1'" } !$acc end serial -!$acc serial reduction (.neqv.:ra1) ! { dg-error "OMP DECLARE REDUCTION \\.neqv\\. not found for type REAL" } -! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (.neqv.:ra1) ! { dg-error "Reduction operator \\.neqv\\. is not valid for 'ra1'" } !$acc end serial -!$acc serial reduction (.and.:d1) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type REAL" } +!$acc serial reduction (.and.:d1) ! { dg-error "Reduction operator \\.and\\. is not valid for 'd1'" } !$acc end serial -!$acc serial reduction (.or.:da1) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type REAL" } -! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (.or.:da1) ! { dg-error "Reduction operator \\.or\\. is not valid for 'da1'" } !$acc end serial -!$acc serial reduction (.eqv.:c1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type COMPLEX" } +!$acc serial reduction (.eqv.:c1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'c1'" } !$acc end serial -!$acc serial reduction (.neqv.:ca1) ! { dg-error "OMP DECLARE REDUCTION \\.neqv\\. not found for type COMPLEX" } -! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (.neqv.:ca1) ! { dg-error "Reduction operator \\.neqv\\. is not valid for 'ca1'" } !$acc end serial -!$acc serial reduction (.and.:a1) ! { dg-error "OMP DECLARE REDUCTION \\.and\\. not found for type CHARACTER" } +!$acc serial reduction (.and.:a1) ! { dg-error "Reduction operator \\.and\\. is not valid for 'a1'" } !$acc end serial -!$acc serial reduction (.or.:t1) ! { dg-error "OMP DECLARE REDUCTION \\.or\\. not found for type TYPE" } +!$acc serial reduction (.or.:t1) ! { dg-error "Reduction operator \\.or\\. is not valid for 't1'" } !$acc end serial -!$acc serial reduction (.eqv.:ta1) ! { dg-error "OMP DECLARE REDUCTION \\.eqv\\. not found for type TYPE" } -! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (.eqv.:ta1) ! { dg-error "Reduction operator \\.eqv\\. is not valid for 'ta1'" } !$acc end serial -!$acc serial reduction (min:c1) ! { dg-error "OMP DECLARE REDUCTION min not found for type COMPLEX" } +!$acc serial reduction (min:c1) ! { dg-error "Reduction operator min is not valid for 'c1'" } !$acc end serial -!$acc serial reduction (max:ca1) ! { dg-error "OMP DECLARE REDUCTION max not found for type COMPLEX" } -! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (max:ca1) ! { dg-error "Reduction operator max is not valid for 'ca1'" } !$acc end serial -!$acc serial reduction (max:l1) ! { dg-error "OMP DECLARE REDUCTION max not found for type LOGICAL" } +!$acc serial reduction (max:l1) ! { dg-error "Reduction operator max is not valid for 'l1'" } !$acc end serial -!$acc serial reduction (min:la1) ! { dg-error "OMP DECLARE REDUCTION min not found for type LOGICAL" } -! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (min:la1) ! { dg-error "Reduction operator min is not valid for 'la1'" } !$acc end serial -!$acc serial reduction (max:a1) ! { dg-error "OMP DECLARE REDUCTION max not found for type CHARACTER" } +!$acc serial reduction (max:a1) ! { dg-error "Reduction operator max is not valid for 'a1'" } !$acc end serial -!$acc serial reduction (min:t1) ! { dg-error "OMP DECLARE REDUCTION min not found for type TYPE" } +!$acc serial reduction (min:t1) !$acc end serial -!$acc serial reduction (max:ta1) ! { dg-error "OMP DECLARE REDUCTION max not found for type TYPE" } -! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (max:ta1) !$acc end serial -!$acc serial reduction (iand:r1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type REAL" } +!$acc serial reduction (iand:r1) ! { dg-error "Reduction operator iand is not valid for 'r1'" } !$acc end serial -!$acc serial reduction (ior:ra1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type REAL" } -! { dg-error "Array 'ra1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (ior:ra1) ! { dg-error "Reduction operator ior is not valid for 'ra1'" } !$acc end serial -!$acc serial reduction (ieor:d1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type REAL" } +!$acc serial reduction (ieor:d1) ! { dg-error "Reduction operator ieor is not valid for 'd1'" } !$acc end serial -!$acc serial reduction (ior:da1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type REAL" } -! { dg-error "Array 'da1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (ior:da1) ! { dg-error "Reduction operator ior is not valid for 'da1'" } !$acc end serial -!$acc serial reduction (iand:c1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type COMPLEX" } +!$acc serial reduction (iand:c1) ! { dg-error "Reduction operator iand is not valid for 'c1'" } !$acc end serial -!$acc serial reduction (ior:ca1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type COMPLEX" } -! { dg-error "Array 'ca1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (ior:ca1) ! { dg-error "Reduction operator ior is not valid for 'ca1'" } !$acc end serial -!$acc serial reduction (ieor:l1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type LOGICAL" } +!$acc serial reduction (ieor:l1) ! { dg-error "Reduction operator ieor is not valid for 'l1'" } !$acc end serial -!$acc serial reduction (iand:la1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type LOGICAL" } -! { dg-error "Array 'la1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (iand:la1) ! { dg-error "Reduction operator iand is not valid for 'la1'" } !$acc end serial -!$acc serial reduction (ior:a1) ! { dg-error "OMP DECLARE REDUCTION ior not found for type CHARACTER" } +!$acc serial reduction (ior:a1) ! { dg-error "Reduction operator ior is not valid for 'a1'" } !$acc end serial -!$acc serial reduction (ieor:t1) ! { dg-error "OMP DECLARE REDUCTION ieor not found for type TYPE" } +!$acc serial reduction (ieor:t1) !$acc end serial -!$acc serial reduction (iand:ta1) ! { dg-error "OMP DECLARE REDUCTION iand not found for type TYPE" } -! { dg-error "Array 'ta1' is not permitted in reduction" "" { target "*-*-*" } .-1 } +!$acc serial reduction (iand:ta1) !$acc end serial end subroutine diff --git a/gcc/testsuite/gfortran.dg/gomp/target-enter-exit-data.f90 b/gcc/testsuite/gfortran.dg/gomp/target-enter-exit-data.f90 index 74eb894..dc267c2 100644 --- a/gcc/testsuite/gfortran.dg/gomp/target-enter-exit-data.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/target-enter-exit-data.f90 @@ -9,17 +9,17 @@ type(t) :: var allocate (var%arr(1:100)) !$omp target enter data map(to: var%arr(10:20)) -! { dg-final { scan-tree-dump-times {(?n)#pragma omp target enter data map\(to:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\)$} 1 "original" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma omp target enter data map\(to:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: D\.[0-9]+\]\)$} 1 "original" } } !$omp target exit data map(release: var%arr(10:20)) -! { dg-final { scan-tree-dump-times {(?n)#pragma omp target exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\)$} 1 "original" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma omp target exit data map\(release:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(release:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: D\.[0-9]+\]\)$} 1 "original" } } !$omp target enter data map(alloc: var%arr(20:30)) -! { dg-final { scan-tree-dump-times {(?n)#pragma omp target enter data map\(alloc:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\)$} 1 "original" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma omp target enter data map\(alloc:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(to:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: D\.[0-9]+\]\)$} 1 "original" } } !$omp target exit data map(delete: var%arr(20:30)) -! { dg-final { scan-tree-dump-times {(?n)#pragma omp target exit data map\(delete:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(delete:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: \(integer\(kind=[0-9]+\)\) parm\.[0-9]+\.data - \(integer\(kind=[0-9]+\)\) var\.arr\.data\]\)$} 1 "original" } } +! { dg-final { scan-tree-dump-times {(?n)#pragma omp target exit data map\(delete:\*\(integer\(kind=[0-9]+\)\[0:\] \*\) parm\.[0-9]+\.data \[len: D\.[0-9]+ \* [0-9]+\]\) map\(delete:var\.arr \[pointer set, len: [0-9]+\]\) map\(attach_detach:\(integer\(kind=[0-9]+\)\[0:\] \*\) var\.arr\.data \[bias: D\.[0-9]+\]\)$} 1 "original" } } !$omp target enter data map(to: var%arr) diff --git a/gcc/tree-loop-distribution.cc b/gcc/tree-loop-distribution.cc index fc0cd39..6be5afd 100644 --- a/gcc/tree-loop-distribution.cc +++ b/gcc/tree-loop-distribution.cc @@ -1197,6 +1197,16 @@ generate_memset_builtin (class loop *loop, partition *partition) /* The new statements will be placed before LOOP. */ gsi = gsi_last_bb (loop_preheader_edge (loop)->src); + if (flag_openacc + && gsi_stmt (gsi) + && gimple_call_internal_p (gsi_stmt (gsi), IFN_UNIQUE) + && (TREE_INT_CST_LOW (gimple_call_arg (gsi_stmt (gsi), 0)) + == (unsigned HOST_WIDE_INT) IFN_UNIQUE_OACC_FORK)) + { + edge e = split_block (loop_preheader_edge (loop)->src, gsi_stmt (gsi)); + gsi = gsi_last_bb (e->dest); + } + nb_bytes = rewrite_to_non_trapping_overflow (builtin->size); nb_bytes = force_gimple_operand_gsi (&gsi, nb_bytes, true, NULL_TREE, false, GSI_CONTINUE_LINKING); @@ -1251,6 +1261,16 @@ generate_memcpy_builtin (class loop *loop, partition *partition) /* The new statements will be placed before LOOP. */ gsi = gsi_last_bb (loop_preheader_edge (loop)->src); + if (flag_openacc + && gsi_stmt (gsi) + && gimple_call_internal_p (gsi_stmt (gsi), IFN_UNIQUE) + && (TREE_INT_CST_LOW (gimple_call_arg (gsi_stmt (gsi), 0)) + == (unsigned HOST_WIDE_INT) IFN_UNIQUE_OACC_FORK)) + { + edge e = split_block (loop_preheader_edge (loop)->src, gsi_stmt (gsi)); + gsi = gsi_last_bb (e->dest); + } + nb_bytes = rewrite_to_non_trapping_overflow (builtin->size); nb_bytes = force_gimple_operand_gsi (&gsi, nb_bytes, true, NULL_TREE, false, GSI_CONTINUE_LINKING); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-1.c new file mode 100644 index 0000000..6f1b86a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-1.c @@ -0,0 +1,69 @@ +/* { dg-do run } */ + +/* Array reductions. */ + +#include <stdlib.h> +#include "reduction.h" + +#define ng 8 +#define nw 4 +#define vl 32 + +#define N 10 + +#define check_reduction_array_op_all(type, opr, init, b) \ + check_reduction_xxx_xx_all(array, op, type, opr, init, b) +#define check_reduction_arraysec_op_all(type, opr, init, b) \ + check_reduction_xxx_xx_all(arraysec, op, type, opr, init, b) +#define check_reduction_array_macro_all(type, opr, init, b) \ + check_reduction_xxx_xx_all(array, macro, type, opr, init, b) +#define check_reduction_arraysec_macro_all(type, opr, init, b) \ + check_reduction_xxx_xx_all(arraysec, macro, type, opr, init, b) + +int +main (void) +{ + const int n = 100; + int ints[n]; + float flts[n]; + double dbls[n]; + int cmp_val = 5; + + for (int i = 0; i < n; i++) + { + ints[i] = i + 1; + flts[i] = i + 1; + dbls[i] = i + 1; + } + + check_reduction_array_op_all (int, +, 0, ints[i]); + check_reduction_array_op_all (int, *, 1, ints[i]); + check_reduction_array_op_all (int, &, -1, ints[i]); + check_reduction_array_op_all (int, |, 0, ints[i]); + check_reduction_array_op_all (int, ^, 0, ints[i]); + check_reduction_array_op_all (int, &&, 1, (cmp_val > ints[i])); + check_reduction_array_op_all (int, ||, 0, (cmp_val > ints[i])); + check_reduction_array_macro_all (int, min, n + 1, ints[i]); + check_reduction_array_macro_all (int, max, -1, ints[i]); + + check_reduction_array_op_all (float, +, 0, flts[i]); + check_reduction_array_op_all (float, *, 1, flts[i]); + check_reduction_array_macro_all (float, min, n + 1, flts[i]); + check_reduction_array_macro_all (float, max, -1, flts[i]); + + check_reduction_arraysec_op_all (int, +, 0, ints[i]); + check_reduction_arraysec_op_all (float, *, 1, flts[i]); + check_reduction_arraysec_macro_all (double, min, n + 1, dbls[i]); + check_reduction_arraysec_macro_all (double, max, -1, dbls[i]); + + check_reduction_array_op_all (double, +, 0, dbls[i]); +#if 0 + /* Currently fails due to unclear issue, presumably unrelated to reduction + mechanics. Avoiding for now. */ + check_reduction_array_op_all (double, *, 1.0, dbls[i]); +#endif + check_reduction_array_macro_all (double, min, n + 1, dbls[i]); + check_reduction_array_macro_all (double, max, -1, dbls[i]); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c new file mode 100644 index 0000000..43e139f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c @@ -0,0 +1,91 @@ +/* { dg-do run } */ + +/* More array reduction tests, different combinations of parallel/loop + construct, implied/explicit copy clauses, and subarrays. */ + +#define ARRAY_BODY(ARRAY, MIN, LEN) \ + for (int i = 0; i < 10; i++) \ + for (int j = MIN; j < MIN + LEN; j++) \ + ARRAY[j] += 1; + +int main (void) +{ + int o[6] = { 5, 1, 1, 5, 9, 9 }; + int a[6]; + + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + a[i] = o[i]; + + #pragma acc parallel + #pragma acc loop reduction(+:a[1:2]) + ARRAY_BODY (a, 1, 2) + ARRAY_BODY (o, 1, 2) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a[3:2]) + #pragma acc loop reduction(+:a[3:2]) + ARRAY_BODY (a, 3, 2) + ARRAY_BODY (o, 3, 2) + for (int i = 0; i < 6; i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a) + #pragma acc loop reduction(+:a[0:5]) + ARRAY_BODY (a, 0, 5) + ARRAY_BODY (o, 0, 5) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel + #pragma acc loop reduction(+:a) + ARRAY_BODY (a, 4, 1) + ARRAY_BODY (o, 4, 1) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a) + #pragma acc loop reduction(+:a) + ARRAY_BODY (a, 3, 3) + ARRAY_BODY (o, 3, 3) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + +#if !defined(ACC_DEVICE_TYPE_host) + + #pragma acc parallel loop reduction(+:a) + ARRAY_BODY (a, 1, 3) + ARRAY_BODY (o, 1, 3) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel loop reduction(+:a[2:3]) + ARRAY_BODY (a, 2, 3) + ARRAY_BODY (o, 2, 3) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel reduction(+:a) + ARRAY_BODY (a, 3, 2) + ARRAY_BODY (o, 3, 2) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel reduction(+:a[1:2]) + ARRAY_BODY (a, 1, 2) + ARRAY_BODY (o, 1, 2) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + +#endif + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c new file mode 100644 index 0000000..aeae2e0 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c @@ -0,0 +1,90 @@ +/* { dg-do run } */ + +/* Same as reduction-arrays-2.c test, but with non-constant subarray + base indexes. */ + +#define ARRAY_BODY(ARRAY, MIN, LEN) \ + for (int i = 0; i < 10; i++) \ + for (int j = MIN; j < MIN + LEN; j++) \ + ARRAY[j] += 1; + +int zero = 0; +int one = 1; +int two = 2; +int three = 3; +int four = 4; + +int main (void) +{ + int o[6] = { 5, 1, 1, 5, 9, 9 }; + int a[6]; + + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + a[i] = o[i]; + + #pragma acc parallel + #pragma acc loop reduction(+:a[one:2]) + ARRAY_BODY (a, one, 2) + ARRAY_BODY (o, one, 2) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a[three:2]) + #pragma acc loop reduction(+:a[three:2]) + ARRAY_BODY (a, three, 2) + ARRAY_BODY (o, three, 2) + for (int i = 0; i < 6; i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a) + #pragma acc loop reduction(+:a[zero:5]) + ARRAY_BODY (a, zero, 5) + ARRAY_BODY (o, zero, 5) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel + #pragma acc loop reduction(+:a) + ARRAY_BODY (a, four, 1) + ARRAY_BODY (o, four, 1) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a) + #pragma acc loop reduction(+:a) + ARRAY_BODY (a, three, 3) + ARRAY_BODY (o, three, 3) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + +#if !defined(ACC_DEVICE_TYPE_host) + + #pragma acc parallel loop reduction(+:a) + ARRAY_BODY (a, one, 3) + ARRAY_BODY (o, one, 3) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel loop reduction(+:a[two:3]) + ARRAY_BODY (a, two, 3) + ARRAY_BODY (o, two, 3) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel reduction(+:a[one:2]) + ARRAY_BODY (a, one, 2) + ARRAY_BODY (o, one, 2) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + +#endif + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c new file mode 100644 index 0000000..c095284 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c @@ -0,0 +1,91 @@ +/* { dg-do run } */ + +/* Same as reduction-arrays-3.c test, but additionally with + non-constant subarray lengths. */ + +#define ARRAY_BODY(ARRAY, MIN, LEN) \ + for (int i = 0; i < 10; i++) \ + for (int j = MIN; j < MIN + LEN; j++) \ + ARRAY[j] += 1; + +int zero = 0; +int one = 1; +int two = 2; +int three = 3; +int four = 4; +int five = 5; + +int main (void) +{ + int o[6] = { 5, 1, 1, 5, 9, 9 }; + int a[6]; + + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + a[i] = o[i]; + + #pragma acc parallel + #pragma acc loop reduction(+:a[one:two]) + ARRAY_BODY (a, one, two) + ARRAY_BODY (o, one, two) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a[three:two]) + #pragma acc loop reduction(+:a[three:two]) + ARRAY_BODY (a, three, two) + ARRAY_BODY (o, three, two) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a) + #pragma acc loop reduction(+:a[zero:five]) + ARRAY_BODY (a, zero, five) + ARRAY_BODY (o, zero, five) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel + #pragma acc loop reduction(+:a) + ARRAY_BODY (a, four, one) + ARRAY_BODY (o, four, one) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a) + #pragma acc loop reduction(+:a) + ARRAY_BODY (a, three, three) + ARRAY_BODY (o, three, three) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + +#if !defined(ACC_DEVICE_TYPE_host) + + #pragma acc parallel loop reduction(+:a) + ARRAY_BODY (a, one, three) + ARRAY_BODY (o, one, three) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel loop reduction(+:a[two:three]) + ARRAY_BODY (a, two, three) + ARRAY_BODY (o, two, three) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel reduction(+:a[one:two]) + ARRAY_BODY (a, one, two) + ARRAY_BODY (o, one, two) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + +#endif + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c new file mode 100644 index 0000000..4794350 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c @@ -0,0 +1,89 @@ +/* { dg-do run } */ + +/* Same as reduction-arrays-4.c test, but reduced arrays are VLAs. */ + +#define ARRAY_BODY(ARRAY, MIN, LEN) \ + for (int i = 0; i < 10; i++) \ + for (int j = MIN; j < MIN + LEN; j++) \ + ARRAY[j] += 1; + +int zero = 0; +int one = 1; +int two = 2; +int three = 3; +int four = 4; +int five = 5; +int six = 6; + +int main (void) +{ + int init[6] = { 5, 1, 1, 5, 9, 9 }; + int o[six]; + int a[six]; + + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + a[i] = o[i] = init[i]; + + #pragma acc parallel + #pragma acc loop reduction(+:a[one:two]) + ARRAY_BODY (a, one, two) + ARRAY_BODY (o, one, two) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a[three:two]) + #pragma acc loop reduction(+:a[three:two]) + ARRAY_BODY (a, three, two) + ARRAY_BODY (o, three, two) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a) + #pragma acc loop reduction(+:a[zero:five]) + ARRAY_BODY (a, zero, five) + ARRAY_BODY (o, zero, five) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel + #pragma acc loop reduction(+:a) + ARRAY_BODY (a, four, one) + ARRAY_BODY (o, four, one) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel copy(a) + #pragma acc loop reduction(+:a) + ARRAY_BODY (a, three, three) + ARRAY_BODY (o, three, three) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel loop reduction(+:a) + ARRAY_BODY (a, one, three) + ARRAY_BODY (o, one, three) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel loop reduction(+:a[two:three]) + ARRAY_BODY (a, two, three) + ARRAY_BODY (o, two, three) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + #pragma acc parallel reduction(+:a[one:two]) + ARRAY_BODY (a, one, two) + ARRAY_BODY (o, one, two) + for (int i = 0; i < sizeof (a) / sizeof (int); i++) + if (a[i] != o[i]) + __builtin_abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-structs-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-structs-1.c new file mode 100644 index 0000000..22216ff --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-structs-1.c @@ -0,0 +1,121 @@ +/* { dg-do run } */ + +/* Struct reductions. */ + +#include <stdlib.h> +#include "reduction.h" + +#define ng 8 +#define nw 4 +#define vl 32 + +#define N 10 + +typedef struct { int x, y; } int_pair; +typedef struct { float m, n; } flt_pair; +typedef struct +{ + int i; + double d; + float f; + int a[N]; + int_pair ip; + flt_pair fp; +} rectype; + +static void +init_struct (rectype *rec, int val) +{ + rec->i = val; + rec->d = (double) val; + rec->f = (float) val; + for (int i = 0; i < N; i++) + rec->a[i] = val; + rec->ip.x = val; + rec->ip.y = val; + rec->fp.m = (float) val; + rec->fp.n = (float) val; +} + +static int +struct_eq (rectype *a, rectype *b) +{ + if (a->i != b->i || a->d != b->d + || a->f != b->f + || a->ip.x != b->ip.x + || a->ip.y != b->ip.y + || a->fp.m != b->fp.m + || a->fp.n != b->fp.n) + return 0; + + for (int i = 0; i < N; i++) + if (a->a[i] != b->a[i]) + return 0; + return 1; +} + +#define check_reduction_struct_xx(type, op, init, b, gwv_par, gwv_loop, apply) \ + { \ + type res, vres; \ + init_struct (&res, init); \ + DO_PRAGMA (acc parallel gwv_par copy(res)) \ + DO_PRAGMA (acc loop gwv_loop reduction (op:res)) \ + for (int i = 0; i < n; i++) \ + { \ + res.i = apply (op, res.i, b); \ + res.d = apply (op, res.d, b); \ + res.f = apply (op, res.f, b); \ + for (int j = 0; j < N; j++) \ + res.a[j] = apply (op, res.a[j], b); \ + res.ip.x = apply (op, res.ip.x, b); \ + res.ip.y = apply (op, res.ip.y, b); \ + res.fp.m = apply (op, res.fp.m, b); \ + res.fp.n = apply (op, res.fp.n, b); \ + } \ + \ + init_struct (&vres, init); \ + for (int i = 0; i < n; i++) \ + { \ + vres.i = apply (op, vres.i, b); \ + vres.d = apply (op, vres.d, b); \ + vres.f = apply (op, vres.f, b); \ + for (int j = 0; j < N; j++) \ + vres.a[j] = apply (op, vres.a[j], b); \ + vres.ip.x = apply (op, vres.ip.x, b); \ + vres.ip.y = apply (op, vres.ip.y, b); \ + vres.fp.m = apply (op, vres.fp.m, b); \ + vres.fp.n = apply (op, vres.fp.n, b); \ + } \ + \ + if (!struct_eq (&res, &vres)) \ + __builtin_abort (); \ + } + +#define operator_apply(op, a, b) (a op b) +#define check_reduction_struct_op(type, op, init, b, gwv_par, gwv_loop) \ + check_reduction_struct_xx(type, op, init, b, gwv_par, gwv_loop, operator_apply) + +#define function_apply(op, a, b) (op (a, b)) +#define check_reduction_struct_macro(type, op, init, b, gwv_par, gwv_loop) \ + check_reduction_struct_xx(type, op, init, b, gwv_par, gwv_loop, function_apply) + +#define check_reduction_struct_op_all(type, opr, init, b) \ + check_reduction_xxx_xx_all (struct, op, type, opr, init, b) +#define check_reduction_struct_macro_all(type, opr, init, b) \ + check_reduction_xxx_xx_all (struct, macro, type, opr, init, b) + +int +main (void) +{ + const int n = 10; + int ints[n]; + + for (int i = 0; i < n; i++) + ints[i] = i + 1; + + check_reduction_struct_op_all (rectype, +, 0, ints[i]); + check_reduction_struct_op_all (rectype, *, 1, ints[i]); + check_reduction_struct_macro_all (rectype, min, n + 1, ints[i]); + check_reduction_struct_macro_all (rectype, max, -1, ints[i]); + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h index 1b3f8d4..c928578 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h @@ -37,6 +37,58 @@ DO_PRAGMA (acc loop gwv_loop reduction (op:res)) \ abort (); \ } +#define check_reduction_array_xx(type, var, var_in_clause, op, init, b, \ + gwv_par, gwv_loop, apply) \ + { \ + type var[N], var ## _check[N]; \ + for (int i = 0; i < N; i++) \ + var[i] = var ## _check[i] = (init); \ + DO_PRAGMA (acc parallel gwv_par copy (var_in_clause)) \ + DO_PRAGMA (acc loop gwv_loop reduction (op: var_in_clause)) \ + for (int i = 0; i < n; i++) \ + for (int j = 0; j < N; j++) \ + var[j] = apply (op, var[j], (b)); \ + \ + for (int i = 0; i < n; i++) \ + for (int j = 0; j < N; j++) \ + var ## _check[j] = apply (op, var ## _check[j], (b)); \ + \ + for (int j = 0; j < N; j++) \ + if (var[j] != var ## _check[j]) \ + abort (); \ + } + +#define operator_apply(op, a, b) (a op b) +#define check_reduction_array_op(type, op, init, b, gwv_par, gwv_loop) \ + check_reduction_array_xx (type, v, v, op, init, b, gwv_par, gwv_loop, \ + operator_apply) +#define check_reduction_arraysec_op(type, op, init, b, gwv_par, gwv_loop) \ + check_reduction_array_xx (type, v, v[:N], op, init, b, gwv_par, gwv_loop, \ + operator_apply) + + +#define function_apply(op, a, b) (op (a, b)) +#define check_reduction_array_macro(type, op, init, b, gwv_par, gwv_loop)\ + check_reduction_array_xx (type, v, v, op, init, b, gwv_par, gwv_loop, \ + function_apply) +#define check_reduction_arraysec_macro(type, op, init, b, gwv_par, gwv_loop)\ + check_reduction_array_xx (type, v, v[:N], op, init, b, gwv_par, gwv_loop, \ + function_apply) + +#define check_reduction_xxx_xx_all(tclass, form, type, op, init, b) \ + check_reduction_ ## tclass ## _ ## form (type, op, init, b, num_gangs (ng), gang); \ + check_reduction_ ## tclass ## _ ## form (type, op, init, b, num_workers (nw), worker); \ + check_reduction_ ## tclass ## _ ## form (type, op, init, b, vector_length (vl), vector); \ + check_reduction_ ## tclass ## _ ## form (type, op, init, b, \ + num_gangs (ng) num_workers (nw), gang worker); \ + check_reduction_ ## tclass ## _ ## form (type, op, init, b, \ + num_gangs (ng) vector_length (vl), gang vector); \ + check_reduction_ ## tclass ## _ ## form (type, op, init, b, \ + num_workers (nw) vector_length (vl), worker vector); \ + check_reduction_ ## tclass ## _ ## form (type, op, init, b, \ + num_gangs (ng) num_workers (nw) vector_length (vl), \ + gang worker vector); + #define max(a, b) (((a) > (b)) ? (a) : (b)) #define min(a, b) (((a) < (b)) ? (a) : (b)) diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-10.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-10.f90 new file mode 100644 index 0000000..f766524 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-10.f90 @@ -0,0 +1,598 @@ +! { dg-do run } + +! integer array reductions + +program main + implicit none + + integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32 + integer :: i, j + integer, dimension (n) :: vresult, rg, rw, rv, rc + logical, dimension (n) :: lrg, lrw, lrv, lrc, lvresult + integer, dimension (n) :: array + + do i = 1, n + array(i) = i + end do + + ! + ! '+' reductions + ! + + rg = 0 + rw = 0 + rv = 0 + rc = 0 + vresult = 0 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(+:rg) gang + do i = 1, n + do j = 1, n + rg(j) = rg(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(+:rw) worker + do i = 1, n + do j = 1, n + rw(j) = rw(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(+:rv) vector + do i = 1, n + do j = 1, n + rv(j) = rv(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(+:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = rc(j) + array(i) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = vresult(j) + array(i) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 1 + if (count (rw .ne. vresult) .ne. 0) STOP 2 + if (count (rv .ne. vresult) .ne. 0) STOP 3 + if (count (rc .ne. vresult) .ne. 0) STOP 4 + + ! + ! '*' reductions + ! + + rg = 1 + rw = 1 + rv = 1 + rc = 1 + vresult = 1 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(*:rg) gang + do i = 1, n + do j = 1, n + rg(j) = rg(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(*:rw) worker + do i = 1, n + do j = 1, n + rw(j) = rw(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(*:rv) vector + do i = 1, n + do j = 1, n + rv(j) = rv(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(*:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = rc(j) * array(i) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = vresult(j) * array(i) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 5 + if (count (rw .ne. vresult) .ne. 0) STOP 6 + if (count (rv .ne. vresult) .ne. 0) STOP 7 + if (count (rc .ne. vresult) .ne. 0) STOP 8 + + ! + ! 'max' reductions + ! + + rg = 0 + rw = 0 + rv = 0 + rc = 0 + vresult = 0 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(max:rg) gang + do i = 1, n + do j = 1, n + rg(j) = max (rg(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(max:rw) worker + do i = 1, n + do j = 1, n + rw(j) = max (rw(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(max:rv) vector + do i = 1, n + do j = 1, n + rv(j) = max (rv(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(max:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = max (rc(j), array(i)) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = max (vresult(j), array(i)) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 9 + if (count (rw .ne. vresult) .ne. 0) STOP 10 + if (count (rv .ne. vresult) .ne. 0) STOP 11 + if (count (rc .ne. vresult) .ne. 0) STOP 12 + + ! + ! 'min' reductions + ! + + rg = n + 1 + rw = n + 1 + rv = n + 1 + rc = n + 1 + vresult = n + 1 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(min:rg) gang + do i = 1, n + do j = 1, n + rg(j) = min (rg(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(min:rw) worker + do i = 1, n + do j = 1, n + rw(j) = min (rw(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(min:rv) vector + do i = 1, n + do j = 1, n + rv(j) = min (rv(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(min:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = min (rc(j), array(i)) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = min (vresult(j), array(i)) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 13 + if (count (rw .ne. vresult) .ne. 0) STOP 14 + if (count (rv .ne. vresult) .ne. 0) STOP 15 + if (count (rc .ne. vresult) .ne. 0) STOP 16 + + ! + ! 'iand' reductions + ! + + rg = 1 + rw = 1 + rv = 1 + rc = 1 + vresult = 1 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(iand:rg) gang + do i = 1, n + do j = 1, n + rg(j) = iand (rg(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(iand:rw) worker + do i = 1, n + do j = 1, n + rw(j) = iand (rw(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(iand:rv) vector + do i = 1, n + do j = 1, n + rv(j) = iand (rv(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(iand:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = iand (rc(j), array(i)) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = iand (vresult(j), array(i)) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 17 + if (count (rw .ne. vresult) .ne. 0) STOP 18 + if (count (rv .ne. vresult) .ne. 0) STOP 19 + if (count (rc .ne. vresult) .ne. 0) STOP 20 + + ! + ! 'ior' reductions + ! + + rg = 0 + rw = 0 + rv = 0 + rc = 0 + vresult = 0 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(ior:rg) gang + do i = 1, n + do j = 1, n + rg(j) = ior (rg(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(ior:rw) worker + do i = 1, n + do j = 1, n + rw(j) = ior (rw(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(ior:rv) vector + do i = 1, n + do j = 1, n + rv(j) = ior (rv(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(ior:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = ior (rc(j), array(i)) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = ior (vresult(j), array(i)) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 21 + if (count (rw .ne. vresult) .ne. 0) STOP 22 + if (count (rv .ne. vresult) .ne. 0) STOP 23 + if (count (rc .ne. vresult) .ne. 0) STOP 24 + + ! + ! 'ieor' reductions + ! + + rg = 0 + rw = 0 + rv = 0 + rc = 0 + vresult = 0 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(ieor:rg) gang + do i = 1, n + do j = 1, n + rg(j) = ieor (rg(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(ieor:rw) worker + do i = 1, n + do j = 1, n + rw(j) = ieor (rw(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(ieor:rv) vector + do i = 1, n + do j = 1, n + rv(j) = ieor (rv(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(ieor:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = ieor (rc(j), array(i)) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = ieor (vresult(j), array(i)) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 25 + if (count (rw .ne. vresult) .ne. 0) STOP 26 + if (count (rv .ne. vresult) .ne. 0) STOP 27 + if (count (rc .ne. vresult) .ne. 0) STOP 28 + + ! + ! '.and.' reductions + ! + + lrg = .true. + lrw = .true. + lrv = .true. + lrc = .true. + lvresult = .true. + + !$acc parallel num_gangs(ng) copy(lrg) + !$acc loop reduction(.and.:lrg) gang + do i = 1, n + do j = 1, n + lrg(j) = lrg(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(lrw) + !$acc loop reduction(.and.:lrw) worker + do i = 1, n + do j = 1, n + lrw(j) = lrw(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(lrv) + !$acc loop reduction(.and.:lrv) vector + do i = 1, n + do j = 1, n + lrv(j) = lrv(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc) + !$acc loop reduction(.and.:lrc) gang worker vector + do i = 1, n + do j = 1, n + lrc(j) = lrc(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + lvresult(j) = lvresult(j) .and. (array(i) .ge. 5) + end do + end do + + if (count (lrg .neqv. lvresult) .ne. 0) STOP 29 + if (count (lrw .neqv. lvresult) .ne. 0) STOP 30 + if (count (lrv .neqv. lvresult) .ne. 0) STOP 31 + if (count (lrc .neqv. lvresult) .ne. 0) STOP 32 + + ! + ! '.or.' reductions + ! + + lrg = .true. + lrw = .true. + lrv = .true. + lrc = .true. + lvresult = .true. + + !$acc parallel num_gangs(ng) copy(lrg) + !$acc loop reduction(.or.:lrg) gang + do i = 1, n + do j = 1, n + lrg(j) = lrg(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(lrw) + !$acc loop reduction(.or.:lrw) worker + do i = 1, n + do j = 1, n + lrw(j) = lrw(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(lrv) + !$acc loop reduction(.or.:lrv) vector + do i = 1, n + do j = 1, n + lrv(j) = lrv(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc) + !$acc loop reduction(.or.:lrc) gang worker vector + do i = 1, n + do j = 1, n + lrc(j) = lrc(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + lvresult(j) = lvresult(j) .or. (array(i) .ge. 5) + end do + end do + + if (count (lrg .neqv. lvresult) .ne. 0) STOP 33 + if (count (lrw .neqv. lvresult) .ne. 0) STOP 34 + if (count (lrv .neqv. lvresult) .ne. 0) STOP 35 + if (count (lrc .neqv. lvresult) .ne. 0) STOP 36 + + ! + ! '.eqv.' reductions + ! + + lrg = .true. + lrw = .true. + lrv = .true. + lrc = .true. + lvresult = .true. + + !$acc parallel num_gangs(ng) copy(lrg) + !$acc loop reduction(.eqv.:lrg) gang + do i = 1, n + do j = 1, n + lrg(j) = lrg(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(lrw) + !$acc loop reduction(.eqv.:lrw) worker + do i = 1, n + do j = 1, n + lrw(j) = lrw(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(lrv) + !$acc loop reduction(.eqv.:lrv) vector + do i = 1, n + do j = 1, n + lrv(j) = lrv(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc) + !$acc loop reduction(.eqv.:lrc) gang worker vector + do i = 1, n + do j = 1, n + lrc(j) = lrc(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + lvresult(j) = lvresult(j) .eqv. (array(i) .ge. 5) + end do + end do + + if (count (lrg .neqv. lvresult) .ne. 0) STOP 37 + if (count (lrw .neqv. lvresult) .ne. 0) STOP 38 + if (count (lrv .neqv. lvresult) .ne. 0) STOP 39 + if (count (lrc .neqv. lvresult) .ne. 0) STOP 40 + +end program main diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-11.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-11.f90 new file mode 100644 index 0000000..220871a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-11.f90 @@ -0,0 +1,424 @@ +! { dg-do run } + +! real array reductions + +program main + implicit none + + integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32 + integer :: i, j + real, dimension (n) :: vresult, rg, rw, rv, rc + logical, dimension (n) :: lrg, lrw, lrv, lrc, lvresult + real, dimension (n) :: array + + do i = 1, n + array(i) = i + end do + + ! + ! '+' reductions + ! + + rg = 0 + rw = 0 + rv = 0 + rc = 0 + vresult = 0 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(+:rg) gang + do i = 1, n + do j = 1, n + rg(j) = rg(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(+:rw) worker + do i = 1, n + do j = 1, n + rw(j) = rw(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(+:rv) vector + do i = 1, n + do j = 1, n + rv(j) = rv(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(+:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = rc(j) + array(i) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = vresult(j) + array(i) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 1 + if (count (rw .ne. vresult) .ne. 0) STOP 2 + if (count (rv .ne. vresult) .ne. 0) STOP 3 + if (count (rc .ne. vresult) .ne. 0) STOP 4 + + ! + ! '*' reductions + ! + + rg = 1 + rw = 1 + rv = 1 + rc = 1 + vresult = 1 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(*:rg) gang + do i = 1, n + do j = 1, n + rg(j) = rg(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(*:rw) worker + do i = 1, n + do j = 1, n + rw(j) = rw(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(*:rv) vector + do i = 1, n + do j = 1, n + rv(j) = rv(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(*:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = rc(j) * array(i) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = vresult(j) * array(i) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 5 + if (count (rw .ne. vresult) .ne. 0) STOP 6 + if (count (rv .ne. vresult) .ne. 0) STOP 7 + if (count (rc .ne. vresult) .ne. 0) STOP 8 + + ! + ! 'max' reductions + ! + + rg = 0 + rw = 0 + rv = 0 + rc = 0 + vresult = 0 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(max:rg) gang + do i = 1, n + do j = 1, n + rg(j) = max (rg(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(max:rw) worker + do i = 1, n + do j = 1, n + rw(j) = max (rw(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(max:rv) vector + do i = 1, n + do j = 1, n + rv(j) = max (rv(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(max:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = max (rc(j), array(i)) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = max (vresult(j), array(i)) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 9 + if (count (rw .ne. vresult) .ne. 0) STOP 10 + if (count (rv .ne. vresult) .ne. 0) STOP 11 + if (count (rc .ne. vresult) .ne. 0) STOP 12 + + ! + ! 'min' reductions + ! + + rg = n + 1 + rw = n + 1 + rv = n + 1 + rc = n + 1 + vresult = n + 1 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(min:rg) gang + do i = 1, n + do j = 1, n + rg(j) = min (rg(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(min:rw) worker + do i = 1, n + do j = 1, n + rw(j) = min (rw(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(min:rv) vector + do i = 1, n + do j = 1, n + rv(j) = min (rv(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(min:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = min (rc(j), array(i)) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = min (vresult(j), array(i)) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 13 + if (count (rw .ne. vresult) .ne. 0) STOP 14 + if (count (rv .ne. vresult) .ne. 0) STOP 15 + if (count (rc .ne. vresult) .ne. 0) STOP 16 + + ! + ! '.and.' reductions + ! + + lrg = .true. + lrw = .true. + lrv = .true. + lrc = .true. + lvresult = .true. + + !$acc parallel num_gangs(ng) copy(lrg) + !$acc loop reduction(.and.:lrg) gang + do i = 1, n + do j = 1, n + lrg(j) = lrg(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(lrw) + !$acc loop reduction(.and.:lrw) worker + do i = 1, n + do j = 1, n + lrw(j) = lrw(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(lrv) + !$acc loop reduction(.and.:lrv) vector + do i = 1, n + do j = 1, n + lrv(j) = lrv(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc) + !$acc loop reduction(.and.:lrc) gang worker vector + do i = 1, n + do j = 1, n + lrc(j) = lrc(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + lvresult(j) = lvresult(j) .and. (array(i) .ge. 5) + end do + end do + + if (count (lrg .neqv. lvresult) .ne. 0) STOP 17 + if (count (lrw .neqv. lvresult) .ne. 0) STOP 18 + if (count (lrv .neqv. lvresult) .ne. 0) STOP 19 + if (count (lrc .neqv. lvresult) .ne. 0) STOP 20 + + ! + ! '.or.' reductions + ! + + lrg = .true. + lrw = .true. + lrv = .true. + lrc = .true. + lvresult = .true. + + !$acc parallel num_gangs(ng) copy(lrg) + !$acc loop reduction(.or.:lrg) gang + do i = 1, n + do j = 1, n + lrg(j) = lrg(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(lrw) + !$acc loop reduction(.or.:lrw) worker + do i = 1, n + do j = 1, n + lrw(j) = lrw(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(lrv) + !$acc loop reduction(.or.:lrv) vector + do i = 1, n + do j = 1, n + lrv(j) = lrv(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc) + !$acc loop reduction(.or.:lrc) gang worker vector + do i = 1, n + do j = 1, n + lrc(j) = lrc(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + lvresult(j) = lvresult(j) .or. (array(i) .ge. 5) + end do + end do + + if (count (lrg .neqv. lvresult) .ne. 0) STOP 21 + if (count (lrw .neqv. lvresult) .ne. 0) STOP 22 + if (count (lrv .neqv. lvresult) .ne. 0) STOP 23 + if (count (lrc .neqv. lvresult) .ne. 0) STOP 24 + + ! + ! '.eqv.' reductions + ! + + lrg = .true. + lrw = .true. + lrv = .true. + lrc = .true. + lvresult = .true. + + !$acc parallel num_gangs(ng) copy(lrg) + !$acc loop reduction(.eqv.:lrg) gang + do i = 1, n + do j = 1, n + lrg(j) = lrg(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(lrw) + !$acc loop reduction(.eqv.:lrw) worker + do i = 1, n + do j = 1, n + lrw(j) = lrw(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(lrv) + !$acc loop reduction(.eqv.:lrv) vector + do i = 1, n + do j = 1, n + lrv(j) = lrv(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc) + !$acc loop reduction(.eqv.:lrc) gang worker vector + do i = 1, n + do j = 1, n + lrc(j) = lrc(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + lvresult(j) = lvresult(j) .eqv. (array(i) .ge. 5) + end do + end do + + if (count (lrg .neqv. lvresult) .ne. 0) STOP 25 + if (count (lrw .neqv. lvresult) .ne. 0) STOP 26 + if (count (lrv .neqv. lvresult) .ne. 0) STOP 27 + if (count (lrc .neqv. lvresult) .ne. 0) STOP 28 + +end program main diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-12.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-12.f90 new file mode 100644 index 0000000..d89d8ed --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-12.f90 @@ -0,0 +1,424 @@ +! { dg-do run } + +! double precision array reductions + +program main + implicit none + + integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32 + integer :: i, j + double precision, dimension (n) :: vresult, rg, rw, rv, rc + logical, dimension (n) :: lrg, lrw, lrv, lrc, lvresult + double precision, dimension (n) :: array + + do i = 1, n + array(i) = i + end do + + ! + ! '+' reductions + ! + + rg = 0 + rw = 0 + rv = 0 + rc = 0 + vresult = 0 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(+:rg) gang + do i = 1, n + do j = 1, n + rg(j) = rg(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(+:rw) worker + do i = 1, n + do j = 1, n + rw(j) = rw(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(+:rv) vector + do i = 1, n + do j = 1, n + rv(j) = rv(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(+:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = rc(j) + array(i) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = vresult(j) + array(i) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 1 + if (count (rw .ne. vresult) .ne. 0) STOP 2 + if (count (rv .ne. vresult) .ne. 0) STOP 3 + if (count (rc .ne. vresult) .ne. 0) STOP 4 + + ! + ! '*' reductions + ! + + rg = 1 + rw = 1 + rv = 1 + rc = 1 + vresult = 1 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(*:rg) gang + do i = 1, n + do j = 1, n + rg(j) = rg(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(*:rw) worker + do i = 1, n + do j = 1, n + rw(j) = rw(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(*:rv) vector + do i = 1, n + do j = 1, n + rv(j) = rv(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(*:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = rc(j) * array(i) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = vresult(j) * array(i) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 5 + if (count (rw .ne. vresult) .ne. 0) STOP 6 + if (count (rv .ne. vresult) .ne. 0) STOP 7 + if (count (rc .ne. vresult) .ne. 0) STOP 8 + + ! + ! 'max' reductions + ! + + rg = 0 + rw = 0 + rv = 0 + rc = 0 + vresult = 0 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(max:rg) gang + do i = 1, n + do j = 1, n + rg(j) = max (rg(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(max:rw) worker + do i = 1, n + do j = 1, n + rw(j) = max (rw(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(max:rv) vector + do i = 1, n + do j = 1, n + rv(j) = max (rv(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(max:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = max (rc(j), array(i)) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = max (vresult(j), array(i)) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 9 + if (count (rw .ne. vresult) .ne. 0) STOP 10 + if (count (rv .ne. vresult) .ne. 0) STOP 11 + if (count (rc .ne. vresult) .ne. 0) STOP 12 + + ! + ! 'min' reductions + ! + + rg = n + 1 + rw = n + 1 + rv = n + 1 + rc = n + 1 + vresult = n + 1 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(min:rg) gang + do i = 1, n + do j = 1, n + rg(j) = min (rg(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(min:rw) worker + do i = 1, n + do j = 1, n + rw(j) = min (rw(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(min:rv) vector + do i = 1, n + do j = 1, n + rv(j) = min (rv(j), array(i)) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(min:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = min (rc(j), array(i)) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = min (vresult(j), array(i)) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 13 + if (count (rw .ne. vresult) .ne. 0) STOP 14 + if (count (rv .ne. vresult) .ne. 0) STOP 15 + if (count (rc .ne. vresult) .ne. 0) STOP 16 + + ! + ! '.and.' reductions + ! + + lrg = .true. + lrw = .true. + lrv = .true. + lrc = .true. + lvresult = .true. + + !$acc parallel num_gangs(ng) copy(lrg) + !$acc loop reduction(.and.:lrg) gang + do i = 1, n + do j = 1, n + lrg(j) = lrg(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(lrw) + !$acc loop reduction(.and.:lrw) worker + do i = 1, n + do j = 1, n + lrw(j) = lrw(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(lrv) + !$acc loop reduction(.and.:lrv) vector + do i = 1, n + do j = 1, n + lrv(j) = lrv(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc) + !$acc loop reduction(.and.:lrc) gang worker vector + do i = 1, n + do j = 1, n + lrc(j) = lrc(j) .and. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + lvresult(j) = lvresult(j) .and. (array(i) .ge. 5) + end do + end do + + if (count (lrg .neqv. lvresult) .ne. 0) STOP 17 + if (count (lrw .neqv. lvresult) .ne. 0) STOP 18 + if (count (lrv .neqv. lvresult) .ne. 0) STOP 19 + if (count (lrc .neqv. lvresult) .ne. 0) STOP 20 + + ! + ! '.or.' reductions + ! + + lrg = .true. + lrw = .true. + lrv = .true. + lrc = .true. + lvresult = .true. + + !$acc parallel num_gangs(ng) copy(lrg) + !$acc loop reduction(.or.:lrg) gang + do i = 1, n + do j = 1, n + lrg(j) = lrg(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(lrw) + !$acc loop reduction(.or.:lrw) worker + do i = 1, n + do j = 1, n + lrw(j) = lrw(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(lrv) + !$acc loop reduction(.or.:lrv) vector + do i = 1, n + do j = 1, n + lrv(j) = lrv(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc) + !$acc loop reduction(.or.:lrc) gang worker vector + do i = 1, n + do j = 1, n + lrc(j) = lrc(j) .or. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + lvresult(j) = lvresult(j) .or. (array(i) .ge. 5) + end do + end do + + if (count (lrg .neqv. lvresult) .ne. 0) STOP 21 + if (count (lrw .neqv. lvresult) .ne. 0) STOP 22 + if (count (lrv .neqv. lvresult) .ne. 0) STOP 23 + if (count (lrc .neqv. lvresult) .ne. 0) STOP 24 + + ! + ! '.eqv.' reductions + ! + + lrg = .true. + lrw = .true. + lrv = .true. + lrc = .true. + lvresult = .true. + + !$acc parallel num_gangs(ng) copy(lrg) + !$acc loop reduction(.eqv.:lrg) gang + do i = 1, n + do j = 1, n + lrg(j) = lrg(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(lrw) + !$acc loop reduction(.eqv.:lrw) worker + do i = 1, n + do j = 1, n + lrw(j) = lrw(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(lrv) + !$acc loop reduction(.eqv.:lrv) vector + do i = 1, n + do j = 1, n + lrv(j) = lrv(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(lrc) + !$acc loop reduction(.eqv.:lrc) gang worker vector + do i = 1, n + do j = 1, n + lrc(j) = lrc(j) .eqv. (array(i) .ge. 5) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + lvresult(j) = lvresult(j) .eqv. (array(i) .ge. 5) + end do + end do + + if (count (lrg .neqv. lvresult) .ne. 0) STOP 25 + if (count (lrw .neqv. lvresult) .ne. 0) STOP 26 + if (count (lrv .neqv. lvresult) .ne. 0) STOP 27 + if (count (lrc .neqv. lvresult) .ne. 0) STOP 28 + +end program main diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-13.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-13.f90 new file mode 100644 index 0000000..701cbb9 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-13.f90 @@ -0,0 +1,134 @@ +! { dg-do run } + +! complex array reductions + +program main + implicit none + + integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32 + integer :: i, j + complex, dimension (n) :: vresult, rg, rw, rv, rc + logical, dimension (n) :: lrg, lrw, lrv, lrc, lvresult + complex, dimension (n) :: array + + do i = 1, n + array(i) = i + end do + + ! + ! '+' reductions + ! + + rg = 0 + rw = 0 + rv = 0 + rc = 0 + vresult = 0 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(+:rg) gang + do i = 1, n + do j = 1, n + rg(j) = rg(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(+:rw) worker + do i = 1, n + do j = 1, n + rw(j) = rw(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(+:rv) vector + do i = 1, n + do j = 1, n + rv(j) = rv(j) + array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(+:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = rc(j) + array(i) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = vresult(j) + array(i) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 1 + if (count (rw .ne. vresult) .ne. 0) STOP 2 + if (count (rv .ne. vresult) .ne. 0) STOP 3 + if (count (rc .ne. vresult) .ne. 0) STOP 4 + + ! + ! '*' reductions + ! + + rg = 1 + rw = 1 + rv = 1 + rc = 1 + vresult = 1 + + !$acc parallel num_gangs(ng) copy(rg) + !$acc loop reduction(*:rg) gang + do i = 1, n + do j = 1, n + rg(j) = rg(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_workers(nw) copy(rw) + !$acc loop reduction(*:rw) worker + do i = 1, n + do j = 1, n + rw(j) = rw(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel vector_length(vl) copy(rv) + !$acc loop reduction(*:rv) vector + do i = 1, n + do j = 1, n + rv(j) = rv(j) * array(i) + end do + end do + !$acc end parallel + + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(rc) + !$acc loop reduction(*:rc) gang worker vector + do i = 1, n + do j = 1, n + rc(j) = rc(j) * array(i) + end do + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + do j = 1, n + vresult(j) = vresult(j) * array(i) + end do + end do + + if (count (rg .ne. vresult) .ne. 0) STOP 5 + if (count (rw .ne. vresult) .ne. 0) STOP 6 + if (count (rv .ne. vresult) .ne. 0) STOP 7 + if (count (rc .ne. vresult) .ne. 0) STOP 8 + +end program main diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-14.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-14.f90 new file mode 100644 index 0000000..95e56c9 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-14.f90 @@ -0,0 +1,68 @@ +! { dg-do run } + +! record type reductions + +program main + implicit none + + type t1 + integer :: i + real :: r + end type t1 + + type t2 + real :: r + integer :: i + double precision :: d + end type t2 + + double precision, parameter :: e = 0.001 + integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32 + integer :: i + type(t1) :: v1, a1 + type (t2) :: v2, a2 + + v1%i = 0 + v1%r = 0 + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(v1) + !$acc loop reduction (+:v1) + do i = 1, n + v1%i = v1%i + 1 + v1%r = v1%r + 2 + end do + !$acc end parallel + a1%i = 0 + a1%r = 0 + do i = 1, n + a1%i = a1%i + 1 + a1%r = a1%r + 2 + end do + if (v1%i .ne. a1%i) STOP 1 + if (v1%r .ne. a1%r) STOP 2 + + v2%i = 1 + v2%r = 1 + v2%d = 1 + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(v2) + !$acc loop reduction (*:v2) + do i = 1, n + v2%i = v2%i * 2 + v2%r = v2%r * 1.1 + v2%d = v2%d * 1.3 + end do + !$acc end parallel + a2%i = 1 + a2%r = 1 + a2%d = 1 + do i = 1, n + a2%i = a2%i * 2 + a2%r = a2%r * 1.1 + a2%d = a2%d * 1.3 + end do + + if (v2%i .ne. a2%i) STOP 3 + if (v2%r .ne. a2%r) STOP 4 + if (abs (v2%d - a2%d) .ge. e) STOP 5 + +end program main + diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-15.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-15.f90 new file mode 100644 index 0000000..7a36fb2 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-15.f90 @@ -0,0 +1,98 @@ +! { dg-do run } +! { dg-additional-options "-cpp" } + +#define ARRAY_BODY(ARRAY, MIN, MAX) \ + do i = 1, 10; \ + do j = MIN, MAX; \ + ARRAY(j) = ARRAY(j) + 1; \ + end do; \ + end do + +program main + implicit none + integer :: i, j, max = 6, two = 2, three = 3, four = 4, five = 5, six = 6 + integer :: a(6) = (/ 5, 1, 1, 5, 9, 9 /) + integer :: o(6) + o = a + + !$acc parallel + !$acc loop reduction(+:a(2:3)) + ARRAY_BODY (a, 2, 3) + !$acc end parallel + ARRAY_BODY (o, 2, 3) + do i = 1, max + if (a(i) .ne. o(i)) STOP 1 + end do + + !$acc parallel copy(a(4:6)) + !$acc loop reduction(+:a(4:6)) + ARRAY_BODY (a, 4, 6) + !$acc end parallel + ARRAY_BODY (o, 4, 6) + do i = 1, max + if (a(i) .ne. o(i)) STOP 2 + end do + + !$acc parallel copy(a) + !$acc loop reduction(+:a(1:6)) + ARRAY_BODY (a, 1, 6) + !$acc end parallel + ARRAY_BODY (o, 1, 6) + do i = 1, max + if (a(i) .ne. o(i)) STOP 3 + end do + + !$acc parallel + !$acc loop reduction(+:a) + ARRAY_BODY (a, 4, 4) + !$acc end parallel + ARRAY_BODY (o, 4, 4) + do i = 1, max + if (a(i) .ne. o(i)) STOP 4 + end do + + !$acc parallel copy(a) + !$acc loop reduction(+:a) + ARRAY_BODY (a, 4, 6) + !$acc end parallel + ARRAY_BODY (o, 4, 6) + do i = 1, max + if (a(i) .ne. o(i)) STOP 5 + end do + +#if !defined(ACC_DEVICE_TYPE_host) + + !$acc parallel loop reduction(+:a) + ARRAY_BODY (a, 2, 4) + !$acc end parallel loop + ARRAY_BODY (o, 2, 4) + do i = 1, max + if (a(i) .ne. o(i)) STOP 6 + end do + + !$acc parallel loop reduction(+:a(2:4)) + ARRAY_BODY (a, 2, 4) + !$acc end parallel loop + ARRAY_BODY (o, 2, 4) + do i = 1, max + if (a(i) .ne. o(i)) STOP 7 + end do + + !$acc parallel reduction(+:a) + ARRAY_BODY (a, 3, 4) + !$acc end parallel + ARRAY_BODY (o, 3, 4) + do i = 1, max + if (a(i) .ne. o(i)) STOP 8 + end do + + !$acc parallel reduction(+:a(2:3)) + ARRAY_BODY (a, 2, 3) + !$acc end parallel + ARRAY_BODY (o, 2, 3) + do i = 1, max + if (a(i) .ne. o(i)) STOP 9 + end do +#endif + +end program main diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-16.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-16.f90 new file mode 100644 index 0000000..c524f2a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-16.f90 @@ -0,0 +1,99 @@ +! { dg-do run } +! { dg-additional-options "-cpp" } + +#define ARRAY_BODY(ARRAY, MIN, MAX) \ + do i = 1, 10; \ + do j = MIN, MAX; \ + ARRAY(j) = ARRAY(j) + 1; \ + end do; \ + end do + +program main + implicit none + integer :: i, j, max = 6, one = 1, two = 2, three = 3, four = 4, five = 5, six = 6 + integer :: a(6) = (/ 5, 1, 1, 5, 9, 9 /) + integer :: o(6) + o = a + + !$acc parallel + !$acc loop reduction(+:a(two:three)) + ARRAY_BODY (a, two, three) + !$acc end parallel + + ARRAY_BODY (o, two, three) + do i = 1, max + if (a(i) .ne. o(i)) STOP 1 + end do + + !$acc parallel copy(a(four:six)) + !$acc loop reduction(+:a(four:six)) + ARRAY_BODY (a, four, six) + !$acc end parallel + ARRAY_BODY (o, four, six) + do i = 1, max + if (a(i) .ne. o(i)) STOP 2 + end do + + !$acc parallel copy(a) + !$acc loop reduction(+:a(one:six)) + ARRAY_BODY (a, one, six) + !$acc end parallel + ARRAY_BODY (o, one, six) + do i = 1, max + if (a(i) .ne. o(i)) STOP 3 + end do + + !$acc parallel + !$acc loop reduction(+:a) + ARRAY_BODY (a, four, four) + !$acc end parallel + ARRAY_BODY (o, four, four) + do i = 1, max + if (a(i) .ne. o(i)) STOP 4 + end do + + !$acc parallel copy(a) + !$acc loop reduction(+:a) + ARRAY_BODY (a, four, six) + !$acc end parallel + ARRAY_BODY (o, four, six) + do i = 1, max + if (a(i) .ne. o(i)) STOP 5 + end do + +#if !defined(ACC_DEVICE_TYPE_host) + + !$acc parallel loop reduction(+:a) + ARRAY_BODY (a, two, four) + !$acc end parallel loop + ARRAY_BODY (o, two, four) + do i = 1, max + if (a(i) .ne. o(i)) STOP 6 + end do + + !$acc parallel loop reduction(+:a(two:four)) + ARRAY_BODY (a, two, four) + !$acc end parallel loop + ARRAY_BODY (o, two, four) + do i = 1, max + if (a(i) .ne. o(i)) STOP 7 + end do + + !$acc parallel reduction(+:a) + ARRAY_BODY (a, three, four) + !$acc end parallel + ARRAY_BODY (o, three, four) + do i = 1, max + if (a(i) .ne. o(i)) STOP 8 + end do + + !$acc parallel reduction(+:a(two:three)) + ARRAY_BODY (a, two, three) + !$acc end parallel + ARRAY_BODY (o, two, three) + do i = 1, max + if (a(i) .ne. o(i)) STOP 9 + end do +#endif + +end program main |