diff options
author | Thomas Schwinge <thomas@codesourcery.com> | 2015-01-15 21:11:12 +0100 |
---|---|---|
committer | Thomas Schwinge <tschwinge@gcc.gnu.org> | 2015-01-15 21:11:12 +0100 |
commit | 41dbbb3789850dfea98dd8984f69806284f87b6e (patch) | |
tree | 97a0bb274cc7583206397ba37ab5c0bbe01cb04d /libgomp/testsuite | |
parent | 96a87981994da859c17259d8c4dccb6602476b0e (diff) | |
download | gcc-41dbbb3789850dfea98dd8984f69806284f87b6e.zip gcc-41dbbb3789850dfea98dd8984f69806284f87b6e.tar.gz gcc-41dbbb3789850dfea98dd8984f69806284f87b6e.tar.bz2 |
Merge current set of OpenACC changes from gomp-4_0-branch.
contrib/
* gcc_update (files_and_dependencies): Update rules for new
libgomp/plugin/Makefrag.am and libgomp/plugin/configfrag.ac files.
gcc/
* builtin-types.def (BT_FN_VOID_INT_INT_VAR)
(BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR_INT_INT_VAR)
(BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR_INT_INT_INT_INT_INT_VAR):
New function types.
* builtins.c: Include "gomp-constants.h".
(expand_builtin_acc_on_device): New function.
(expand_builtin, is_inexpensive_builtin): Handle
BUILT_IN_ACC_ON_DEVICE.
* builtins.def (DEF_GOACC_BUILTIN, DEF_GOACC_BUILTIN_COMPILER):
New macros.
* cgraph.c (cgraph_node::create): Consider flag_openacc next to
flag_openmp.
* config.gcc <nvptx-*> (tm_file): Add nvptx/offload.h.
<*-intelmic-* | *-intelmicemul-*> (tm_file): Add
i386/intelmic-offload.h.
* gcc.c (LINK_COMMAND_SPEC, GOMP_SELF_SPECS): For -fopenacc, link
to libgomp and its dependencies.
* config/arc/arc.h (LINK_COMMAND_SPEC): Likewise.
* config/darwin.h (LINK_COMMAND_SPEC_A): Likewise.
* config/i386/mingw32.h (GOMP_SELF_SPECS): Likewise.
* config/ia64/hpux.h (LIB_SPEC): Likewise.
* config/pa/pa-hpux11.h (LIB_SPEC): Likewise.
* config/pa/pa64-hpux.h (LIB_SPEC): Likewise.
* doc/generic.texi: Update for OpenACC changes.
* doc/gimple.texi: Likewise.
* doc/invoke.texi: Likewise.
* doc/sourcebuild.texi: Likewise.
* gimple-pretty-print.c (dump_gimple_omp_for): Handle
GF_OMP_FOR_KIND_OACC_LOOP.
(dump_gimple_omp_target): Handle GF_OMP_TARGET_KIND_OACC_KERNELS,
GF_OMP_TARGET_KIND_OACC_PARALLEL, GF_OMP_TARGET_KIND_OACC_DATA,
GF_OMP_TARGET_KIND_OACC_UPDATE,
GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA.
Dump more data.
* gimple.c: Update comments for OpenACC changes.
* gimple.def: Likewise.
* gimple.h: Likewise.
(enum gf_mask): Add GF_OMP_FOR_KIND_OACC_LOOP,
GF_OMP_TARGET_KIND_OACC_PARALLEL, GF_OMP_TARGET_KIND_OACC_KERNELS,
GF_OMP_TARGET_KIND_OACC_DATA, GF_OMP_TARGET_KIND_OACC_UPDATE,
GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA.
(gimple_omp_for_cond, gimple_omp_for_set_cond): Sort in the
appropriate place.
(is_gimple_omp_oacc, is_gimple_omp_offloaded): New functions.
* gimplify.c: Include "gomp-constants.h".
Update comments for OpenACC changes.
(is_gimple_stmt): Handle OACC_PARALLEL, OACC_KERNELS, OACC_DATA,
OACC_HOST_DATA, OACC_DECLARE, OACC_UPDATE, OACC_ENTER_DATA,
OACC_EXIT_DATA, OACC_CACHE, OACC_LOOP.
(gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses): Handle
OMP_CLAUSE__CACHE_, OMP_CLAUSE_ASYNC, OMP_CLAUSE_WAIT,
OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS,
OMP_CLAUSE_VECTOR_LENGTH, OMP_CLAUSE_GANG, OMP_CLAUSE_WORKER,
OMP_CLAUSE_VECTOR, OMP_CLAUSE_DEVICE_RESIDENT,
OMP_CLAUSE_USE_DEVICE, OMP_CLAUSE_INDEPENDENT, OMP_CLAUSE_AUTO,
OMP_CLAUSE_SEQ.
(gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses): Use
GOMP_MAP_* instead of OMP_CLAUSE_MAP_*. Use
OMP_CLAUSE_SET_MAP_KIND.
(gimplify_oacc_cache): New function.
(gimplify_omp_for): Handle OACC_LOOP.
(gimplify_omp_workshare): Handle OACC_KERNELS, OACC_PARALLEL,
OACC_DATA.
(gimplify_omp_target_update): Handle OACC_ENTER_DATA,
OACC_EXIT_DATA, OACC_UPDATE.
(gimplify_expr): Handle OACC_LOOP, OACC_CACHE, OACC_HOST_DATA,
OACC_DECLARE, OACC_KERNELS, OACC_PARALLEL, OACC_DATA,
OACC_ENTER_DATA, OACC_EXIT_DATA, OACC_UPDATE.
(gimplify_body): Consider flag_openacc next to flag_openmp.
* lto-streamer-out.c: Include "gomp-constants.h".
* omp-builtins.def (BUILT_IN_ACC_GET_DEVICE_TYPE)
(BUILT_IN_GOACC_DATA_START, BUILT_IN_GOACC_DATA_END)
(BUILT_IN_GOACC_ENTER_EXIT_DATA, BUILT_IN_GOACC_PARALLEL)
(BUILT_IN_GOACC_UPDATE, BUILT_IN_GOACC_WAIT)
(BUILT_IN_GOACC_GET_THREAD_NUM, BUILT_IN_GOACC_GET_NUM_THREADS)
(BUILT_IN_ACC_ON_DEVICE): New builtins.
* omp-low.c: Include "gomp-constants.h".
Update comments for OpenACC changes.
(struct omp_context): Add reduction_map, gwv_below, gwv_this
members.
(extract_omp_for_data, use_pointer_for_field, install_var_field)
(new_omp_context, delete_omp_context, scan_sharing_clauses)
(create_omp_child_function, scan_omp_for, scan_omp_target)
(check_omp_nesting_restrictions, lower_reduction_clauses)
(build_omp_regions_1, diagnose_sb_0, make_gimple_omp_edges):
Update for OpenACC changes.
(scan_sharing_clauses): Handle OMP_CLAUSE_NUM_GANGS:
OMP_CLAUSE_NUM_WORKERS: OMP_CLAUSE_VECTOR_LENGTH,
OMP_CLAUSE_ASYNC, OMP_CLAUSE_WAIT, OMP_CLAUSE_GANG,
OMP_CLAUSE_WORKER, OMP_CLAUSE_VECTOR, OMP_CLAUSE_DEVICE_RESIDENT,
OMP_CLAUSE_USE_DEVICE, OMP_CLAUSE__CACHE_, OMP_CLAUSE_INDEPENDENT,
OMP_CLAUSE_AUTO, OMP_CLAUSE_SEQ. Use GOMP_MAP_* instead of
OMP_CLAUSE_MAP_*.
(expand_omp_for_static_nochunk, expand_omp_for_static_chunk):
Handle GF_OMP_FOR_KIND_OACC_LOOP.
(expand_omp_target, lower_omp_target): Handle
GF_OMP_TARGET_KIND_OACC_PARALLEL, GF_OMP_TARGET_KIND_OACC_KERNELS,
GF_OMP_TARGET_KIND_OACC_UPDATE,
GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA,
GF_OMP_TARGET_KIND_OACC_DATA.
(pass_expand_omp::execute, execute_lower_omp)
(pass_diagnose_omp_blocks::gate): Consider flag_openacc next to
flag_openmp.
(offload_symbol_decl): New variable.
(oacc_get_reduction_array_id, oacc_max_threads)
(get_offload_symbol_decl, get_base_type, lookup_oacc_reduction)
(maybe_lookup_oacc_reduction, enclosing_target_ctx)
(oacc_loop_or_target_p, oacc_lower_reduction_var_helper)
(oacc_gimple_assign, oacc_initialize_reduction_data)
(oacc_finalize_reduction_data, oacc_process_reduction_data): New
functions.
(is_targetreg_ctx): Remove function.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE__CACHE_,
OMP_CLAUSE_DEVICE_RESIDENT, OMP_CLAUSE_USE_DEVICE,
OMP_CLAUSE_GANG, OMP_CLAUSE_ASYNC, OMP_CLAUSE_WAIT,
OMP_CLAUSE_AUTO, OMP_CLAUSE_SEQ, OMP_CLAUSE_INDEPENDENT,
OMP_CLAUSE_WORKER, OMP_CLAUSE_VECTOR, OMP_CLAUSE_NUM_GANGS,
OMP_CLAUSE_NUM_WORKERS, OMP_CLAUSE_VECTOR_LENGTH.
* tree.c (omp_clause_code_name, walk_tree_1): Update accordingly.
* tree.h (OMP_CLAUSE_GANG_EXPR, OMP_CLAUSE_GANG_STATIC_EXPR)
(OMP_CLAUSE_ASYNC_EXPR, OMP_CLAUSE_WAIT_EXPR)
(OMP_CLAUSE_VECTOR_EXPR, OMP_CLAUSE_WORKER_EXPR)
(OMP_CLAUSE_NUM_GANGS_EXPR, OMP_CLAUSE_NUM_WORKERS_EXPR)
(OMP_CLAUSE_VECTOR_LENGTH_EXPR): New macros.
* tree-core.h: Update comments for OpenACC changes.
(enum omp_clause_map_kind): Remove.
(struct tree_omp_clause): Change type of map_kind member from enum
omp_clause_map_kind to unsigned char.
* tree-inline.c: Update comments for OpenACC changes.
* tree-nested.c: Likewise. Include "gomp-constants.h".
(convert_nonlocal_reference_stmt, convert_local_reference_stmt)
(convert_tramp_reference_stmt, convert_gimple_call): Update for
OpenACC changes. Use GOMP_MAP_* instead of OMP_CLAUSE_MAP_*. Use
OMP_CLAUSE_SET_MAP_KIND.
* tree-pretty-print.c: Include "gomp-constants.h".
(dump_omp_clause): Handle OMP_CLAUSE_DEVICE_RESIDENT,
OMP_CLAUSE_USE_DEVICE, OMP_CLAUSE__CACHE_, OMP_CLAUSE_GANG,
OMP_CLAUSE_ASYNC, OMP_CLAUSE_AUTO, OMP_CLAUSE_SEQ,
OMP_CLAUSE_WAIT, OMP_CLAUSE_WORKER, OMP_CLAUSE_VECTOR,
OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS,
OMP_CLAUSE_VECTOR_LENGTH, OMP_CLAUSE_INDEPENDENT. Use GOMP_MAP_*
instead of OMP_CLAUSE_MAP_*.
(dump_generic_node): Handle OACC_PARALLEL, OACC_KERNELS,
OACC_DATA, OACC_HOST_DATA, OACC_DECLARE, OACC_UPDATE,
OACC_ENTER_DATA, OACC_EXIT_DATA, OACC_CACHE, OACC_LOOP.
* tree-streamer-in.c: Include "gomp-constants.h".
(unpack_ts_omp_clause_value_fields) Use GOMP_MAP_* instead of
OMP_CLAUSE_MAP_*. Use OMP_CLAUSE_SET_MAP_KIND.
* tree-streamer-out.c: Include "gomp-constants.h".
(pack_ts_omp_clause_value_fields): Use GOMP_MAP_* instead of
OMP_CLAUSE_MAP_*.
* tree.def (OACC_PARALLEL, OACC_KERNELS, OACC_DATA)
(OACC_HOST_DATA, OACC_LOOP, OACC_CACHE, OACC_DECLARE)
(OACC_ENTER_DATA, OACC_EXIT_DATA, OACC_UPDATE): New tree codes.
* tree.c (omp_clause_num_ops): Update accordingly.
* tree.h (OMP_BODY, OMP_CLAUSES, OMP_LOOP_CHECK, OMP_CLAUSE_SIZE):
Likewise.
(OACC_PARALLEL_BODY, OACC_PARALLEL_CLAUSES, OACC_KERNELS_BODY)
(OACC_KERNELS_CLAUSES, OACC_DATA_BODY, OACC_DATA_CLAUSES)
(OACC_HOST_DATA_BODY, OACC_HOST_DATA_CLAUSES, OACC_CACHE_CLAUSES)
(OACC_DECLARE_CLAUSES, OACC_ENTER_DATA_CLAUSES)
(OACC_EXIT_DATA_CLAUSES, OACC_UPDATE_CLAUSES)
(OACC_KERNELS_COMBINED, OACC_PARALLEL_COMBINED): New macros.
* tree.h (OMP_CLAUSE_MAP_KIND): Cast it to enum gomp_map_kind.
(OMP_CLAUSE_SET_MAP_KIND): New macro.
* varpool.c (varpool_node::get_create): Consider flag_openacc next
to flag_openmp.
* config/i386/intelmic-offload.h: New file.
* config/nvptx/offload.h: Likewise.
gcc/ada/
* gcc-interface/utils.c (DEF_FUNCTION_TYPE_VAR_8)
(DEF_FUNCTION_TYPE_VAR_12): New macros.
gcc/c-family/
* c.opt (fopenacc): New option.
* c-cppbuiltin.c (c_cpp_builtins): Conditionally define _OPENACC.
* c-common.c (DEF_FUNCTION_TYPE_VAR_8, DEF_FUNCTION_TYPE_VAR_12):
New macros.
* c-common.h (c_finish_oacc_wait): New prototype.
* c-omp.c: Include "omp-low.h" and "gomp-constants.h".
(c_finish_oacc_wait): New function.
* c-pragma.c (oacc_pragmas): New variable.
(c_pp_lookup_pragma, init_pragma): Handle it.
* c-pragma.h (enum pragma_kind): Add PRAGMA_OACC_CACHE,
PRAGMA_OACC_DATA, PRAGMA_OACC_ENTER_DATA, PRAGMA_OACC_EXIT_DATA,
PRAGMA_OACC_KERNELS, PRAGMA_OACC_LOOP, PRAGMA_OACC_PARALLEL,
PRAGMA_OACC_UPDATE, PRAGMA_OACC_WAIT.
(enum pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ASYNC,
PRAGMA_OACC_CLAUSE_AUTO, PRAGMA_OACC_CLAUSE_COLLAPSE,
PRAGMA_OACC_CLAUSE_COPY, PRAGMA_OACC_CLAUSE_COPYIN,
PRAGMA_OACC_CLAUSE_COPYOUT, PRAGMA_OACC_CLAUSE_CREATE,
PRAGMA_OACC_CLAUSE_DELETE, PRAGMA_OACC_CLAUSE_DEVICE,
PRAGMA_OACC_CLAUSE_DEVICEPTR, PRAGMA_OACC_CLAUSE_FIRSTPRIVATE,
PRAGMA_OACC_CLAUSE_GANG, PRAGMA_OACC_CLAUSE_HOST,
PRAGMA_OACC_CLAUSE_IF, PRAGMA_OACC_CLAUSE_NUM_GANGS,
PRAGMA_OACC_CLAUSE_NUM_WORKERS, PRAGMA_OACC_CLAUSE_PRESENT,
PRAGMA_OACC_CLAUSE_PRESENT_OR_COPY,
PRAGMA_OACC_CLAUSE_PRESENT_OR_COPYIN,
PRAGMA_OACC_CLAUSE_PRESENT_OR_COPYOUT,
PRAGMA_OACC_CLAUSE_PRESENT_OR_CREATE, PRAGMA_OACC_CLAUSE_PRIVATE,
PRAGMA_OACC_CLAUSE_REDUCTION, PRAGMA_OACC_CLAUSE_SELF,
PRAGMA_OACC_CLAUSE_SEQ, PRAGMA_OACC_CLAUSE_VECTOR,
PRAGMA_OACC_CLAUSE_VECTOR_LENGTH, PRAGMA_OACC_CLAUSE_WAIT,
PRAGMA_OACC_CLAUSE_WORKER.
gcc/c/
* c-parser.c: Include "gomp-constants.h".
(c_parser_omp_clause_map): Use enum gomp_map_kind instead of enum
omp_clause_map_kind. Use GOMP_MAP_* instead of OMP_CLAUSE_MAP_*.
Use OMP_CLAUSE_SET_MAP_KIND.
(c_parser_pragma): Handle PRAGMA_OACC_ENTER_DATA,
PRAGMA_OACC_EXIT_DATA, PRAGMA_OACC_UPDATE.
(c_parser_omp_construct): Handle PRAGMA_OACC_CACHE,
PRAGMA_OACC_DATA, PRAGMA_OACC_KERNELS, PRAGMA_OACC_LOOP,
PRAGMA_OACC_PARALLEL, PRAGMA_OACC_WAIT.
(c_parser_omp_clause_name): Handle "auto", "async", "copy",
"copyout", "create", "delete", "deviceptr", "gang", "host",
"num_gangs", "num_workers", "present", "present_or_copy", "pcopy",
"present_or_copyin", "pcopyin", "present_or_copyout", "pcopyout",
"present_or_create", "pcreate", "seq", "self", "vector",
"vector_length", "wait", "worker".
(OACC_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK)
(OACC_ENTER_DATA_CLAUSE_MASK, OACC_EXIT_DATA_CLAUSE_MASK)
(OACC_LOOP_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK)
(OACC_UPDATE_CLAUSE_MASK, OACC_WAIT_CLAUSE_MASK): New macros.
(c_parser_omp_variable_list): Handle OMP_CLAUSE__CACHE_.
(c_parser_oacc_wait_list, c_parser_oacc_data_clause)
(c_parser_oacc_data_clause_deviceptr)
(c_parser_omp_clause_num_gangs, c_parser_omp_clause_num_workers)
(c_parser_oacc_clause_async, c_parser_oacc_clause_wait)
(c_parser_omp_clause_vector_length, c_parser_oacc_all_clauses)
(c_parser_oacc_cache, c_parser_oacc_data, c_parser_oacc_kernels)
(c_parser_oacc_enter_exit_data, c_parser_oacc_loop)
(c_parser_oacc_parallel, c_parser_oacc_update)
(c_parser_oacc_wait): New functions.
* c-tree.h (c_finish_oacc_parallel, c_finish_oacc_kernels)
(c_finish_oacc_data): New prototypes.
* c-typeck.c: Include "gomp-constants.h".
(handle_omp_array_sections): Handle GOMP_MAP_FORCE_DEVICEPTR. Use
GOMP_MAP_* instead of OMP_CLAUSE_MAP_*. Use
OMP_CLAUSE_SET_MAP_KIND.
(c_finish_oacc_parallel, c_finish_oacc_kernels)
(c_finish_oacc_data): New functions.
(c_finish_omp_clauses): Handle OMP_CLAUSE__CACHE_,
OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS,
OMP_CLAUSE_VECTOR_LENGTH, OMP_CLAUSE_ASYNC, OMP_CLAUSE_WAIT,
OMP_CLAUSE_AUTO, OMP_CLAUSE_SEQ, OMP_CLAUSE_GANG,
OMP_CLAUSE_WORKER, OMP_CLAUSE_VECTOR, and OMP_CLAUSE_MAP's
GOMP_MAP_FORCE_DEVICEPTR.
gcc/cp/
* parser.c: Include "gomp-constants.h".
(cp_parser_omp_clause_map): Use enum gomp_map_kind instead of enum
omp_clause_map_kind. Use GOMP_MAP_* instead of OMP_CLAUSE_MAP_*.
Use OMP_CLAUSE_SET_MAP_KIND.
(cp_parser_omp_construct, cp_parser_pragma): Handle
PRAGMA_OACC_CACHE, PRAGMA_OACC_DATA, PRAGMA_OACC_ENTER_DATA,
PRAGMA_OACC_EXIT_DATA, PRAGMA_OACC_KERNELS, PRAGMA_OACC_PARALLEL,
PRAGMA_OACC_LOOP, PRAGMA_OACC_UPDATE, PRAGMA_OACC_WAIT.
(cp_parser_omp_clause_name): Handle "async", "copy", "copyout",
"create", "delete", "deviceptr", "host", "num_gangs",
"num_workers", "present", "present_or_copy", "pcopy",
"present_or_copyin", "pcopyin", "present_or_copyout", "pcopyout",
"present_or_create", "pcreate", "vector_length", "wait".
(OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
(OACC_EXIT_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK)
(OACC_LOOP_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK)
(OACC_UPDATE_CLAUSE_MASK, OACC_WAIT_CLAUSE_MASK): New macros.
(cp_parser_omp_var_list_no_open): Handle OMP_CLAUSE__CACHE_.
(cp_parser_oacc_data_clause, cp_parser_oacc_data_clause_deviceptr)
(cp_parser_oacc_clause_vector_length, cp_parser_oacc_wait_list)
(cp_parser_oacc_clause_wait, cp_parser_omp_clause_num_gangs)
(cp_parser_omp_clause_num_workers, cp_parser_oacc_clause_async)
(cp_parser_oacc_all_clauses, cp_parser_oacc_cache)
(cp_parser_oacc_data, cp_parser_oacc_enter_exit_data)
(cp_parser_oacc_kernels, cp_parser_oacc_loop)
(cp_parser_oacc_parallel, cp_parser_oacc_update)
(cp_parser_oacc_wait): New functions.
* cp-tree.h (finish_oacc_data, finish_oacc_kernels)
(finish_oacc_parallel): New prototypes.
* semantics.c: Include "gomp-constants.h".
(handle_omp_array_sections): Handle GOMP_MAP_FORCE_DEVICEPTR. Use
GOMP_MAP_* instead of OMP_CLAUSE_MAP_*. Use
OMP_CLAUSE_SET_MAP_KIND.
(finish_omp_clauses): Handle OMP_CLAUSE_ASYNC,
OMP_CLAUSE_VECTOR_LENGTH, OMP_CLAUSE_WAIT, OMP_CLAUSE__CACHE_.
Use GOMP_MAP_* instead of OMP_CLAUSE_MAP_*.
(finish_oacc_data, finish_oacc_kernels, finish_oacc_parallel): New
functions.
gcc/fortran/
* lang.opt (fopenacc): New option.
* cpp.c (cpp_define_builtins): Conditionally define _OPENACC.
* dump-parse-tree.c (show_omp_node): Split part of it into...
(show_omp_clauses): ... this new function.
(show_omp_node, show_code_node): Handle EXEC_OACC_PARALLEL_LOOP,
EXEC_OACC_PARALLEL, EXEC_OACC_KERNELS_LOOP, EXEC_OACC_KERNELS,
EXEC_OACC_DATA, EXEC_OACC_HOST_DATA, EXEC_OACC_LOOP,
EXEC_OACC_UPDATE, EXEC_OACC_WAIT, EXEC_OACC_CACHE,
EXEC_OACC_ENTER_DATA, EXEC_OACC_EXIT_DATA.
(show_namespace): Update for OpenACC.
* f95-lang.c (DEF_FUNCTION_TYPE_VAR_2, DEF_FUNCTION_TYPE_VAR_8)
(DEF_FUNCTION_TYPE_VAR_12, DEF_GOACC_BUILTIN)
(DEF_GOACC_BUILTIN_COMPILER): New macros.
* types.def (BT_FN_VOID_INT_INT_VAR)
(BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR_INT_INT_VAR)
(BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR_INT_INT_INT_INT_INT_VAR):
New function types.
* gfortran.h (gfc_statement): Add ST_OACC_PARALLEL_LOOP,
ST_OACC_END_PARALLEL_LOOP, ST_OACC_PARALLEL, ST_OACC_END_PARALLEL,
ST_OACC_KERNELS, ST_OACC_END_KERNELS, ST_OACC_DATA,
ST_OACC_END_DATA, ST_OACC_HOST_DATA, ST_OACC_END_HOST_DATA,
ST_OACC_LOOP, ST_OACC_END_LOOP, ST_OACC_DECLARE, ST_OACC_UPDATE,
ST_OACC_WAIT, ST_OACC_CACHE, ST_OACC_KERNELS_LOOP,
ST_OACC_END_KERNELS_LOOP, ST_OACC_ENTER_DATA, ST_OACC_EXIT_DATA,
ST_OACC_ROUTINE.
(struct gfc_expr_list): New data type.
(gfc_get_expr_list): New macro.
(gfc_omp_map_op): Add OMP_MAP_FORCE_ALLOC, OMP_MAP_FORCE_DEALLOC,
OMP_MAP_FORCE_TO, OMP_MAP_FORCE_FROM, OMP_MAP_FORCE_TOFROM,
OMP_MAP_FORCE_PRESENT, OMP_MAP_FORCE_DEVICEPTR.
(OMP_LIST_FIRST, OMP_LIST_DEVICE_RESIDENT, OMP_LIST_USE_DEVICE)
(OMP_LIST_CACHE): New enumerators.
(struct gfc_omp_clauses): Add async_expr, gang_expr, worker_expr,
vector_expr, num_gangs_expr, num_workers_expr, vector_length_expr,
wait_list, tile_list, async, gang, worker, vector, seq,
independent, wait, par_auto, gang_static, and loc members.
(struct gfc_namespace): Add oacc_declare_clauses member.
(gfc_exec_op): Add EXEC_OACC_KERNELS_LOOP,
EXEC_OACC_PARALLEL_LOOP, EXEC_OACC_PARALLEL, EXEC_OACC_KERNELS,
EXEC_OACC_DATA, EXEC_OACC_HOST_DATA, EXEC_OACC_LOOP,
EXEC_OACC_UPDATE, EXEC_OACC_WAIT, EXEC_OACC_CACHE,
EXEC_OACC_ENTER_DATA, EXEC_OACC_EXIT_DATA.
(gfc_free_expr_list, gfc_resolve_oacc_directive)
(gfc_resolve_oacc_declare, gfc_resolve_oacc_parallel_loop_blocks)
(gfc_resolve_oacc_blocks): New prototypes.
* match.c (match_exit_cycle): Handle EXEC_OACC_LOOP and
EXEC_OACC_PARALLEL_LOOP.
* match.h (gfc_match_oacc_cache, gfc_match_oacc_wait)
(gfc_match_oacc_update, gfc_match_oacc_declare)
(gfc_match_oacc_loop, gfc_match_oacc_host_data)
(gfc_match_oacc_data, gfc_match_oacc_kernels)
(gfc_match_oacc_kernels_loop, gfc_match_oacc_parallel)
(gfc_match_oacc_parallel_loop, gfc_match_oacc_enter_data)
(gfc_match_oacc_exit_data, gfc_match_oacc_routine): New
prototypes.
* openmp.c: Include "diagnostic.h" and "gomp-constants.h".
(gfc_free_omp_clauses): Update for members added to struct
gfc_omp_clauses.
(gfc_match_omp_clauses): Change mask paramter to uint64_t. Add
openacc parameter.
(resolve_omp_clauses): Add openacc parameter. Update for OpenACC.
(struct fortran_omp_context): Add is_openmp member.
(gfc_resolve_omp_parallel_blocks): Initialize it.
(gfc_resolve_do_iterator): Update for OpenACC.
(gfc_resolve_omp_directive): Call
resolve_omp_directive_inside_oacc_region.
(OMP_CLAUSE_PRIVATE, OMP_CLAUSE_FIRSTPRIVATE)
(OMP_CLAUSE_LASTPRIVATE, OMP_CLAUSE_COPYPRIVATE)
(OMP_CLAUSE_SHARED, OMP_CLAUSE_COPYIN, OMP_CLAUSE_REDUCTION)
(OMP_CLAUSE_IF, OMP_CLAUSE_NUM_THREADS, OMP_CLAUSE_SCHEDULE)
(OMP_CLAUSE_DEFAULT, OMP_CLAUSE_ORDERED, OMP_CLAUSE_COLLAPSE)
(OMP_CLAUSE_UNTIED, OMP_CLAUSE_FINAL, OMP_CLAUSE_MERGEABLE)
(OMP_CLAUSE_ALIGNED, OMP_CLAUSE_DEPEND, OMP_CLAUSE_INBRANCH)
(OMP_CLAUSE_LINEAR, OMP_CLAUSE_NOTINBRANCH, OMP_CLAUSE_PROC_BIND)
(OMP_CLAUSE_SAFELEN, OMP_CLAUSE_SIMDLEN, OMP_CLAUSE_UNIFORM)
(OMP_CLAUSE_DEVICE, OMP_CLAUSE_MAP, OMP_CLAUSE_TO)
(OMP_CLAUSE_FROM, OMP_CLAUSE_NUM_TEAMS, OMP_CLAUSE_THREAD_LIMIT)
(OMP_CLAUSE_DIST_SCHEDULE): Use uint64_t.
(OMP_CLAUSE_ASYNC, OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS)
(OMP_CLAUSE_VECTOR_LENGTH, OMP_CLAUSE_COPY, OMP_CLAUSE_COPYOUT)
(OMP_CLAUSE_CREATE, OMP_CLAUSE_PRESENT)
(OMP_CLAUSE_PRESENT_OR_COPY, OMP_CLAUSE_PRESENT_OR_COPYIN)
(OMP_CLAUSE_PRESENT_OR_COPYOUT, OMP_CLAUSE_PRESENT_OR_CREATE)
(OMP_CLAUSE_DEVICEPTR, OMP_CLAUSE_GANG, OMP_CLAUSE_WORKER)
(OMP_CLAUSE_VECTOR, OMP_CLAUSE_SEQ, OMP_CLAUSE_INDEPENDENT)
(OMP_CLAUSE_USE_DEVICE, OMP_CLAUSE_DEVICE_RESIDENT)
(OMP_CLAUSE_HOST_SELF, OMP_CLAUSE_OACC_DEVICE, OMP_CLAUSE_WAIT)
(OMP_CLAUSE_DELETE, OMP_CLAUSE_AUTO, OMP_CLAUSE_TILE): New macros.
(gfc_match_omp_clauses): Handle those.
(OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES, OACC_DATA_CLAUSES)
(OACC_LOOP_CLAUSES, OACC_PARALLEL_LOOP_CLAUSES)
(OACC_KERNELS_LOOP_CLAUSES, OACC_HOST_DATA_CLAUSES)
(OACC_DECLARE_CLAUSES, OACC_UPDATE_CLAUSES)
(OACC_ENTER_DATA_CLAUSES, OACC_EXIT_DATA_CLAUSES)
(OACC_WAIT_CLAUSES): New macros.
(gfc_free_expr_list, match_oacc_expr_list, match_oacc_clause_gang)
(gfc_match_omp_map_clause, gfc_match_oacc_parallel_loop)
(gfc_match_oacc_parallel, gfc_match_oacc_kernels_loop)
(gfc_match_oacc_kernels, gfc_match_oacc_data)
(gfc_match_oacc_host_data, gfc_match_oacc_loop)
(gfc_match_oacc_declare, gfc_match_oacc_update)
(gfc_match_oacc_enter_data, gfc_match_oacc_exit_data)
(gfc_match_oacc_wait, gfc_match_oacc_cache)
(gfc_match_oacc_routine, oacc_is_loop)
(resolve_oacc_scalar_int_expr, resolve_oacc_positive_int_expr)
(check_symbol_not_pointer, check_array_not_assumed)
(resolve_oacc_data_clauses, resolve_oacc_deviceptr_clause)
(oacc_compatible_clauses, oacc_is_parallel, oacc_is_kernels)
(omp_code_to_statement, oacc_code_to_statement)
(resolve_oacc_directive_inside_omp_region)
(resolve_omp_directive_inside_oacc_region)
(resolve_oacc_nested_loops, resolve_oacc_params_in_parallel)
(resolve_oacc_loop_blocks, gfc_resolve_oacc_blocks)
(resolve_oacc_loop, resolve_oacc_cache, gfc_resolve_oacc_declare)
(gfc_resolve_oacc_directive): New functions.
* parse.c (next_free): Update for OpenACC. Move some code into...
(verify_token_free): ... this new function.
(next_fixed): Update for OpenACC. Move some code into...
(verify_token_fixed): ... this new function.
(case_executable): Add ST_OACC_UPDATE, ST_OACC_WAIT,
ST_OACC_CACHE, ST_OACC_ENTER_DATA, and ST_OACC_EXIT_DATA.
(case_exec_markers): Add ST_OACC_PARALLEL_LOOP, ST_OACC_PARALLEL,
ST_OACC_KERNELS, ST_OACC_DATA, ST_OACC_HOST_DATA, ST_OACC_LOOP,
ST_OACC_KERNELS_LOOP.
(case_decl): Add ST_OACC_ROUTINE.
(push_state, parse_critical_block, parse_progunit): Update for
OpenACC.
(gfc_ascii_statement): Handle ST_OACC_PARALLEL_LOOP,
ST_OACC_END_PARALLEL_LOOP, ST_OACC_PARALLEL, ST_OACC_END_PARALLEL,
ST_OACC_KERNELS, ST_OACC_END_KERNELS, ST_OACC_KERNELS_LOOP,
ST_OACC_END_KERNELS_LOOP, ST_OACC_DATA, ST_OACC_END_DATA,
ST_OACC_HOST_DATA, ST_OACC_END_HOST_DATA, ST_OACC_LOOP,
ST_OACC_END_LOOP, ST_OACC_DECLARE, ST_OACC_UPDATE, ST_OACC_WAIT,
ST_OACC_CACHE, ST_OACC_ENTER_DATA, ST_OACC_EXIT_DATA,
ST_OACC_ROUTINE.
(verify_st_order, parse_spec): Handle ST_OACC_DECLARE.
(parse_executable): Handle ST_OACC_PARALLEL_LOOP,
ST_OACC_KERNELS_LOOP, ST_OACC_LOOP, ST_OACC_PARALLEL,
ST_OACC_KERNELS, ST_OACC_DATA, ST_OACC_HOST_DATA.
(decode_oacc_directive, parse_oacc_structured_block)
(parse_oacc_loop, is_oacc): New functions.
* parse.h (struct gfc_state_data): Add oacc_declare_clauses
member.
(is_oacc): New prototype.
* resolve.c (gfc_resolve_blocks, gfc_resolve_code): Handle
EXEC_OACC_PARALLEL_LOOP, EXEC_OACC_PARALLEL,
EXEC_OACC_KERNELS_LOOP, EXEC_OACC_KERNELS, EXEC_OACC_DATA,
EXEC_OACC_HOST_DATA, EXEC_OACC_LOOP, EXEC_OACC_UPDATE,
EXEC_OACC_WAIT, EXEC_OACC_CACHE, EXEC_OACC_ENTER_DATA,
EXEC_OACC_EXIT_DATA.
(resolve_codes): Call gfc_resolve_oacc_declare.
* scanner.c (openacc_flag, openacc_locus): New variables.
(skip_free_comments): Update for OpenACC. Move some code into...
(skip_omp_attribute): ... this new function.
(skip_oacc_attribute): New function.
(skip_fixed_comments, gfc_next_char_literal): Update for OpenACC.
* st.c (gfc_free_statement): Handle EXEC_OACC_PARALLEL_LOOP,
EXEC_OACC_PARALLEL, EXEC_OACC_KERNELS_LOOP, EXEC_OACC_KERNELS,
EXEC_OACC_DATA, EXEC_OACC_HOST_DATA, EXEC_OACC_LOOP,
EXEC_OACC_UPDATE, EXEC_OACC_WAIT, EXEC_OACC_CACHE,
EXEC_OACC_ENTER_DATA, EXEC_OACC_EXIT_DATA.
* trans-decl.c (gfc_generate_function_code): Update for OpenACC.
* trans-openmp.c: Include "gomp-constants.h".
(gfc_omp_finish_clause, gfc_trans_omp_clauses): Use GOMP_MAP_*
instead of OMP_CLAUSE_MAP_*. Use OMP_CLAUSE_SET_MAP_KIND.
(gfc_trans_omp_clauses): Handle OMP_LIST_USE_DEVICE,
OMP_LIST_DEVICE_RESIDENT, OMP_LIST_CACHE, and OMP_MAP_FORCE_ALLOC,
OMP_MAP_FORCE_DEALLOC, OMP_MAP_FORCE_TO, OMP_MAP_FORCE_FROM,
OMP_MAP_FORCE_TOFROM, OMP_MAP_FORCE_PRESENT,
OMP_MAP_FORCE_DEVICEPTR, and gfc_omp_clauses' async, seq,
independent, wait_list, num_gangs_expr, num_workers_expr,
vector_length_expr, vector, vector_expr, worker, worker_expr,
gang, gang_expr members.
(gfc_trans_omp_do): Handle EXEC_OACC_LOOP.
(gfc_convert_expr_to_tree, gfc_trans_oacc_construct)
(gfc_trans_oacc_executable_directive)
(gfc_trans_oacc_wait_directive, gfc_trans_oacc_combined_directive)
(gfc_trans_oacc_declare, gfc_trans_oacc_directive): New functions.
* trans-stmt.c (gfc_trans_block_construct): Update for OpenACC.
* trans-stmt.h (gfc_trans_oacc_directive, gfc_trans_oacc_declare):
New prototypes.
* trans.c (tranc_code): Handle EXEC_OACC_CACHE, EXEC_OACC_WAIT,
EXEC_OACC_UPDATE, EXEC_OACC_LOOP, EXEC_OACC_HOST_DATA,
EXEC_OACC_DATA, EXEC_OACC_KERNELS, EXEC_OACC_KERNELS_LOOP,
EXEC_OACC_PARALLEL, EXEC_OACC_PARALLEL_LOOP, EXEC_OACC_ENTER_DATA,
EXEC_OACC_EXIT_DATA.
* gfortran.texi: Update for OpenACC.
* intrinsic.texi: Likewise.
* invoke.texi: Likewise.
gcc/lto/
* lto-lang.c (DEF_FUNCTION_TYPE_VAR_8, DEF_FUNCTION_TYPE_VAR_12):
New macros.
* lto.c: Include "gomp-constants.h".
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_fopenacc): New
procedure.
* g++.dg/goacc-gomp/goacc-gomp.exp: New file.
* g++.dg/goacc/goacc.exp: Likewise.
* gcc.dg/goacc-gomp/goacc-gomp.exp: Likewise.
* gcc.dg/goacc/goacc.exp: Likewise.
* gfortran.dg/goacc/goacc.exp: Likewise.
* c-c++-common/cpp/openacc-define-1.c: New file.
* c-c++-common/cpp/openacc-define-2.c: Likewise.
* c-c++-common/cpp/openacc-define-3.c: Likewise.
* c-c++-common/goacc-gomp/nesting-1.c: Likewise.
* c-c++-common/goacc-gomp/nesting-fail-1.c: Likewise.
* c-c++-common/goacc/acc_on_device-2-off.c: Likewise.
* c-c++-common/goacc/acc_on_device-2.c: Likewise.
* c-c++-common/goacc/asyncwait-1.c: Likewise.
* c-c++-common/goacc/cache-1.c: Likewise.
* c-c++-common/goacc/clauses-fail.c: Likewise.
* c-c++-common/goacc/collapse-1.c: Likewise.
* c-c++-common/goacc/data-1.c: Likewise.
* c-c++-common/goacc/data-2.c: Likewise.
* c-c++-common/goacc/data-clause-duplicate-1.c: Likewise.
* c-c++-common/goacc/deviceptr-1.c: Likewise.
* c-c++-common/goacc/deviceptr-2.c: Likewise.
* c-c++-common/goacc/deviceptr-3.c: Likewise.
* c-c++-common/goacc/if-clause-1.c: Likewise.
* c-c++-common/goacc/if-clause-2.c: Likewise.
* c-c++-common/goacc/kernels-1.c: Likewise.
* c-c++-common/goacc/loop-1.c: Likewise.
* c-c++-common/goacc/loop-private-1.c: Likewise.
* c-c++-common/goacc/nesting-1.c: Likewise.
* c-c++-common/goacc/nesting-data-1.c: Likewise.
* c-c++-common/goacc/nesting-fail-1.c: Likewise.
* c-c++-common/goacc/parallel-1.c: Likewise.
* c-c++-common/goacc/pcopy.c: Likewise.
* c-c++-common/goacc/pcopyin.c: Likewise.
* c-c++-common/goacc/pcopyout.c: Likewise.
* c-c++-common/goacc/pcreate.c: Likewise.
* c-c++-common/goacc/pragma_context.c: Likewise.
* c-c++-common/goacc/present-1.c: Likewise.
* c-c++-common/goacc/reduction-1.c: Likewise.
* c-c++-common/goacc/reduction-2.c: Likewise.
* c-c++-common/goacc/reduction-3.c: Likewise.
* c-c++-common/goacc/reduction-4.c: Likewise.
* c-c++-common/goacc/sb-1.c: Likewise.
* c-c++-common/goacc/sb-2.c: Likewise.
* c-c++-common/goacc/sb-3.c: Likewise.
* c-c++-common/goacc/update-1.c: Likewise.
* gcc.dg/goacc/acc_on_device-1.c: Likewise.
* gfortran.dg/goacc/acc_on_device-1.f95: Likewise.
* gfortran.dg/goacc/acc_on_device-2-off.f95: Likewise.
* gfortran.dg/goacc/acc_on_device-2.f95: Likewise.
* gfortran.dg/goacc/assumed.f95: Likewise.
* gfortran.dg/goacc/asyncwait-1.f95: Likewise.
* gfortran.dg/goacc/asyncwait-2.f95: Likewise.
* gfortran.dg/goacc/asyncwait-3.f95: Likewise.
* gfortran.dg/goacc/asyncwait-4.f95: Likewise.
* gfortran.dg/goacc/branch.f95: Likewise.
* gfortran.dg/goacc/cache-1.f95: Likewise.
* gfortran.dg/goacc/coarray.f95: Likewise.
* gfortran.dg/goacc/continuation-free-form.f95: Likewise.
* gfortran.dg/goacc/cray.f95: Likewise.
* gfortran.dg/goacc/critical.f95: Likewise.
* gfortran.dg/goacc/data-clauses.f95: Likewise.
* gfortran.dg/goacc/data-tree.f95: Likewise.
* gfortran.dg/goacc/declare-1.f95: Likewise.
* gfortran.dg/goacc/enter-exit-data.f95: Likewise.
* gfortran.dg/goacc/fixed-1.f: Likewise.
* gfortran.dg/goacc/fixed-2.f: Likewise.
* gfortran.dg/goacc/fixed-3.f: Likewise.
* gfortran.dg/goacc/fixed-4.f: Likewise.
* gfortran.dg/goacc/host_data-tree.f95: Likewise.
* gfortran.dg/goacc/if.f95: Likewise.
* gfortran.dg/goacc/kernels-tree.f95: Likewise.
* gfortran.dg/goacc/list.f95: Likewise.
* gfortran.dg/goacc/literal.f95: Likewise.
* gfortran.dg/goacc/loop-1.f95: Likewise.
* gfortran.dg/goacc/loop-2.f95: Likewise.
* gfortran.dg/goacc/loop-3.f95: Likewise.
* gfortran.dg/goacc/loop-tree-1.f90: Likewise.
* gfortran.dg/goacc/omp.f95: Likewise.
* gfortran.dg/goacc/parallel-kernels-clauses.f95: Likewise.
* gfortran.dg/goacc/parallel-kernels-regions.f95: Likewise.
* gfortran.dg/goacc/parallel-tree.f95: Likewise.
* gfortran.dg/goacc/parameter.f95: Likewise.
* gfortran.dg/goacc/private-1.f95: Likewise.
* gfortran.dg/goacc/private-2.f95: Likewise.
* gfortran.dg/goacc/private-3.f95: Likewise.
* gfortran.dg/goacc/pure-elemental-procedures.f95: Likewise.
* gfortran.dg/goacc/reduction-2.f95: Likewise.
* gfortran.dg/goacc/reduction.f95: Likewise.
* gfortran.dg/goacc/routine-1.f90: Likewise.
* gfortran.dg/goacc/routine-2.f90: Likewise.
* gfortran.dg/goacc/sentinel-free-form.f95: Likewise.
* gfortran.dg/goacc/several-directives.f95: Likewise.
* gfortran.dg/goacc/sie.f95: Likewise.
* gfortran.dg/goacc/subarrays.f95: Likewise.
* gfortran.dg/gomp/map-1.f90: Likewise.
* gfortran.dg/openacc-define-1.f90: Likewise.
* gfortran.dg/openacc-define-2.f90: Likewise.
* gfortran.dg/openacc-define-3.f90: Likewise.
* g++.dg/gomp/block-1.C: Update for changed compiler output.
* g++.dg/gomp/block-2.C: Likewise.
* g++.dg/gomp/block-3.C: Likewise.
* g++.dg/gomp/block-5.C: Likewise.
* g++.dg/gomp/target-1.C: Likewise.
* g++.dg/gomp/target-2.C: Likewise.
* g++.dg/gomp/taskgroup-1.C: Likewise.
* g++.dg/gomp/teams-1.C: Likewise.
* gcc.dg/cilk-plus/jump-openmp.c: Likewise.
* gcc.dg/cilk-plus/jump.c: Likewise.
* gcc.dg/gomp/block-1.c: Likewise.
* gcc.dg/gomp/block-10.c: Likewise.
* gcc.dg/gomp/block-2.c: Likewise.
* gcc.dg/gomp/block-3.c: Likewise.
* gcc.dg/gomp/block-4.c: Likewise.
* gcc.dg/gomp/block-5.c: Likewise.
* gcc.dg/gomp/block-6.c: Likewise.
* gcc.dg/gomp/block-7.c: Likewise.
* gcc.dg/gomp/block-8.c: Likewise.
* gcc.dg/gomp/block-9.c: Likewise.
* gcc.dg/gomp/target-1.c: Likewise.
* gcc.dg/gomp/target-2.c: Likewise.
* gcc.dg/gomp/taskgroup-1.c: Likewise.
* gcc.dg/gomp/teams-1.c: Likewise.
include/
* gomp-constants.h: New file.
libgomp/
* Makefile.am (search_path): Add $(top_srcdir)/../include.
(libgomp_la_SOURCES): Add splay-tree.c, libgomp-plugin.c,
oacc-parallel.c, oacc-host.c, oacc-init.c, oacc-mem.c,
oacc-async.c, oacc-plugin.c, oacc-cuda.c.
[USE_FORTRAN] (libgomp_la_SOURCES): Add openacc.f90.
Include $(top_srcdir)/plugin/Makefrag.am.
(nodist_libsubinclude_HEADERS): Add openacc.h.
[USE_FORTRAN] (nodist_finclude_HEADERS): Add openacc_lib.h,
openacc.f90, openacc.mod, openacc_kinds.mod.
(omp_lib.mod): Generalize into...
(%.mod): ... this new rule.
(openacc_kinds.mod, openacc.mod): New rules.
* plugin/configfrag.ac: New file.
* configure.ac: Move plugin/offloading support into it. Include
it. Instantiate testsuite/libgomp-test-support.pt.exp.
* plugin/Makefrag.am: New file.
* testsuite/Makefile.am (OFFLOAD_TARGETS)
(OFFLOAD_ADDITIONAL_OPTIONS, OFFLOAD_ADDITIONAL_LIB_PATHS): Don't
export.
(libgomp-test-support.exp): New rule.
(all-local): Depend on it.
* Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.
* config.h.in: Likewise.
* configure: Likewise.
* configure.tgt: Harden shell syntax.
* env.c: Include "oacc-int.h".
(parse_acc_device_type): New function.
(gomp_debug_var, goacc_device_type, goacc_device_num): New
variables.
(initialize_env): Initialize those. Call
goacc_runtime_initialize.
* error.c (gomp_vdebug, gomp_debug, gomp_vfatal): New functions.
(gomp_fatal): Call gomp_vfatal.
* libgomp.h: Include "libgomp-plugin.h" and <stdarg.h>.
(gomp_debug_var, goacc_device_type, goacc_device_num, gomp_vdebug)
(gomp_debug, gomp_verror, gomp_vfatal, gomp_init_targets_once)
(splay_tree_node, splay_tree, splay_tree_key)
(struct target_mem_desc, struct splay_tree_key_s)
(struct gomp_memory_mapping, struct acc_dispatch_t)
(struct gomp_device_descr, gomp_acc_insert_pointer)
(gomp_acc_remove_pointer, target_mem_desc, gomp_copy_from_async)
(gomp_unmap_vars, gomp_init_device, gomp_init_tables)
(gomp_free_memmap, gomp_fini_device): New declarations.
(gomp_vdebug, gomp_debug): New macros.
Include "splay-tree.h".
* libgomp.map (OACC_2.0): New symbol version. Use for
acc_get_num_devices, acc_get_num_devices_h_, acc_set_device_type,
acc_set_device_type_h_, acc_get_device_type,
acc_get_device_type_h_, acc_set_device_num, acc_set_device_num_h_,
acc_get_device_num, acc_get_device_num_h_, acc_async_test,
acc_async_test_h_, acc_async_test_all, acc_async_test_all_h_,
acc_wait, acc_wait_h_, acc_wait_async, acc_wait_async_h_,
acc_wait_all, acc_wait_all_h_, acc_wait_all_async,
acc_wait_all_async_h_, acc_init, acc_init_h_, acc_shutdown,
acc_shutdown_h_, acc_on_device, acc_on_device_h_, acc_malloc,
acc_free, acc_copyin, acc_copyin_32_h_, acc_copyin_64_h_,
acc_copyin_array_h_, acc_present_or_copyin,
acc_present_or_copyin_32_h_, acc_present_or_copyin_64_h_,
acc_present_or_copyin_array_h_, acc_create, acc_create_32_h_,
acc_create_64_h_, acc_create_array_h_, acc_present_or_create,
acc_present_or_create_32_h_, acc_present_or_create_64_h_,
acc_present_or_create_array_h_, acc_copyout, acc_copyout_32_h_,
acc_copyout_64_h_, acc_copyout_array_h_, acc_delete,
acc_delete_32_h_, acc_delete_64_h_, acc_delete_array_h_,
acc_update_device, acc_update_device_32_h_,
acc_update_device_64_h_, acc_update_device_array_h_,
acc_update_self, acc_update_self_32_h_, acc_update_self_64_h_,
acc_update_self_array_h_, acc_map_data, acc_unmap_data,
acc_deviceptr, acc_hostptr, acc_is_present, acc_is_present_32_h_,
acc_is_present_64_h_, acc_is_present_array_h_,
acc_memcpy_to_device, acc_memcpy_from_device,
acc_get_current_cuda_device, acc_get_current_cuda_context,
acc_get_cuda_stream, acc_set_cuda_stream.
(GOACC_2.0): New symbol version. Use for GOACC_data_end,
GOACC_data_start, GOACC_enter_exit_data, GOACC_parallel,
GOACC_update, GOACC_wait, GOACC_get_thread_num,
GOACC_get_num_threads.
(GOMP_PLUGIN_1.0): New symbol version. Use for
GOMP_PLUGIN_malloc, GOMP_PLUGIN_malloc_cleared,
GOMP_PLUGIN_realloc, GOMP_PLUGIN_debug, GOMP_PLUGIN_error,
GOMP_PLUGIN_fatal, GOMP_PLUGIN_async_unmap_vars,
GOMP_PLUGIN_acc_thread.
* libgomp.texi: Update for OpenACC changes, and GOMP_DEBUG
environment variable.
* libgomp_g.h (GOACC_data_start, GOACC_data_end)
(GOACC_enter_exit_data, GOACC_parallel, GOACC_update, GOACC_wait)
(GOACC_get_num_threads, GOACC_get_thread_num): New declarations.
* splay-tree.h (splay_tree_lookup, splay_tree_insert)
(splay_tree_remove): New declarations.
(rotate_left, rotate_right, splay_tree_splay, splay_tree_insert)
(splay_tree_remove, splay_tree_lookup): Move into...
* splay-tree.c: ... this new file.
* target.c: Include "oacc-plugin.h", "oacc-int.h", <assert.h>.
(splay_tree_node, splay_tree, splay_tree_key)
(struct target_mem_desc, struct splay_tree_key_s)
(struct gomp_device_descr): Don't declare.
(num_devices_openmp): New variable.
(gomp_get_num_devices ): Use it.
(gomp_init_targets_once): New function.
(gomp_get_num_devices ): Use it.
(get_kind, gomp_copy_from_async, gomp_free_memmap)
(gomp_fini_device, gomp_register_image_for_device): New functions.
(gomp_map_vars): Add devaddrs parameter.
(gomp_update): Add mm parameter.
(gomp_init_device): Move most of it into...
(gomp_init_tables): ... this new function.
(gomp_register_images_for_device): Remove function.
(splay_compare, gomp_map_vars, gomp_unmap_vars, gomp_init_device):
Make them hidden instead of static.
(gomp_map_vars_existing, gomp_map_vars, gomp_unmap_vars)
(gomp_update, gomp_init_device, GOMP_target, GOMP_target_data)
(GOMP_target_end_data, GOMP_target_update)
(gomp_load_plugin_for_device, gomp_target_init): Update for
OpenACC changes.
* oacc-async.c: New file.
* oacc-cuda.c: Likewise.
* oacc-host.c: Likewise.
* oacc-init.c: Likewise.
* oacc-int.h: Likewise.
* oacc-mem.c: Likewise.
* oacc-parallel.c: Likewise.
* oacc-plugin.c: Likewise.
* oacc-plugin.h: Likewise.
* oacc-ptx.h: Likewise.
* openacc.f90: Likewise.
* openacc.h: Likewise.
* openacc_lib.h: Likewise.
* plugin/plugin-host.c: Likewise.
* plugin/plugin-nvptx.c: Likewise.
* libgomp-plugin.c: Likewise.
* libgomp-plugin.h: Likewise.
* libgomp_target.h: Remove file after merging content into the
former file. Update all users.
* testsuite/lib/libgomp.exp: Load libgomp-test-support.exp.
(offload_targets_s, offload_targets_s_openacc): New variables.
(check_effective_target_openacc_nvidia_accel_present)
(check_effective_target_openacc_nvidia_accel_selected): New
procedures.
(libgomp_init): Update for OpenACC changes.
* testsuite/libgomp-test-support.exp.in: New file.
* testsuite/libgomp.oacc-c++/c++.exp: Likewise.
* testsuite/libgomp.oacc-c/c.exp: Likewise.
* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
* testsuite/libgomp.oacc-c-c++-common/abort-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/abort-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/abort-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/abort-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/asyncwait-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/cache-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/clauses-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/clauses-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/context-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/context-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/context-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/context-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-already-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-already-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-already-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-already-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-already-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-already-6.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-already-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-already-8.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/if-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-empty.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-10.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-11.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-12.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-13.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-14.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-15.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-16.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-17.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-18.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-19.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-20.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-21.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-22.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-23.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-24.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-25.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-26.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-27.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-28.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-29.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-30.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-31.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-32.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-33.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-34.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-35.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-36.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-37.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-38.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-39.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-40.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-41.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-42.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-43.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-44.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-45.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-46.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-47.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-48.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-49.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-50.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-51.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-52.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-53.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-54.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-55.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-56.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-57.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-58.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-59.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-6.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-60.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-61.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-62.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-63.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-64.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-65.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-66.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-67.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-68.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-69.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-70.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-71.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-72.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-73.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-74.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-75.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-76.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-77.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-78.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-80.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-81.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-82.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-83.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-84.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-86.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-87.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-88.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-89.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-9.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-90.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-91.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-92.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/nested-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/nested-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/offset-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/parallel-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/parallel-empty.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/pointer-align-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/present-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-initial-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/subr.h: Likewise.
* testsuite/libgomp.oacc-c-c++-common/subr.ptx: Likewise.
* testsuite/libgomp.oacc-c-c++-common/timer.h: Likewise.
* testsuite/libgomp.oacc-c-c++-common/update-1-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/update-1.c: Likewise.
* testsuite/libgomp.oacc-fortran/abort-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/abort-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise.
* testsuite/libgomp.oacc-fortran/asyncwait-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/asyncwait-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/asyncwait-3.f90: Likewise.
* testsuite/libgomp.oacc-fortran/collapse-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/collapse-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/collapse-3.f90: Likewise.
* testsuite/libgomp.oacc-fortran/collapse-4.f90: Likewise.
* testsuite/libgomp.oacc-fortran/collapse-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/collapse-6.f90: Likewise.
* testsuite/libgomp.oacc-fortran/collapse-7.f90: Likewise.
* testsuite/libgomp.oacc-fortran/collapse-8.f90: Likewise.
* testsuite/libgomp.oacc-fortran/data-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/data-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/data-3.f90: Likewise.
* testsuite/libgomp.oacc-fortran/data-4-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/data-4.f90: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-1.f: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-3.f: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-4.f: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-5.f: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-6.f: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-7.f: Likewise.
* testsuite/libgomp.oacc-fortran/data-already-8.f: Likewise.
* testsuite/libgomp.oacc-fortran/lib-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/lib-10.f90: Likewise.
* testsuite/libgomp.oacc-fortran/lib-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/lib-3.f: Likewise.
* testsuite/libgomp.oacc-fortran/lib-4.f90: Likewise.
* testsuite/libgomp.oacc-fortran/lib-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/lib-6.f90: Likewise.
* testsuite/libgomp.oacc-fortran/lib-7.f90: Likewise.
* testsuite/libgomp.oacc-fortran/lib-8.f90: Likewise.
* testsuite/libgomp.oacc-fortran/map-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/openacc_version-1.f: Likewise.
* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pointer-align-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pset-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-3.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-4.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise.
* testsuite/libgomp.oacc-fortran/routine-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/routine-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/routine-3.f90: Likewise.
* testsuite/libgomp.oacc-fortran/routine-4.f90: Likewise.
* testsuite/libgomp.oacc-fortran/subarrays-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/subarrays-2.f90: Likewise.
liboffloadmic/
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_name)
(GOMP_OFFLOAD_get_caps, GOMP_OFFLOAD_fini_device): New functions.
Co-Authored-By: Bernd Schmidt <bernds@codesourcery.com>
Co-Authored-By: Cesar Philippidis <cesar@codesourcery.com>
Co-Authored-By: Dmitry Bocharnikov <dmitry.b@samsung.com>
Co-Authored-By: Evgeny Gavrin <e.gavrin@samsung.com>
Co-Authored-By: Ilmir Usmanov <i.usmanov@samsung.com>
Co-Authored-By: Jakub Jelinek <jakub@redhat.com>
Co-Authored-By: James Norris <jnorris@codesourcery.com>
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com>
Co-Authored-By: Tobias Burnus <burnus@net-b.de>
Co-Authored-By: Tom de Vries <tom@codesourcery.com>
From-SVN: r219682
Diffstat (limited to 'libgomp/testsuite')
204 files changed, 15315 insertions, 30 deletions
diff --git a/libgomp/testsuite/Makefile.am b/libgomp/testsuite/Makefile.am index 9cc103a..66a9d94 100644 --- a/libgomp/testsuite/Makefile.am +++ b/libgomp/testsuite/Makefile.am @@ -12,7 +12,16 @@ _RUNTEST = $(shell if test -f $(top_srcdir)/../dejagnu/runtest; then \ echo $(top_srcdir)/../dejagnu/runtest; else echo runtest; fi) RUNTEST = "$(_RUNTEST) $(AM_RUNTESTFLAGS)" -# Used for support non-fallback offloading. -export OFFLOAD_TARGETS = $(offload_targets) -export OFFLOAD_ADDITIONAL_OPTIONS = $(offload_additional_options) -export OFFLOAD_ADDITIONAL_LIB_PATHS = $(offload_additional_lib_paths) + +# Instead of directly in ../testsuite/libgomp-test-support.exp.in, the +# following variables have to be "routed through" this Makefile, for expansion +# of the several (Makefile) variables used therein. +libgomp-test-support.exp: libgomp-test-support.pt.exp Makefile + cp $< $@.tmp + echo >> $@.tmp \ + 'set offload_additional_options "$(offload_additional_options)"' + echo >> $@.tmp \ + 'set offload_additional_lib_paths "$(offload_additional_lib_paths)"' + mv $@.tmp $@ + +all-local: libgomp-test-support.exp diff --git a/libgomp/testsuite/Makefile.in b/libgomp/testsuite/Makefile.in index 2f845f0..352fc3f 100644 --- a/libgomp/testsuite/Makefile.in +++ b/libgomp/testsuite/Makefile.in @@ -35,7 +35,8 @@ build_triplet = @build@ host_triplet = @host@ target_triplet = @target@ subdir = testsuite -DIST_COMMON = $(srcdir)/Makefile.in $(srcdir)/Makefile.am +DIST_COMMON = $(srcdir)/Makefile.in $(srcdir)/Makefile.am \ + $(srcdir)/libgomp-test-support.exp.in ACLOCAL_M4 = $(top_srcdir)/aclocal.m4 am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \ $(top_srcdir)/../config/depstand.m4 \ @@ -49,12 +50,13 @@ am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \ $(top_srcdir)/../config/tls.m4 $(top_srcdir)/../ltoptions.m4 \ $(top_srcdir)/../ltsugar.m4 $(top_srcdir)/../ltversion.m4 \ $(top_srcdir)/../lt~obsolete.m4 $(top_srcdir)/acinclude.m4 \ - $(top_srcdir)/../libtool.m4 $(top_srcdir)/configure.ac + $(top_srcdir)/../libtool.m4 $(top_srcdir)/plugin/configfrag.ac \ + $(top_srcdir)/configure.ac am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \ $(ACLOCAL_M4) mkinstalldirs = $(SHELL) $(top_srcdir)/../mkinstalldirs CONFIG_HEADER = $(top_builddir)/config.h -CONFIG_CLEAN_FILES = +CONFIG_CLEAN_FILES = libgomp-test-support.pt.exp CONFIG_CLEAN_VPATH_FILES = SOURCES = DEJATOOL = $(PACKAGE) @@ -71,6 +73,8 @@ CCDEPMODE = @CCDEPMODE@ CFLAGS = @CFLAGS@ CPP = @CPP@ CPPFLAGS = @CPPFLAGS@ +CUDA_DRIVER_INCLUDE = @CUDA_DRIVER_INCLUDE@ +CUDA_DRIVER_LIB = @CUDA_DRIVER_LIB@ CYGPATH_W = @CYGPATH_W@ DEFS = @DEFS@ DEPDIR = @DEPDIR@ @@ -129,6 +133,10 @@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ +PLUGIN_NVPTX = @PLUGIN_NVPTX@ +PLUGIN_NVPTX_CPPFLAGS = @PLUGIN_NVPTX_CPPFLAGS@ +PLUGIN_NVPTX_LDFLAGS = @PLUGIN_NVPTX_LDFLAGS@ +PLUGIN_NVPTX_LIBS = @PLUGIN_NVPTX_LIBS@ RANLIB = @RANLIB@ SECTION_LDFLAGS = @SECTION_LDFLAGS@ SED = @SED@ @@ -250,6 +258,8 @@ $(top_srcdir)/configure: @MAINTAINER_MODE_TRUE@ $(am__configure_deps) $(ACLOCAL_M4): @MAINTAINER_MODE_TRUE@ $(am__aclocal_m4_deps) cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh $(am__aclocal_m4_deps): +libgomp-test-support.pt.exp: $(top_builddir)/config.status $(srcdir)/libgomp-test-support.exp.in + cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ mostlyclean-libtool: -rm -f *.lo @@ -303,7 +313,7 @@ distclean-DEJAGNU: check-am: all-am $(MAKE) $(AM_MAKEFLAGS) check-DEJAGNU check: check-am -all-am: Makefile +all-am: Makefile all-local installdirs: install: install-am install-exec: install-exec-am @@ -398,23 +408,31 @@ uninstall-am: .MAKE: check-am install-am install-strip -.PHONY: all all-am check check-DEJAGNU check-am clean clean-generic \ - clean-libtool distclean distclean-DEJAGNU distclean-generic \ - distclean-libtool dvi dvi-am html html-am info info-am install \ - install-am install-data install-data-am install-dvi \ - install-dvi-am install-exec install-exec-am install-html \ - install-html-am install-info install-info-am install-man \ - install-pdf install-pdf-am install-ps install-ps-am \ - install-strip installcheck installcheck-am installdirs \ - maintainer-clean maintainer-clean-generic mostlyclean \ - mostlyclean-generic mostlyclean-libtool pdf pdf-am ps ps-am \ - uninstall uninstall-am - - -# Used for support non-fallback offloading. -export OFFLOAD_TARGETS = $(offload_targets) -export OFFLOAD_ADDITIONAL_OPTIONS = $(offload_additional_options) -export OFFLOAD_ADDITIONAL_LIB_PATHS = $(offload_additional_lib_paths) +.PHONY: all all-am all-local check check-DEJAGNU check-am clean \ + clean-generic clean-libtool distclean distclean-DEJAGNU \ + distclean-generic distclean-libtool dvi dvi-am html html-am \ + info info-am install install-am install-data install-data-am \ + install-dvi install-dvi-am install-exec install-exec-am \ + install-html install-html-am install-info install-info-am \ + install-man install-pdf install-pdf-am install-ps \ + install-ps-am install-strip installcheck installcheck-am \ + installdirs maintainer-clean maintainer-clean-generic \ + mostlyclean mostlyclean-generic mostlyclean-libtool pdf pdf-am \ + ps ps-am uninstall uninstall-am + + +# Instead of directly in ../testsuite/libgomp-test-support.exp.in, the +# following variables have to be "routed through" this Makefile, for expansion +# of the several (Makefile) variables used therein. +libgomp-test-support.exp: libgomp-test-support.pt.exp Makefile + cp $< $@.tmp + echo >> $@.tmp \ + 'set offload_additional_options "$(offload_additional_options)"' + echo >> $@.tmp \ + 'set offload_additional_lib_paths "$(offload_additional_lib_paths)"' + mv $@.tmp $@ + +all-local: libgomp-test-support.exp # Tell versions [3.59,3.63) of GNU make to not export all variables. # Otherwise a system limit (for SysV at least) may be exceeded. diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp index 2d6f822..5a6eec1 100644 --- a/libgomp/testsuite/lib/libgomp.exp +++ b/libgomp/testsuite/lib/libgomp.exp @@ -32,6 +32,29 @@ load_gcc_lib timeout-dg.exp load_gcc_lib torture-options.exp load_gcc_lib fortran-modules.exp +# Try to load a test support file, built during libgomp configuration. +load_file libgomp-test-support.exp + +# Populate offload_targets_s (offloading targets separated by a space), and +# offload_targets_s_openacc (the same, but with OpenACC names; OpenACC spells +# some of them a little differently). +set offload_targets_s [split $offload_targets ","] +set offload_targets_s_openacc {} +foreach offload_target_openacc $offload_targets_s { + switch $offload_target_openacc { + intelmic { + # Skip; will all FAIL because of missing + # GOMP_OFFLOAD_CAP_OPENACC_200. + continue + } + nvptx { + set offload_target_openacc "nvidia" + } + } + lappend offload_targets_s_openacc "$offload_target_openacc" +} +lappend offload_targets_s_openacc "host" + set dg-do-what-default run # @@ -108,13 +131,9 @@ proc libgomp_init { args } { # Compute what needs to be put into LD_LIBRARY_PATH set always_ld_library_path ".:${blddir}/.libs" - # Get offload-related variables from environment (exported by Makefile) - set offload_targets [getenv OFFLOAD_TARGETS] - set offload_additional_options [getenv OFFLOAD_ADDITIONAL_OPTIONS] - set offload_additional_lib_paths [getenv OFFLOAD_ADDITIONAL_LIB_PATHS] - # Add liboffloadmic build directory in LD_LIBRARY_PATH to support # non-fallback testing for Intel MIC targets + global offload_targets if { [string match "*,intelmic,*" ",$offload_targets,"] } { append always_ld_library_path ":${blddir}/../liboffloadmic/.libs" append always_ld_library_path ":${blddir}/../liboffloadmic/plugin/.libs" @@ -122,6 +141,7 @@ proc libgomp_init { args } { append always_ld_library_path ":${blddir}/../libstdc++-v3/src/.libs" } + global offload_additional_lib_paths if { $offload_additional_lib_paths != "" } { append always_ld_library_path "${offload_additional_lib_paths}" } @@ -158,9 +178,29 @@ proc libgomp_init { args } { lappend ALWAYS_CFLAGS "additional_flags=-B${blddir}/.libs" lappend ALWAYS_CFLAGS "additional_flags=-I${blddir}" lappend ALWAYS_CFLAGS "ldflags=-L${blddir}/.libs" + # The top-level include directory, for gomp-constants.h. + lappend ALWAYS_CFLAGS "additional_flags=-I${srcdir}/../../include" } lappend ALWAYS_CFLAGS "additional_flags=-I${srcdir}/.." + # For build-tree testing, also consider the CUDA paths used for builing. + # For installed testing, we assume all that to be provided in the sysroot. + if { $blddir != "" } { + global cuda_driver_include + global cuda_driver_lib + if { $cuda_driver_include != "" } { + # Stop gfortran from freaking out: + # Warning: Nonexistent include directory "[...]" + if {[file exists $cuda_driver_include]} { + lappend ALWAYS_CFLAGS "additional_flags=-I$cuda_driver_include" + } + } + if { $cuda_driver_lib != "" } { + lappend ALWAYS_CFLAGS "additional_flags=-L$cuda_driver_lib" + append always_ld_library_path ":$cuda_driver_lib" + } + } + # We use atomic operations in the testcases to validate results. if { ([istarget i?86-*-*] || [istarget x86_64-*-*]) && [check_effective_target_ia32] } { @@ -191,6 +231,7 @@ proc libgomp_init { args } { # Used for support non-fallback offloading. # Help GCC to find target mkoffload. + global offload_additional_options if { $offload_additional_options != "" } { lappend ALWAYS_CFLAGS "additional_flags=${offload_additional_options}" } @@ -278,3 +319,29 @@ proc check_effective_target_offload_device { } { } } ] } + +# Return 1 if at least one nvidia board is present. + +proc check_effective_target_openacc_nvidia_accel_present { } { + return [check_runtime openacc_nvidia_accel_present { + #include <openacc.h> + int main () { + return !(acc_get_num_devices (acc_device_nvidia) > 0); + } + } "" ] +} + +# Return 1 if at least one nvidia board is present, and the nvidia device type +# is selected by default by means of setting the environment variable +# ACC_DEVICE_TYPE. + +proc check_effective_target_openacc_nvidia_accel_selected { } { + if { ![check_effective_target_openacc_nvidia_accel_present] } { + return 0; + } + global offload_target_openacc + if { $offload_target_openacc == "nvidia" } { + return 1; + } + return 0; +} diff --git a/libgomp/testsuite/libgomp-test-support.exp.in b/libgomp/testsuite/libgomp-test-support.exp.in new file mode 100644 index 0000000..764bec0 --- /dev/null +++ b/libgomp/testsuite/libgomp-test-support.exp.in @@ -0,0 +1,4 @@ +set cuda_driver_include "@CUDA_DRIVER_INCLUDE@" +set cuda_driver_lib "@CUDA_DRIVER_LIB@" + +set offload_targets "@offload_targets@" diff --git a/libgomp/testsuite/libgomp.oacc-c++/c++.exp b/libgomp/testsuite/libgomp.oacc-c++/c++.exp new file mode 100644 index 0000000..f486f9b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c++/c++.exp @@ -0,0 +1,107 @@ +# This whole file adapted from libgomp.c++/c++.exp. + +load_lib libgomp-dg.exp +load_gcc_lib gcc-dg.exp + +global shlib_ext + +set shlib_ext [get_shlib_extension] +set lang_link_flags "-lstdc++" +set lang_test_file_found 0 +set lang_library_path "../libstdc++-v3/src/.libs" +if [info exists lang_include_flags] then { + unset lang_include_flags +} + +# Initialize dg. +dg-init + +# Turn on OpenACC. +lappend ALWAYS_CFLAGS "additional_flags=-fopenacc" + +# Switch into C++ mode. Otherwise, the libgomp.oacc-c-c++-common/*.c +# files would be compiled as C files. +set SAVE_GCC_UNDER_TEST "$GCC_UNDER_TEST" +set GCC_UNDER_TEST "$GCC_UNDER_TEST -x c++" + +set blddir [lookfor_file [get_multilibs] libgomp] + + +if { $blddir != "" } { + # Look for a static libstdc++ first. + if [file exists "${blddir}/${lang_library_path}/libstdc++.a"] { + set lang_test_file "${lang_library_path}/libstdc++.a" + set lang_test_file_found 1 + # We may have a shared only build, so look for a shared libstdc++. + } elseif [file exists "${blddir}/${lang_library_path}/libstdc++.${shlib_ext}"] { + set lang_test_file "${lang_library_path}/libstdc++.${shlib_ext}" + set lang_test_file_found 1 + } else { + puts "No libstdc++ library found, will not execute c++ tests" + } +} elseif { [info exists GXX_UNDER_TEST] } { + set lang_test_file_found 1 + # Needs to exist for libgomp.exp. + set lang_test_file "" +} else { + puts "GXX_UNDER_TEST not defined, will not execute c++ tests" +} + +if { $lang_test_file_found } { + # Gather a list of all tests. + set tests [lsort [concat \ + [find $srcdir/$subdir *.C] \ + [find $srcdir/$subdir/../libgomp.oacc-c-c++-common *.c]]] + + if { $blddir != "" } { + set ld_library_path "$always_ld_library_path:${blddir}/${lang_library_path}" + } else { + set ld_library_path "$always_ld_library_path" + } + append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST] + set_ld_library_path_env_vars + + set flags_file "${blddir}/../libstdc++-v3/scripts/testsuite_flags" + if { [file exists $flags_file] } { + set libstdcxx_includes [exec sh $flags_file --build-includes] + } else { + set libstdcxx_includes "" + } + + # Test OpenACC with available accelerators. + foreach offload_target_openacc $offload_targets_s_openacc { + set tagopt "-DACC_DEVICE_TYPE_$offload_target_openacc=1" + + switch $offload_target_openacc { + host { + set acc_mem_shared 1 + } + host_nonshm { + set acc_mem_shared 0 + } + nvidia { + # Copy ptx file (TEMPORARY) + remote_download host $srcdir/libgomp.oacc-c-c++-common/subr.ptx + + # Where timer.h lives + lappend ALWAYS_CFLAGS "additional_flags=-I${srcdir}/libgomp.oacc-c-c++-common" + + set acc_mem_shared 0 + } + default { + set acc_mem_shared 0 + } + } + set tagopt "$tagopt -DACC_MEM_SHARED=$acc_mem_shared" + + setenv ACC_DEVICE_TYPE $offload_target_openacc + + dg-runtest $tests "$tagopt" $libstdcxx_includes + } +} + +# See above. +set GCC_UNDER_TEST "$SAVE_GCC_UNDER_TEST" + +# All done. +dg-finish diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-1.c new file mode 100644 index 0000000..f88b9e3 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-1.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-shouldfail "" { *-*-* } { "*" } { "" } } */ + +#include <stdlib.h> + +int +main (void) +{ + +#pragma acc parallel + { + abort (); + } + + return 0; +} + diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-2.c new file mode 100644 index 0000000..debb81e --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-2.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ + +#include <stdlib.h> + +int +main (int argc, char **argv) +{ + +#pragma acc parallel + { + if (argc != 1) + abort (); + } + + return 0; +} + diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c new file mode 100644 index 0000000..be7aaa8 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-shouldfail "" { *-*-* } { "*" } { "" } } */ + +#include <stdlib.h> + +int +main (void) +{ + +#pragma acc kernels + { + abort (); + } + + return 0; +} + diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-4.c new file mode 100644 index 0000000..c29ca3f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-4.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ + +#include <stdlib.h> + +int +main (int argc, char **argv) +{ + +#pragma acc kernels + { + if (argc != 1) + abort (); + } + + return 0; +} + diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c new file mode 100644 index 0000000..81ea476 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c @@ -0,0 +1,75 @@ +/* Disable the acc_on_device builtin; we want to test the libgomp library + function. */ +/* { dg-additional-options "-fno-builtin-acc_on_device" } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char *argv[]) +{ + /* Host. */ + + { + if (!acc_on_device (acc_device_none)) + abort (); + if (!acc_on_device (acc_device_host)) + abort (); + if (acc_on_device (acc_device_host_nonshm)) + abort (); + if (acc_on_device (acc_device_not_host)) + abort (); + if (acc_on_device (acc_device_nvidia)) + abort (); + } + + + /* Host via offloading fallback mode. */ + +#pragma acc parallel if(0) + { + if (!acc_on_device (acc_device_none)) + abort (); + if (!acc_on_device (acc_device_host)) + abort (); + if (acc_on_device (acc_device_host_nonshm)) + abort (); + if (acc_on_device (acc_device_not_host)) + abort (); + if (acc_on_device (acc_device_nvidia)) + abort (); + } + + +#if !ACC_DEVICE_TYPE_host + + /* Offloaded. */ + +#pragma acc parallel + { + if (acc_on_device (acc_device_none)) + abort (); + if (acc_on_device (acc_device_host)) + abort (); +#if ACC_DEVICE_TYPE_host_nonshm + if (!acc_on_device (acc_device_host_nonshm)) + abort (); +#else + if (acc_on_device (acc_device_host_nonshm)) + abort (); +#endif + if (!acc_on_device (acc_device_not_host)) + abort (); +#if ACC_DEVICE_TYPE_nvidia + if (!acc_on_device (acc_device_nvidia)) + abort (); +#else + if (acc_on_device (acc_device_nvidia)) + abort (); +#endif + } + +#endif + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-1.c new file mode 100644 index 0000000..22cef6d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-1.c @@ -0,0 +1,466 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <openacc.h> +#include <stdlib.h> +#include "cuda.h" + +#include <stdio.h> +#include <sys/time.h> + +int +main (int argc, char **argv) +{ + CUresult r; + CUstream stream1; + int N = 128; //1024 * 1024; + float *a, *b, *c, *d, *e; + int i; + int nbytes; + + acc_init (acc_device_nvidia); + + nbytes = N * sizeof (float); + + a = (float *) malloc (nbytes); + b = (float *) malloc (nbytes); + c = (float *) malloc (nbytes); + d = (float *) malloc (nbytes); + e = (float *) malloc (nbytes); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + } + +#pragma acc data copy (a[0:N]) copy (b[0:N]) copyin (N) + { + +#pragma acc parallel async + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc wait + + } + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 3.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 2.0; + b[i] = 0.0; + } + +#pragma acc data copy (a[0:N]) copy (b[0:N]) copyin (N) + { + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc wait (1) + + } + + for (i = 0; i < N; i++) + { + if (a[i] != 2.0) + abort (); + + if (b[i] != 2.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + c[i] = 0.0; + d[i] = 0.0; + } + +#pragma acc data copy (a[0:N]) copy (b[0:N]) copy (c[0:N]) copy (d[0:N]) copyin (N) + { + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = (a[ii] * a[ii] * a[ii]) / a[ii]; + } + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + c[ii] = (a[ii] + a[ii] + a[ii] + a[ii]) / a[ii]; + } + + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + d[ii] = ((a[ii] * a[ii] + a[ii]) / a[ii]) - a[ii]; + } + +#pragma acc wait (1) + + } + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 9.0) + abort (); + + if (c[i] != 4.0) + abort (); + + if (d[i] != 1.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 2.0; + b[i] = 0.0; + c[i] = 0.0; + d[i] = 0.0; + e[i] = 0.0; + } + +#pragma acc data copy (a[0:N], b[0:N], c[0:N], d[0:N], e[0:N]) copyin (N) + { + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = (a[ii] * a[ii] * a[ii]) / a[ii]; + } + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + c[ii] = (a[ii] + a[ii] + a[ii] + a[ii]) / a[ii]; + } + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + d[ii] = ((a[ii] * a[ii] + a[ii]) / a[ii]) - a[ii]; + } + +#pragma acc parallel wait (1) async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + e[ii] = a[ii] + b[ii] + c[ii] + d[ii]; + } + +#pragma acc wait (1) + + } + + for (i = 0; i < N; i++) + { + if (a[i] != 2.0) + abort (); + + if (b[i] != 4.0) + abort (); + + if (c[i] != 4.0) + abort (); + + if (d[i] != 1.0) + abort (); + + if (e[i] != 11.0) + abort (); + } + + + r = cuStreamCreate (&stream1, CU_STREAM_NON_BLOCKING); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + acc_set_cuda_stream (1, stream1); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 0.0; + } + +#pragma acc data copy (a[0:N], b[0:N]) copyin (N) + { + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc wait (1) + + } + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 5.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 7.0; + b[i] = 0.0; + c[i] = 0.0; + d[i] = 0.0; + } + +#pragma acc data copy (a[0:N]) copy (b[0:N]) copy (c[0:N]) copy (d[0:N]) copyin (N) + { + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = (a[ii] * a[ii] * a[ii]) / a[ii]; + } + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + c[ii] = (a[ii] + a[ii] + a[ii] + a[ii]) / a[ii]; + } + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + d[ii] = ((a[ii] * a[ii] + a[ii]) / a[ii]) - a[ii]; + } + +#pragma acc wait (1) + + } + + for (i = 0; i < N; i++) + { + if (a[i] != 7.0) + abort (); + + if (b[i] != 49.0) + abort (); + + if (c[i] != 4.0) + abort (); + + if (d[i] != 1.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + c[i] = 0.0; + d[i] = 0.0; + e[i] = 0.0; + } + +#pragma acc data copy (a[0:N], b[0:N], c[0:N], d[0:N], e[0:N]) copyin (N) + { + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = (a[ii] * a[ii] * a[ii]) / a[ii]; + } + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + c[ii] = (a[ii] + a[ii] + a[ii] + a[ii]) / a[ii]; + } + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + d[ii] = ((a[ii] * a[ii] + a[ii]) / a[ii]) - a[ii]; + } + +#pragma acc parallel wait (1) async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + e[ii] = a[ii] + b[ii] + c[ii] + d[ii]; + } + +#pragma acc wait (1) + + } + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 9.0) + abort (); + + if (c[i] != 4.0) + abort (); + + if (d[i] != 1.0) + abort (); + + if (e[i] != 17.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 4.0; + b[i] = 0.0; + c[i] = 0.0; + d[i] = 0.0; + e[i] = 0.0; + } + +#pragma acc data copyin (a[0:N], b[0:N], c[0:N]) copyin (N) + { + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = (a[ii] * a[ii] * a[ii]) / a[ii]; + } + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + c[ii] = (a[ii] + a[ii] + a[ii] + a[ii]) / a[ii]; + } + +#pragma acc update host (a[0:N], b[0:N], c[0:N]) wait (1) + + } + + for (i = 0; i < N; i++) + { + if (a[i] != 4.0) + abort (); + + if (b[i] != 16.0) + abort (); + + if (c[i] != 4.0) + abort (); + } + + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 0.0; + c[i] = 0.0; + d[i] = 0.0; + e[i] = 0.0; + } + +#pragma acc data copyin (a[0:N], b[0:N], c[0:N]) copyin (N) + { + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = (a[ii] * a[ii] * a[ii]) / a[ii]; + } + +#pragma acc parallel async (1) + { + int ii; + + for (ii = 0; ii < N; ii++) + c[ii] = (a[ii] + a[ii] + a[ii] + a[ii]) / a[ii]; + } + +#pragma acc update host (a[0:N], b[0:N], c[0:N]) async (1) + +#pragma acc wait (1) + + } + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 25.0) + abort (); + + if (c[i] != 4.0) + abort (); + } + + acc_shutdown (acc_device_nvidia); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/cache-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/cache-1.c new file mode 100644 index 0000000..3f1f0bb --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/cache-1.c @@ -0,0 +1,48 @@ +int +main (int argc, char **argv) +{ +#define N 2 + int a[N], b[N]; + int i; + + for (i = 0; i < N; i++) + { + a[i] = 3; + b[i] = 0; + } + +#pragma acc parallel copyin (a[0:N]) copyout (b[0:N]) +{ + int ii; + + for (ii = 0; ii < N; ii++) + { + const int idx = ii; + int n = 1; + const int len = n; + +#pragma acc cache (a[0:N]) + +#pragma acc cache (a[0:N], b[0:N]) + +#pragma acc cache (a[0]) + +#pragma acc cache (a[0], a[1], b[0:N]) + +#pragma acc cache (a[idx]) + +#pragma acc cache (a[idx:len]) + + b[ii] = a[ii]; + } +} + + + for (i = 0; i < N; i++) + { + if (a[i] != b[i]) + __builtin_abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-1.c new file mode 100644 index 0000000..51c0cf5 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-1.c @@ -0,0 +1,623 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> +#include <string.h> +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> + +int +main (int argc, char **argv) +{ + int N = 8; + float *a, *b, *c, *d; + int i; + + a = (float *) malloc (N * sizeof (float)); + b = (float *) malloc (N * sizeof (float)); + c = (float *) malloc (N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + } + +#pragma acc parallel copyin (a[0:N]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + + for (i = 0; i < N; i++) + { + if (b[i] != 3.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 1.0; + } + +#pragma acc parallel copyin (a[0:N]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + + for (i = 0; i < N; i++) + { + if (b[i] != 5.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 0.0; + } + + d = (float *) acc_copyin (&a[0], N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + } + +#pragma acc parallel present_or_copyin (a[0:N]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + + for (i = 0; i < N; i++) + { + if (b[i] != 6.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 0.0; + } + +#pragma acc parallel copyin (a[0:N]) present_or_copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + + for (i = 0; i < N; i++) + { + if (b[i] != 6.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 2.0; + } + + d = (float *) acc_copyin (&b[0], N * sizeof (float)); + +#pragma acc parallel copyin (a[0:N]) present_or_copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 2.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + acc_free (d); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 4.0; + } + +#pragma acc parallel copy (a[0:N]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + a[ii] = a[ii] + 1; + b[ii] = a[ii] + 2; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 4.0) + abort (); + + if (b[i] != 6.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 4.0; + b[i] = 7.0; + } + +#pragma acc parallel present_or_copy (a[0:N]) present_or_copy (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + a[ii] = a[ii] + 1; + b[ii] = b[ii] + 2; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 9.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 7.0; + } + + d = (float *) acc_copyin (&a[0], N * sizeof (float)); + d = (float *) acc_copyin (&b[0], N * sizeof (float)); + +#pragma acc parallel present_or_copy (a[0:N]) present_or_copy (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + a[ii] = a[ii] + 1; + b[ii] = b[ii] + 2; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 7.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + d = (float *) acc_deviceptr (&a[0]); + acc_unmap_data (&a[0]); + acc_free (d); + + d = (float *) acc_deviceptr (&b[0]); + acc_unmap_data (&b[0]); + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 7.0; + } + +#pragma acc parallel copyin (a[0:N]) create (c[0:N]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 3.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&c[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 4.0; + b[i] = 8.0; + } + +#pragma acc parallel copyin (a[0:N]) present_or_create (c[0:N]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 4.0) + abort (); + + if (b[i] != 4.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&c[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 2.0; + b[i] = 5.0; + } + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (c, d, N * sizeof (float)); + +#pragma acc parallel copyin (a[0:N]) present_or_create (c[0:N]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 2.0) + abort (); + + if (b[i] != 2.0) + abort (); + } + + if (acc_is_present (a, (N * sizeof (float)))) + abort (); + + if (acc_is_present (b, (N * sizeof (float)))) + abort (); + + if (!acc_is_present (c, (N * sizeof (float)))) + abort (); + + d = (float *) acc_deviceptr (c); + + acc_unmap_data (c); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 4.0; + b[i] = 8.0; + } + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (c, d, N * sizeof (float)); + +#pragma acc parallel copyin (a[0:N]) present (c[0:N]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 4.0) + abort (); + + if (b[i] != 4.0) + abort (); + } + + if (acc_is_present (a, (N * sizeof (float)))) + abort (); + + if (acc_is_present (b, (N * sizeof (float)))) + abort (); + + if (!acc_is_present (c, (N * sizeof (float)))) + abort (); + + acc_unmap_data (c); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 4.0; + b[i] = 8.0; + } + + acc_copyin (a, N * sizeof (float)); + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (b, d, N * sizeof (float)); + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (c, d, N * sizeof (float)); + +#pragma acc parallel present (a[0:N]) present (c[0:N]) present (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + + if (!acc_is_present (a, (N * sizeof (float)))) + abort (); + + if (!acc_is_present (b, (N * sizeof (float)))) + abort (); + + if (!acc_is_present (c, (N * sizeof (float)))) + abort (); + + acc_copyout (b, N * sizeof (float)); + + for (i = 0; i < N; i++) + { + if (a[i] != 4.0) + abort (); + + if (b[i] != 4.0) + abort (); + } + + d = (float *) acc_deviceptr (a); + + acc_unmap_data (a); + + acc_free (d); + + d = (float *) acc_deviceptr (c); + + acc_unmap_data (c); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 6.0; + } + + d = (float *) acc_malloc (N * sizeof (float)); + +#pragma acc parallel copyin (a[0:N]) deviceptr (d) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + d[ii] = a[ii]; + b[ii] = d[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 3.0) + abort (); + } + + if (acc_is_present (a, (N * sizeof (float)))) + abort (); + + if (acc_is_present (b, (N * sizeof (float)))) + abort (); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 0.0; + } + + d = (float *) acc_copyin (&a[0], N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + } + +#pragma acc parallel pcopyin (a[0:N]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + + for (i = 0; i < N; i++) + { + if (b[i] != 6.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 0.0; + } + +#pragma acc parallel copyin (a[0:N]) pcopyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + + for (i = 0; i < N; i++) + { + if (b[i] != 6.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 7.0; + } + +#pragma acc parallel copyin (a[0:N]) pcreate (c[0:N]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 5.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&c[0], (N * sizeof (float)))) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-2.c new file mode 100644 index 0000000..8dc45cb --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/clauses-2.c @@ -0,0 +1,67 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> +#include <string.h> +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> + +int +main (int argc, char **argv) +{ + int N = 8; + float *a, *b, *c, *d; + int i; + + a = (float *) malloc (N * sizeof (float)); + b = (float *) malloc (N * sizeof (float)); + c = (float *) malloc (N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 2.0; + b[i] = 5.0; + } + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (c, d, N * sizeof (float)); + +#pragma acc parallel copyin (a[0:N]) present_or_create (c[0:N+1]) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 2.0) + abort (); + + if (b[i] != 2.0) + abort (); + } + + if (acc_is_present (a, (N * sizeof (float)))) + abort (); + + if (acc_is_present (b, (N * sizeof (float)))) + abort (); + + if (!acc_is_present (c, (N * sizeof (float)))) + abort (); + + d = (float *) acc_deviceptr (c); + + acc_unmap_data (c); + + acc_free (d); + + return 0; +} +/* { dg-shouldfail "libgomp: \[\h+,\d+\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-1.c new file mode 100644 index 0000000..80fed6c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-1.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> + +int +main (void) +{ + int i, j, k, l = 0; + int a[3][3][3]; + + memset (a, '\0', sizeof (a)); + #pragma acc parallel + #pragma acc loop collapse(4 - 1) + for (i = 0; i < 2; i++) + for (j = 0; j < 2; j++) + for (k = 0; k < 2; k++) + a[i][j][k] = i + j * 4 + k * 16; + #pragma acc parallel + { + #pragma acc loop collapse(2) reduction(|:l) + for (i = 0; i < 2; i++) + for (j = 0; j < 2; j++) + for (k = 0; k < 2; k++) + if (a[i][j][k] != i + j * 4 + k * 16) + l = 1; + } + if (l) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-2.c new file mode 100644 index 0000000..44a77f7 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-2.c @@ -0,0 +1,37 @@ +/* { dg-do run } */ + +#include <stdlib.h> + +int +main (void) +{ + int i, j, k, l = 0, f = 0, x = 0; + int m1 = 4, m2 = -5, m3 = 17; + + #pragma acc parallel + #pragma acc loop collapse(3) reduction(+:l) + for (i = -2; i < m1; i++) + for (j = m2; j < -2; j++) + { + for (k = 13; k < m3; k++) + { + if ((i + 2) * 12 + (j + 5) * 4 + (k - 13) != 9 + f++) + l++; + } + } + + for (i = -2; i < m1; i++) + for (j = m2; j < -2; j++) + { + for (k = 13; k < m3; k++) + { + if ((i + 2) * 12 + (j + 5) * 4 + (k - 13) != 9 + f++) + x++; + } + } + + if (l != x) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-3.c new file mode 100644 index 0000000..a5be728 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-3.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <string.h> +#include <stdlib.h> +#include <stdio.h> + +int +main (void) +{ + int i2, l = 0, r = 0; + int a[3][3][3]; + + memset (a, '\0', sizeof (a)); + #pragma acc parallel + #pragma acc loop collapse(4 - 1) + for (int i = 0; i < 2; i++) + for (int j = 0; j < 2; j++) + for (int k = 0; k < 2; k++) + a[i][j][k] = i + j * 4 + k * 16; +#pragma acc parallel + { + #pragma acc loop collapse(2) reduction(|:l) + for (i2 = 0; i2 < 2; i2++) + for (int j = 0; j < 2; j++) + for (int k = 0; k < 2; k++) + if (a[i2][j][k] != i2 + j * 4 + k * 16) + l += 1; + } + + for (i2 = 0; i2 < 2; i2++) + for (int j = 0; j < 2; j++) + for (int k = 0; k < 2; k++) + if (a[i2][j][k] != i2 + j * 4 + k * 16) + r += 1; + + if (l != r) + abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-4.c new file mode 100644 index 0000000..52dd435 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-4.c @@ -0,0 +1,27 @@ +/* { dg-do run } */ + +#include <string.h> + +int +main (void) +{ + int l = 0; + int b[3][3]; + int i, j; + + memset (b, '\0', sizeof (b)); + +#pragma acc parallel copy(b[0:3][0:3]) copy(l) + { +#pragma acc loop collapse(2) reduction(+:l) + for (i = 0; i < 2; i++) + for (j = 0; j < 2; j++) + if (b[i][j] != 16) + l += 1; + } + + if (l != 2 * 2) + __builtin_abort(); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-1.c new file mode 100644 index 0000000..dabc706 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-1.c @@ -0,0 +1,213 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda -lcublas -lcudart" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <cuda.h> +#include <cuda_runtime_api.h> +#include <cublas_v2.h> +#include <openacc.h> + +void +saxpy (int n, float a, float *x, float *y) +{ + int i; + + for (i = 0; i < n; i++) + { + y[i] = a * x[i] + y[i]; + } +} + +void +context_check (CUcontext ctx1) +{ + CUcontext ctx2, ctx3; + CUresult r; + + r = cuCtxGetCurrent (&ctx2); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + if (ctx1 != ctx2) + { + fprintf (stderr, "new context established\n"); + exit (EXIT_FAILURE); + } + + ctx3 = (CUcontext) acc_get_current_cuda_context (); + + if (ctx1 != ctx3) + { + fprintf (stderr, "acc_get_current_cuda_context returned wrong value\n"); + exit (EXIT_FAILURE); + } + + return; +} + +int +main (int argc, char **argv) +{ + cublasStatus_t s; + cudaError_t e; + cublasHandle_t h; + CUcontext pctx, ctx; + CUresult r; + int dev; + int i; + const int N = 256; + float *h_X, *h_Y1, *h_Y2; + float *d_X,*d_Y; + float alpha = 2.0f; + float error_norm; + float ref_norm; + + /* Test 1 - cuBLAS creates, OpenACC shares. */ + + s = cublasCreate (&h); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasCreate failed: %d\n", s); + exit (EXIT_FAILURE); + } + + r = cuCtxGetCurrent (&pctx); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + e = cudaGetDevice (&dev); + if (e != cudaSuccess) + { + fprintf (stderr, "cudaGetDevice failed: %d\n", e); + exit (EXIT_FAILURE); + } + + acc_set_device_num (dev, acc_device_nvidia); + + h_X = (float *) malloc (N * sizeof (float)); + if (!h_X) + { + fprintf (stderr, "malloc failed: for h_X\n"); + exit (EXIT_FAILURE); + } + + h_Y1 = (float *) malloc (N * sizeof (float)); + if (!h_Y1) + { + fprintf (stderr, "malloc failed: for h_Y1\n"); + exit (EXIT_FAILURE); + } + + h_Y2 = (float *) malloc (N * sizeof (float)); + if (!h_Y2) + { + fprintf (stderr, "malloc failed: for h_Y2\n"); + exit (EXIT_FAILURE); + } + + for (i = 0; i < N; i++) + { + h_X[i] = rand () / (float) RAND_MAX; + h_Y2[i] = h_Y1[i] = rand () / (float) RAND_MAX; + } + + d_X = (float *) acc_copyin (&h_X[0], N * sizeof (float)); + if (d_X == NULL) + { + fprintf (stderr, "copyin error h_X\n"); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + d_Y = (float *) acc_copyin (&h_Y1[0], N * sizeof (float)); + if (d_Y == NULL) + { + fprintf (stderr, "copyin error h_Y1\n"); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + s = cublasSaxpy (h, N, &alpha, d_X, 1, d_Y, 1); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasSaxpy failed: %d\n", s); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + acc_memcpy_from_device (&h_Y1[0], d_Y, N * sizeof (float)); + + context_check (pctx); + + saxpy (N, alpha, h_X, h_Y2); + + error_norm = 0; + ref_norm = 0; + + for (i = 0; i < N; ++i) + { + float diff; + + diff = h_Y1[i] - h_Y2[i]; + error_norm += diff * diff; + ref_norm += h_Y2[i] * h_Y2[i]; + } + + error_norm = (float) sqrt ((double) error_norm); + ref_norm = (float) sqrt ((double) ref_norm); + + if ((fabs (ref_norm) < 1e-7) || ((error_norm / ref_norm) >= 1e-6f)) + { + fprintf (stderr, "math error\n"); + exit (EXIT_FAILURE); + } + + free (h_X); + free (h_Y1); + free (h_Y2); + + acc_free (d_X); + acc_free (d_Y); + + context_check (pctx); + + s = cublasDestroy (h); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasDestroy failed: %d\n", s); + exit (EXIT_FAILURE); + } + + acc_shutdown (acc_device_nvidia); + + r = cuCtxGetCurrent (&ctx); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + if (!ctx) + { + fprintf (stderr, "Expected context\n"); + exit (EXIT_FAILURE); + } + + if (pctx != ctx) + { + fprintf (stderr, "Unexpected new context\n"); + exit (EXIT_FAILURE); + } + + return EXIT_SUCCESS; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c new file mode 100644 index 0000000..6a52f74 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c @@ -0,0 +1,223 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda -lcublas -lcudart" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <cuda.h> +#include <cuda_runtime_api.h> +#include <cublas_v2.h> +#include <openacc.h> + +void +saxpy (int n, float a, float *x, float *y) +{ + int i; + + for (i = 0; i < n; i++) + { + y[i] = a * x[i] + y[i]; + } +} + +void +context_check (CUcontext ctx1) +{ + CUcontext ctx2, ctx3; + CUresult r; + + r = cuCtxGetCurrent (&ctx2); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + if (ctx1 != ctx2) + { + fprintf (stderr, "new context established\n"); + exit (EXIT_FAILURE); + } + + ctx3 = (CUcontext) acc_get_current_cuda_context (); + + if (ctx1 != ctx3) + { + fprintf (stderr, "acc_get_current_cuda_context returned wrong value\n"); + exit (EXIT_FAILURE); + } + + return; +} + +int +main (int argc, char **argv) +{ + cublasStatus_t s; + cudaError_t e; + cublasHandle_t h; + CUcontext pctx, ctx; + CUresult r; + int dev; + int i; + const int N = 256; + float *h_X, *h_Y1, *h_Y2; + float *d_X,*d_Y; + float alpha = 2.0f; + float error_norm; + float ref_norm; + + /* Test 2 - cuBLAS creates, OpenACC shares. */ + + s = cublasCreate (&h); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasCreate failed: %d\n", s); + exit (EXIT_FAILURE); + } + + r = cuCtxGetCurrent (&pctx); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + e = cudaGetDevice (&dev); + if (e != cudaSuccess) + { + fprintf (stderr, "cudaGetDevice failed: %d\n", e); + exit (EXIT_FAILURE); + } + + acc_set_device_num (dev, acc_device_nvidia); + + h_X = (float *) malloc (N * sizeof (float)); + if (h_X == 0) + { + fprintf (stderr, "malloc failed: for h_X\n"); + exit (EXIT_FAILURE); + } + + h_Y1 = (float *) malloc (N * sizeof (float)); + if (h_Y1 == 0) + { + fprintf (stderr, "malloc failed: for h_Y1\n"); + exit (EXIT_FAILURE); + } + + h_Y2 = (float *) malloc (N * sizeof (float)); + if (h_Y2 == 0) + { + fprintf (stderr, "malloc failed: for h_Y2\n"); + exit (EXIT_FAILURE); + } + + for (i = 0; i < N; i++) + { + h_X[i] = rand () / (float) RAND_MAX; + h_Y2[i] = h_Y1[i] = rand () / (float) RAND_MAX; + } + + d_X = (float *) acc_copyin (&h_X[0], N * sizeof (float)); + if (d_X == NULL) + { + fprintf (stderr, "copyin error h_X\n"); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + d_Y = (float *) acc_copyin (&h_Y1[0], N * sizeof (float)); + if (d_Y == NULL) + { + fprintf (stderr, "copyin error h_Y1\n"); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + s = cublasSaxpy (h, N, &alpha, d_X, 1, d_Y, 1); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasSaxpy failed: %d\n", s); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + acc_memcpy_from_device (&h_Y1[0], d_Y, N * sizeof (float)); + + context_check (pctx); + +#pragma acc parallel present (h_X[0:N]), copy (h_Y2[0:N]) copyin (alpha) + { + int i; + + for (i = 0; i < N; i++) + { + h_Y2[i] = alpha * h_X[i] + h_Y2[i]; + } + } + + context_check (pctx); + + error_norm = 0; + ref_norm = 0; + + for (i = 0; i < N; ++i) + { + float diff; + + diff = h_Y1[i] - h_Y2[i]; + error_norm += diff * diff; + ref_norm += h_Y2[i] * h_Y2[i]; + } + + error_norm = (float) sqrt ((double) error_norm); + ref_norm = (float) sqrt ((double) ref_norm); + + if ((fabs (ref_norm) < 1e-7) || ((error_norm / ref_norm) >= 1e-6f)) + { + fprintf (stderr, "math error\n"); + exit (EXIT_FAILURE); + } + + free (h_X); + free (h_Y1); + free (h_Y2); + + acc_free (d_X); + acc_free (d_Y); + + context_check (pctx); + + s = cublasDestroy (h); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasDestroy failed: %d\n", s); + exit (EXIT_FAILURE); + } + + acc_shutdown (acc_device_nvidia); + + r = cuCtxGetCurrent (&ctx); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + if (!ctx) + { + fprintf (stderr, "Expected context\n"); + exit (EXIT_FAILURE); + } + + if (pctx != ctx) + { + fprintf (stderr, "Unexpected new context\n"); + exit (EXIT_FAILURE); + } + + return EXIT_SUCCESS; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-3.c new file mode 100644 index 0000000..ccd276c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-3.c @@ -0,0 +1,200 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda -lcublas -lcudart" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <cuda.h> +#include <cuda_runtime_api.h> +#include <cublas_v2.h> +#include <openacc.h> + +void +saxpy (int n, float a, float *x, float *y) +{ + int i; + + for (i = 0; i < n; i++) + { + y[i] = a * x[i] + y[i]; + } +} + +void +context_check (CUcontext ctx1) +{ + CUcontext ctx2, ctx3; + CUresult r; + + r = cuCtxGetCurrent (&ctx2); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + if (ctx1 != ctx2) + { + fprintf (stderr, "new context established\n"); + exit (EXIT_FAILURE); + } + + ctx3 = (CUcontext) acc_get_current_cuda_context (); + + if (ctx1 != ctx3) + { + fprintf (stderr, "acc_get_current_cuda_context returned wrong value\n"); + exit (EXIT_FAILURE); + } + + return; +} + +int +main (int argc, char **argv) +{ + cublasStatus_t s; + cublasHandle_t h; + CUcontext pctx; + CUresult r; + int i; + const int N = 256; + float *h_X, *h_Y1, *h_Y2; + float *d_X,*d_Y; + float alpha = 2.0f; + float error_norm; + float ref_norm; + + /* Test 3 - OpenACC creates, cuBLAS shares. */ + + acc_set_device_num (0, acc_device_nvidia); + + r = cuCtxGetCurrent (&pctx); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + h_X = (float *) malloc (N * sizeof (float)); + if (h_X == 0) + { + fprintf (stderr, "malloc failed: for h_X\n"); + exit (EXIT_FAILURE); + } + + h_Y1 = (float *) malloc (N * sizeof (float)); + if (h_Y1 == 0) + { + fprintf (stderr, "malloc failed: for h_Y1\n"); + exit (EXIT_FAILURE); + } + + h_Y2 = (float *) malloc (N * sizeof (float)); + if (h_Y2 == 0) + { + fprintf (stderr, "malloc failed: for h_Y2\n"); + exit (EXIT_FAILURE); + } + + for (i = 0; i < N; i++) + { + h_X[i] = rand () / (float) RAND_MAX; + h_Y2[i] = h_Y1[i] = rand () / (float) RAND_MAX; + } + + d_X = (float *) acc_copyin (&h_X[0], N * sizeof (float)); + if (d_X == NULL) + { + fprintf (stderr, "copyin error h_X\n"); + exit (EXIT_FAILURE); + } + + d_Y = (float *) acc_copyin (&h_Y1[0], N * sizeof (float)); + if (d_Y == NULL) + { + fprintf (stderr, "copyin error h_Y1\n"); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + s = cublasCreate (&h); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasCreate failed: %d\n", s); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + s = cublasSaxpy (h, N, &alpha, d_X, 1, d_Y, 1); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasSaxpy failed: %d\n", s); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + acc_memcpy_from_device (&h_Y1[0], d_Y, N * sizeof (float)); + + context_check (pctx); + + saxpy (N, alpha, h_X, h_Y2); + + error_norm = 0; + ref_norm = 0; + + for (i = 0; i < N; ++i) + { + float diff; + + diff = h_Y1[i] - h_Y2[i]; + error_norm += diff * diff; + ref_norm += h_Y2[i] * h_Y2[i]; + } + + error_norm = (float) sqrt ((double) error_norm); + ref_norm = (float) sqrt ((double) ref_norm); + + if ((fabs (ref_norm) < 1e-7) || ((error_norm / ref_norm) >= 1e-6f)) + { + fprintf (stderr, "math error\n"); + exit (EXIT_FAILURE); + } + + free (h_X); + free (h_Y1); + free (h_Y2); + + acc_free (d_X); + acc_free (d_Y); + + context_check (pctx); + + s = cublasDestroy (h); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasDestroy failed: %d\n", s); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + acc_shutdown (acc_device_nvidia); + + r = cuCtxGetCurrent (&pctx); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + if (pctx) + { + fprintf (stderr, "Unexpected context\n"); + exit (EXIT_FAILURE); + } + + return EXIT_SUCCESS; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c new file mode 100644 index 0000000..71365e8 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c @@ -0,0 +1,213 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda -lcublas -lcudart" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <cuda.h> +#include <cuda_runtime_api.h> +#include <cublas_v2.h> +#include <openacc.h> + +void +saxpy (int n, float a, float *x, float *y) +{ + int i; + + for (i = 0; i < n; i++) + { + y[i] = a * x[i] + y[i]; + } +} + +void +context_check (CUcontext ctx1) +{ + CUcontext ctx2, ctx3; + CUresult r; + + r = cuCtxGetCurrent (&ctx2); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + if (ctx1 != ctx2) + { + fprintf (stderr, "new context established\n"); + exit (EXIT_FAILURE); + } + + ctx3 = (CUcontext) acc_get_current_cuda_context (); + + if (ctx1 != ctx3) + { + fprintf (stderr, "acc_get_current_cuda_context returned wrong value\n"); + exit (EXIT_FAILURE); + } + + return; +} + +int +main (int argc, char **argv) +{ + cublasStatus_t s; + cublasHandle_t h; + CUcontext pctx; + CUresult r; + int i; + const int N = 256; + float *h_X, *h_Y1, *h_Y2; + float *d_X,*d_Y; + float alpha = 2.0f; + float error_norm; + float ref_norm; + + /* Test 4 - OpenACC creates, cuBLAS shares. */ + + acc_set_device_num (0, acc_device_nvidia); + + r = cuCtxGetCurrent (&pctx); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + h_X = (float *) malloc (N * sizeof (float)); + if (h_X == 0) + { + fprintf (stderr, "malloc failed: for h_X\n"); + exit (EXIT_FAILURE); + } + + h_Y1 = (float *) malloc (N * sizeof (float)); + if (h_Y1 == 0) + { + fprintf (stderr, "malloc failed: for h_Y1\n"); + exit (EXIT_FAILURE); + } + + h_Y2 = (float *) malloc (N * sizeof (float)); + if (h_Y2 == 0) + { + fprintf (stderr, "malloc failed: for h_Y2\n"); + exit (EXIT_FAILURE); + } + + for (i = 0; i < N; i++) + { + h_X[i] = rand () / (float) RAND_MAX; + h_Y2[i] = h_Y1[i] = rand () / (float) RAND_MAX; + } + +#pragma acc parallel copyin (h_X[0:N]), copy (h_Y2[0:N]) copy (alpha) + { + int i; + + for (i = 0; i < N; i++) + { + h_Y2[i] = alpha * h_X[i] + h_Y2[i]; + } + } + + r = cuCtxGetCurrent (&pctx); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + d_X = (float *) acc_copyin (&h_X[0], N * sizeof (float)); + if (d_X == NULL) + { + fprintf (stderr, "copyin error h_Y1\n"); + exit (EXIT_FAILURE); + } + + d_Y = (float *) acc_copyin (&h_Y1[0], N * sizeof (float)); + if (d_Y == NULL) + { + fprintf (stderr, "copyin error h_Y1\n"); + exit (EXIT_FAILURE); + } + + s = cublasCreate (&h); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasCreate failed: %d\n", s); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + s = cublasSaxpy (h, N, &alpha, d_X, 1, d_Y, 1); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasSaxpy failed: %d\n", s); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + acc_memcpy_from_device (&h_Y1[0], d_Y, N * sizeof (float)); + + context_check (pctx); + + error_norm = 0; + ref_norm = 0; + + for (i = 0; i < N; ++i) + { + float diff; + + diff = h_Y1[i] - h_Y2[i]; + error_norm += diff * diff; + ref_norm += h_Y2[i] * h_Y2[i]; + } + + error_norm = (float) sqrt ((double) error_norm); + ref_norm = (float) sqrt ((double) ref_norm); + + if ((fabs (ref_norm) < 1e-7) || ((error_norm / ref_norm) >= 1e-6f)) + { + fprintf (stderr, "math error\n"); + exit (EXIT_FAILURE); + } + + free (h_X); + free (h_Y1); + free (h_Y2); + + acc_free (d_X); + acc_free (d_Y); + + context_check (pctx); + + s = cublasDestroy (h); + if (s != CUBLAS_STATUS_SUCCESS) + { + fprintf (stderr, "cublasDestroy failed: %d\n", s); + exit (EXIT_FAILURE); + } + + context_check (pctx); + + acc_shutdown (acc_device_nvidia); + + r = cuCtxGetCurrent (&pctx); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuCtxGetCurrent failed: %d\n", r); + exit (EXIT_FAILURE); + } + + if (pctx) + { + fprintf (stderr, "Unexpected context\n"); + exit (EXIT_FAILURE); + } + + return EXIT_SUCCESS; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-1.c new file mode 100644 index 0000000..e7564cc --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-1.c @@ -0,0 +1,188 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int i; + +int +is_mapped (void *p, size_t n) +{ +#if ACC_MEM_SHARED + return 1; +#else + return acc_is_present (p, n); +#endif +} + +int main(void) +{ + int j; + + i = -1; + j = -2; +#pragma acc data copyin (i, j) + { + if (!is_mapped (&i, sizeof (i)) || !is_mapped (&j, sizeof (j))) + abort (); + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + } + if (i != 2 || j != 1) + abort (); + + i = -1; + j = -2; +#pragma acc data copyout (i, j) + { + if (!is_mapped (&i, sizeof (i)) || !is_mapped (&j, sizeof (j))) + abort (); + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + +#pragma acc parallel present (i, j) + { + i = 4; + j = 2; + } + } + if (i != 4 || j != 2) + abort (); + + i = -1; + j = -2; +#pragma acc data create (i, j) + { + if (!is_mapped (&i, sizeof (i)) || !is_mapped (&j, sizeof (j))) + abort (); + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + } + if (i != 2 || j != 1) + abort (); + + i = -1; + j = -2; +#pragma acc data present_or_copyin (i, j) + { + if (!is_mapped (&i, sizeof (i)) || !is_mapped (&j, sizeof (j))) + abort (); + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + } + if (i != 2 || j != 1) + abort (); + + i = -1; + j = -2; +#pragma acc data present_or_copyout (i, j) + { + if (!is_mapped (&i, sizeof (i)) || !is_mapped (&j, sizeof (j))) + abort (); + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + +#pragma acc parallel present (i, j) + { + i = 4; + j = 2; + } + } + if (i != 4 || j != 2) + abort (); + + i = -1; + j = -2; +#pragma acc data present_or_copy (i, j) + { + if (!is_mapped (&i, sizeof (i)) || !is_mapped (&j, sizeof (j))) + abort (); + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + } +#if ACC_MEM_SHARED + if (i != 2 || j != 1) + abort (); +#else + if (i != -1 || j != -2) + abort (); +#endif + + i = -1; + j = -2; +#pragma acc data present_or_create (i, j) + { + if (!is_mapped (&i, sizeof (i)) || !is_mapped (&j, sizeof (j))) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + } + + if (i != 2 || j != 1) + abort (); + + i = -1; + j = -2; +#pragma acc data copyin (i, j) + { +#pragma acc data present (i, j) + { + if (!is_mapped (&i, sizeof (i)) || !is_mapped (&j, sizeof (j))) + abort (); + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + } + } + if (i != 2 || j != 1) + abort (); + + i = -1; + j = -2; +#pragma acc data + { +#if !ACC_MEM_SHARED + if (is_mapped (&i, sizeof (i)) || is_mapped (&j, sizeof (j))) + abort (); +#endif + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + } + if (i != 2 || j != 1) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2.c new file mode 100644 index 0000000..f867a66 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2.c @@ -0,0 +1,162 @@ +/* { dg-do run } */ + +#include <stdlib.h> + +int +main (int argc, char **argv) +{ + int N = 128; //1024 * 1024; + float *a, *b, *c, *d, *e; + int i; + int nbytes; + + nbytes = N * sizeof (float); + + a = (float *) malloc (nbytes); + b = (float *) malloc (nbytes); + c = (float *) malloc (nbytes); + d = (float *) malloc (nbytes); + e = (float *) malloc (nbytes); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + } + +#pragma acc enter data copyin (a[0:N]) copyin (b[0:N]) copyin (N) async +#pragma acc parallel async wait +#pragma acc loop + for (i = 0; i < N; i++) + b[i] = a[i]; + +#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) wait async +#pragma acc wait + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 3.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 2.0; + b[i] = 0.0; + } + +#pragma acc enter data copyin (a[0:N]) copyin (b[0:N]) copyin (N) async (1) +#pragma acc parallel async (1) +#pragma acc loop + for (i = 0; i < N; i++) + b[i] = a[i]; + +#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) wait (1) async (1) +#pragma acc wait (1) + + for (i = 0; i < N; i++) + { + if (a[i] != 2.0) + abort (); + + if (b[i] != 2.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + c[i] = 0.0; + d[i] = 0.0; + } + +#pragma acc enter data copyin (a[0:N]) copyin (b[0:N]) copyin (c[0:N]) copyin (d[0:N]) copyin (N) async (1) + +#pragma acc parallel async (1) wait (1) +#pragma acc loop + for (i = 0; i < N; i++) + b[i] = (a[i] * a[i] * a[i]) / a[i]; + +#pragma acc parallel async (2) wait (1) +#pragma acc loop + for (i = 0; i < N; i++) + c[i] = (a[i] + a[i] + a[i] + a[i]) / a[i]; + +#pragma acc parallel async (3) wait (1) +#pragma acc loop + for (i = 0; i < N; i++) + d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i]; + +#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) copyout (c[0:N]) copyout (d[0:N]) wait (1, 2, 3) async (1) +#pragma acc wait (1) + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 9.0) + abort (); + + if (c[i] != 4.0) + abort (); + + if (d[i] != 1.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 2.0; + b[i] = 0.0; + c[i] = 0.0; + d[i] = 0.0; + e[i] = 0.0; + } + +#pragma acc enter data copyin (a[0:N]) copyin (b[0:N]) copyin (c[0:N]) copyin (d[0:N]) copyin (e[0:N]) copyin (N) async (1) + +#pragma acc parallel async (1) wait (1) + for (int ii = 0; ii < N; ii++) + b[ii] = (a[ii] * a[ii] * a[ii]) / a[ii]; + +#pragma acc parallel async (2) wait (1) + for (int ii = 0; ii < N; ii++) + c[ii] = (a[ii] + a[ii] + a[ii] + a[ii]) / a[ii]; + +#pragma acc parallel async (3) wait (1) + for (int ii = 0; ii < N; ii++) + d[ii] = ((a[ii] * a[ii] + a[ii]) / a[ii]) - a[ii]; + +#pragma acc parallel wait (1) async (4) + for (int ii = 0; ii < N; ii++) + e[ii] = a[ii] + b[ii] + c[ii] + d[ii]; + +#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) copyout (c[0:N]) copyout (d[0:N]) copyout (e[0:N]) wait (1, 2, 3, 4) async (1) +#pragma acc wait (1) + + + for (i = 0; i < N; i++) + { + if (a[i] != 2.0) + abort (); + + if (b[i] != 4.0) + abort (); + + if (c[i] != 4.0) + abort (); + + if (d[i] != 1.0) + abort (); + + if (e[i] != 11.0) + abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-3.c new file mode 100644 index 0000000..747109f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-3.c @@ -0,0 +1,166 @@ +/* { dg-do run } */ + +#include <stdlib.h> + +int +main (int argc, char **argv) +{ + int N = 128; //1024 * 1024; + float *a, *b, *c, *d, *e; + int i; + int nbytes; + + nbytes = N * sizeof (float); + + a = (float *) malloc (nbytes); + b = (float *) malloc (nbytes); + c = (float *) malloc (nbytes); + d = (float *) malloc (nbytes); + e = (float *) malloc (nbytes); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + } + +#pragma acc enter data copyin (a[0:N]) copyin (b[0:N]) copyin (N) async +#pragma acc parallel async wait +#pragma acc loop + for (i = 0; i < N; i++) + b[i] = a[i]; + +#pragma acc update host (a[0:N], b[0:N]) async wait +#pragma acc wait + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 3.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 2.0; + b[i] = 0.0; + } + +#pragma acc update device (a[0:N], b[0:N]) async (1) +#pragma acc parallel async (1) +#pragma acc loop + for (i = 0; i < N; i++) + b[i] = a[i]; + +#pragma acc update host (a[0:N], b[0:N]) async (1) wait (1) +#pragma acc wait (1) + + for (i = 0; i < N; i++) + { + if (a[i] != 2.0) + abort (); + + if (b[i] != 2.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + c[i] = 0.0; + d[i] = 0.0; + } + +#pragma acc update device (a[0:N]) async (1) +#pragma acc update device (b[0:N]) async (2) +#pragma acc enter data copyin (c[0:N], d[0:N]) async (3) + +#pragma acc parallel async (1) wait (1,2) +#pragma acc loop + for (i = 0; i < N; i++) + b[i] = (a[i] * a[i] * a[i]) / a[i]; + +#pragma acc parallel async (2) wait (1,3) +#pragma acc loop + for (i = 0; i < N; i++) + c[i] = (a[i] + a[i] + a[i] + a[i]) / a[i]; + +#pragma acc parallel async (3) wait (1,3) +#pragma acc loop + for (i = 0; i < N; i++) + d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i]; + +#pragma acc update host (a[0:N], b[0:N], c[0:N], d[0:N]) async (1) wait (1,2,3) +#pragma acc wait (1) + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 9.0) + abort (); + + if (c[i] != 4.0) + abort (); + + if (d[i] != 1.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 2.0; + b[i] = 0.0; + c[i] = 0.0; + d[i] = 0.0; + e[i] = 0.0; + } + +#pragma acc update device (a[0:N], b[0:N], c[0:N], d[0:N]) async (1) +#pragma acc enter data copyin (e[0:N]) async (5) + +#pragma acc parallel async (1) wait (1) + for (int ii = 0; ii < N; ii++) + b[ii] = (a[ii] * a[ii] * a[ii]) / a[ii]; + +#pragma acc parallel async (2) wait (1) + for (int ii = 0; ii < N; ii++) + c[ii] = (a[ii] + a[ii] + a[ii] + a[ii]) / a[ii]; + +#pragma acc parallel async (3) wait (1) + for (int ii = 0; ii < N; ii++) + d[ii] = ((a[ii] * a[ii] + a[ii]) / a[ii]) - a[ii]; + +#pragma acc parallel wait (1,5) async (4) + for (int ii = 0; ii < N; ii++) + e[ii] = a[ii] + b[ii] + c[ii] + d[ii]; + +#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) copyout (c[0:N]) copyout (d[0:N]) copyout (e[0:N]) wait (1, 2, 3, 4) async (1) +#pragma acc exit data delete (N) +#pragma acc wait (1) + + + for (i = 0; i < N; i++) + { + if (a[i] != 2.0) + abort (); + + if (b[i] != 4.0) + abort (); + + if (c[i] != 4.0) + abort (); + + if (d[i] != 1.0) + abort (); + + if (e[i] != 11.0) + abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-1.c new file mode 100644 index 0000000..83c0a42 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-1.c @@ -0,0 +1,19 @@ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> + +int +main (int argc, char *argv[]) +{ + int i; + + acc_copyin (&i, sizeof i); + +#pragma acc data copy (i) + ++i; + + return 0; +} + +/* { dg-shouldfail "" } + { dg-output "Trying to map into device .* object when .* is already mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-2.c new file mode 100644 index 0000000..137d8ce --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-2.c @@ -0,0 +1,16 @@ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +int +main (int argc, char *argv[]) +{ + int i; + +#pragma acc data present_or_copy (i) +#pragma acc data copyout (i) + ++i; + + return 0; +} + +/* { dg-shouldfail "" } + { dg-output "Trying to map into device .* object when .* is already mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-3.c new file mode 100644 index 0000000..b993b78 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-3.c @@ -0,0 +1,17 @@ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> + +int +main (int argc, char *argv[]) +{ + int i; + +#pragma acc data present_or_copy (i) + acc_copyin (&i, sizeof i); + + return 0; +} + +/* { dg-shouldfail "" } + { dg-output "already mapped to" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-4.c new file mode 100644 index 0000000..82523f4 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-4.c @@ -0,0 +1,17 @@ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> + +int +main (int argc, char *argv[]) +{ + int i; + + acc_present_or_copyin (&i, sizeof i); + acc_copyin (&i, sizeof i); + + return 0; +} + +/* { dg-shouldfail "" } + { dg-output "already mapped to" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-5.c new file mode 100644 index 0000000..4961fe5 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-5.c @@ -0,0 +1,17 @@ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> + +int +main (int argc, char *argv[]) +{ + int i; + +#pragma acc enter data create (i) + acc_copyin (&i, sizeof i); + + return 0; +} + +/* { dg-shouldfail "" } + { dg-output "already mapped to" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-6.c new file mode 100644 index 0000000..77b56a9 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-6.c @@ -0,0 +1,17 @@ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> + +int +main (int argc, char *argv[]) +{ + int i; + + acc_present_or_copyin (&i, sizeof i); +#pragma acc enter data create (i) + + return 0; +} + +/* { dg-shouldfail "" } + { dg-output "already mapped to" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-7.c new file mode 100644 index 0000000..b08417b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-7.c @@ -0,0 +1,17 @@ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> + +int +main (int argc, char *argv[]) +{ + int i; + +#pragma acc enter data create (i) + acc_create (&i, sizeof i); + + return 0; +} + +/* { dg-shouldfail "" } + { dg-output "already mapped to" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-8.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-8.c new file mode 100644 index 0000000..a50f7de --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-already-8.c @@ -0,0 +1,16 @@ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +int +main (int argc, char *argv[]) +{ + int i; + +#pragma acc data create (i) +#pragma acc parallel copyin (i) + ++i; + + return 0; +} + +/* { dg-shouldfail "" } + { dg-output "Trying to map into device .* object when .* is already mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c new file mode 100644 index 0000000..e271a37 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ + +#include <stdlib.h> + +int main (void) +{ + void *a, *a_1, *a_2; + +#define A (void *) 0x123 + a = A; + +#pragma acc data copyout (a_1, a_2) +#pragma acc kernels deviceptr (a) + { + a_1 = a; + a_2 = &a; + } + + if (a != A) + abort (); + if (a_1 != a) + abort (); +#if ACC_MEM_SHARED + if (a_2 != &a) + abort (); +#else + if (a_2 == &a) + abort (); +#endif + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/if-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/if-1.c new file mode 100644 index 0000000..184b355 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/if-1.c @@ -0,0 +1,613 @@ +/* { dg-do run } */ +/* { dg-additional-options "-fno-builtin-acc_on_device" } */ + +#include <openacc.h> +#include <stdlib.h> +#include <stdbool.h> + +#define N 32 + +int +main(int argc, char **argv) +{ + float *a, *b, *d_a, *d_b, exp, exp2; + int i; + const int one = 1; + const int zero = 0; + int n; + + a = (float *) malloc (N * sizeof (float)); + b = (float *) malloc (N * sizeof (float)); + d_a = (float *) acc_malloc (N * sizeof (float)); + d_b = (float *) acc_malloc (N * sizeof (float)); + + for (i = 0; i < N; i++) + a[i] = 4.0; + +#pragma acc parallel copyin(a[0:N]) copyout(b[0:N]) if(1) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + +#if ACC_MEM_SHARED + exp = 5.0; +#else + exp = 4.0; +#endif + + for (i = 0; i < N; i++) + { + if (b[i] != exp) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 16.0; + +#pragma acc parallel if(0) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 17.0) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 8.0; + +#pragma acc parallel copyin(a[0:N]) copyout(b[0:N]) if(one) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + +#if ACC_MEM_SHARED + exp = 9.0; +#else + exp = 8.0; +#endif + + for (i = 0; i < N; i++) + { + if (b[i] != exp) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 22.0; + +#pragma acc parallel if(zero) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 23.0) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 16.0; + +#pragma acc parallel copyin(a[0:N]) copyout(b[0:N]) if(true) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + +#if ACC_MEM_SHARED + exp = 17.0; +#else + exp = 16.0; +#endif + + for (i = 0; i < N; i++) + { + if (b[i] != exp) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 76.0; + +#pragma acc parallel if(false) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 77.0) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 22.0; + + n = 1; + +#pragma acc parallel copyin(a[0:N]) copyout(b[0:N]) if(n) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + +#if ACC_MEM_SHARED + exp = 23.0; +#else + exp = 22.0; +#endif + + for (i = 0; i < N; i++) + { + if (b[i] != exp) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 18.0; + + n = 0; + +#pragma acc parallel if(n) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 19.0) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 49.0; + + n = 1; + +#pragma acc parallel copyin(a[0:N]) copyout(b[0:N]) if(n + n) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + +#if ACC_MEM_SHARED + exp = 50.0; +#else + exp = 49.0; +#endif + + for (i = 0; i < N; i++) + { + if (b[i] != exp) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 38.0; + + n = 0; + +#pragma acc parallel if(n + n) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 39.0) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 91.0; + +#pragma acc parallel copyin(a[0:N]) copyout(b[0:N]) if(-2) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + +#if ACC_MEM_SHARED + exp = 92.0; +#else + exp = 91.0; +#endif + + for (i = 0; i < N; i++) + { + if (b[i] != exp) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 43.0; + +#pragma acc parallel copyin(a[0:N]) copyout(b[0:N]) if(one == 1) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + +#if ACC_MEM_SHARED + exp = 44.0; +#else + exp = 43.0; +#endif + + for (i = 0; i < N; i++) + { + if (b[i] != exp) + abort(); + } + + for (i = 0; i < N; i++) + a[i] = 87.0; + +#pragma acc parallel if(one == 0) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + if (acc_on_device (acc_device_host)) + b[ii] = a[ii] + 1; + else + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 88.0) + abort(); + } + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 9.0; + } + +#if ACC_MEM_SHARED + exp = 0.0; + exp2 = 0.0; +#else + acc_map_data (a, d_a, N * sizeof (float)); + acc_map_data (b, d_b, N * sizeof (float)); + exp = 3.0; + exp2 = 9.0; +#endif + +#pragma acc update device(a[0:N], b[0:N]) if(1) + + for (i = 0; i < N; i++) + { + a[i] = 0.0; + b[i] = 0.0; + } + +#pragma acc update host(a[0:N], b[0:N]) if(1) + + for (i = 0; i < N; i++) + { + if (a[i] != exp) + abort(); + + if (b[i] != exp2) + abort(); + } + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 12.0; + } + +#pragma acc update device(a[0:N], b[0:N]) if(0) + + for (i = 0; i < N; i++) + { + a[i] = 0.0; + b[i] = 0.0; + } + +#pragma acc update host(a[0:N], b[0:N]) if(1) + + for (i = 0; i < N; i++) + { + if (a[i] != exp) + abort(); + + if (b[i] != exp2) + abort(); + } + + for (i = 0; i < N; i++) + { + a[i] = 26.0; + b[i] = 21.0; + } + +#pragma acc update device(a[0:N], b[0:N]) if(1) + + for (i = 0; i < N; i++) + { + a[i] = 0.0; + b[i] = 0.0; + } + +#pragma acc update host(a[0:N], b[0:N]) if(0) + + for (i = 0; i < N; i++) + { + if (a[i] != 0.0) + abort(); + + if (b[i] != 0.0) + abort(); + } + +#if !ACC_MEM_SHARED + acc_unmap_data (a); + acc_unmap_data (b); +#endif + + acc_free (d_a); + acc_free (d_b); + + for (i = 0; i < N; i++) + { + a[i] = 4.0; + b[i] = 0.0; + } + +#pragma acc data copyin(a[0:N]) copyout(b[0:N]) if(1) +{ +#pragma acc parallel present(a[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + b[ii] = a[ii]; + } + } +} + + for (i = 0; i < N; i++) + { + if (b[i] != 4.0) + abort(); + } + + for (i = 0; i < N; i++) + { + a[i] = 8.0; + b[i] = 1.0; + } + +#pragma acc data copyin(a[0:N]) copyout(b[0:N]) if(0) +{ +#if !ACC_MEM_SHARED + if (acc_is_present (a, N * sizeof (float))) + abort (); +#endif + +#if !ACC_MEM_SHARED + if (acc_is_present (b, N * sizeof (float))) + abort (); +#endif +} + + for (i = 0; i < N; i++) + { + a[i] = 18.0; + b[i] = 21.0; + } + +#pragma acc data copyin(a[0:N]) if(1) +{ +#if !ACC_MEM_SHARED + if (!acc_is_present (a, N * sizeof (float))) + abort (); +#endif + +#pragma acc data copyout(b[0:N]) if(0) + { +#if !ACC_MEM_SHARED + if (acc_is_present (b, N * sizeof (float))) + abort (); +#endif + +#pragma acc data copyout(b[0:N]) if(1) + { +#pragma acc parallel present(a[0:N]) present(b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + b[ii] = a[ii]; + } + } + } + +#if !ACC_MEM_SHARED + if (acc_is_present (b, N * sizeof (float))) + abort (); +#endif + } +} + + for (i = 0; i < N; i++) + { + if (b[i] != 18.0) + abort (); + } + +#pragma acc enter data copyin (b[0:N]) if (0) + +#if !ACC_MEM_SHARED + if (acc_is_present (b, N * sizeof (float))) + abort (); +#endif + +#pragma acc exit data delete (b[0:N]) if (0) + +#pragma acc enter data copyin (b[0:N]) if (1) + +#if !ACC_MEM_SHARED + if (!acc_is_present (b, N * sizeof (float))) + abort (); +#endif + +#pragma acc exit data delete (b[0:N]) if (1) + +#if !ACC_MEM_SHARED + if (acc_is_present (b, N * sizeof (float))) + abort (); +#endif + +#pragma acc enter data copyin (b[0:N]) if (zero) + +#if !ACC_MEM_SHARED + if (acc_is_present (b, N * sizeof (float))) + abort (); +#endif + +#pragma acc exit data delete (b[0:N]) if (zero) + +#pragma acc enter data copyin (b[0:N]) if (one) + +#if !ACC_MEM_SHARED + if (!acc_is_present (b, N * sizeof (float))) + abort (); +#endif + +#pragma acc exit data delete (b[0:N]) if (one) + +#if !ACC_MEM_SHARED + if (acc_is_present (b, N * sizeof (float))) + abort (); +#endif + +#pragma acc enter data copyin (b[0:N]) if (one == 0) + +#if !ACC_MEM_SHARED + if (acc_is_present (b, N * sizeof (float))) + abort (); +#endif + +#pragma acc exit data delete (b[0:N]) if (one == 0) + +#pragma acc enter data copyin (b[0:N]) if (one == 1) + +#if !ACC_MEM_SHARED + if (!acc_is_present (b, N * sizeof (float))) + abort (); +#endif + +#pragma acc exit data delete (b[0:N]) if (one == 1) + +#if !ACC_MEM_SHARED + if (acc_is_present (b, N * sizeof (float))) + abort (); +#endif + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c new file mode 100644 index 0000000..3acfdf5 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c @@ -0,0 +1,184 @@ +/* { dg-do run } */ + +#include <stdlib.h> + +int i; + +int main (void) +{ + int j, v; + +#if 0 + i = -1; + j = -2; + v = 0; +#pragma acc kernels /* copyout */ present_or_copyout (v) copyin (i, j) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != -1 || j != -2) + abort (); + + i = -1; + j = -2; + v = 0; +#pragma acc kernels /* copyout */ present_or_copyout (v) copyout (i, j) + { + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != 2 || j != 1) + abort (); + + i = -1; + j = -2; + v = 0; +#pragma acc kernels /* copyout */ present_or_copyout (v) copy (i, j) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != 2 || j != 1) + abort (); + + i = -1; + j = -2; + v = 0; +#pragma acc kernels /* copyout */ present_or_copyout (v) create (i, j) + { + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != -1 || j != -2) + abort (); +#endif + + i = -1; + j = -2; + v = 0; +#pragma acc kernels /* copyout */ present_or_copyout (v) present_or_copyin (i, j) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1) + abort (); +#if ACC_MEM_SHARED + if (i != 2 || j != 1) + abort (); +#else + if (i != -1 || j != -2) + abort (); +#endif + + i = -1; + j = -2; + v = 0; +#pragma acc kernels /* copyout */ present_or_copyout (v) present_or_copyout (i, j) + { + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != 2 || j != 1) + abort (); + + i = -1; + j = -2; + v = 0; +#pragma acc kernels /* copyout */ present_or_copyout (v) present_or_copy (i, j) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != 2 || j != 1) + abort (); + + i = -1; + j = -2; + v = 0; +#pragma acc kernels /* copyout */ present_or_copyout (v) present_or_create (i, j) + { + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1) + abort (); +#if ACC_MEM_SHARED + if (i != 2 || j != 1) + abort (); +#else + if (i != -1 || j != -2) + abort (); +#endif + +#if 0 + i = -1; + j = -2; + v = 0; +#pragma acc kernels /* copyout */ present_or_copyout (v) present (i, j) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != 2 || j != 1) + abort (); +#endif + +#if 0 + i = -1; + j = -2; + v = 0; +#pragma acc kernels /* copyout */ present_or_copyout (v) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != 2 || j != 1) + abort (); +#endif + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-empty.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-empty.c new file mode 100644 index 0000000..a68a7cd --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-empty.c @@ -0,0 +1,6 @@ +int +main (void) +{ +#pragma acc kernels + ; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-1.c new file mode 100644 index 0000000..17129d8 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-1.c @@ -0,0 +1,24 @@ +/* { dg-do run } */ + +#include <openacc.h> + +int +main (int argc, char **argv) +{ + acc_device_t devtype = acc_device_host; + +#if ACC_DEVICE_TYPE_nvidia + devtype = acc_device_nvidia; + + if (acc_get_num_devices (devtype) == 0) + return 0; +#endif + + acc_init (devtype); + + acc_init (devtype); + + return 0; +} + +/* { dg-shouldfail "libgomp: device already active" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-10.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-10.c new file mode 100644 index 0000000..cf1af8c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-10.c @@ -0,0 +1,58 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + void *d; + acc_device_t devtype = acc_device_host; + +#if ACC_DEVICE_TYPE_nvidia + devtype = acc_device_nvidia; + + if (acc_get_num_devices (acc_device_nvidia) == 0) + return 0; +#endif + + acc_init (devtype); + + d = acc_malloc (0); + if (d != NULL) + abort (); + + acc_free (0); + + acc_shutdown (devtype); + + acc_set_device_type (devtype); + + d = acc_malloc (0); + if (d != NULL) + abort (); + + acc_shutdown (devtype); + + acc_init (devtype); + + d = acc_malloc (1024); + if (d == NULL) + abort (); + + acc_free (d); + + acc_shutdown (devtype); + + acc_set_device_type (devtype); + + d = acc_malloc (1024); + if (d == NULL) + abort (); + + acc_free (d); + + acc_shutdown (devtype); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-11.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-11.c new file mode 100644 index 0000000..eccdb8c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-11.c @@ -0,0 +1,23 @@ +/* Only nvptx plugin does the required error checking. + { dg-do run { target openacc_nvidia_accel_selected } } */ + +#include <stdlib.h> +#include <openacc.h> +#include <stdint.h> + +int +main (int argc, char **argv) +{ + const int N = 512; + void *d; + + d = acc_malloc (N); + if (d == NULL) + abort (); + + acc_free ((void *)((uintptr_t) d + (uintptr_t) (N >> 1))); + + return 0; +} + +/* { dg-shouldfail "libgomp: mem free failed 1" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-12.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-12.c new file mode 100644 index 0000000..b46f590 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-12.c @@ -0,0 +1,37 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + (void) acc_copyin (h, N); + + memset (h, 0, N); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-13.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-13.c new file mode 100644 index 0000000..7098ef3 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-13.c @@ -0,0 +1,60 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +#include <stdio.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + + if (acc_is_present (h, 1) != 1) + abort (); + + if (acc_is_present (h, N + 1) != 0) + abort (); + + if (acc_is_present (h + 1, N) != 0) + abort (); + + if (acc_is_present (h - 1, N) != 0) + abort (); + + if (acc_is_present (h - 1, N - 1) != 0) + abort (); + + if (acc_is_present (h + N, 0) != 0) + abort (); + + if (acc_is_present (h + N, N) != 0) + abort (); + + if (acc_is_present (0, N) != 0) + abort (); + + if (acc_is_present (h, 0) != 0) + abort (); + + acc_free (d); + + if (acc_is_present (h, 1) != 0) + abort (); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-14.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-14.c new file mode 100644 index 0000000..a9632f7 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-14.c @@ -0,0 +1,61 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +#include <stdio.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + + if (acc_is_present (h, 1) != 1) + abort (); + + if (acc_is_present (h + N - 1, 1) != 1) + abort (); + + if (acc_is_present (h - 1, 1) != 0) + abort (); + + if (acc_is_present (h + N, 1) != 0) + abort (); + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + i, 1) != 1) + abort (); + } + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + i, N - i) != 1) + abort (); + } + + acc_free (d); + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + i, N - i) != 0) + abort (); + } + + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-15.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-15.c new file mode 100644 index 0000000..4f6a731 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-15.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + (void) acc_copyin (h, N); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + i, 1) != 0) + abort (); + } + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-16.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-16.c new file mode 100644 index 0000000..9d277ac --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-16.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + (void) acc_copyin (h, N); + + (void) acc_copyin (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,\+256\] already mapped to \[\h+,\+256\]" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-17.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-17.c new file mode 100644 index 0000000..5ff894c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-17.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + (void) acc_copyin (h, N); + + acc_copyout (h, N); + + acc_copyout (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,256\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-18.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-18.c new file mode 100644 index 0000000..2bc32637 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-18.c @@ -0,0 +1,34 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +#include <stdio.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + + acc_free (d); + + acc_copyout (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,256\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-19.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-19.c new file mode 100644 index 0000000..3581616 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-19.c @@ -0,0 +1,60 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +#include <stdio.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h[N]; + + for (i = 0; i < N; i++) + { + int j; + unsigned char *p; + + h[i] = (unsigned char *) malloc (N); + p = h[i]; + + for (j = 0; j < N; j++) + { + p[j] = i; + } + + (void) acc_copyin (p, N); + } + + for (i = 0; i < N; i++) + { + memset (h[i], 0, i); + } + + for (i = 0; i < N; i++) + { + int j; + unsigned char *p; + + acc_copyout (h[i], N); + + p = h[i]; + + for (j = 0; j < N; j++) + { + if (p[j] != i) + abort (); + } + } + + for (i = 0; i < N; i++) + { + free (h[i]); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-2.c new file mode 100644 index 0000000..9a4501f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-2.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ + +#include <openacc.h> + +int +main (int argc, char **argv) +{ + acc_device_t devtype = acc_device_host; + +#if ACC_DEVICE_TYPE_nvidia + devtype = acc_device_nvidia; + + if (acc_get_num_devices (acc_device_nvidia) == 0) + return 0; +#endif + + acc_init (devtype); + + acc_shutdown (devtype); + + acc_shutdown (devtype); + + return 0; +} + +/* { dg-shouldfail "libgomp: no device initialized" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-20.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-20.c new file mode 100644 index 0000000..b379a8f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-20.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + (void) acc_copyin (h, N); + + acc_copyout (h, N + 1); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,256\] surounds2 \[\h+,\+257\]" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-21.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-21.c new file mode 100644 index 0000000..3a67400 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-21.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + (void) acc_copyin (h, N); + + acc_copyout (h, 0); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,0\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-22.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-22.c new file mode 100644 index 0000000..2b86da8 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-22.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + (void) acc_copyin (h, N); + + acc_copyout (h + 1, N - 1); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,256\] surrounds2 \[\h+,\+255\]" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-23.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-23.c new file mode 100644 index 0000000..38f236d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-23.c @@ -0,0 +1,39 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h1, *h2; + + h1 = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h1[i] = 0xab; + } + + (void) acc_copyin (h1, N); + + h2 = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h2[i] = 0xde; + } + + (void) acc_copyin (h2, N); + + acc_copyout (h1, N + N); + + free (h1); + free (h2); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,256\] surrounds2 \[\h+,\+512\]" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-24.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-24.c new file mode 100644 index 0000000..d7de8e3 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-24.c @@ -0,0 +1,55 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_create (h, N); + if (!d) + abort (); + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + i, 1) != 1) + abort (); + } + + acc_delete (h, N); + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + i, 1) != 0) + abort (); + } + + d = acc_create (h, N); + if (!d) + abort (); + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + i, 1) != 1) + abort (); + } + + acc_delete (h, N); + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + i, 1) != 0) + abort (); + } + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-25.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-25.c new file mode 100644 index 0000000..1145828 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-25.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_create (h, N); + if (!d) + abort (); + + d = acc_create (h, N); + if (!d) + abort (); + + acc_delete (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,256\] already mapped to \[\h+,256\]" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-26.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-26.c new file mode 100644 index 0000000..a23f56e --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-26.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_create (h, 0); + if (!d) + abort (); + + acc_delete (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,\+0\] is a bad range" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-27.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-27.c new file mode 100644 index 0000000..074fddb --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-27.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_create (0, N); + if (!d) + abort (); + + acc_delete (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\(nil\)\] is a bad range" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-28.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-28.c new file mode 100644 index 0000000..027f7cc --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-28.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_create (h, N); + if (!d) + abort (); + + acc_delete (0, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\(nil\),256\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-29.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-29.c new file mode 100644 index 0000000..a66de0f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-29.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_create (h, N); + if (!d) + abort (); + + acc_delete (h, 0); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,0\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-3.c new file mode 100644 index 0000000..e823a41 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-3.c @@ -0,0 +1,15 @@ +/* { dg-do run } */ + +#include <openacc.h> + +int +main (int argc, char **argv) +{ + acc_init (acc_device_host); + + acc_shutdown (acc_device_not_host); + + return 0; +} + +/* { dg-shouldfail "libgomp: device 4(4) is initialized" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-30.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-30.c new file mode 100644 index 0000000..ce2bdb4 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-30.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_create (h, N); + if (!d) + abort (); + + acc_delete (h, N - 2); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,256\] surrounds2 \[\h+,\+254\]" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-31.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-31.c new file mode 100644 index 0000000..25ce5a9 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-31.c @@ -0,0 +1,27 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_present_or_create (h, N); + if (!d) + abort (); + + if (acc_is_present (h, 1) != 1) + abort (); + + acc_delete (h, N); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-32.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-32.c new file mode 100644 index 0000000..e3f87a8 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-32.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d1, *d2; + + h = (unsigned char *) malloc (N); + + d1 = acc_present_or_create (h, N); + if (!d1) + abort (); + + d2 = acc_present_or_create (h, N); + if (!d2) + abort (); + + if (d1 != d2) + abort (); + + d2 = acc_pcreate (h, N); + if (!d2) + abort (); + + if (d1 != d2) + abort (); + + acc_delete (h, N); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-33.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-33.c new file mode 100644 index 0000000..4abaa02 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-33.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d1, *d2; + + h = (unsigned char *) malloc (N); + + d1 = acc_present_or_create (h, N); + if (!d1) + abort (); + + d2 = acc_present_or_create (h, N - 2); + if (!d2) + abort (); + + if (d1 != d2) + abort (); + + acc_delete (h, N); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-34.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-34.c new file mode 100644 index 0000000..32d5d51 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-34.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d1, *d2; + + h = (unsigned char *) malloc (N); + + d1 = acc_present_or_create (h, N); + if (!d1) + abort (); + + d2 = acc_present_or_create (h + 2, N); + if (!d2) + abort (); + + if (d1 != d2) + abort (); + + acc_delete (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,\+256\] not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-35.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-35.c new file mode 100644 index 0000000..ca8edab --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-35.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_present_or_create (0, N); + if (!d) + abort (); + + acc_delete (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\(nil\),+256\] is a bad range" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-36.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-36.c new file mode 100644 index 0000000..cb29397 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-36.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_present_or_create (h, 0); + if (!d) + abort (); + + acc_delete (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,\+0\] is a bad range" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-37.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-37.c new file mode 100644 index 0000000..5a7d533 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-37.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_present_or_copyin (h, N); + if (!d) + abort (); + + memset (&h[0], 0, N); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-38.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-38.c new file mode 100644 index 0000000..05d8498 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-38.c @@ -0,0 +1,64 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d1, *d2; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d1 = acc_present_or_copyin (h, N); + if (!d1) + abort (); + + for (i = 0; i < N; i++) + { + h[i] = 0xab; + } + + d2 = acc_present_or_copyin (h, N); + if (!d2) + abort (); + + if (d1 != d2) + abort (); + + memset (&h[0], 0, N); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + d2 = acc_pcopyin (h, N); + if (!d2) + abort (); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-39.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-39.c new file mode 100644 index 0000000..db1e0b3 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-39.c @@ -0,0 +1,41 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_present_or_copyin (0, N); + if (!d) + abort (); + + memset (&h[0], 0, N); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\(nil\),+256\] is a bad range" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-4.c new file mode 100644 index 0000000..060275b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-4.c @@ -0,0 +1,13 @@ +/* { dg-do run } */ + +#include <openacc.h> + +int +main (int argc, char **argv) +{ + acc_init ((acc_device_t) 99); + + return 0; +} + +/* { dg-shouldfail "libgomp: device 99 is out of range" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-40.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-40.c new file mode 100644 index 0000000..cb6c422 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-40.c @@ -0,0 +1,42 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_present_or_copyin (h, 0); + if (!d) + abort (); + + memset (&h[0], 0, N); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,\+0\] is a bad range" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-41.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-41.c new file mode 100644 index 0000000..01c5f3c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-41.c @@ -0,0 +1,43 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + if (!d) + abort (); + + for (i = 0; i < N; i++) + { + h[i] = 0xab; + } + + acc_update_device (h, N); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != 0xab) + abort (); + } + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-42.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-42.c new file mode 100644 index 0000000..d577fe3 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-42.c @@ -0,0 +1,35 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + acc_update_device (h, N); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != 0xab) + abort (); + } + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,256\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c new file mode 100644 index 0000000..ceeb155 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-43.c @@ -0,0 +1,45 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + if (!d) + abort (); + + for (i = 0; i < N; i++) + { + h[i] = 0xab; + } + + acc_update_device (0, N); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != 0xab) + abort (); + } + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\(nil\),256\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-44.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-44.c new file mode 100644 index 0000000..0cabb0d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-44.c @@ -0,0 +1,45 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + if (!d) + abort (); + + for (i = 0; i < N; i++) + { + h[i] = 0xab; + } + + acc_update_device (h, 0); + + acc_copyout (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != 0xab) + abort (); + } + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,0\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-45.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-45.c new file mode 100644 index 0000000..f9a6294 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-45.c @@ -0,0 +1,50 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + if (!d) + abort (); + + for (i = 0; i < N; i++) + { + h[i] = 0xab; + } + + acc_update_device (h, N - 2); + + acc_copyout (h, N); + + for (i = 0; i < N - 2; i++) + { + if (h[i] != 0xab) + abort (); + } + + for (i = N - 2; i < N; i++) + { + if (h[i] != i) + abort (); + } + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-46.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-46.c new file mode 100644 index 0000000..b195725 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-46.c @@ -0,0 +1,42 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + if (!d) + abort (); + + memset (&h[0], 0, N); + + acc_update_self (h, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + acc_delete (h, N); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c new file mode 100644 index 0000000..a7ff904 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-47.c @@ -0,0 +1,43 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + if (!d) + abort (); + + memset (&h[0], 0, N); + + acc_update_self (0, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + acc_delete (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\(nil\),256\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-48.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-48.c new file mode 100644 index 0000000..01d3c6c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-48.c @@ -0,0 +1,43 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + if (!d) + abort (); + + memset (&h[0], 0, N); + + acc_update_self (h, 0); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + acc_delete (h, N); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,0\] is not mapped" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-49.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-49.c new file mode 100644 index 0000000..a33324c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-49.c @@ -0,0 +1,48 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_copyin (h, N); + if (!d) + abort (); + + memset (&h[0], 0, N); + + acc_update_self (h, N - 2); + + for (i = 0; i < N - 2; i++) + { + if (h[i] != i) + abort (); + } + + for (i = N - 2; i < N; i++) + { + if (h[i] != 0) + abort (); + } + + acc_delete (h, N); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-5.c new file mode 100644 index 0000000..961a62c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-5.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + if (acc_get_device_type () == acc_device_default) + abort (); + + acc_init (acc_device_default); + + if (acc_get_device_type () == acc_device_default) + abort (); + + acc_shutdown (acc_device_default); + + if (acc_get_num_devices (acc_device_nvidia) != 0) + { + acc_init (acc_device_nvidia); + + if (acc_get_device_type () != acc_device_nvidia) + abort (); + + acc_shutdown (acc_device_nvidia); + + acc_init (acc_device_default); + + acc_set_device_type (acc_device_nvidia); + + if (acc_get_device_type () != acc_device_nvidia) + abort (); + + acc_shutdown (acc_device_nvidia); + } + + return 0; + +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-50.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-50.c new file mode 100644 index 0000000..e8294e1 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-50.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_malloc (N); + + acc_map_data (h, d, N); + + if (acc_is_present (h, N) != 1) + abort (); + + acc_unmap_data (h); + + acc_free (d); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-51.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-51.c new file mode 100644 index 0000000..29d28f2 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-51.c @@ -0,0 +1,41 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h[N]; + void *d[N]; + + for (i = 0; i < N; i++) + { + h[i] = (unsigned char *) malloc (N); + d[i] = acc_malloc (N); + + acc_map_data (h[i], d[i], N); + } + + for (i = 0; i < N; i++) + { + if (acc_is_present (h[i], N) != 1) + abort (); + } + + for (i = 0; i < N; i++) + { + acc_unmap_data (h[i]); + + if (acc_is_present (h[i], N) != 0) + abort (); + + acc_free (d[i]); + free (h[i]); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-52.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-52.c new file mode 100644 index 0000000..780db31 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-52.c @@ -0,0 +1,28 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_malloc (N); + + acc_map_data (0, d, N); + + acc_unmap_data (h); + + acc_free (d); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[(nil),+256\]->\[\h+,\+256\] is a bad map" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-53.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-53.c new file mode 100644 index 0000000..657adde --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-53.c @@ -0,0 +1,28 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_malloc (N); + + acc_map_data (h, 0, N); + + acc_unmap_data (h); + + acc_free (d); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,\+256\]->\[(nil),\+256\] is a bad map" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-54.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-54.c new file mode 100644 index 0000000..1f3df80 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-54.c @@ -0,0 +1,28 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_malloc (N); + + acc_map_data (h, d, 0); + + acc_unmap_data (h); + + acc_free (d); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \[\h+,\+0\]->\[\h+,\+0\] is a bad map" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-55.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-55.c new file mode 100644 index 0000000..286653f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-55.c @@ -0,0 +1,48 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <stdlib.h> +#include <openacc.h> +#include <stdint.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + int i; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_malloc (N); + + for (i = 0; i < N; i++) + { + acc_map_data ((void *)((uintptr_t) h + (uintptr_t) i), + (void *)((uintptr_t) d + (uintptr_t) i), 1); + } + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + 1, 1) != 1) + abort (); + } + + for (i = 0; i < N; i++) + { + acc_unmap_data (h + i); + } + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + 1, 1) != 0) + abort (); + } + + acc_free (d); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-56.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-56.c new file mode 100644 index 0000000..e3f5a80 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-56.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_malloc (N); + + acc_map_data (h, d, N >> 1); + + if (acc_is_present (h, 1) != 1) + abort (); + + if (acc_is_present (h + (N >> 1), 1) != 0) + abort (); + + acc_unmap_data (h); + + acc_free (d); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-57.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-57.c new file mode 100644 index 0000000..f9043a4 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-57.c @@ -0,0 +1,28 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_malloc (N); + + acc_map_data (h, d, N); + + acc_unmap_data (d); + + acc_free (d); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \h+ is not a mapped block" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-58.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-58.c new file mode 100644 index 0000000..9d6e27d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-58.c @@ -0,0 +1,28 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_malloc (N); + + acc_map_data (h, d, N); + + acc_unmap_data (0); + + acc_free (d); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: \(nil\) is not a mapped block" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-59.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-59.c new file mode 100644 index 0000000..2f087ae --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-59.c @@ -0,0 +1,55 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <stdlib.h> +#include <openacc.h> +#include <stdint.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + d = acc_malloc (N); + + acc_map_data (h, d, N); + + for (i = 0; i < N; i++) + { + if (acc_hostptr ((void *)((uintptr_t) d + (uintptr_t) i)) != + (void *)((uintptr_t) h + (uintptr_t) i)) + abort (); + } + + for (i = 0; i < N; i++) + { + if (acc_deviceptr ((void *)((uintptr_t) h + (uintptr_t) i)) != + (void *)((uintptr_t) d + (uintptr_t) i)) + abort (); + } + + acc_unmap_data (h); + + for (i = 0; i < N; i++) + { + if (acc_hostptr ((void *)((uintptr_t) d + (uintptr_t) i)) != 0) + abort (); + } + + for (i = 0; i < N; i++) + { + if (acc_deviceptr (h + i) != 0) + abort (); + } + + acc_free (d); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-6.c new file mode 100644 index 0000000..afdd480 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-6.c @@ -0,0 +1,39 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + int devnum; + + if (acc_get_device_type () == acc_device_default) + abort (); + + if (acc_get_num_devices (acc_device_nvidia) == 0) + return 0; + + acc_set_device_type (acc_device_nvidia); + + if (acc_get_device_type () != acc_device_nvidia) + abort (); + + acc_shutdown (acc_device_nvidia); + + acc_set_device_type (acc_device_nvidia); + + if (acc_get_device_type () != acc_device_nvidia) + abort (); + + devnum = acc_get_num_devices (acc_device_host); + if (devnum != 1) + abort (); + + acc_shutdown (acc_device_nvidia); + + if (acc_get_device_type () == acc_device_default) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-60.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-60.c new file mode 100644 index 0000000..ccae728e --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-60.c @@ -0,0 +1,54 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_malloc (N); + + acc_memcpy_to_device (d, h, N); + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + i, 1) != 0) + abort (); + } + + memset (&h[0], 0, N); + + acc_memcpy_from_device (h, d, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + for (i = 0; i < N; i++) + { + if (acc_is_present (h + i, 1) != 0) + abort (); + } + + acc_free (d); + + free (h); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-61.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-61.c new file mode 100644 index 0000000..ce66ced --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-61.c @@ -0,0 +1,70 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h[N]; + void *d[N]; + + for (i = 0; i < N; i++) + { + int j; + unsigned char *p; + + h[i] = (unsigned char *) malloc (N); + + p = h[i]; + + for (j = 0; j < N; j++) + { + p[j] = i; + } + + d[i] = acc_malloc (N); + + acc_memcpy_to_device (d[i], h[i], N); + + for (j = 0; j < N; j++) + { + if (acc_is_present (h[i] + j, 1) != 0) + abort (); + } + } + + for (i = 0; i < N; i++) + { + int j; + unsigned char *p; + + memset (h[i], 0, N); + + acc_memcpy_from_device (h[i], d[i], N); + + p = h[i]; + + for (j = 0; j < N; j++) + { + if (p[j] != i) + abort (); + } + + for (j = 0; j < N; j++) + { + if (acc_is_present (h[i] + j, 1) != 0) + abort (); + } + + acc_free (d[i]); + + free (h[i]); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-62.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-62.c new file mode 100644 index 0000000..e6178e2 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-62.c @@ -0,0 +1,49 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + acc_init (acc_device_nvidia); + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_malloc (N); + + acc_memcpy_to_device (d, h, N); + + memset (&h[0], 0, N); + + acc_memcpy_to_device (d, h, N << 1); + + acc_memcpy_from_device (h, d, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + acc_free (d); + + free (h); + + acc_shutdown (acc_device_nvidia); + + return 0; +} + +/* { dg-shouldfail "libgomp: invalid size" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-63.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-63.c new file mode 100644 index 0000000..ca237ec --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-63.c @@ -0,0 +1,43 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_malloc (N); + + acc_memcpy_to_device (0, h, N); + + memset (&h[0], 0, N); + + acc_memcpy_from_device (h, d, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + acc_free (d); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: invalid device address" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-64.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-64.c new file mode 100644 index 0000000..850fd2e --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-64.c @@ -0,0 +1,43 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_malloc (N); + + acc_memcpy_to_device (d, 0, N); + + memset (&h[0], 0, N); + + acc_memcpy_from_device (h, d, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + acc_free (d); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: invalid host address" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-65.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-65.c new file mode 100644 index 0000000..26c8cef --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-65.c @@ -0,0 +1,43 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_malloc (N); + + acc_memcpy_to_device (d, d, N); + + memset (&h[0], 0, N); + + acc_memcpy_from_device (h, d, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + acc_free (d); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: invalid host or device address" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c new file mode 100644 index 0000000..398dc2a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c @@ -0,0 +1,48 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + acc_init (acc_device_default); + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_malloc (N); + + acc_memcpy_to_device (d, h, N); + + memset (&h[0], 0, N); + + acc_memcpy_to_device (d, h, 0); + + acc_memcpy_from_device (h, d, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + acc_free (d); + + free (h); + + acc_shutdown (acc_device_default); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-67.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-67.c new file mode 100644 index 0000000..01b8b2d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-67.c @@ -0,0 +1,43 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_malloc (N); + + acc_memcpy_to_device (d, h, N); + + memset (&h[0], 0, N); + + acc_memcpy_from_device (0, d, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + acc_free (d); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: invalid host address" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-68.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-68.c new file mode 100644 index 0000000..3ff5bd7 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-68.c @@ -0,0 +1,43 @@ +/* { dg-do run } */ + +#include <string.h> +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + const int N = 256; + int i; + unsigned char *h; + void *d; + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_malloc (N); + + acc_memcpy_to_device (d, h, N); + + memset (&h[0], 0, N); + + acc_memcpy_from_device (h, 0, N); + + for (i = 0; i < N; i++) + { + if (h[i] != i) + abort (); + } + + acc_free (d); + + free (h); + + return 0; +} + +/* { dg-shouldfail "libgomp: invalid device address" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-69.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-69.c new file mode 100644 index 0000000..5462f12 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-69.c @@ -0,0 +1,124 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <unistd.h> +#include <openacc.h> +#include <cuda.h> + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + CUstream stream; + unsigned long *a, *d_a, dticks; + int nbytes; + float dtime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + stream = (CUstream) acc_get_cuda_stream (0); + if (stream != NULL) + abort (); + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (0, stream)) + abort (); + + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + if (acc_async_test (0) != 0) + { + fprintf (stderr, "asynchronous operation not running\n"); + abort (); + } + + sleep (1); + + if (acc_async_test (0) != 1) + { + fprintf (stderr, "found asynchronous operation still running\n"); + abort (); + } + + acc_unmap_data (a); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-7.c new file mode 100644 index 0000000..e78734b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-7.c @@ -0,0 +1,18 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + if (acc_get_num_devices (acc_device_none) != 0) + abort (); + + if (acc_get_num_devices (acc_device_host) == 0) + abort (); + + return 0; +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-70.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-70.c new file mode 100644 index 0000000..912b266 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-70.c @@ -0,0 +1,136 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include <cuda.h> + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + const int N = 10; + int i; + CUstream streams[N]; + unsigned long *a, *d_a, dticks; + int nbytes; + float dtime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + for (i = 0; i < N; i++) + { + streams[i] = (CUstream) acc_get_cuda_stream (i); + if (streams[i] != NULL) + abort (); + + r = cuStreamCreate (&streams[i], CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (i, streams[i])) + abort (); + } + + for (i = 0; i < N; i++) + { + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, streams[i], kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + if (acc_async_test (i) != 0) + { + fprintf (stderr, "asynchronous operation not running\n"); + abort (); + } + } + + sleep ((int) (dtime / 1000.0f) + 1); + + for (i = 0; i < N; i++) + { + if (acc_async_test (i) != 1) + { + fprintf (stderr, "found asynchronous operation still running\n"); + abort (); + } + } + + acc_unmap_data (a); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-71.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-71.c new file mode 100644 index 0000000..a045379 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-71.c @@ -0,0 +1,119 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <unistd.h> +#include <openacc.h> +#include <cuda.h> + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + CUstream stream; + unsigned long *a, *d_a, dticks; + int nbytes; + float dtime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + acc_set_cuda_stream (0, stream); + + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + if (acc_async_test (1) != 0) + { + fprintf (stderr, "asynchronous operation not running\n"); + abort (); + } + + sleep ((int) (dtime / 1000.0f) + 1); + + if (acc_async_test (1) != 1) + { + fprintf (stderr, "found asynchronous operation still running\n"); + abort (); + } + + acc_unmap_data (a); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + return 0; +} + +/* { dg-shouldfail "libgomp: unknown async \d" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-72.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-72.c new file mode 100644 index 0000000..e383ba0 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-72.c @@ -0,0 +1,121 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <unistd.h> +#include <stdlib.h> +#include <openacc.h> +#include <cuda.h> + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + CUstream stream; + unsigned long *a, *d_a, dticks; + int nbytes; + float dtime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (0, stream)) + abort (); + + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + if (acc_async_test_all () != 0) + { + fprintf (stderr, "asynchronous operation not running\n"); + abort (); + } + + sleep ((int) (dtime / 1000.f) + 1); + + if (acc_async_test_all () != 1) + { + fprintf (stderr, "found asynchronous operation still running\n"); + abort (); + } + + acc_unmap_data (a); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-73.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-73.c new file mode 100644 index 0000000..43a8b7e --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-73.c @@ -0,0 +1,134 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <unistd.h> +#include <stdlib.h> +#include <openacc.h> +#include <cuda.h> + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + const int N = 10; + int i; + CUstream streams[N]; + unsigned long *a, *d_a, dticks; + int nbytes; + float dtime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + for (i = 0; i < N; i++) + { + streams[i] = (CUstream) acc_get_cuda_stream (i); + if (streams[i] != NULL) + abort (); + + r = cuStreamCreate (&streams[i], CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (i, streams[i])) + abort (); + } + + for (i = 0; i < N; i++) + { + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, streams[i], kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + } + + if (acc_async_test_all () != 0) + { + fprintf (stderr, "asynchronous operation not running\n"); + abort (); + } + + sleep ((int) (dtime / 1000.0f) + 1); + + if (acc_async_test_all () != 1) + { + fprintf (stderr, "asynchronous operation not running\n"); + abort (); + } + + acc_unmap_data (a); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-74.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-74.c new file mode 100644 index 0000000..0726ee4 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-74.c @@ -0,0 +1,139 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <openacc.h> +#include <cuda.h> +#include "timer.h" + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + CUstream stream; + unsigned long *a, *d_a, dticks; + int nbytes; + float atime, dtime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + stream = (CUstream) acc_get_cuda_stream (0); + if (stream != NULL) + abort (); + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (0, stream)) + abort (); + + init_timers (1); + + start_timer (0); + + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + acc_wait (0); + + atime = stop_timer (0); + + if (atime < dtime) + { + fprintf (stderr, "actual time < delay time\n"); + abort (); + } + + start_timer (0); + + acc_wait (0); + + atime = stop_timer (0); + + if (0.010 < atime) + { + fprintf (stderr, "actual time too long\n"); + abort (); + } + + acc_unmap_data (a); + + fini_timers (); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-75.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-75.c new file mode 100644 index 0000000..1942211 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-75.c @@ -0,0 +1,141 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <unistd.h> +#include <stdlib.h> +#include <openacc.h> +#include <cuda.h> +#include "timer.h" + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + int N; + int i; + CUstream stream; + unsigned long *a, *d_a, dticks; + int nbytes; + float atime, dtime, hitime, lotime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + N = nprocs; + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + stream = (CUstream) acc_get_cuda_stream (0); + if (stream != NULL) + abort (); + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (0, stream)) + abort (); + + init_timers (1); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + start_timer (0); + + for (i = 0; i < N; i++) + { + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + acc_wait (0); + } + + atime = stop_timer (0); + + hitime = dtime * N; + hitime += hitime * 0.02; + + lotime = dtime * N; + lotime -= lotime * 0.02; + + if (atime > hitime || atime < lotime) + { + fprintf (stderr, "actual time < delay time\n"); + abort (); + } + + acc_unmap_data (a); + + fini_timers (); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-76.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-76.c new file mode 100644 index 0000000..11d9d62 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-76.c @@ -0,0 +1,147 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include <cuda.h> +#include "timer.h" + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + int N; + int i; + CUstream *streams; + unsigned long *a, *d_a, dticks; + int nbytes; + float atime, dtime, hitime, lotime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + N = nprocs; + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + streams = (CUstream *) malloc (N * sizeof (void *)); + + for (i = 0; i < N; i++) + { + streams[i] = (CUstream) acc_get_cuda_stream (i); + if (streams[i] != NULL) + abort (); + + r = cuStreamCreate (&streams[i], CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (i, streams[i])) + abort (); + } + + init_timers (1); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + start_timer (0); + + for (i = 0; i < N; i++) + { + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, streams[i], kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + acc_wait (i); + } + + atime = stop_timer (0); + + hitime = dtime * N; + hitime += hitime * 0.02; + + lotime = dtime * N; + lotime -= lotime * 0.02; + + if (atime > hitime || atime < lotime) + { + fprintf (stderr, "actual time < delay time\n"); + abort (); + } + + acc_unmap_data (a); + + fini_timers (); + + free (streams); + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-77.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-77.c new file mode 100644 index 0000000..e47212b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-77.c @@ -0,0 +1,135 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include <cuda.h> +#include "timer.h" + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + CUstream stream; + unsigned long *a, *d_a, dticks; + int nbytes; + float atime, dtime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + acc_set_cuda_stream (0, stream); + + init_timers (1); + + start_timer (0); + + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + acc_wait (1); + + atime = stop_timer (0); + + if (atime < dtime) + { + fprintf (stderr, "actual time < delay time\n"); + abort (); + } + + start_timer (0); + + acc_wait (1); + + atime = stop_timer (0); + + if (0.010 < atime) + { + fprintf (stderr, "actual time < delay time\n"); + abort (); + } + + acc_unmap_data (a); + + fini_timers (); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + return 0; +} + +/* { dg-shouldfail "libgomp: unknown async \d" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-78.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-78.c new file mode 100644 index 0000000..4f58fb2 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-78.c @@ -0,0 +1,140 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include <cuda.h> +#include "timer.h" + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + CUstream stream; + unsigned long *a, *d_a, dticks; + int nbytes; + float atime, dtime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + stream = (CUstream) acc_get_cuda_stream (0); + if (stream != NULL) + abort (); + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (0, stream)) + abort (); + + init_timers (1); + + start_timer (0); + + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + acc_wait_all (); + + atime = stop_timer (0); + + if (atime < dtime) + { + fprintf (stderr, "actual time < delay time\n"); + abort (); + } + + start_timer (0); + + acc_wait_all (); + + atime = stop_timer (0); + + if (0.010 < atime) + { + fprintf (stderr, "actual time too long\n"); + abort (); + } + + acc_unmap_data (a); + + fini_timers (); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-79.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-79.c new file mode 100644 index 0000000..ef3df13 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-79.c @@ -0,0 +1,167 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include <cuda.h> +#include "timer.h" + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + int N; + int i; + CUstream stream; + unsigned long *a, *d_a, dticks; + int nbytes; + float atime, dtime, hitime, lotime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + devnum = 2; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + N = nprocs; + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (1, stream)) + abort (); + + stream = (CUstream) acc_get_cuda_stream (0); + if (stream != NULL) + abort (); + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (0, stream)) + abort (); + + init_timers (1); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + start_timer (0); + + for (i = 0; i < N; i++) + { + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + } + + acc_wait_async (0, 1); + + if (acc_async_test (0) != 0) + abort (); + + if (acc_async_test (1) != 0) + abort (); + + acc_wait (1); + + atime = stop_timer (0); + + if (acc_async_test (0) != 1) + abort (); + + if (acc_async_test (1) != 1) + abort (); + + hitime = dtime * N; + hitime += hitime * 0.02; + + lotime = dtime * N; + lotime -= lotime * 0.02; + + if (atime > hitime || atime < lotime) + { + fprintf (stderr, "actual time < delay time\n"); + abort (); + } + + acc_unmap_data (a); + + fini_timers (); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-80.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-80.c new file mode 100644 index 0000000..0b5ec24 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-80.c @@ -0,0 +1,132 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include <cuda.h> +#include "timer.h" + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + CUstream stream; + int N; + int i; + unsigned long *a, *d_a, dticks; + int nbytes; + float atime, dtime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 200.0; + + dticks = (unsigned long) (dtime * clkrate); + + N = nprocs; + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + acc_set_cuda_stream (1, stream); + + init_timers (1); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + start_timer (0); + + for (i = 0; i < N; i++) + { + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + } + + acc_wait_async (1, 1); + + acc_wait (1); + + atime = stop_timer (0); + + if (atime < dtime) + { + fprintf (stderr, "actual time < delay time\n"); + abort (); + } + + acc_unmap_data (a); + + fini_timers (); + + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + return 0; +} + +/* { dg-shouldfail "libgomp: identical parameters" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-81.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-81.c new file mode 100644 index 0000000..d5f18f0 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-81.c @@ -0,0 +1,211 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include <cuda.h> +#include "timer.h" + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay; + CUmodule module; + CUresult r; + int N; + int i; + CUstream *streams, stream; + unsigned long *a, *d_a, dticks; + int nbytes; + float atime, dtime; + void *kargs[2]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay, module, "delay"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = nprocs * sizeof (unsigned long); + + dtime = 500.0; + + dticks = (unsigned long) (dtime * clkrate); + + N = nprocs; + + a = (unsigned long *) malloc (nbytes); + d_a = (unsigned long *) acc_malloc (nbytes); + + acc_map_data (a, d_a, nbytes); + + streams = (CUstream *) malloc (N * sizeof (void *)); + + for (i = 0; i < N; i++) + { + streams[i] = (CUstream) acc_get_cuda_stream (i); + if (streams[i] != NULL) + abort (); + + r = cuStreamCreate (&streams[i], CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (i, streams[i])) + abort (); + } + + init_timers (1); + + kargs[0] = (void *) &d_a; + kargs[1] = (void *) &dticks; + + stream = (CUstream) acc_get_cuda_stream (N); + if (stream != NULL) + abort (); + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (N, stream)) + abort (); + + start_timer (0); + + for (i = 0; i < N; i++) + { + r = cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, streams[i], kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + } + + acc_wait_all_async (N); + + for (i = 0; i <= N; i++) + { + if (acc_async_test (i) != 0) + abort (); + } + + acc_wait (N); + + for (i = 0; i <= N; i++) + { + if (acc_async_test (i) != 1) + abort (); + } + + atime = stop_timer (0); + + if (atime < dtime) + { + fprintf (stderr, "actual time < delay time\n"); + abort (); + } + + start_timer (0); + + stream = (CUstream) acc_get_cuda_stream (N + 1); + if (stream != NULL) + abort (); + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (N + 1, stream)) + abort (); + + acc_wait_all_async (N + 1); + + acc_wait (N + 1); + + atime = stop_timer (0); + + if (0.10 < atime) + { + fprintf (stderr, "actual time too long\n"); + abort (); + } + + start_timer (0); + + acc_wait_all_async (N); + + acc_wait (N); + + atime = stop_timer (0); + + if (0.10 < atime) + { + fprintf (stderr, "actual time too long\n"); + abort (); + } + + acc_unmap_data (a); + + fini_timers (); + + free (streams); + free (a); + acc_free (d_a); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-82.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-82.c new file mode 100644 index 0000000..be30a7f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-82.c @@ -0,0 +1,144 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include <cuda.h> + +int +main (int argc, char **argv) +{ + CUdevice dev; + CUfunction delay2; + CUmodule module; + CUresult r; + int N; + int i; + CUstream *streams; + unsigned long **a, **d_a, *tid, ticks; + int nbytes; + void *kargs[3]; + int clkrate; + int devnum, nprocs; + + acc_init (acc_device_nvidia); + + devnum = acc_get_device_num (acc_device_nvidia); + + r = cuDeviceGet (&dev, devnum); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGet failed: %d\n", r); + abort (); + } + + r = + cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, + dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, dev); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); + abort (); + } + + r = cuModuleLoad (&module, "subr.ptx"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleLoad failed: %d\n", r); + abort (); + } + + r = cuModuleGetFunction (&delay2, module, "delay2"); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); + abort (); + } + + nbytes = sizeof (int); + + ticks = (unsigned long) (200.0 * clkrate); + + N = nprocs; + + streams = (CUstream *) malloc (N * sizeof (void *)); + + a = (unsigned long **) malloc (N * sizeof (unsigned long *)); + d_a = (unsigned long **) malloc (N * sizeof (unsigned long *)); + tid = (unsigned long *) malloc (N * sizeof (unsigned long)); + + for (i = 0; i < N; i++) + { + a[i] = (unsigned long *) malloc (sizeof (unsigned long)); + *a[i] = N; + d_a[i] = (unsigned long *) acc_malloc (nbytes); + tid[i] = i; + + acc_map_data (a[i], d_a[i], nbytes); + + streams[i] = (CUstream) acc_get_cuda_stream (i); + if (streams[i] != NULL) + abort (); + + r = cuStreamCreate (&streams[i], CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (i, streams[i])) + abort (); + } + + for (i = 0; i < N; i++) + { + kargs[0] = (void *) &d_a[i]; + kargs[1] = (void *) &ticks; + kargs[2] = (void *) &tid[i]; + + r = cuLaunchKernel (delay2, 1, 1, 1, 1, 1, 1, 0, streams[i], kargs, 0); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuLaunchKernel failed: %d\n", r); + abort (); + } + + ticks = (unsigned long) (50.0 * clkrate); + } + + acc_wait_all_async (0); + + for (i = 0; i < N; i++) + { + acc_copyout (a[i], nbytes); + if (*a[i] != i) + abort (); + } + + free (streams); + + for (i = 0; i < N; i++) + { + free (a[i]); + } + + free (a); + free (d_a); + free (tid); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-83.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-83.c new file mode 100644 index 0000000..1c2e52b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-83.c @@ -0,0 +1,58 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include "timer.h" + +int +main (int argc, char **argv) +{ + float atime; + CUstream stream; + CUresult r; + + acc_init (acc_device_nvidia); + + (void) acc_get_device_num (acc_device_nvidia); + + init_timers (1); + + stream = (CUstream) acc_get_cuda_stream (0); + if (stream != NULL) + abort (); + + r = cuStreamCreate (&stream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (0, stream)) + abort (); + + start_timer (0); + + acc_wait_all_async (0); + + acc_wait (0); + + atime = stop_timer (0); + + if (0.010 < atime) + { + fprintf (stderr, "actual time too long\n"); + abort (); + } + + fini_timers (); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-84.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-84.c new file mode 100644 index 0000000..786b908 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-84.c @@ -0,0 +1,66 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdlib.h> +#include <unistd.h> +#include <stdio.h> +#include <openacc.h> +#include <cuda.h> + +int +main (int argc, char **argv) +{ + const int N = 100; + int i; + CUstream *streams; + CUstream s; + CUresult r; + + acc_init (acc_device_nvidia); + + (void) acc_get_device_num (acc_device_nvidia); + + streams = (CUstream *) malloc (N * sizeof (void *)); + + for (i = 0; i < N; i++) + { + streams[i] = (CUstream) acc_get_cuda_stream (i); + if (streams[i] != NULL) + abort (); + + r = cuStreamCreate (&streams[i], CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (i, streams[i])) + abort (); + } + + for (i = 0; i < N; i++) + { + int j; + int cnt; + + cnt = 0; + + s = streams[i]; + + for (j = 0; j < N; j++) + { + if (s == streams[j]) + cnt++; + } + + if (cnt != 1) + abort (); + } + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-85.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-85.c new file mode 100644 index 0000000..cf925a7 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-85.c @@ -0,0 +1,52 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include <stdio.h> +#include <cuda.h> + +int +main (int argc, char **argv) +{ + const int N = 100; + int i; + CUstream *streams; + CUstream s; + CUresult r; + + acc_init (acc_device_nvidia); + + (void) acc_get_device_num (acc_device_nvidia); + + streams = (CUstream *) malloc (N * sizeof (void *)); + + for (i = 0; i < N; i++) + { + streams[i] = (CUstream) acc_get_cuda_stream (i); + if (streams[i] != NULL) + abort (); + + r = cuStreamCreate (&streams[i], CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (i, streams[i])) + abort (); + } + + s = NULL; + + if (acc_set_cuda_stream (N + 1, s) != 0) + abort (); + + acc_shutdown (acc_device_nvidia); + + exit (0); +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-86.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-86.c new file mode 100644 index 0000000..b8a8ee9 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-86.c @@ -0,0 +1,42 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + if (acc_get_num_devices (acc_device_nvidia) == 0) + return 0; + + if (acc_get_current_cuda_device () != 0) + abort (); + + acc_init (acc_device_host); + + if (acc_get_current_cuda_device () != 0) + abort (); + + acc_shutdown (acc_device_host); + + if (acc_get_num_devices (acc_device_nvidia) == 0) + return 0; + + if (acc_get_current_cuda_device () != 0) + abort (); + + acc_init (acc_device_nvidia); + + if (acc_get_current_cuda_device () == 0) + abort (); + + acc_shutdown (acc_device_nvidia); + + if (acc_get_current_cuda_device () != 0) + abort (); + + return 0; +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-87.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-87.c new file mode 100644 index 0000000..147d443 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-87.c @@ -0,0 +1,42 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + if (acc_get_num_devices (acc_device_nvidia) == 0) + return 0; + + if (acc_get_current_cuda_context () != 0) + abort (); + + acc_init (acc_device_host); + + if (acc_get_current_cuda_context () != 0) + abort (); + + acc_shutdown (acc_device_host); + + if (acc_get_num_devices (acc_device_nvidia) == 0) + return 0; + + if (acc_get_current_cuda_context () != 0) + abort (); + + acc_init (acc_device_nvidia); + + if (acc_get_current_cuda_context () == 0) + abort (); + + acc_shutdown (acc_device_nvidia); + + if (acc_get_current_cuda_context () != 0) + abort (); + + return 0; +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-88.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-88.c new file mode 100644 index 0000000..10f4ad8 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-88.c @@ -0,0 +1,111 @@ +/* { dg-do run } */ + +#include <stdio.h> +#include <pthread.h> +#include <string.h> +#include <stdlib.h> +#include <ctype.h> +#include <openacc.h> + +unsigned char *x; +void *d_x; +const int N = 256; + +static void * +test (void *arg) +{ + int i; + + if (acc_get_current_cuda_context () != NULL) + abort (); + + if (acc_is_present (x, N) != 1) + abort (); + + memset (x, 0, N); + + acc_copyout (x, N); + + for (i = 0; i < N; i++) + { + if (x[i] != i) + abort (); + + x[i] = N - i - 1; + } + + d_x = acc_copyin (x, N); + + return 0; +} + +int +main (int argc, char **argv) +{ + const int nthreads = 1; + int i; + pthread_attr_t attr; + pthread_t *tid; + + if (acc_get_num_devices (acc_device_nvidia) == 0) + return 0; + + acc_init (acc_device_nvidia); + + x = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + x[i] = i; + } + + d_x = acc_copyin (x, N); + + if (acc_is_present (x, N) != 1) + abort (); + + if (pthread_attr_init (&attr) != 0) + perror ("pthread_attr_init failed"); + + tid = (pthread_t *) malloc (nthreads * sizeof (pthread_t)); + + for (i = 0; i < nthreads; i++) + { + if (pthread_create (&tid[i], &attr, &test, (void *) (unsigned long) (i)) + != 0) + perror ("pthread_create failed"); + } + + if (pthread_attr_destroy (&attr) != 0) + perror ("pthread_attr_destroy failed"); + + for (i = 0; i < nthreads; i++) + { + void *res; + + if (pthread_join (tid[i], &res) != 0) + perror ("pthread join failed"); + } + + if (acc_is_present (x, N) != 1) + abort (); + + memset (x, 0, N); + + acc_copyout (x, N); + + for (i = 0; i < N; i++) + { + if (x[i] != N - i - 1) + abort (); + } + + if (acc_is_present (x, N) != 0) + abort (); + + acc_shutdown (acc_device_nvidia); + + return 0; +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-89.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-89.c new file mode 100644 index 0000000..061c409 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-89.c @@ -0,0 +1,118 @@ +/* { dg-do run } */ + +#include <stdio.h> +#include <pthread.h> +#include <string.h> +#include <stdlib.h> +#include <errno.h> +#include <ctype.h> +#include <openacc.h> + +unsigned char **x; +void **d_x; +const int N = 16; +const int NTHREADS = 32; + +static void * +test (void *arg) +{ + int i; + int tid; + unsigned char *p; + int devnum; + + tid = (int) (long) arg; + + devnum = acc_get_device_num (acc_device_nvidia); + acc_set_device_num (devnum, acc_device_nvidia); + + if (acc_get_current_cuda_context () == NULL) + abort (); + + p = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + p[i] = tid; + } + + x[tid] = p; + + d_x[tid] = acc_copyin (p, N); + + return 0; +} + +int +main (int argc, char **argv) +{ + int i; + pthread_attr_t attr; + pthread_t *tid; + + if (acc_get_num_devices (acc_device_nvidia) == 0) + return 0; + + acc_init (acc_device_nvidia); + + x = (unsigned char **) malloc (NTHREADS * N); + d_x = (void **) malloc (NTHREADS * N); + + if (pthread_attr_init (&attr) != 0) + perror ("pthread_attr_init failed"); + + tid = (pthread_t *) malloc (NTHREADS * sizeof (pthread_t)); + + for (i = 0; i < NTHREADS; i++) + { + if (pthread_create (&tid[i], &attr, &test, (void *) (unsigned long) (i)) + != 0) + perror ("pthread_create failed"); + } + + if (pthread_attr_destroy (&attr) != 0) + perror ("pthread_attr_destroy failed"); + + for (i = 0; i < NTHREADS; i++) + { + void *res; + + if (pthread_join (tid[i], &res) != 0) + perror ("pthread join failed"); + } + + for (i = 0; i < NTHREADS; i++) + { + if (acc_is_present (x[i], N) != 1) + abort (); + } + + for (i = 0; i < NTHREADS; i++) + { + memset (x[i], 0, N); + acc_copyout (x[i], N); + } + + for (i = 0; i < NTHREADS; i++) + { + unsigned char *p; + int j; + + p = x[i]; + + for (j = 0; j < N; j++) + { + if (p[j] != i) + abort (); + } + + if (acc_is_present (x[i], N) != 0) + abort (); + } + + acc_shutdown (acc_device_nvidia); + + return 0; +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-9.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-9.c new file mode 100644 index 0000000..84045db --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-9.c @@ -0,0 +1,70 @@ +/* { dg-do run } */ + +#include <stdlib.h> +#include <openacc.h> + +int +main (int argc, char **argv) +{ + int i; + int num_devices; + int devnum; + acc_device_t devtype = acc_device_host; + +#if ACC_DEVICE_TYPE_nvidia + devtype = acc_device_nvidia; +#endif + + num_devices = acc_get_num_devices (devtype); + if (num_devices == 0) + return 0; + + acc_init (devtype); + + for (i = 0; i < num_devices; i++) + { + acc_set_device_num (i, devtype); + devnum = acc_get_device_num (devtype); + if (devnum != i) + abort (); + } + + acc_shutdown (devtype); + + num_devices = acc_get_num_devices (devtype); + if (num_devices == 0) + abort (); + + for (i = 0; i < num_devices; i++) + { + acc_set_device_num (i, devtype); + devnum = acc_get_device_num (devtype); + if (devnum != i) + abort (); + } + + acc_shutdown (devtype); + + acc_init (devtype); + + acc_set_device_num (0, devtype); + + devnum = acc_get_device_num (devtype); + if (devnum != 0) + abort (); + + if (num_devices > 1) + { + acc_set_device_num (1, (acc_device_t) 0); + + devnum = acc_get_device_num (devtype); + if (devnum != 0) + abort (); + } + + acc_shutdown (devtype); + + return 0; +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-90.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-90.c new file mode 100644 index 0000000..d17755b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-90.c @@ -0,0 +1,137 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <pthread.h> +#include <stdio.h> +#include <string.h> +#include <stdlib.h> +#include <unistd.h> +#include <errno.h> +#include <ctype.h> +#include <openacc.h> +#include <cuda.h> + +unsigned char **x; +void **d_x; +const int N = 16; +const int NTHREADS = 32; + +static void * +test (void *arg) +{ + int i; + int tid; + unsigned char *p; + int devnum; + + tid = (int) (long) arg; + + devnum = acc_get_device_num (acc_device_nvidia); + acc_set_device_num (devnum, acc_device_nvidia); + + if (acc_get_current_cuda_context () == NULL) + abort (); + + p = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + p[i] = tid; + } + + x[tid] = p; + + d_x[tid] = acc_copyin (p, N); + + acc_wait_all (); + + return 0; +} + +int +main (int argc, char **argv) +{ + int i; + pthread_attr_t attr; + pthread_t *tid; + CUresult r; + CUstream s; + + acc_init (acc_device_nvidia); + + x = (unsigned char **) malloc (NTHREADS * N); + d_x = (void **) malloc (NTHREADS * N); + + if (pthread_attr_init (&attr) != 0) + perror ("pthread_attr_init failed"); + + tid = (pthread_t *) malloc (NTHREADS * sizeof (pthread_t)); + + r = cuStreamCreate (&s, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (0, s)) + abort (); + + for (i = 0; i < NTHREADS; i++) + { + if (pthread_create (&tid[i], &attr, &test, (void *) (unsigned long) (i)) + != 0) + perror ("pthread_create failed"); + } + + if (pthread_attr_destroy (&attr) != 0) + perror ("pthread_attr_destroy failed"); + + for (i = 0; i < NTHREADS; i++) + { + void *res; + + if (pthread_join (tid[i], &res) != 0) + perror ("pthread join failed"); + } + + + for (i = 0; i < NTHREADS; i++) + { + if (acc_is_present (x[i], N) != 1) + abort (); + } + + acc_get_cuda_stream (1); + + for (i = 0; i < NTHREADS; i++) + { + memset (x[i], 0, N); + acc_copyout (x[i], N); + } + + acc_wait_all (); + + for (i = 0; i < NTHREADS; i++) + { + unsigned char *p; + int j; + + p = x[i]; + + for (j = 0; j < N; j++) + { + if (p[j] != i) + abort (); + } + + if (acc_is_present (x[i], N) != 0) + abort (); + } + + acc_shutdown (acc_device_nvidia); + + return 0; +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-91.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-91.c new file mode 100644 index 0000000..e00ef4f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-91.c @@ -0,0 +1,84 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ +/* { dg-additional-options "-lcuda" } */ + +#include <stdlib.h> +#include <unistd.h> +#include <openacc.h> +#include <sys/time.h> +#include <stdio.h> +#include <cuda.h> + +int +main (int argc, char **argv) +{ + const int N = 1024 * 1024; + int i; + unsigned char *h; + void *d; + float async, sync; + struct timeval start, stop; + CUresult r; + CUstream s; + + acc_init (acc_device_nvidia); + + h = (unsigned char *) malloc (N); + + for (i = 0; i < N; i++) + { + h[i] = i; + } + + d = acc_malloc (N); + + acc_map_data (h, d, N); + + gettimeofday (&start, NULL); + + for (i = 0; i < 100; i++) + { +#pragma acc update device(h[0:N]) + } + + gettimeofday (&stop, NULL); + + sync = (float) (stop.tv_sec - start.tv_sec); + sync += (float) ((stop.tv_usec - start.tv_usec) / 1000000.0); + + gettimeofday (&start, NULL); + + r = cuStreamCreate (&s, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + if (!acc_set_cuda_stream (0, s)) + abort (); + + for (i = 0; i < 100; i++) + { +#pragma acc update device(h[0:N]) async(0) + } + + acc_wait_all (); + + gettimeofday (&stop, NULL); + + async = (float) (stop.tv_sec - start.tv_sec); + async += (float) ((stop.tv_usec - start.tv_usec) / 1000000.0); + + if (async > (sync * 1.5)) + abort (); + + acc_free (d); + + free (h); + + acc_shutdown (acc_device_nvidia); + + return 0; +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-92.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-92.c new file mode 100644 index 0000000..18193e0 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-92.c @@ -0,0 +1,112 @@ +/* { dg-do run } */ + +#include <pthread.h> +#include <stdio.h> +#include <stdlib.h> +#include <errno.h> +#include <ctype.h> +#include <openacc.h> + +unsigned char **x; +void **d_x; +const int N = 32; +const int NTHREADS = 32; + +static void * +test (void *arg) +{ + int i; + int tid; + unsigned char *p; + int devnum; + + tid = (int) (long) arg; + + devnum = acc_get_device_num (acc_device_nvidia); + acc_set_device_num (devnum, acc_device_nvidia); + + if (acc_get_current_cuda_context () == NULL) + abort (); + + acc_copyout (x[tid], N); + + p = x[tid]; + + for (i = 0; i < N; i++) + { + if (p[i] != i) + abort (); + } + + return 0; +} + +int +main (int argc, char **argv) +{ + int i; + pthread_attr_t attr; + pthread_t *tid; + unsigned char *p; + + if (acc_get_num_devices (acc_device_nvidia) == 0) + return 0; + + acc_init (acc_device_nvidia); + + x = (unsigned char **) malloc (NTHREADS * N); + d_x = (void **) malloc (NTHREADS * N); + + for (i = 0; i < N; i++) + { + int j; + + p = (unsigned char *) malloc (N); + + x[i] = p; + + for (j = 0; j < N; j++) + { + p[j] = j; + } + + d_x[i] = acc_copyin (p, N); + } + + if (pthread_attr_init (&attr) != 0) + perror ("pthread_attr_init failed"); + + tid = (pthread_t *) malloc (NTHREADS * sizeof (pthread_t)); + + acc_get_cuda_stream (1); + + for (i = 0; i < NTHREADS; i++) + { + if (pthread_create (&tid[i], &attr, &test, (void *) (unsigned long) (i)) + != 0) + perror ("pthread_create failed"); + } + + if (pthread_attr_destroy (&attr) != 0) + perror ("pthread_attr_destroy failed"); + + for (i = 0; i < NTHREADS; i++) + { + void *res; + + if (pthread_join (tid[i], &res) != 0) + perror ("pthread join failed"); + } + + for (i = 0; i < NTHREADS; i++) + { + if (acc_is_present (x[i], N) != 0) + abort (); + } + + acc_shutdown (acc_device_nvidia); + + return 0; +} + +/* { dg-output "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/nested-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/nested-1.c new file mode 100644 index 0000000..ededf2b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/nested-1.c @@ -0,0 +1,680 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> +#include <string.h> +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> + +int +main (int argc, char **argv) +{ + int N = 8; + float *a, *b, *c, *d; + int i; + + a = (float *) malloc (N * sizeof (float)); + b = (float *) malloc (N * sizeof (float)); + c = (float *) malloc (N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + } + +#pragma acc data copyin (a[0:N]) copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 3.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 1.0; + } + +#pragma acc data copyin (a[0:N]) copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 5.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 0.0; + } + + d = (float *) acc_copyin (&a[0], N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + } + +#pragma acc data present_or_copyin (a[0:N]) copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 6.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 0.0; + } + +#pragma acc data copyin (a[0:N]) present_or_copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 6.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 2.0; + } + + d = (float *) acc_copyin (&b[0], N * sizeof (float)); + +#pragma acc data copyin (a[0:N]) present_or_copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 2.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + acc_free (d); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 4.0; + } + +#pragma acc data copy (a[0:N]) copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + { + a[ii] = a[ii] + 1; + b[ii] = a[ii] + 2; + } + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 4.0) + abort (); + + if (b[i] != 6.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 4.0; + b[i] = 7.0; + } + +#pragma acc data present_or_copy (a[0:N]) present_or_copy (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + { + a[ii] = a[ii] + 1; + b[ii] = b[ii] + 2; + } + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 9.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 7.0; + } + + d = (float *) acc_copyin (&a[0], N * sizeof (float)); + d = (float *) acc_copyin (&b[0], N * sizeof (float)); + +#pragma acc data present_or_copy (a[0:N]) present_or_copy (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + { + a[ii] = a[ii] + 1; + b[ii] = b[ii] + 2; + } + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 7.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + d = (float *) acc_deviceptr (&a[0]); + acc_unmap_data (&a[0]); + acc_free (d); + + d = (float *) acc_deviceptr (&b[0]); + acc_unmap_data (&b[0]); + acc_free (d); + + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 7.0; + } + +#pragma acc data copyin (a[0:N]) create (c[0:N]) copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 3.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&c[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 4.0; + b[i] = 8.0; + } + +#pragma acc data copyin (a[0:N]) present_or_create (c[0:N]) copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 4.0) + abort (); + + if (b[i] != 4.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&c[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 2.0; + b[i] = 5.0; + } + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (c, d, N * sizeof (float)); + +#pragma acc data copyin (a[0:N]) present_or_create (c[0:N]) copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 2.0) + abort (); + + if (b[i] != 2.0) + abort (); + } + + if (acc_is_present (a, (N * sizeof (float)))) + abort (); + + if (acc_is_present (b, (N * sizeof (float)))) + abort (); + + if (!acc_is_present (c, (N * sizeof (float)))) + abort (); + + d = (float *) acc_deviceptr (c); + + acc_unmap_data (c); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 4.0; + b[i] = 8.0; + } + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (c, d, N * sizeof (float)); + +#pragma acc data copyin (a[0:N]) present (c[0:N]) copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 4.0) + abort (); + + if (b[i] != 4.0) + abort (); + } + + if (acc_is_present (a, (N * sizeof (float)))) + abort (); + + if (acc_is_present (b, (N * sizeof (float)))) + abort (); + + if (!acc_is_present (c, (N * sizeof (float)))) + abort (); + + acc_unmap_data (c); + + if (acc_is_present (c, (N * sizeof (float)))) + abort (); + + acc_free (d); + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (c, d, N * sizeof (float)); + + if (!acc_is_present (c, (N * sizeof (float)))) + abort (); + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (b, d, N * sizeof (float)); + + if (!acc_is_present (b, (N * sizeof (float)))) + abort (); + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (a, d, N * sizeof (float)); + + if (!acc_is_present (a, (N * sizeof (float)))) + abort (); + +#pragma acc data present (a[0:N]) present (c[0:N]) present (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + { + a[ii] = 1.0; + c[ii] = 2.0; + b[ii] = 4.0; + } + } + } + + if (!acc_is_present (a, (N * sizeof (float)))) + abort (); + + if (!acc_is_present (b, (N * sizeof (float)))) + abort (); + + if (!acc_is_present (c, (N * sizeof (float)))) + abort (); + + acc_copyout (b, N * sizeof (float)); + + for (i = 0; i < N; i++) + { + if (a[i] != 4.0) + abort (); + + if (b[i] != 4.0) + abort (); + } + + d = (float *) acc_deviceptr (a); + + acc_unmap_data (a); + + acc_free (d); + + d = (float *) acc_deviceptr (c); + + acc_unmap_data (c); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 6.0; + } + + d = (float *) acc_malloc (N * sizeof (float)); + +#pragma acc parallel copyin (a[0:N]) deviceptr (d) copyout (b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + d[ii] = a[ii]; + b[ii] = d[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 3.0) + abort (); + } + + if (acc_is_present (a, (N * sizeof (float)))) + abort (); + + if (acc_is_present (b, (N * sizeof (float)))) + abort (); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 0.0; + } + + d = (float *) acc_copyin (&a[0], N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + } + +#pragma acc data pcopyin (a[0:N]) copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 6.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + acc_free (d); + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 0.0; + } + +#pragma acc data copyin (a[0:N]) pcopyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + } + + for (i = 0; i < N; i++) + { + if (b[i] != 6.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 7.0; + } + +#pragma acc data copyin (a[0:N]) pcreate (c[0:N]) copyout (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 5.0) + abort (); + } + + if (acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + if (acc_is_present (&c[0], (N * sizeof (float)))) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c new file mode 100644 index 0000000..c164598 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c @@ -0,0 +1,141 @@ +/* { dg-do run } */ + +#include <stdlib.h> + +int +main (int argc, char *argv[]) +{ +#define N 10 + char a[N]; + int i; + + for (i = 0; i < N; ++i) + a[i] = 0; + +#pragma acc data copy (a) + { +#pragma acc parallel present (a) + { + int j; + + for (j = 0; j < N; ++j) + a[j] = j; + } + } + + for (i = 0; i < N; ++i) + { + if (a[i] != i) + abort (); + } + + for (i = 0; i < N; ++i) + a[i] = 0; + +#pragma acc data copy (a) + { +#pragma acc kernels present (a) + { + int j; + + for (j = 0; j < N; ++j) + a[j] = j; + } + } + + for (i = 0; i < N; ++i) + { + if (a[i] != i) + abort (); + } + + for (i = 0; i < N; ++i) + a[i] = 0; + +#pragma acc data copy (a) + { +#pragma acc data present (a) + { +#pragma acc parallel present (a) + { + int j; + + for (j = 0; j < N; ++j) + a[j] = j; + } + } + } + + for (i = 0; i < N; ++i) + { + if (a[i] != i) + abort (); + } + +#pragma acc data copy (a) + { +#pragma acc data present (a) + { +#pragma acc kernels present (a) + { + int j; + + for (j = 0; j < N; ++j) + a[j] = j; + } + } + } + + for (i = 0; i < N; ++i) + { + if (a[i] != i) + abort (); + } + + for (i = 0; i < N; ++i) + a[i] = 0; + +#pragma acc enter data copyin (a) + +#pragma acc data present (a) + { +#pragma acc parallel present (a) + { + int j; + + for (j = 0; j < N; ++j) + a[j] = j; + } + } + +#pragma acc exit data copyout (a) + + for (i = 0; i < N; ++i) + { + if (a[i] != i) + abort (); + } + +#pragma acc enter data copyin (a) + +#pragma acc data present (a) + { +#pragma acc kernels present (a) + { + int j; + + for (j = 0; j < N; ++j) + a[j] = j; + } + } + +#pragma acc exit data copyout (a) + + for (i = 0; i < N; ++i) + { + if (a[i] != i) + abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/offset-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/offset-1.c new file mode 100644 index 0000000..0bae23a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/offset-1.c @@ -0,0 +1,97 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> +#include <string.h> +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> + +int +main(int argc, char **argv) +{ + int N = 8; + float *a, *b; + int i; + + a = (float *) malloc(N * sizeof (float)); + b = (float *) malloc(N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 2.0; + b[i] = 5.0; + } + +#pragma acc parallel copyin(a[2:4]) copyout(b[2:4]) + { + b[2] = a[2]; + b[3] = a[3]; + } + + for (i = 2; i < 4; i++) + { + if (a[i] != 2.0) + abort(); + + if (b[i] != 2.0) + abort(); + } + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 1.0; + } + +#pragma acc parallel copyin(a[0:4]) copyout(b[0:4]) + { + b[0] = a[0]; + b[1] = a[1]; + b[2] = a[2]; + b[3] = a[3]; + } + + for (i = 0; i < 4; i++) + { + if (a[i] != 3.0) + abort(); + + if (b[i] != 3.0) + abort(); + } + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + b[i] = 6.0; + } + +#pragma acc parallel copyin(a[0:4]) copyout(b[4:4]) + { + b[4] = a[0]; + b[5] = a[1]; + b[6] = a[2]; + b[7] = a[3]; + } + + for (i = 0; i < 4; i++) + { + if (a[i] != 9.0) + abort(); + } + + for (i = 4; i < 8; i++) + { + if (b[i] != 9.0) + abort(); + } + + if (acc_is_present (a, (N * sizeof (float)))) + abort(); + + if (acc_is_present (b, (N * sizeof (float)))) + abort(); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-1.c new file mode 100644 index 0000000..fd9df33 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-1.c @@ -0,0 +1,206 @@ +/* { dg-do run } */ + +#include <stdlib.h> + +int i; + +int main(void) +{ + int j, v; + + i = -1; + j = -2; + v = 0; +#pragma acc parallel /* copyout */ present_or_copyout (v) copyin (i, j) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } +#if ACC_MEM_SHARED + if (v != 1 || i != 2 || j != 1) + abort (); +#else + if (v != 1 || i != -1 || j != -2) + abort (); +#endif + + i = -1; + j = -2; + v = 0; +#pragma acc parallel /* copyout */ present_or_copyout (v) copyout (i, j) + { + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != 2 || j != 1) + abort (); + + i = -1; + j = -2; + v = 0; +#pragma acc parallel /* copyout */ present_or_copyout (v) copy (i, j) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != 2 || j != 1) + abort (); + + i = -1; + j = -2; + v = 0; +#pragma acc parallel /* copyout */ present_or_copyout (v) create (i, j) + { + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } +#if ACC_MEM_SHARED + if (v != 1 || i != 2 || j != 1) + abort (); +#else + if (v != 1 || i != -1 || j != -2) + abort (); +#endif + + i = -1; + j = -2; + v = 0; +#pragma acc parallel /* copyout */ present_or_copyout (v) present_or_copyin (i, j) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1) + abort (); +#if ACC_MEM_SHARED + if (v != 1 || i != 2 || j != 1) + abort (); +#else + if (v != 1 || i != -1 || j != -2) + abort (); +#endif + + i = -1; + j = -2; + v = 0; +#pragma acc parallel /* copyout */ present_or_copyout (v) present_or_copyout (i, j) + { + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != 2 || j != 1) + abort (); + + i = -1; + j = -2; + v = 0; +#pragma acc parallel /* copyout */ present_or_copyout (v) present_or_copy (i, j) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1 || i != 2 || j != 1) + abort (); + + i = -1; + j = -2; + v = 0; +#pragma acc parallel /* copyout */ present_or_copyout (v) present_or_create (i, j) + { + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + if (v != 1) + abort (); +#if ACC_MEM_SHARED + if (v != 1 || i != 2 || j != 1) + abort (); +#else + if (v != 1 || i != -1 || j != -2) + abort (); +#endif + + i = -1; + j = -2; + v = 0; + +#pragma acc data copyin (i, j) + { +#pragma acc parallel /* copyout */ present_or_copyout (v) present (i, j) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + } +#if ACC_MEM_SHARED + if (v != 1 || i != 2 || j != 1) + abort (); +#else + if (v != 1 || i != -1 || j != -2) + abort (); +#endif + + i = -1; + j = -2; + v = 0; + +#pragma acc data copyin(i, j) + { +#pragma acc parallel /* copyout */ present_or_copyout (v) + { + if (i != -1 || j != -2) + abort (); + i = 2; + j = 1; + if (i != 2 || j != 1) + abort (); + v = 1; + } + } +#if ACC_MEM_SHARED + if (v != 1 || i != 2 || j != 1) + abort (); +#else + if (v != 1 || i != -1 || j != -2) + abort (); +#endif + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-empty.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-empty.c new file mode 100644 index 0000000..8e3bb43 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-empty.c @@ -0,0 +1,6 @@ +int +main (void) +{ +#pragma acc parallel + ; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pointer-align-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pointer-align-1.c new file mode 100644 index 0000000..f7d5b9b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pointer-align-1.c @@ -0,0 +1,35 @@ +/* { dg-do run } */ + +/* PR middle-end/63247 */ + +#include <stdlib.h> + +int +main(int argc, char **argv) +{ +#define N 4 + short a[N]; + + a[0] = 10; + a[1] = 10; + a[2] = 10; + a[3] = 10; + +#pragma acc parallel copy(a[1:N-1]) + { + a[1] = 51; + a[2] = 52; + a[3] = 53; + } + + if (a[0] != 10) + abort (); + if (a[1] != 51) + abort (); + if (a[2] != 52) + abort (); + if (a[3] != 53) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/present-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/present-1.c new file mode 100644 index 0000000..f331f1f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/present-1.c @@ -0,0 +1,48 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> +#include <string.h> +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> + +int +main (int argc, char **argv) +{ + int N = 8; + float *a, *b, *c, *d; + int i; + + a = (float *) malloc (N * sizeof (float)); + b = (float *) malloc (N * sizeof (float)); + c = (float *) malloc (N * sizeof (float)); + + d = (float *) acc_malloc (N * sizeof (float)); + acc_map_data (c, d, N * sizeof (float)); + +#pragma acc data present (a[0:N]) present (c[0:N]) present (b[0:N]) + { +#pragma acc parallel + { + int ii; + + for (ii = 0; ii < N; ii++) + { + c[ii] = a[ii]; + b[ii] = c[ii]; + } + } + } + + d = (float *) acc_deviceptr (c); + acc_unmap_data (c); + acc_free (d); + + free (a); + free (b); + free (c); + + return 0; +} +/* { dg-shouldfail "libgomp: present clause: !acc_is_present" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/present-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/present-2.c new file mode 100644 index 0000000..41efa70 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/present-2.c @@ -0,0 +1,48 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> +#include <stdlib.h> + +int +main (int argc, char **argv) +{ + int N = 8; + float *a, *b; + int i; + + a = (float *) malloc (N * sizeof (float)); + b = (float *) malloc (N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 4.0; + b[i] = 0.0; + } + +#pragma acc data copyin(a[0:N]) copyout(b[0:N]) + { + +#pragma acc parallel present(a[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + { + b[ii] = a[ii]; + } + } + + } + + for (i = 0; i < N; i++) + { + if (a[i] != 4.0) + abort (); + + if (b[i] != 4.0) + abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-1.c new file mode 100644 index 0000000..acf9540 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-1.c @@ -0,0 +1,174 @@ +/* { dg-do run } */ + +/* Integer reductions. */ + +#include <stdlib.h> +#include <stdbool.h> + +#define vl 32 + +int +main(void) +{ + const int n = 1000; + int i; + int vresult, result, array[n]; + bool lvresult, lresult; + + for (i = 0; i < n; i++) + array[i] = i; + + result = 0; + vresult = 0; + + /* '+' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (+:result) + for (i = 0; i < n; i++) + result += array[i]; + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + vresult += array[i]; + + if (result != vresult) + abort (); + + result = 0; + vresult = 0; + + /* '*' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (*:result) + for (i = 0; i < n; i++) + result *= array[i]; + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + vresult *= array[i]; + + if (result != vresult) + abort (); + +// result = 0; +// vresult = 0; +// +// /* 'max' reductions. */ +// #pragma acc parallel vector_length (vl) +// #pragma acc loop reduction (+:result) +// for (i = 0; i < n; i++) +// result = result > array[i] ? result : array[i]; +// +// /* Verify the reduction. */ +// for (i = 0; i < n; i++) +// vresult = vresult > array[i] ? vresult : array[i]; +// +// printf("%d != %d\n", result, vresult); +// if (result != vresult) +// abort (); +// +// result = 0; +// vresult = 0; +// +// /* 'min' reductions. */ +// #pragma acc parallel vector_length (vl) +// #pragma acc loop reduction (+:result) +// for (i = 0; i < n; i++) +// result = result < array[i] ? result : array[i]; +// +// /* Verify the reduction. */ +// for (i = 0; i < n; i++) +// vresult = vresult < array[i] ? vresult : array[i]; +// +// printf("%d != %d\n", result, vresult); +// if (result != vresult) +// abort (); + + result = 0; + vresult = 0; + + /* '&' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (&:result) + for (i = 0; i < n; i++) + result &= array[i]; + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + vresult &= array[i]; + + if (result != vresult) + abort (); + + result = 0; + vresult = 0; + + /* '|' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (|:result) + for (i = 0; i < n; i++) + result |= array[i]; + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + vresult |= array[i]; + + if (result != vresult) + abort (); + + result = 0; + vresult = 0; + + /* '^' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (^:result) + for (i = 0; i < n; i++) + result ^= array[i]; + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + vresult ^= array[i]; + + if (result != vresult) + abort (); + + result = 5; + vresult = 5; + + lresult = false; + lvresult = false; + + /* '&&' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (&&:lresult) + for (i = 0; i < n; i++) + lresult = lresult && (result > array[i]); + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + lvresult = lresult && (result > array[i]); + + if (lresult != lvresult) + abort (); + + result = 5; + vresult = 5; + + lresult = false; + lvresult = false; + + /* '||' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (||:lresult) + for (i = 0; i < n; i++) + lresult = lresult || (result > array[i]); + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + lvresult = lresult || (result > array[i]); + + if (lresult != lvresult) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-2.c new file mode 100644 index 0000000..c2ec110 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-2.c @@ -0,0 +1,126 @@ +/* { dg-do run } */ + +/* float reductions. */ + +#include <stdlib.h> +#include <stdbool.h> +#include <math.h> + +#define vl 32 + +int +main(void) +{ + const int n = 1000; + int i; + float vresult, result, array[n]; + bool lvresult, lresult; + + for (i = 0; i < n; i++) + array[i] = i; + + result = 0; + vresult = 0; + + /* '+' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (+:result) + for (i = 0; i < n; i++) + result += array[i]; + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + vresult += array[i]; + + if (result != vresult) + abort (); + + result = 0; + vresult = 0; + + /* '*' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (*:result) + for (i = 0; i < n; i++) + result *= array[i]; + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + vresult *= array[i]; + + if (fabs(result - vresult) > .0001) + abort (); +// result = 0; +// vresult = 0; +// +// /* 'max' reductions. */ +// #pragma acc parallel vector_length (vl) +// #pragma acc loop reduction (+:result) +// for (i = 0; i < n; i++) +// result = result > array[i] ? result : array[i]; +// +// /* Verify the reduction. */ +// for (i = 0; i < n; i++) +// vresult = vresult > array[i] ? vresult : array[i]; +// +// printf("%d != %d\n", result, vresult); +// if (result != vresult) +// abort (); +// +// result = 0; +// vresult = 0; +// +// /* 'min' reductions. */ +// #pragma acc parallel vector_length (vl) +// #pragma acc loop reduction (+:result) +// for (i = 0; i < n; i++) +// result = result < array[i] ? result : array[i]; +// +// /* Verify the reduction. */ +// for (i = 0; i < n; i++) +// vresult = vresult < array[i] ? vresult : array[i]; +// +// printf("%d != %d\n", result, vresult); +// if (result != vresult) +// abort (); + + result = 5; + vresult = 5; + + lresult = false; + lvresult = false; + + /* '&&' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (&&:lresult) + for (i = 0; i < n; i++) + lresult = lresult && (result > array[i]); + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + lvresult = lresult && (result > array[i]); + + if (lresult != lvresult) + abort (); + + result = 5; + vresult = 5; + + lresult = false; + lvresult = false; + + /* '||' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (||:lresult) + for (i = 0; i < n; i++) + lresult = lresult || (result > array[i]); + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + lvresult = lresult || (result > array[i]); + + if (lresult != lvresult) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-3.c new file mode 100644 index 0000000..58b49ff --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-3.c @@ -0,0 +1,126 @@ +/* { dg-do run } */ + +/* double reductions. */ + +#include <stdlib.h> +#include <stdbool.h> +#include <math.h> + +#define vl 32 + +int +main(void) +{ + const int n = 1000; + int i; + double vresult, result, array[n]; + bool lvresult, lresult; + + for (i = 0; i < n; i++) + array[i] = i; + + result = 0; + vresult = 0; + + /* '+' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (+:result) + for (i = 0; i < n; i++) + result += array[i]; + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + vresult += array[i]; + + if (result != vresult) + abort (); + + result = 0; + vresult = 0; + + /* '*' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (*:result) + for (i = 0; i < n; i++) + result *= array[i]; + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + vresult *= array[i]; + + if (fabs(result - vresult) > .0001) + abort (); +// result = 0; +// vresult = 0; +// +// /* 'max' reductions. */ +// #pragma acc parallel vector_length (vl) +// #pragma acc loop reduction (+:result) +// for (i = 0; i < n; i++) +// result = result > array[i] ? result : array[i]; +// +// /* Verify the reduction. */ +// for (i = 0; i < n; i++) +// vresult = vresult > array[i] ? vresult : array[i]; +// +// printf("%d != %d\n", result, vresult); +// if (result != vresult) +// abort (); +// +// result = 0; +// vresult = 0; +// +// /* 'min' reductions. */ +// #pragma acc parallel vector_length (vl) +// #pragma acc loop reduction (+:result) +// for (i = 0; i < n; i++) +// result = result < array[i] ? result : array[i]; +// +// /* Verify the reduction. */ +// for (i = 0; i < n; i++) +// vresult = vresult < array[i] ? vresult : array[i]; +// +// printf("%d != %d\n", result, vresult); +// if (result != vresult) +// abort (); + + result = 5; + vresult = 5; + + lresult = false; + lvresult = false; + + /* '&&' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (&&:lresult) + for (i = 0; i < n; i++) + lresult = lresult && (result > array[i]); + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + lvresult = lresult && (result > array[i]); + + if (lresult != lvresult) + abort (); + + result = 5; + vresult = 5; + + lresult = false; + lvresult = false; + + /* '||' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (||:lresult) + for (i = 0; i < n; i++) + lresult = lresult || (result > array[i]); + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + lvresult = lresult || (result > array[i]); + + if (lresult != lvresult) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-4.c new file mode 100644 index 0000000..c8a9a6c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-4.c @@ -0,0 +1,129 @@ +/* { dg-do run } */ + +/* complex reductions. */ + +#include <stdlib.h> +#include <stdbool.h> +#include <math.h> +#include <complex.h> + +#define vl 32 + +int +main(void) +{ + const int n = 1000; + int i; + double complex vresult, result, array[n]; + bool lvresult, lresult; + + for (i = 0; i < n; i++) + array[i] = i; + + result = 0; + vresult = 0; + + /* '+' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (+:result) + for (i = 0; i < n; i++) + result += array[i]; + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + vresult += array[i]; + + if (result != vresult) + abort (); + + result = 0; + vresult = 0; + + /* Needs support for complex multiplication. */ + +// /* '*' reductions. */ +// #pragma acc parallel vector_length (vl) +// #pragma acc loop reduction (*:result) +// for (i = 0; i < n; i++) +// result *= array[i]; +// +// /* Verify the reduction. */ +// for (i = 0; i < n; i++) +// vresult *= array[i]; +// +// if (fabs(result - vresult) > .0001) +// abort (); +// result = 0; +// vresult = 0; + +// /* 'max' reductions. */ +// #pragma acc parallel vector_length (vl) +// #pragma acc loop reduction (+:result) +// for (i = 0; i < n; i++) +// result = result > array[i] ? result : array[i]; +// +// /* Verify the reduction. */ +// for (i = 0; i < n; i++) +// vresult = vresult > array[i] ? vresult : array[i]; +// +// printf("%d != %d\n", result, vresult); +// if (result != vresult) +// abort (); +// +// result = 0; +// vresult = 0; +// +// /* 'min' reductions. */ +// #pragma acc parallel vector_length (vl) +// #pragma acc loop reduction (+:result) +// for (i = 0; i < n; i++) +// result = result < array[i] ? result : array[i]; +// +// /* Verify the reduction. */ +// for (i = 0; i < n; i++) +// vresult = vresult < array[i] ? vresult : array[i]; +// +// printf("%d != %d\n", result, vresult); +// if (result != vresult) +// abort (); + + result = 5; + vresult = 5; + + lresult = false; + lvresult = false; + + /* '&&' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (&&:lresult) + for (i = 0; i < n; i++) + lresult = lresult && (creal(result) > creal(array[i])); + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + lvresult = lresult && (creal(result) > creal(array[i])); + + if (lresult != lvresult) + abort (); + + result = 5; + vresult = 5; + + lresult = false; + lvresult = false; + + /* '||' reductions. */ +#pragma acc parallel vector_length (vl) +#pragma acc loop reduction (||:lresult) + for (i = 0; i < n; i++) + lresult = lresult || (creal(result) > creal(array[i])); + + /* Verify the reduction. */ + for (i = 0; i < n; i++) + lvresult = lresult || (creal(result) > creal(array[i])); + + if (lresult != lvresult) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-5.c new file mode 100644 index 0000000..757b8be --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-5.c @@ -0,0 +1,32 @@ +#include <stdio.h> +#include <stdlib.h> + +int +main (void) +{ + int s1 = 2, s2 = 5, v1 = 2, v2 = 5; + int n = 100; + int i; + +#pragma acc parallel vector_length (1000) +#pragma acc loop reduction (+:s1, s2) + for (i = 0; i < n; i++) + { + s1 = s1 + 3; + s2 = s2 + 2; + } + + for (i = 0; i < n; i++) + { + v1 = v1 + 3; + v2 = v2 + 2; + } + + if (s1 != v1) + abort (); + + if (s2 != v2) + abort (); + + return 0; +}
\ No newline at end of file diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-initial-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-initial-1.c new file mode 100644 index 0000000..81cf865 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-initial-1.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ + +int +main(void) +{ +#define I 5 +#define N 11 +#define A 8 + + int a = A; + int s = I; + +#pragma acc parallel vector_length(N) + { + int i; +#pragma acc loop reduction(+:s) + for (i = 0; i < N; ++i) + s += a; + } + + if (s != I + N * A) + __builtin_abort(); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/subr.h b/libgomp/testsuite/libgomp.oacc-c-c++-common/subr.h new file mode 100644 index 0000000..9db236c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/subr.h @@ -0,0 +1,46 @@ + +#if ACC_DEVICE_TYPE_nvidia + +#pragma acc routine nohost +static int clock (void) +{ + int thetime; + + asm __volatile__ ("mov.u32 %0, %%clock;" : "=r"(thetime)); + + return thetime; +} + +#endif + +void +delay (unsigned long *d_o, unsigned long delay) +{ + int start, ticks; + + start = clock (); + + ticks = 0; + + while (ticks < delay) + ticks = clock () - start; + + return; +} + +void +delay2 (unsigned long *d_o, unsigned long delay, unsigned long tid) +{ + int start, ticks; + + start = clock (); + + ticks = 0; + + while (ticks < delay) + ticks = clock () - start; + + d_o[0] = tid; + + return; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/subr.ptx b/libgomp/testsuite/libgomp.oacc-c-c++-common/subr.ptx new file mode 100644 index 0000000..6f748fc --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/subr.ptx @@ -0,0 +1,148 @@ +// BEGIN PREAMBLE + .version 3.1 + .target sm_30 + .address_size 64 +// END PREAMBLE + +// BEGIN FUNCTION DEF: clock +.func (.param.u32 %out_retval)clock +{ +.reg.u32 %retval; + .reg.u64 %hr10; + .reg.u32 %r22; + .reg.u32 %r23; + .reg.u32 %r24; + .local.align 8 .b8 %frame[8]; + // #APP +// 7 "subr.c" 1 + mov.u32 %r24, %clock; +// 0 "" 2 + // #NO_APP + st.local.u32 [%frame], %r24; + ld.local.u32 %r22, [%frame]; + mov.u32 %r23, %r22; + mov.u32 %retval, %r23; + st.param.u32 [%out_retval], %retval; + ret; + } +// END FUNCTION DEF +// BEGIN GLOBAL FUNCTION DEF: delay +.visible .entry delay(.param.u64 %in_ar1, .param.u64 %in_ar2) +{ + .reg.u64 %ar1; + .reg.u64 %ar2; + .reg.u64 %hr10; + .reg.u64 %r22; + .reg.u32 %r23; + .reg.u64 %r24; + .reg.u64 %r25; + .reg.u32 %r26; + .reg.u32 %r27; + .reg.u32 %r28; + .reg.u32 %r29; + .reg.u32 %r30; + .reg.u64 %r31; + .reg.pred %r32; + .local.align 8 .b8 %frame[24]; + ld.param.u64 %ar1, [%in_ar1]; + ld.param.u64 %ar2, [%in_ar2]; + mov.u64 %r24, %ar1; + st.u64 [%frame+8], %r24; + mov.u64 %r25, %ar2; + st.local.u64 [%frame+16], %r25; + { + .param.u32 %retval_in; + { + call (%retval_in), clock; + } + ld.param.u32 %r26, [%retval_in]; +} + st.local.u32 [%frame+4], %r26; + mov.u32 %r27, 0; + st.local.u32 [%frame], %r27; + bra $L4; +$L5: + { + .param.u32 %retval_in; + { + call (%retval_in), clock; + } + ld.param.u32 %r28, [%retval_in]; +} + mov.u32 %r23, %r28; + ld.local.u32 %r30, [%frame+4]; + sub.u32 %r29, %r23, %r30; + st.local.u32 [%frame], %r29; +$L4: + ld.local.s32 %r22, [%frame]; + ld.local.u64 %r31, [%frame+16]; + setp.lo.u64 %r32,%r22,%r31; + @%r32 bra $L5; + ret; + } +// END FUNCTION DEF +// BEGIN GLOBAL FUNCTION DEF: delay2 +.visible .entry delay2(.param.u64 %in_ar1, .param.u64 %in_ar2, .param.u64 %in_ar3) +{ + .reg.u64 %ar1; + .reg.u64 %ar2; + .reg.u64 %ar3; + .reg.u64 %hr10; + .reg.u64 %r22; + .reg.u32 %r23; + .reg.u64 %r24; + .reg.u64 %r25; + .reg.u64 %r26; + .reg.u32 %r27; + .reg.u32 %r28; + .reg.u32 %r29; + .reg.u32 %r30; + .reg.u32 %r31; + .reg.u64 %r32; + .reg.pred %r33; + .reg.u64 %r34; + .reg.u64 %r35; + .local.align 8 .b8 %frame[32]; + ld.param.u64 %ar1, [%in_ar1]; + ld.param.u64 %ar2, [%in_ar2]; + ld.param.u64 %ar3, [%in_ar3]; + mov.u64 %r24, %ar1; + st.local.u64 [%frame+8], %r24; + mov.u64 %r25, %ar2; + st.local.u64 [%frame+16], %r25; + mov.u64 %r26, %ar3; + st.local.u64 [%frame+24], %r26; + { + .param.u32 %retval_in; + { + call (%retval_in), clock; + } + ld.param.u32 %r27, [%retval_in]; +} + st.local.u32 [%frame+4], %r27; + mov.u32 %r28, 0; + st.local.u32 [%frame], %r28; + bra $L8; +$L9: + { + .param.u32 %retval_in; + { + call (%retval_in), clock; + } + ld.param.u32 %r29, [%retval_in]; +} + mov.u32 %r23, %r29; + ld.local.u32 %r31, [%frame+4]; + sub.u32 %r30, %r23, %r31; + st.local.u32 [%frame], %r30; +$L8: + ld.local.s32 %r22, [%frame]; + ld.local.u64 %r32, [%frame+16]; + setp.lo.u64 %r33,%r22,%r32; + @%r33 bra $L9; + ld.local.u64 %r34, [%frame+8]; + ld.local.u64 %r35, [%frame+24]; + st.u64 [%r34], %r35; + ret; + } +// END FUNCTION DEF diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/timer.h b/libgomp/testsuite/libgomp.oacc-c-c++-common/timer.h new file mode 100644 index 0000000..53749da --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/timer.h @@ -0,0 +1,103 @@ + +#include <stdio.h> +#include <cuda.h> + +static int _Tnum_timers; +static CUevent *_Tstart_events, *_Tstop_events; +static CUstream _Tstream; + +void +init_timers (int ntimers) +{ + int i; + CUresult r; + + _Tnum_timers = ntimers; + + _Tstart_events = (CUevent *) malloc (_Tnum_timers * sizeof (CUevent)); + _Tstop_events = (CUevent *) malloc (_Tnum_timers * sizeof (CUevent)); + + r = cuStreamCreate (&_Tstream, CU_STREAM_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuStreamCreate failed: %d\n", r); + abort (); + } + + for (i = 0; i < _Tnum_timers; i++) + { + r = cuEventCreate (&_Tstart_events[i], CU_EVENT_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuEventCreate failed: %d\n", r); + abort (); + } + + r = cuEventCreate (&_Tstop_events[i], CU_EVENT_DEFAULT); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuEventCreate failed: %d\n", r); + abort (); + } + } +} + +void +fini_timers (void) +{ + int i; + + for (i = 0; i < _Tnum_timers; i++) + { + cuEventDestroy (_Tstart_events[i]); + cuEventDestroy (_Tstop_events[i]); + } + + cuStreamDestroy (_Tstream); + + free (_Tstart_events); + free (_Tstop_events); +} + +void +start_timer (int timer) +{ + CUresult r; + + r = cuEventRecord (_Tstart_events[timer], _Tstream); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuEventRecord failed: %d\n", r); + abort (); + } +} + +float +stop_timer (int timer) +{ + CUresult r; + float etime; + + r = cuEventRecord (_Tstop_events[timer], _Tstream); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuEventRecord failed: %d\n", r); + abort (); + } + + r = cuEventSynchronize (_Tstop_events[timer]); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuEventSynchronize failed: %d\n", r); + abort (); + } + + r = cuEventElapsedTime (&etime, _Tstart_events[timer], _Tstop_events[timer]); + if (r != CUDA_SUCCESS) + { + fprintf (stderr, "cuEventElapsedTime failed: %d\n", r); + abort (); + } + + return etime; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/update-1-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/update-1-2.c new file mode 100644 index 0000000..c7e7257 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/update-1-2.c @@ -0,0 +1,282 @@ +/* Copy of update-1.c with self exchanged with host for #pragma acc update. */ + +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> +#include <string.h> +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> + +int +main (int argc, char **argv) +{ + int N = 8; + float *a, *b, *c; + float *d_a, *d_b, *d_c; + int i; + + a = (float *) malloc (N * sizeof (float)); + b = (float *) malloc (N * sizeof (float)); + c = (float *) malloc (N * sizeof (float)); + + d_a = (float *) acc_malloc (N * sizeof (float)); + d_b = (float *) acc_malloc (N * sizeof (float)); + d_c = (float *) acc_malloc (N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + } + + acc_map_data (a, d_a, N * sizeof (float)); + acc_map_data (b, d_b, N * sizeof (float)); + acc_map_data (c, d_c, N * sizeof (float)); + +#pragma acc update device (a[0:N], b[0:N]) + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update self (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 3.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 1.0; + } + +#pragma acc update device (a[0:N], b[0:N]) + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update self (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 5.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 1.0; + } + +#pragma acc update device (a[0:N], b[0:N]) + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update host (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 5.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 0.0; + } + +#pragma acc update device (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + } + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update self (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 6.0) + abort (); + + if (b[i] != 6.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 7.0; + b[i] = 2.0; + } + +#pragma acc update device (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + } + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update self (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 7.0) + abort (); + + if (b[i] != 7.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + } + +#pragma acc update device (a[0:N]) + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update self (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 9.0) + abort (); + + if (b[i] != 9.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + } + +#pragma acc update device (a[0:N]) + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + } + +#pragma acc update device (a[0:N >> 1]) + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update self (a[0:N], b[0:N]) + + for (i = 0; i < (N >> 1); i++) + { + if (a[i] != 6.0) + abort (); + + if (b[i] != 6.0) + abort (); + } + + for (i = (N >> 1); i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 5.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/update-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/update-1.c new file mode 100644 index 0000000..dff139f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/update-1.c @@ -0,0 +1,280 @@ +/* { dg-do run } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include <openacc.h> +#include <string.h> +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> + +int +main (int argc, char **argv) +{ + int N = 8; + float *a, *b, *c; + float *d_a, *d_b, *d_c; + int i; + + a = (float *) malloc (N * sizeof (float)); + b = (float *) malloc (N * sizeof (float)); + c = (float *) malloc (N * sizeof (float)); + + d_a = (float *) acc_malloc (N * sizeof (float)); + d_b = (float *) acc_malloc (N * sizeof (float)); + d_c = (float *) acc_malloc (N * sizeof (float)); + + for (i = 0; i < N; i++) + { + a[i] = 3.0; + b[i] = 0.0; + } + + acc_map_data (a, d_a, N * sizeof (float)); + acc_map_data (b, d_b, N * sizeof (float)); + acc_map_data (c, d_c, N * sizeof (float)); + +#pragma acc update device (a[0:N], b[0:N]) + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update host (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 3.0) + abort (); + + if (b[i] != 3.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 1.0; + } + +#pragma acc update device (a[0:N], b[0:N]) + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update host (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 5.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + b[i] = 1.0; + } + +#pragma acc update device (a[0:N], b[0:N]) + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update self (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 5.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + b[i] = 0.0; + } + +#pragma acc update device (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + } + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update host (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 6.0) + abort (); + + if (b[i] != 6.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 7.0; + b[i] = 2.0; + } + +#pragma acc update device (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + } + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update host (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 7.0) + abort (); + + if (b[i] != 7.0) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 9.0; + } + +#pragma acc update device (a[0:N]) + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update host (a[0:N], b[0:N]) + + for (i = 0; i < N; i++) + { + if (a[i] != 9.0) + abort (); + + if (b[i] != 9.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + for (i = 0; i < N; i++) + { + a[i] = 5.0; + } + +#pragma acc update device (a[0:N]) + + for (i = 0; i < N; i++) + { + a[i] = 6.0; + } + +#pragma acc update device (a[0:N >> 1]) + +#pragma acc parallel present (a[0:N], b[0:N]) + { + int ii; + + for (ii = 0; ii < N; ii++) + b[ii] = a[ii]; + } + +#pragma acc update host (a[0:N], b[0:N]) + + for (i = 0; i < (N >> 1); i++) + { + if (a[i] != 6.0) + abort (); + + if (b[i] != 6.0) + abort (); + } + + for (i = (N >> 1); i < N; i++) + { + if (a[i] != 5.0) + abort (); + + if (b[i] != 5.0) + abort (); + } + + if (!acc_is_present (&a[0], (N * sizeof (float)))) + abort (); + + if (!acc_is_present (&b[0], (N * sizeof (float)))) + abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c/c.exp b/libgomp/testsuite/libgomp.oacc-c/c.exp new file mode 100644 index 0000000..c0c70bb --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c/c.exp @@ -0,0 +1,71 @@ +# This whole file adapted from libgomp.c/c.exp. + +if [info exists lang_library_path] then { + unset lang_library_path + unset lang_link_flags +} +if [info exists lang_test_file] then { + unset lang_test_file +} +if [info exists lang_include_flags] then { + unset lang_include_flags +} + +load_lib libgomp-dg.exp +load_gcc_lib gcc-dg.exp + +# If a testcase doesn't have special options, use these. +if ![info exists DEFAULT_CFLAGS] then { + set DEFAULT_CFLAGS "-O2" +} + +# Initialize dg. +dg-init + +# Turn on OpenACC. +lappend ALWAYS_CFLAGS "additional_flags=-fopenacc" + +# Gather a list of all tests. +set tests [lsort [concat \ + [find $srcdir/$subdir *.c] \ + [find $srcdir/$subdir/../libgomp.oacc-c-c++-common *.c]]] + +set ld_library_path $always_ld_library_path +append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST] +set_ld_library_path_env_vars + +# Test OpenACC with available accelerators. +set SAVE_ALWAYS_CFLAGS "$ALWAYS_CFLAGS" +foreach offload_target_openacc $offload_targets_s_openacc { + set ALWAYS_CFLAGS "$SAVE_ALWAYS_CFLAGS" + set tagopt "-DACC_DEVICE_TYPE_$offload_target_openacc=1" + + switch $offload_target_openacc { + host { + set acc_mem_shared 1 + } + host_nonshm { + set acc_mem_shared 0 + } + nvidia { + # Copy ptx file (TEMPORARY) + remote_download host $srcdir/libgomp.oacc-c-c++-common/subr.ptx + + # Where timer.h lives + lappend ALWAYS_CFLAGS "additional_flags=-I${srcdir}/libgomp.oacc-c-c++-common" + + set acc_mem_shared 0 + } + default { + set acc_mem_shared 0 + } + } + set tagopt "$tagopt -DACC_MEM_SHARED=$acc_mem_shared" + + setenv ACC_DEVICE_TYPE $offload_target_openacc + + dg-runtest $tests "$tagopt" $DEFAULT_CFLAGS +} + +# All done. +dg-finish diff --git a/libgomp/testsuite/libgomp.oacc-fortran/abort-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/abort-1.f90 new file mode 100644 index 0000000..52b030b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/abort-1.f90 @@ -0,0 +1,10 @@ +! { dg-shouldfail "" { *-*-* } { "*" } { "" } } + +program main + implicit none + + !$acc parallel + call abort + !$acc end parallel + +end program main diff --git a/libgomp/testsuite/libgomp.oacc-fortran/abort-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/abort-2.f90 new file mode 100644 index 0000000..2ba2bcb --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/abort-2.f90 @@ -0,0 +1,13 @@ +program main + implicit none + + integer :: argc + argc = command_argument_count () + + !$acc parallel copyin(argc) + if (argc .ne. 0) then + call abort + end if + !$acc end parallel + +end program main diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 new file mode 100644 index 0000000..4488818 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 @@ -0,0 +1,52 @@ +! { dg-additional-options "-cpp" } +! TODO: Have to disable the acc_on_device builtin for we want to test the +! libgomp library function? The command line option +! '-fno-builtin-acc_on_device' is valid for C/C++/ObjC/ObjC++ but not for +! Fortran. + +use openacc +implicit none + +! Host. + +if (.not. acc_on_device (acc_device_none)) call abort +if (.not. acc_on_device (acc_device_host)) call abort +if (acc_on_device (acc_device_host_nonshm)) call abort +if (acc_on_device (acc_device_not_host)) call abort +if (acc_on_device (acc_device_nvidia)) call abort + + +! Host via offloading fallback mode. + +!$acc parallel if(.false.) +if (.not. acc_on_device (acc_device_none)) call abort +if (.not. acc_on_device (acc_device_host)) call abort +if (acc_on_device (acc_device_host_nonshm)) call abort +if (acc_on_device (acc_device_not_host)) call abort +if (acc_on_device (acc_device_nvidia)) call abort +!$acc end parallel + + +#if !ACC_DEVICE_TYPE_host + +! Offloaded. + +!$acc parallel +if (acc_on_device (acc_device_none)) call abort +if (acc_on_device (acc_device_host)) call abort +#if ACC_DEVICE_TYPE_host_nonshm +if (.not. acc_on_device (acc_device_host_nonshm)) call abort +#else +if (acc_on_device (acc_device_host_nonshm)) call abort +#endif +if (.not. acc_on_device (acc_device_not_host)) call abort +#if ACC_DEVICE_TYPE_nvidia +if (.not. acc_on_device (acc_device_nvidia)) call abort +#else +if (acc_on_device (acc_device_nvidia)) call abort +#endif +!$acc end parallel + +#endif + +end diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f new file mode 100644 index 0000000..0047a19 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f @@ -0,0 +1,52 @@ +! { dg-additional-options "-cpp" } +! TODO: Have to disable the acc_on_device builtin for we want to test +! the libgomp library function? The command line option +! '-fno-builtin-acc_on_device' is valid for C/C++/ObjC/ObjC++ but not +! for Fortran. + + USE OPENACC + IMPLICIT NONE + +!Host. + + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NONE)) CALL ABORT + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_HOST)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_HOST_NONSHM)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) CALL ABORT + + +!Host via offloading fallback mode. + +!$ACC PARALLEL IF(.FALSE.) + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NONE)) CALL ABORT + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_HOST)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_HOST_NONSHM)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) CALL ABORT +!$ACC END PARALLEL + + +#if !ACC_DEVICE_TYPE_host + +! Offloaded. + +!$ACC PARALLEL + IF (ACC_ON_DEVICE (ACC_DEVICE_NONE)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_HOST)) CALL ABORT +#if ACC_DEVICE_TYPE_host_nonshm + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_HOST_NONSHM)) CALL ABORT +#else + IF (ACC_ON_DEVICE (ACC_DEVICE_HOST_NONSHM)) CALL ABORT +#endif + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) CALL ABORT +#if ACC_DEVICE_TYPE_nvidia + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) CALL ABORT +#else + IF (ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) CALL ABORT +#endif +!$ACC END PARALLEL + +#endif + + END diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f new file mode 100644 index 0000000..49d7a72 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f @@ -0,0 +1,52 @@ +! { dg-additional-options "-cpp" } +! TODO: Have to disable the acc_on_device builtin for we want to test +! the libgomp library function? The command line option +! '-fno-builtin-acc_on_device' is valid for C/C++/ObjC/ObjC++ but not +! for Fortran. + + IMPLICIT NONE + INCLUDE "openacc_lib.h" + +!Host. + + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NONE)) CALL ABORT + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_HOST)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_HOST_NONSHM)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) CALL ABORT + + +!Host via offloading fallback mode. + +!$ACC PARALLEL IF(.FALSE.) + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NONE)) CALL ABORT + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_HOST)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_HOST_NONSHM)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) CALL ABORT +!$ACC END PARALLEL + + +#if !ACC_DEVICE_TYPE_host + +! Offloaded. + +!$ACC PARALLEL + IF (ACC_ON_DEVICE (ACC_DEVICE_NONE)) CALL ABORT + IF (ACC_ON_DEVICE (ACC_DEVICE_HOST)) CALL ABORT +#if ACC_DEVICE_TYPE_host_nonshm + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_HOST_NONSHM)) CALL ABORT +#else + IF (ACC_ON_DEVICE (ACC_DEVICE_HOST_NONSHM)) CALL ABORT +#endif + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) CALL ABORT +#if ACC_DEVICE_TYPE_nvidia + IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) CALL ABORT +#else + IF (ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) CALL ABORT +#endif +!$ACC END PARALLEL + +#endif + + END diff --git a/libgomp/testsuite/libgomp.oacc-fortran/asyncwait-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/asyncwait-1.f90 new file mode 100644 index 0000000..b6e637b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/asyncwait-1.f90 @@ -0,0 +1,135 @@ +! { dg-do run } + +program asyncwait + integer, parameter :: N = 64 + real, allocatable :: a(:), b(:), c(:), d(:), e(:) + integer i + + allocate (a(N)) + allocate (b(N)) + allocate (c(N)) + allocate (d(N)) + allocate (e(N)) + + a(:) = 3.0 + b(:) = 0.0 + + !$acc data copy (a(1:N)) copy (b(1:N)) + + !$acc parallel async + !$acc loop + do i = 1, N + b(i) = a(i) + end do + !$acc end parallel + + !$acc wait + !$acc end data + + do i = 1, N + if (a(i) .ne. 3.0) call abort + if (b(i) .ne. 3.0) call abort + end do + + a(:) = 2.0 + b(:) = 0.0 + + !$acc data copy (a(1:N)) copy (b(1:N)) + + !$acc parallel async (1) + !$acc loop + do i = 1, N + b(i) = a(i) + end do + !$acc end parallel + + !$acc wait (1) + !$acc end data + + do i = 1, N + if (a(i) .ne. 2.0) call abort + if (b(i) .ne. 2.0) call abort + end do + + a(:) = 3.0 + b(:) = 0.0 + c(:) = 0.0 + d(:) = 0.0 + + !$acc data copy (a(1:N)) copy (b(1:N)) copy (c(1:N)) copy (d(1:N)) + + !$acc parallel async (1) + do i = 1, N + b(i) = (a(i) * a(i) * a(i)) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + c(i) = (a(i) * 4) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + !$acc loop + do i = 1, N + d(i) = ((a(i) * a(i) + a(i)) / a(i)) - a(i) + end do + !$acc end parallel + + !$acc wait (1) + !$acc end data + + do i = 1, N + if (a(i) .ne. 3.0) call abort + if (b(i) .ne. 9.0) call abort + if (c(i) .ne. 4.0) call abort + if (d(i) .ne. 1.0) call abort + end do + + a(:) = 2.0 + b(:) = 0.0 + c(:) = 0.0 + d(:) = 0.0 + e(:) = 0.0 + + !$acc data copy (a(1:N), b(1:N), c(1:N), d(1:N), e(1:N)) + + !$acc parallel async (1) + do i = 1, N + b(i) = (a(i) * a(i) * a(i)) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + !$acc loop + do i = 1, N + c(i) = (a(i) * 4) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + !$acc loop + do i = 1, N + d(i) = ((a(i) * a(i) + a(i)) / a(i)) - a(i) + end do + !$acc end parallel + + !$acc parallel wait (1) async (1) + !$acc loop + do i = 1, N + e(i) = a(i) + b(i) + c(i) + d(i) + end do + !$acc end parallel + + !$acc wait (1) + !$acc end data + + do i = 1, N + if (a(i) .ne. 2.0) call abort + if (b(i) .ne. 4.0) call abort + if (c(i) .ne. 4.0) call abort + if (d(i) .ne. 1.0) call abort + if (e(i) .ne. 11.0) call abort + end do +end program asyncwait diff --git a/libgomp/testsuite/libgomp.oacc-fortran/asyncwait-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/asyncwait-2.f90 new file mode 100644 index 0000000..bade52b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/asyncwait-2.f90 @@ -0,0 +1,40 @@ +! { dg-do run } + +program parallel_wait + integer, parameter :: N = 64 + real, allocatable :: a(:), b(:), c(:) + integer i + + allocate (a(N)) + allocate (b(N)) + allocate (c(N)) + + !$acc parallel async (0) + !$acc loop + do i = 1, N + a(i) = 1 + end do + !$acc end parallel + + !$acc parallel async (1) + !$acc loop + do i = 1, N + b(i) = 1 + end do + !$acc end parallel + + !$acc parallel wait (0, 1) + !$acc loop + do i = 1, N + c(i) = a(i) + b(i) + end do + !$acc end parallel + + do i = 1, N + if (c(i) .ne. 2.0) call abort + end do + + deallocate (a) + deallocate (b) + deallocate (c) +end program parallel_wait diff --git a/libgomp/testsuite/libgomp.oacc-fortran/asyncwait-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/asyncwait-3.f90 new file mode 100644 index 0000000..d48dc11 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/asyncwait-3.f90 @@ -0,0 +1,42 @@ +! { dg-do run } + +program parallel_wait + integer, parameter :: N = 64 + real, allocatable :: a(:), b(:), c(:) + integer i + + allocate (a(N)) + allocate (b(N)) + allocate (c(N)) + + !$acc parallel async (0) + !$acc loop + do i = 1, N + a(i) = 1 + end do + !$acc end parallel + + !$acc parallel async (1) + !$acc loop + do i = 1, N + b(i) = 1 + end do + !$acc end parallel + + !$acc wait (0, 1) + + !$acc parallel + !$acc loop + do i = 1, N + c(i) = a(i) + b(i) + end do + !$acc end parallel + + do i = 1, N + if (c(i) .ne. 2.0) call abort + end do + + deallocate (a) + deallocate (b) + deallocate (c) +end program parallel_wait diff --git a/libgomp/testsuite/libgomp.oacc-fortran/collapse-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/collapse-1.f90 new file mode 100644 index 0000000..4c07bc2 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/collapse-1.f90 @@ -0,0 +1,27 @@ +! { dg-do run } + +program collapse1 + integer :: i, j, k, a(1:3, 4:6, 5:7) + logical :: l + l = .false. + a(:, :, :) = 0 + !$acc parallel + !$acc loop collapse(4 - 1) + do i = 1, 3 + do j = 4, 6 + do k = 5, 7 + a(i, j, k) = i + j + k + end do + end do + end do + !$acc loop collapse(2) reduction(.or.:l) + do i = 1, 3 + do j = 4, 6 + do k = 5, 7 + if (a(i, j, k) .ne. (i + j + k)) l = .true. + end do + end do + end do + !$acc end parallel + if (l) call abort +end program collapse1 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/collapse-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/collapse-2.f90 new file mode 100644 index 0000000..ca3b638 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/collapse-2.f90 @@ -0,0 +1,25 @@ +! { dg-do run } + +program collapse2 + integer :: i, j, k, a(1:3, 4:6, 5:7) + logical :: l + l = .false. + a(:, :, :) = 0 + !$acc parallel + !$acc loop collapse(4 - 1) + do 164 i = 1, 3 + do 164 j = 4, 6 + do 164 k = 5, 7 + a(i, j, k) = i + j + k +164 end do + !$acc loop collapse(2) reduction(.or.:l) +firstdo: do i = 1, 3 + do j = 4, 6 + do k = 5, 7 + if (a(i, j, k) .ne. (i + j + k)) l = .true. + end do + end do + end do firstdo + !$acc end parallel + if (l) call abort +end program collapse2 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/collapse-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/collapse-3.f90 new file mode 100644 index 0000000..50e6100 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/collapse-3.f90 @@ -0,0 +1,28 @@ +! { dg-do run } + +program collapse3 + integer :: a(3,3,3), k, kk, kkk, l, ll, lll + !$acc parallel + !$acc loop collapse(3) + do 115 k=1,3 +dokk: do kk=1,3 + do kkk=1,3 + a(k,kk,kkk) = 1 + enddo + enddo dokk +115 continue + !$acc end parallel + if (any(a(1:3,1:3,1:3).ne.1)) call abort + + !$acc parallel + !$acc loop collapse(3) +dol: do 120 l=1,3 +doll: do ll=1,3 + do lll=1,3 + a(l,ll,lll) = 2 + enddo + enddo doll +120 end do dol + !$acc end parallel + if (any(a(1:3,1:3,1:3).ne.2)) call abort +end program collapse3 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/collapse-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/collapse-4.f90 new file mode 100644 index 0000000..41b66db --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/collapse-4.f90 @@ -0,0 +1,40 @@ +! { dg-do run } + +! collapse3.f90:test1 +program collapse4 + integer :: i, j, k, a(1:7, -3:5, 12:19), b(1:7, -3:5, 12:19) + logical :: l, r + l = .false. + r = .false. + a(:, :, :) = 0 + b(:, :, :) = 0 + !$acc parallel + !$acc loop collapse (3) reduction (.or.:l) + do i = 2, 6 + do j = -2, 4 + do k = 13, 18 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + end do + end do + end do + !$acc end parallel + do i = 2, 6 + do j = -2, 4 + do k = 13, 18 + r = r.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + r = r.or.k.lt.13.or.k.gt.18 + if (.not.l) b(i, j, k) = b(i, j, k) + 1 + end do + end do + end do + if (l .neqv. r) call abort + do i = 2, 6 + do j = -2, 4 + do k = 13, 18 + if (a(i, j, k) .ne. b(i, j, k)) call abort + end do + end do + end do +end program collapse4 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/collapse-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/collapse-5.f90 new file mode 100644 index 0000000..8c20f04 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/collapse-5.f90 @@ -0,0 +1,48 @@ +! { dg-do run } + +! collapse3.f90:test2 +program collapse5 + integer :: i, j, k, a(1:7, -3:5, 12:19), b(1:7, -3:5, 12:19) + integer :: v1, v2, v3, v4, v5, v6 + logical :: l, r + l = .false. + r = .false. + a(:, :, :) = 0 + b(:, :, :) = 0 + v1 = 3 + v2 = 6 + v3 = -2 + v4 = 4 + v5 = 13 + v6 = 18 + !$acc parallel + !$acc loop collapse (3) reduction (.or.:l) + do i = v1, v2 + do j = v3, v4 + do k = v5, v6 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + m = i * 100 + j * 10 + k + end do + end do + end do + !$acc end parallel + do i = v1, v2 + do j = v3, v4 + do k = v5, v6 + r = r.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + r = r.or.k.lt.13.or.k.gt.18 + if (.not.l) b(i, j, k) = b(i, j, k) + 1 + end do + end do + end do + if (l .neqv. r) call abort + do i = v1, v2 + do j = v3, v4 + do k = v5, v6 + if (a(i, j, k) .ne. b(i, j, k)) call abort + end do + end do + end do +end program collapse5 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/collapse-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/collapse-6.f90 new file mode 100644 index 0000000..7404b91 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/collapse-6.f90 @@ -0,0 +1,50 @@ +! { dg-do run } + +! collapse3.f90:test3 +program collapse6 + integer :: i, j, k, a(1:7, -3:5, 12:19), b(1:7, -3:5, 12:19) + integer :: v1, v2, v3, v4, v5, v6, v7, v8, v9 + logical :: l, r + l = .false. + r = .false. + a(:, :, :) = 0 + b(:, :, :) = 0 + v1 = 3 + v2 = 6 + v3 = -2 + v4 = 4 + v5 = 13 + v6 = 18 + v7 = 1 + v8 = 1 + v9 = 1 + !$acc parallel + !$acc loop collapse (3) reduction (.or.:l) + do i = v1, v2, v7 + do j = v3, v4, v8 + do k = v5, v6, v9 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + end do + end do + end do + !$acc end parallel + do i = v1, v2, v7 + do j = v3, v4, v8 + do k = v5, v6, v9 + r = r.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + r = r.or.k.lt.13.or.k.gt.18 + if (.not.r) b(i, j, k) = b(i, j, k) + 1 + end do + end do + end do + if (l .neqv. r) call abort + do i = v1, v2, v7 + do j = v3, v4, v8 + do k = v5, v6, v9 + if (a(i, j, k) .ne. b(i, j, k)) call abort + end do + end do + end do +end program collapse6 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/collapse-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/collapse-7.f90 new file mode 100644 index 0000000..12efd8c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/collapse-7.f90 @@ -0,0 +1,40 @@ +! { dg-do run } + +! collapse3.f90:test4 +program collapse7 + integer :: i, j, k, a(1:7, -3:5, 12:19), b(1:7, -3:5, 12:19) + logical :: l, r + l = .false. + r = .false. + a(:, :, :) = 0 + b(:, :, :) = 0 + !$acc parallel + !$acc loop collapse (3) reduction (.or.:l) + do i = 2, 6 + do j = -2, 4 + do k = 13, 18 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + end do + end do + end do + !$acc end parallel + do i = 2, 6 + do j = -2, 4 + do k = 13, 18 + r = r.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + r = r.or.k.lt.13.or.k.gt.18 + if (.not.r) b(i, j, k) = b(i, j, k) + 1 + end do + end do + end do + if (l .neqv. r) call abort + do i = 1, 7 + do j = -3, 5 + do k = 12, 19 + if (a(i, j, k) .ne. b(i, j, k)) call abort + end do + end do + end do +end program collapse7 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/collapse-8.f90 b/libgomp/testsuite/libgomp.oacc-fortran/collapse-8.f90 new file mode 100644 index 0000000..04fbcfe --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/collapse-8.f90 @@ -0,0 +1,47 @@ +! { dg-do run } + +! collapse3.f90:test5 +program collapse8 + integer :: i, j, k, a(1:7, -3:5, 12:19), b(1:7, -3:5, 12:19) + integer :: v1, v2, v3, v4, v5, v6 + logical :: l, r + l = .false. + r = .false. + a(:, :, :) = 0 + b(:, :, :) = 0 + v1 = 3 + v2 = 6 + v3 = -2 + v4 = 4 + v5 = 13 + v6 = 18 + !$acc parallel + !$acc loop collapse (3) reduction (.or.:l) + do i = v1, v2 + do j = v3, v4 + do k = v5, v6 + l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + l = l.or.k.lt.13.or.k.gt.18 + if (.not.l) a(i, j, k) = a(i, j, k) + 1 + end do + end do + end do + !$acc end parallel + do i = v1, v2 + do j = v3, v4 + do k = v5, v6 + r = r.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4 + r = r.or.k.lt.13.or.k.gt.18 + if (.not.r) b(i, j, k) = b(i, j, k) + 1 + end do + end do + end do + if (l .neqv. r) call abort + do i = v1, v2 + do j = v3, v4 + do k = v5, v6 + if (a(i, j, k) .ne. b(i, j, k)) call abort + end do + end do + end do +end program collapse8 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data-1.f90 new file mode 100644 index 0000000..5e94e2d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-1.f90 @@ -0,0 +1,45 @@ +! { dg-do run } + +program test + integer, parameter :: N = 8 + real, allocatable :: a(:), b(:) + + allocate (a(N)) + allocate (b(N)) + + a(:) = 3.0 + b(:) = 0.0 + + !$acc enter data copyin (a(1:N), b(1:N)) + + !$acc parallel + do i = 1, n + b(i) = a (i) + end do + !$acc end parallel + + !$acc exit data copyout (a(1:N), b(1:N)) + + do i = 1, n + if (a(i) .ne. 3.0) call abort + if (b(i) .ne. 3.0) call abort + end do + + a(:) = 5.0 + b(:) = 1.0 + + !$acc enter data copyin (a(1:N), b(1:N)) + + !$acc parallel + do i = 1, n + b(i) = a (i) + end do + !$acc end parallel + + !$acc exit data copyout (a(1:N), b(1:N)) + + do i = 1, n + if (a(i) .ne. 5.0) call abort + if (b(i) .ne. 5.0) call abort + end do +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 new file mode 100644 index 0000000..8736c2a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 @@ -0,0 +1,31 @@ +! { dg-do run } + +program test + integer, parameter :: N = 8 + real, allocatable :: a(:,:), b(:,:) + + allocate (a(N,N)) + allocate (b(N,N)) + + a(:,:) = 3.0 + b(:,:) = 0.0 + + !$acc enter data copyin (a(1:N,1:N), b(1:N,1:N)) + + !$acc parallel + do i = 1, n + do j = 1, n + b(j,i) = a (j,i) + end do + end do + !$acc end parallel + + !$acc exit data copyout (a(1:N,1:N), b(1:N,1:N)) + + do i = 1, n + do j = 1, n + if (a(j,i) .ne. 3.0) call abort + if (b(j,i) .ne. 3.0) call abort + end do + end do +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data-3.f90 new file mode 100644 index 0000000..9868cb0 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-3.f90 @@ -0,0 +1,131 @@ +! { dg-do run } + +program asyncwait + real, allocatable :: a(:), b(:), c(:), d(:), e(:) + integer i, N + + N = 64 + + allocate (a(N)) + allocate (b(N)) + allocate (c(N)) + allocate (d(N)) + allocate (e(N)) + + a(:) = 3.0 + b(:) = 0.0 + + !$acc enter data copyin (a(1:N)) copyin (b(1:N)) copyin (N) async + + !$acc parallel async wait + do i = 1, N + b(i) = a(i) + end do + !$acc end parallel + + !$acc wait + !$acc exit data copyout (a(1:N)) copyout (b(1:N)) + + do i = 1, N + if (a(i) .ne. 3.0) call abort + if (b(i) .ne. 3.0) call abort + end do + + a(:) = 2.0 + b(:) = 0.0 + + !$acc enter data copyin (a(1:N)) copyin (b(1:N)) async (1) + + !$acc parallel async (1) wait (1) + do i = 1, N + b(i) = a(i) + end do + !$acc end parallel + + !$acc wait (1) + !$acc exit data copyout (a(1:N)) copyout (b(1:N)) + + do i = 1, N + if (a(i) .ne. 2.0) call abort + if (b(i) .ne. 2.0) call abort + end do + + a(:) = 3.0 + b(:) = 0.0 + c(:) = 0.0 + d(:) = 0.0 + + !$acc enter data copyin (a(1:N)) create (b(1:N)) create (c(1:N)) create (d(1:N)) + + !$acc parallel async (1) + do i = 1, N + b(i) = (a(i) * a(i) * a(i)) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + c(i) = (a(i) * 4) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + d(i) = ((a(i) * a(i) + a(i)) / a(i)) - a(i) + end do + !$acc end parallel + + !$acc wait (1) + !$acc exit data copyout (a(1:N)) copyout (b(1:N)) copyout (c(1:N)) copyout (d(1:N)) + + do i = 1, N + if (a(i) .ne. 3.0) call abort + if (b(i) .ne. 9.0) call abort + if (c(i) .ne. 4.0) call abort + if (d(i) .ne. 1.0) call abort + end do + + a(:) = 2.0 + b(:) = 0.0 + c(:) = 0.0 + d(:) = 0.0 + e(:) = 0.0 + + !$acc enter data copyin (a(1:N)) create (b(1:N)) create (c(1:N)) create (d(1:N)) copyin (e(1:N)) + + !$acc parallel async (1) + do i = 1, N + b(i) = (a(i) * a(i) * a(i)) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + c(i) = (a(i) * 4) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + d(i) = ((a(i) * a(i) + a(i)) / a(i)) - a(i) + end do + !$acc end parallel + + !$acc parallel wait (1) async (1) + do i = 1, N + e(i) = a(i) + b(i) + c(i) + d(i) + end do + !$acc end parallel + + !$acc wait (1) + !$acc exit data copyout (a(1:N)) copyout (b(1:N)) copyout (c(1:N)) copyout (d(1:N)) copyout (e(1:N)) + !$acc exit data delete (N) + + do i = 1, N + if (a(i) .ne. 2.0) call abort + if (b(i) .ne. 4.0) call abort + if (c(i) .ne. 4.0) call abort + if (d(i) .ne. 1.0) call abort + if (e(i) .ne. 11.0) call abort + end do +end program asyncwait diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-4-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data-4-2.f90 new file mode 100644 index 0000000..16a8598 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-4-2.f90 @@ -0,0 +1,138 @@ +! Copy of data-4.f90 with self exchanged with host for !acc update. + +! { dg-do run } + +program asyncwait + real, allocatable :: a(:), b(:), c(:), d(:), e(:) + integer i, N + + N = 64 + + allocate (a(N)) + allocate (b(N)) + allocate (c(N)) + allocate (d(N)) + allocate (e(N)) + + a(:) = 3.0 + b(:) = 0.0 + + !$acc enter data copyin (a(1:N)) copyin (b(1:N)) copyin (N) async + + !$acc parallel async wait + !$acc loop + do i = 1, N + b(i) = a(i) + end do + !$acc end parallel + + !$acc update self (a(1:N), b(1:N)) async wait + !$acc wait + + do i = 1, N + if (a(i) .ne. 3.0) call abort + if (b(i) .ne. 3.0) call abort + end do + + a(:) = 2.0 + b(:) = 0.0 + + !$acc update device (a(1:N), b(1:N)) async (1) + + !$acc parallel async (1) wait (1) + !$acc loop + do i = 1, N + b(i) = a(i) + end do + !$acc end parallel + + !$acc update host (a(1:N), b(1:N)) async (1) wait (1) + !$acc wait (1) + + do i = 1, N + if (a(i) .ne. 2.0) call abort + if (b(i) .ne. 2.0) call abort + end do + + a(:) = 3.0 + b(:) = 0.0 + c(:) = 0.0 + d(:) = 0.0 + + !$acc enter data copyin (c(1:N), d(1:N)) async (1) + !$acc update device (a(1:N), b(1:N)) async (1) + + !$acc parallel async (1) + do i = 1, N + b(i) = (a(i) * a(i) * a(i)) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + c(i) = (a(i) * 4) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + d(i) = ((a(i) * a(i) + a(i)) / a(i)) - a(i) + end do + !$acc end parallel + + !$acc update self (a(1:N), b(1:N), c(1:N), d(1:N)) async (1) wait (1) + + !$acc wait (1) + + do i = 1, N + if (a(i) .ne. 3.0) call abort + if (b(i) .ne. 9.0) call abort + if (c(i) .ne. 4.0) call abort + if (d(i) .ne. 1.0) call abort + end do + + a(:) = 2.0 + b(:) = 0.0 + c(:) = 0.0 + d(:) = 0.0 + e(:) = 0.0 + + !$acc enter data copyin (e(1:N)) async (1) + !$acc update device (a(1:N), b(1:N), c(1:N), d(1:N)) async (1) + + !$acc parallel async (1) + do i = 1, N + b(i) = (a(i) * a(i) * a(i)) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + c(i) = (a(i) * 4) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + d(i) = ((a(i) * a(i) + a(i)) / a(i)) - a(i) + end do + !$acc end parallel + + !$acc parallel wait (1) async (1) + do i = 1, N + e(i) = a(i) + b(i) + c(i) + d(i) + end do + !$acc end parallel + + !$acc update self (a(1:N), b(1:N), c(1:N), d(1:N), e(1:N)) async (1) wait (1) + !$acc wait (1) + !$acc exit data delete (N, a(1:N), b(1:N), c(1:N), d(1:N), e(1:N)) + + do i = 1, N + if (a(i) .ne. 2.0) call abort + if (b(i) .ne. 4.0) call abort + if (c(i) .ne. 4.0) call abort + if (d(i) .ne. 1.0) call abort + if (e(i) .ne. 11.0) call abort + end do +end program asyncwait diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data-4.f90 new file mode 100644 index 0000000..f6886b0 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-4.f90 @@ -0,0 +1,136 @@ +! { dg-do run } + +program asyncwait + real, allocatable :: a(:), b(:), c(:), d(:), e(:) + integer i, N + + N = 64 + + allocate (a(N)) + allocate (b(N)) + allocate (c(N)) + allocate (d(N)) + allocate (e(N)) + + a(:) = 3.0 + b(:) = 0.0 + + !$acc enter data copyin (a(1:N)) copyin (b(1:N)) copyin (N) async + + !$acc parallel async wait + !$acc loop + do i = 1, N + b(i) = a(i) + end do + !$acc end parallel + + !$acc update host (a(1:N), b(1:N)) async wait + !$acc wait + + do i = 1, N + if (a(i) .ne. 3.0) call abort + if (b(i) .ne. 3.0) call abort + end do + + a(:) = 2.0 + b(:) = 0.0 + + !$acc update device (a(1:N), b(1:N)) async (1) + + !$acc parallel async (1) wait (1) + !$acc loop + do i = 1, N + b(i) = a(i) + end do + !$acc end parallel + + !$acc update self (a(1:N), b(1:N)) async (1) wait (1) + !$acc wait (1) + + do i = 1, N + if (a(i) .ne. 2.0) call abort + if (b(i) .ne. 2.0) call abort + end do + + a(:) = 3.0 + b(:) = 0.0 + c(:) = 0.0 + d(:) = 0.0 + + !$acc enter data copyin (c(1:N), d(1:N)) async (1) + !$acc update device (a(1:N), b(1:N)) async (1) + + !$acc parallel async (1) + do i = 1, N + b(i) = (a(i) * a(i) * a(i)) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + c(i) = (a(i) * 4) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + d(i) = ((a(i) * a(i) + a(i)) / a(i)) - a(i) + end do + !$acc end parallel + + !$acc update host (a(1:N), b(1:N), c(1:N), d(1:N)) async (1) wait (1) + + !$acc wait (1) + + do i = 1, N + if (a(i) .ne. 3.0) call abort + if (b(i) .ne. 9.0) call abort + if (c(i) .ne. 4.0) call abort + if (d(i) .ne. 1.0) call abort + end do + + a(:) = 2.0 + b(:) = 0.0 + c(:) = 0.0 + d(:) = 0.0 + e(:) = 0.0 + + !$acc enter data copyin (e(1:N)) async (1) + !$acc update device (a(1:N), b(1:N), c(1:N), d(1:N)) async (1) + + !$acc parallel async (1) + do i = 1, N + b(i) = (a(i) * a(i) * a(i)) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + c(i) = (a(i) * 4) / a(i) + end do + !$acc end parallel + + !$acc parallel async (1) + do i = 1, N + d(i) = ((a(i) * a(i) + a(i)) / a(i)) - a(i) + end do + !$acc end parallel + + !$acc parallel wait (1) async (1) + do i = 1, N + e(i) = a(i) + b(i) + c(i) + d(i) + end do + !$acc end parallel + + !$acc update host (a(1:N), b(1:N), c(1:N), d(1:N), e(1:N)) async (1) wait (1) + !$acc wait (1) + !$acc exit data delete (N, a(1:N), b(1:N), c(1:N), d(1:N), e(1:N)) + + do i = 1, N + if (a(i) .ne. 2.0) call abort + if (b(i) .ne. 4.0) call abort + if (c(i) .ne. 4.0) call abort + if (d(i) .ne. 1.0) call abort + if (e(i) .ne. 11.0) call abort + end do +end program asyncwait diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-already-1.f b/libgomp/testsuite/libgomp.oacc-fortran/data-already-1.f new file mode 100644 index 0000000..ac220ab --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-already-1.f @@ -0,0 +1,17 @@ +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } + + IMPLICIT NONE + INCLUDE "openacc_lib.h" + + INTEGER I + + CALL ACC_COPYIN (I) + +!$ACC DATA COPY (I) + I = 0 +!$ACC END DATA + + END + +! { dg-shouldfail "" } +! { dg-output "Trying to map into device .* object when .* is already mapped" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-already-2.f b/libgomp/testsuite/libgomp.oacc-fortran/data-already-2.f new file mode 100644 index 0000000..2c5254b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-already-2.f @@ -0,0 +1,16 @@ +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } + + IMPLICIT NONE + + INTEGER I + +!$ACC DATA PRESENT_OR_COPY (I) +!$ACC DATA COPYOUT (I) + I = 0 +!$ACC END DATA +!$ACC END DATA + + END + +! { dg-shouldfail "" } +! { dg-output "Trying to map into device .* object when .* is already mapped" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-already-3.f b/libgomp/testsuite/libgomp.oacc-fortran/data-already-3.f new file mode 100644 index 0000000..c41de28 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-already-3.f @@ -0,0 +1,15 @@ +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } + + IMPLICIT NONE + INCLUDE "openacc_lib.h" + + INTEGER I + +!$ACC DATA PRESENT_OR_COPY (I) + CALL ACC_COPYIN (I) +!$ACC END DATA + + END + +! { dg-shouldfail "" } +! { dg-output "already mapped to" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-already-4.f b/libgomp/testsuite/libgomp.oacc-fortran/data-already-4.f new file mode 100644 index 0000000..f54bf58 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-already-4.f @@ -0,0 +1,14 @@ +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } + + IMPLICIT NONE + INCLUDE "openacc_lib.h" + + INTEGER I + + CALL ACC_PRESENT_OR_COPYIN (I) + CALL ACC_COPYIN (I) + + END + +! { dg-shouldfail "" } +! { dg-output "already mapped to" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-already-5.f b/libgomp/testsuite/libgomp.oacc-fortran/data-already-5.f new file mode 100644 index 0000000..9a3e94f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-already-5.f @@ -0,0 +1,14 @@ +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } + + IMPLICIT NONE + INCLUDE "openacc_lib.h" + + INTEGER I + +!$ACC ENTER DATA CREATE (I) + CALL ACC_COPYIN (I) + + END + +! { dg-shouldfail "" } +! { dg-output "already mapped to" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-already-6.f b/libgomp/testsuite/libgomp.oacc-fortran/data-already-6.f new file mode 100644 index 0000000..eaf5d98 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-already-6.f @@ -0,0 +1,14 @@ +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } + + IMPLICIT NONE + INCLUDE "openacc_lib.h" + + INTEGER I + + CALL ACC_PRESENT_OR_COPYIN (I) +!$ACC ENTER DATA CREATE (I) + + END + +! { dg-shouldfail "" } +! { dg-output "already mapped to" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-already-7.f b/libgomp/testsuite/libgomp.oacc-fortran/data-already-7.f new file mode 100644 index 0000000..d96bf0b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-already-7.f @@ -0,0 +1,14 @@ +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } + + IMPLICIT NONE + INCLUDE "openacc_lib.h" + + INTEGER I + +!$ACC ENTER DATA CREATE (I) + CALL ACC_CREATE (I) + + END + +! { dg-shouldfail "" } +! { dg-output "already mapped to" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-already-8.f b/libgomp/testsuite/libgomp.oacc-fortran/data-already-8.f new file mode 100644 index 0000000..16da048 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-already-8.f @@ -0,0 +1,16 @@ +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } + + IMPLICIT NONE + + INTEGER I + +!$ACC DATA CREATE (I) +!$ACC PARALLEL COPYIN (I) + I = 0 +!$ACC END PARALLEL +!$ACC END DATA + + END + +! { dg-shouldfail "" } +! { dg-output "Trying to map into device .* object when .* is already mapped" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp b/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp new file mode 100644 index 0000000..a8f62e8 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp @@ -0,0 +1,98 @@ +# This whole file adapted from libgomp.fortran/fortran.exp. + +load_lib libgomp-dg.exp +load_gcc_lib gcc-dg.exp +load_gcc_lib gfortran-dg.exp + +global shlib_ext +global ALWAYS_CFLAGS + +set shlib_ext [get_shlib_extension] +set lang_library_path "../libgfortran/.libs" +set lang_link_flags "-lgfortran" +if [info exists lang_include_flags] then { + unset lang_include_flags +} +set lang_test_file_found 0 +set quadmath_library_path "../libquadmath/.libs" + + +# Initialize dg. +dg-init + +# Turn on OpenACC. +lappend ALWAYS_CFLAGS "additional_flags=-fopenacc" + +if { $blddir != "" } { + set lang_source_re {^.*\.[fF](|90|95|03|08)$} + set lang_include_flags "-fintrinsic-modules-path=${blddir}" + # Look for a static libgfortran first. + if [file exists "${blddir}/${lang_library_path}/libgfortran.a"] { + set lang_test_file "${lang_library_path}/libgfortran.a" + set lang_test_file_found 1 + # We may have a shared only build, so look for a shared libgfortran. + } elseif [file exists "${blddir}/${lang_library_path}/libgfortran.${shlib_ext}"] { + set lang_test_file "${lang_library_path}/libgfortran.${shlib_ext}" + set lang_test_file_found 1 + } else { + puts "No libgfortran library found, will not execute fortran tests" + } +} elseif [info exists GFORTRAN_UNDER_TEST] { + set lang_test_file_found 1 + # Needs to exist for libgomp.exp. + set lang_test_file "" +} else { + puts "GFORTRAN_UNDER_TEST not defined, will not execute fortran tests" +} + +if { $lang_test_file_found } { + # Gather a list of all tests. + set tests [lsort [find $srcdir/$subdir *.\[fF\]{,90,95,03,08}]] + + if { $blddir != "" } { + if { [file exists "${blddir}/${quadmath_library_path}/libquadmath.a"] + || [file exists "${blddir}/${quadmath_library_path}/libquadmath.${shlib_ext}"] } { + lappend ALWAYS_CFLAGS "ldflags=-L${blddir}/${quadmath_library_path}/" + # Allow for spec subsitution. + lappend ALWAYS_CFLAGS "additional_flags=-B${blddir}/${quadmath_library_path}/" + set ld_library_path "$always_ld_library_path:${blddir}/${lang_library_path}:${blddir}/${quadmath_library_path}" + } else { + set ld_library_path "$always_ld_library_path:${blddir}/${lang_library_path}" + } + } else { + set ld_library_path "$always_ld_library_path" + } + append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST] + set_ld_library_path_env_vars + + # Test OpenACC with available accelerators. + foreach offload_target_openacc $offload_targets_s_openacc { + set tagopt "-DACC_DEVICE_TYPE_$offload_target_openacc=1" + + switch $offload_target_openacc { + host { + set acc_mem_shared 1 + } + host_nonshm { + set acc_mem_shared 0 + } + nvidia { + set acc_mem_shared 0 + } + default { + set acc_mem_shared 0 + } + } + set tagopt "$tagopt -DACC_MEM_SHARED=$acc_mem_shared" + + setenv ACC_DEVICE_TYPE $offload_target_openacc + + # For Fortran we're doing torture testing, as Fortran has far more tests + # with arrays etc. that testing just -O0 or -O2 is insufficient, that is + # typically not the case for C/C++. + gfortran-dg-runtest $tests "$tagopt" "" + } +} + +# All done. +dg-finish diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/lib-1.f90 new file mode 100644 index 0000000..51dc452 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-1.f90 @@ -0,0 +1,13 @@ +use openacc + +if (acc_get_num_devices (acc_device_host) .ne. 1) call abort +call acc_set_device_type (acc_device_host) +if (acc_get_device_type () .ne. acc_device_host) call abort +call acc_set_device_num (0, acc_device_host) +if (acc_get_device_num (acc_device_host) .ne. 0) call abort +call acc_shutdown (acc_device_host) + +call acc_init (acc_device_host) +call acc_shutdown (acc_device_host) + +end diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-10.f90 b/libgomp/testsuite/libgomp.oacc-fortran/lib-10.f90 new file mode 100644 index 0000000..a54d6a7 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-10.f90 @@ -0,0 +1,82 @@ +! { dg-do run } + +program main + implicit none + include "openacc_lib.h" + + integer, target :: a_3d_i(10, 10, 10) + complex a_3d_c(10, 10, 10) + real a_3d_r(10, 10, 10) + + integer i, j, k + complex c + real r + integer, parameter :: i_size = sizeof (i) + integer, parameter :: c_size = sizeof (c) + integer, parameter :: r_size = sizeof (r) + + if (acc_get_num_devices (acc_device_nvidia) .eq. 0) call exit + + call acc_init (acc_device_nvidia) + + call set3d (.FALSE., a_3d_i, a_3d_c, a_3d_r) + + call acc_copyin (a_3d_i) + call acc_copyin (a_3d_c) + call acc_copyin (a_3d_r) + + if (acc_is_present (a_3d_i) .neqv. .TRUE.) call abort + if (acc_is_present (a_3d_c) .neqv. .TRUE.) call abort + if (acc_is_present (a_3d_r) .neqv. .TRUE.) call abort + + do i = 1, 10 + do j = 1, 10 + do k = 1, 10 + if (acc_is_present (a_3d_i(i, j, k), i_size) .neqv. .TRUE.) call abort + if (acc_is_present (a_3d_c(i, j, k), i_size) .neqv. .TRUE.) call abort + if (acc_is_present (a_3d_r(i, j, k), i_size) .neqv. .TRUE.) call abort + end do + end do + end do + + call acc_shutdown (acc_device_nvidia) + +contains + + subroutine set3d (clear, a_i, a_c, a_r) + logical clear + integer, dimension (:,:,:), intent (inout) :: a_i + complex, dimension (:,:,:), intent (inout) :: a_c + real, dimension (:,:,:), intent (inout) :: a_r + + integer i, j, k + integer lb1, ub1, lb2, ub2, lb3, ub3 + + lb1 = lbound (a_i, 1) + ub1 = ubound (a_i, 1) + + lb2 = lbound (a_i, 2) + ub2 = ubound (a_i, 2) + + lb3 = lbound (a_i, 3) + ub3 = ubound (a_i, 3) + + do i = lb1, ub1 + do j = lb2, ub2 + do k = lb3, ub3 + if (clear) then + a_i(i, j, k) = 0 + a_c(i, j, k) = cmplx (0.0, 0.0) + a_r(i, j, k) = 0.0 + else + a_i(i, j, k) = i + a_c(i, j, k) = cmplx (i, j) + a_r(i, j, k) = i + end if + end do + end do + end do + + end subroutine + +end program diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-2.f b/libgomp/testsuite/libgomp.oacc-fortran/lib-2.f new file mode 100644 index 0000000..a9d70b2 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-2.f @@ -0,0 +1,13 @@ + USE OPENACC + + IF (ACC_GET_NUM_DEVICES (ACC_DEVICE_HOST) .NE. 1) CALL ABORT + CALL ACC_SET_DEVICE_TYPE (ACC_DEVICE_HOST) + IF (ACC_GET_DEVICE_TYPE () .NE. ACC_DEVICE_HOST) CALL ABORT + CALL ACC_SET_DEVICE_NUM (0, ACC_DEVICE_HOST) + IF (ACC_GET_DEVICE_NUM (ACC_DEVICE_HOST) .NE. 0) CALL ABORT + CALL ACC_SHUTDOWN (ACC_DEVICE_HOST) + + CALL ACC_INIT (ACC_DEVICE_HOST) + CALL ACC_SHUTDOWN (ACC_DEVICE_HOST) + + END diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-3.f b/libgomp/testsuite/libgomp.oacc-fortran/lib-3.f new file mode 100644 index 0000000..56d2cd2 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-3.f @@ -0,0 +1,13 @@ + INCLUDE "openacc_lib.h" + + IF (ACC_GET_NUM_DEVICES (ACC_DEVICE_HOST) .NE. 1) CALL ABORT + CALL ACC_SET_DEVICE_TYPE (ACC_DEVICE_HOST) + IF (ACC_GET_DEVICE_TYPE () .NE. ACC_DEVICE_HOST) CALL ABORT + CALL ACC_SET_DEVICE_NUM (0, ACC_DEVICE_HOST) + IF (ACC_GET_DEVICE_NUM (ACC_DEVICE_HOST) .NE. 0) CALL ABORT + CALL ACC_SHUTDOWN (ACC_DEVICE_HOST) + + CALL ACC_INIT (ACC_DEVICE_HOST) + CALL ACC_SHUTDOWN (ACC_DEVICE_HOST) + + END diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/lib-4.f90 new file mode 100644 index 0000000..3a2b661 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-4.f90 @@ -0,0 +1,35 @@ +! { dg-do run } + +program main + use openacc + implicit none + + integer n + + if (acc_get_num_devices (acc_device_host) .ne. 1) call abort + + if (acc_get_num_devices (acc_device_none) .ne. 0) call abort + + call acc_init (acc_device_host) + + if (acc_get_device_type () .ne. acc_device_host) call abort + + call acc_set_device_type (acc_device_host) + + if (acc_get_device_type () .ne. acc_device_host) call abort + + n = 0 + + call acc_set_device_num (n, acc_device_host) + + if (acc_get_device_num (acc_device_host) .ne. 0) call abort + + if (.NOT. acc_async_test (n) ) call abort + + call acc_wait (n) + + call acc_wait_all () + + call acc_shutdown (acc_device_host) + +end program diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/lib-5.f90 new file mode 100644 index 0000000..e68eb89 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-5.f90 @@ -0,0 +1,31 @@ +! { dg-do run } + +program main + use openacc + implicit none + + integer n + + if (acc_get_num_devices (acc_device_nvidia) .eq. 0) call exit + + call acc_init (acc_device_nvidia) + + n = 0 + + call acc_set_device_num (n, acc_device_nvidia) + + if (acc_get_device_num (acc_device_nvidia) .ne. 0) call abort + + if (acc_get_num_devices (acc_device_nvidia) .gt. 1) then + + n = 1 + + call acc_set_device_num (n, acc_device_nvidia) + + if (acc_get_device_num (acc_device_nvidia) .ne. 1) call abort + + end if + + call acc_shutdown (acc_device_nvidia) + +end program diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/lib-6.f90 new file mode 100644 index 0000000..401ad66 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-6.f90 @@ -0,0 +1,35 @@ +! { dg-do run } + +program main + implicit none + include "openacc_lib.h" + + integer n + + if (acc_get_num_devices (acc_device_host) .ne. 1) call abort + + if (acc_get_num_devices (acc_device_none) .ne. 0) call abort + + call acc_init (acc_device_host) + + if (acc_get_device_type () .ne. acc_device_host) call abort + + call acc_set_device_type (acc_device_host) + + if (acc_get_device_type () .ne. acc_device_host) call abort + + n = 0 + + call acc_set_device_num (n, acc_device_host) + + if (acc_get_device_num (acc_device_host) .ne. 0) call abort + + if (.NOT. acc_async_test (n) ) call abort + + call acc_wait (n) + + call acc_wait_all () + + call acc_shutdown (acc_device_host) + +end program diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/lib-7.f90 new file mode 100644 index 0000000..422df53 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-7.f90 @@ -0,0 +1,31 @@ +! { dg-do run } + +program main + implicit none + include "openacc_lib.h" + + integer n + + if (acc_get_num_devices (acc_device_nvidia) .eq. 0) call exit + + call acc_init (acc_device_nvidia) + + n = 0 + + call acc_set_device_num (n, acc_device_nvidia) + + if (acc_get_device_num (acc_device_nvidia) .ne. 0) call abort + + if (acc_get_num_devices (acc_device_nvidia) .gt. 1) then + + n = 1 + + call acc_set_device_num (n, acc_device_nvidia) + + if (acc_get_device_num (acc_device_nvidia) .ne. 1) call abort + + end if + + call acc_shutdown (acc_device_nvidia) + +end program diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-8.f90 b/libgomp/testsuite/libgomp.oacc-fortran/lib-8.f90 new file mode 100644 index 0000000..ad758b2 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-8.f90 @@ -0,0 +1,83 @@ +! { dg-do run } + +program main + use openacc + use iso_c_binding + implicit none + + integer, target :: a_3d_i(10, 10, 10) + complex a_3d_c(10, 10, 10) + real a_3d_r(10, 10, 10) + + integer i, j, k + complex c + real r + integer, parameter :: i_size = sizeof (i) + integer, parameter :: c_size = sizeof (c) + integer, parameter :: r_size = sizeof (r) + + if (acc_get_num_devices (acc_device_nvidia) .eq. 0) call exit + + call acc_init (acc_device_nvidia) + + call set3d (.FALSE., a_3d_i, a_3d_c, a_3d_r) + + call acc_copyin (a_3d_i) + call acc_copyin (a_3d_c) + call acc_copyin (a_3d_r) + + if (acc_is_present (a_3d_i) .neqv. .TRUE.) call abort + if (acc_is_present (a_3d_c) .neqv. .TRUE.) call abort + if (acc_is_present (a_3d_r) .neqv. .TRUE.) call abort + + do i = 1, 10 + do j = 1, 10 + do k = 1, 10 + if (acc_is_present (a_3d_i(i, j, k), i_size) .neqv. .TRUE.) call abort + if (acc_is_present (a_3d_c(i, j, k), i_size) .neqv. .TRUE.) call abort + if (acc_is_present (a_3d_r(i, j, k), i_size) .neqv. .TRUE.) call abort + end do + end do + end do + + call acc_shutdown (acc_device_nvidia) + +contains + + subroutine set3d (clear, a_i, a_c, a_r) + logical clear + integer, dimension (:,:,:), intent (inout) :: a_i + complex, dimension (:,:,:), intent (inout) :: a_c + real, dimension (:,:,:), intent (inout) :: a_r + + integer i, j, k + integer lb1, ub1, lb2, ub2, lb3, ub3 + + lb1 = lbound (a_i, 1) + ub1 = ubound (a_i, 1) + + lb2 = lbound (a_i, 2) + ub2 = ubound (a_i, 2) + + lb3 = lbound (a_i, 3) + ub3 = ubound (a_i, 3) + + do i = lb1, ub1 + do j = lb2, ub2 + do k = lb3, ub3 + if (clear) then + a_i(i, j, k) = 0 + a_c(i, j, k) = cmplx (0.0, 0.0) + a_r(i, j, k) = 0.0 + else + a_i(i, j, k) = i + a_c(i, j, k) = cmplx (i, j) + a_r(i, j, k) = i + end if + end do + end do + end do + + end subroutine + +end program diff --git a/libgomp/testsuite/libgomp.oacc-fortran/map-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/map-1.f90 new file mode 100644 index 0000000..082dd8a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/map-1.f90 @@ -0,0 +1,97 @@ +program map + integer, parameter :: n = 20, c = 10 + integer :: i, a(n), b(n) + + a(:) = 0 + b(:) = 0 + + ! COPY + + !$acc parallel copy (a) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + do i = 1, n + b(i) = i + end do + + call check (a, b, n) + + ! COPYOUT + + a(:) = 0 + + !$acc parallel copyout (a) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + do i = 1, n + if (a(i) .ne. b(i)) call abort + end do + call check (a, b, n) + + ! COPYIN + + a(:) = 0 + + !$acc parallel copyout (a) copyin (b) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + call check (a, b, n) + + ! PRESENT_OR_COPY + + !$acc parallel pcopy (a) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + call check (a, b, n) + + ! PRESENT_OR_COPYOUT + + a(:) = 0 + + !$acc parallel pcopyout (a) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + call check (a, b, n) + + ! PRESENT_OR_COPYIN + + a(:) = 0 + + !$acc parallel pcopyout (a) pcopyin (b) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + call check (a, b, n) +end program map + +subroutine check (a, b, n) + integer :: n, a(n), b(n) + integer :: i + + do i = 1, n + if (a(i) .ne. b(i)) call abort + end do +end subroutine check diff --git a/libgomp/testsuite/libgomp.oacc-fortran/openacc_version-1.f b/libgomp/testsuite/libgomp.oacc-fortran/openacc_version-1.f new file mode 100644 index 0000000..db3c6b1 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/openacc_version-1.f @@ -0,0 +1,9 @@ +! { dg-do run } + + program main + implicit none + include "openacc_lib.h" + + if (openacc_version .ne. 201306) call abort; + + end program main diff --git a/libgomp/testsuite/libgomp.oacc-fortran/openacc_version-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/openacc_version-2.f90 new file mode 100644 index 0000000..a14ecdd --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/openacc_version-2.f90 @@ -0,0 +1,9 @@ +! { dg-do run } + +program main + use openacc + implicit none + + if (openacc_version .ne. 201306) call abort; + +end program main diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pointer-align-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pointer-align-1.f90 new file mode 100644 index 0000000..a5e1fcb --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/pointer-align-1.f90 @@ -0,0 +1,21 @@ +! PR middle-end/63247 + +program test + implicit none + + integer(kind=2) a(4) + + a = 10; + + !$acc parallel copy(a(2:4)) + a(2) = 52 + a(3) = 53 + a(4) = 54 + !$acc end parallel + + if (a(1) .ne. 10) call abort + if (a(2) .ne. 52) call abort + if (a(3) .ne. 53) call abort + if (a(4) .ne. 54) call abort + +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pset-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pset-1.f90 new file mode 100644 index 0000000..1a1d4c7 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/pset-1.f90 @@ -0,0 +1,229 @@ +! { dg-do run } + +program test + implicit none + integer, allocatable :: a1(:) + integer, allocatable :: b1(:) + integer, allocatable :: c1(:) + integer, allocatable :: b2(:,:) + integer, allocatable :: c3(:,:,:) + + allocate (a1(5)) + if (.not.allocated (a1)) call abort() + + a1 = 10 + + !$acc parallel copy(a1(1:5)) + a1(1) = 1 + a1(2) = 2 + a1(3) = 3 + a1(4) = 4 + a1(5) = 5 + !$acc end parallel + + if (a1(1) .ne. 1) call abort + if (a1(2) .ne. 2) call abort + if (a1(3) .ne. 3) call abort + if (a1(4) .ne. 4) call abort + if (a1(5) .ne. 5) call abort + + deallocate(a1) + + allocate (a1(0:4)) + if (.not.allocated (a1)) call abort() + + a1 = 10 + + !$acc parallel copy(a1(0:4)) + a1(0) = 1 + a1(1) = 2 + a1(2) = 3 + a1(3) = 4 + a1(4) = 5 + !$acc end parallel + + if (a1(0) .ne. 1) call abort + if (a1(1) .ne. 2) call abort + if (a1(2) .ne. 3) call abort + if (a1(3) .ne. 4) call abort + if (a1(4) .ne. 5) call abort + + deallocate(a1) + + allocate (b2(5,5)) + if (.not.allocated (b2)) call abort() + + b2 = 11 + + !$acc parallel copy(b2(1:5,1:5)) + b2(1,1) = 1 + b2(2,2) = 2 + b2(3,3) = 3 + b2(4,4) = 4 + b2(5,5) = 5 + !$acc end parallel + + if (b2(1,1) .ne. 1) call abort + if (b2(2,2) .ne. 2) call abort + if (b2(3,3) .ne. 3) call abort + if (b2(4,4) .ne. 4) call abort + if (b2(5,5) .ne. 5) call abort + + deallocate(b2) + + allocate (b2(0:4,0:4)) + if (.not.allocated (b2)) call abort() + + b2 = 11 + + !$acc parallel copy(b2(0:4,0:4)) + b2(0,0) = 1 + b2(1,1) = 2 + b2(2,2) = 3 + b2(3,3) = 4 + b2(4,4) = 5 + !$acc end parallel + + if (b2(0,0) .ne. 1) call abort + if (b2(1,1) .ne. 2) call abort + if (b2(2,2) .ne. 3) call abort + if (b2(3,3) .ne. 4) call abort + if (b2(4,4) .ne. 5) call abort + + deallocate(b2) + + allocate (c3(5,5,5)) + if (.not.allocated (c3)) call abort() + + c3 = 12 + + !$acc parallel copy(c3(1:5,1:5,1:5)) + c3(1,1,1) = 1 + c3(2,2,2) = 2 + c3(3,3,3) = 3 + c3(4,4,4) = 4 + c3(5,5,5) = 5 + !$acc end parallel + + if (c3(1,1,1) .ne. 1) call abort + if (c3(2,2,2) .ne. 2) call abort + if (c3(3,3,3) .ne. 3) call abort + if (c3(4,4,4) .ne. 4) call abort + if (c3(5,5,5) .ne. 5) call abort + + deallocate(c3) + + allocate (c3(0:4,0:4,0:4)) + if (.not.allocated (c3)) call abort() + + c3 = 12 + + !$acc parallel copy(c3(0:4,0:4,0:4)) + c3(0,0,0) = 1 + c3(1,1,1) = 2 + c3(2,2,2) = 3 + c3(3,3,3) = 4 + c3(4,4,4) = 5 + !$acc end parallel + + if (c3(0,0,0) .ne. 1) call abort + if (c3(1,1,1) .ne. 2) call abort + if (c3(2,2,2) .ne. 3) call abort + if (c3(3,3,3) .ne. 4) call abort + if (c3(4,4,4) .ne. 5) call abort + + deallocate(c3) + + allocate (a1(5)) + if (.not.allocated (a1)) call abort() + + allocate (b1(5)) + if (.not.allocated (b1)) call abort() + + allocate (c1(5)) + if (.not.allocated (c1)) call abort() + + a1 = 10 + b1 = 3 + c1 = 7 + + !$acc parallel copyin(a1(1:5)) create(c1(1:5)) copyout(b1(1:5)) + c1(1) = a1(1) + c1(2) = a1(2) + c1(3) = a1(3) + c1(4) = a1(4) + c1(5) = a1(5) + + b1(1) = c1(1) + b1(2) = c1(2) + b1(3) = c1(3) + b1(4) = c1(4) + b1(5) = c1(5) + !$acc end parallel + + if (b1(1) .ne. 10) call abort + if (b1(2) .ne. 10) call abort + if (b1(3) .ne. 10) call abort + if (b1(4) .ne. 10) call abort + if (b1(5) .ne. 10) call abort + + deallocate(a1) + deallocate(b1) + deallocate(c1) + + allocate (a1(0:4)) + if (.not.allocated (a1)) call abort() + + allocate (b1(0:4)) + if (.not.allocated (b1)) call abort() + + allocate (c1(0:4)) + if (.not.allocated (c1)) call abort() + + a1 = 10 + b1 = 3 + c1 = 7 + + !$acc parallel copyin(a1(0:4)) create(c1(0:4)) copyout(b1(0:4)) + c1(0) = a1(0) + c1(1) = a1(1) + c1(2) = a1(2) + c1(3) = a1(3) + c1(4) = a1(4) + + b1(0) = c1(0) + b1(1) = c1(1) + b1(2) = c1(2) + b1(3) = c1(3) + b1(4) = c1(4) + !$acc end parallel + + if (b1(0) .ne. 10) call abort + if (b1(1) .ne. 10) call abort + if (b1(2) .ne. 10) call abort + if (b1(3) .ne. 10) call abort + if (b1(4) .ne. 10) call abort + + deallocate(a1) + deallocate(b1) + deallocate(c1) + + allocate (a1(5)) + if (.not.allocated (a1)) call abort() + + a1 = 10 + + !$acc parallel copy(a1(2:3)) + a1(2) = 2 + a1(3) = 3 + !$acc end parallel + + if (a1(1) .ne. 10) call abort + if (a1(2) .ne. 2) call abort + if (a1(3) .ne. 3) call abort + if (a1(4) .ne. 10) call abort + if (a1(5) .ne. 10) call abort + + deallocate(a1) + +end program test diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-1.f90 new file mode 100644 index 0000000..89e7fe7 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-1.f90 @@ -0,0 +1,225 @@ +! { dg-do run } + +! Integer reductions + +program reduction_1 + implicit none + + integer, parameter :: n = 10, vl = 2 + integer :: i, vresult, result + logical :: lresult, lvresult + integer, dimension (n) :: array + + do i = 1, n + array(i) = i + end do + + result = 0 + vresult = 0 + + ! '+' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(+:result) + do i = 1, n + result = result + array(i) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = vresult + array(i) + end do + + if (result.ne.vresult) call abort + + result = 0 + vresult = 0 + + ! '*' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(*:result) + do i = 1, n + result = result * array(i) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = vresult * array(i) + end do + + if (result.ne.vresult) call abort + + result = 0 + vresult = 0 + + ! 'max' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(max:result) + do i = 1, n + result = max (result, array(i)) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = max (vresult, array(i)) + end do + + if (result.ne.vresult) call abort + + result = 1 + vresult = 1 + + ! 'min' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(min:result) + do i = 1, n + result = min (result, array(i)) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = min (vresult, array(i)) + end do + + if (result.ne.vresult) call abort + + result = 1 + vresult = 1 + + ! 'iand' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(iand:result) + do i = 1, n + result = iand (result, array(i)) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = iand (vresult, array(i)) + end do + + if (result.ne.vresult) call abort + + result = 1 + vresult = 1 + + ! 'ior' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(ior:result) + do i = 1, n + result = ior (result, array(i)) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = ior (vresult, array(i)) + end do + + if (result.ne.vresult) call abort + + result = 0 + vresult = 0 + + ! 'ieor' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(ieor:result) + do i = 1, n + result = ieor (result, array(i)) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = ieor (vresult, array(i)) + end do + + if (result.ne.vresult) call abort + + lresult = .false. + lvresult = .false. + + ! '.and.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.and.:lresult) + do i = 1, n + lresult = lresult .and. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .and. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort + + lresult = .false. + lvresult = .false. + + ! '.or.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.or.:lresult) + do i = 1, n + lresult = lresult .or. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .or. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort + + lresult = .false. + lvresult = .false. + + ! '.eqv.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.eqv.:lresult) + do i = 1, n + lresult = lresult .eqv. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .eqv. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort + + lresult = .false. + lvresult = .false. + + ! '.neqv.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.neqv.:lresult) + do i = 1, n + lresult = lresult .neqv. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .neqv. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort +end program reduction_1 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-2.f90 new file mode 100644 index 0000000..d3659c9 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-2.f90 @@ -0,0 +1,170 @@ +! { dg-do run } + +! real reductions + +program reduction_2 + implicit none + + integer, parameter :: n = 10, vl = 2 + integer :: i + real, parameter :: e = .001 + real :: vresult, result + logical :: lresult, lvresult + real, dimension (n) :: array + + do i = 1, n + array(i) = i + end do + + result = 0 + vresult = 0 + + ! '+' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(+:result) + do i = 1, n + result = result + array(i) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = vresult + array(i) + end do + + if (abs (result - vresult) .ge. e) call abort + + result = 1 + vresult = 1 + + ! '*' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(*:result) + do i = 1, n + result = result * array(i) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = vresult * array(i) + end do + + if (result.ne.vresult) call abort + + result = 0 + vresult = 0 + + ! 'max' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(max:result) + do i = 1, n + result = max (result, array(i)) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = max (vresult, array(i)) + end do + + if (result.ne.vresult) call abort + + result = 1 + vresult = 1 + + ! 'min' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(min:result) + do i = 1, n + result = min (result, array(i)) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = min (vresult, array(i)) + end do + + if (result.ne.vresult) call abort + + result = 1 + vresult = 1 + + ! '.and.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.and.:lresult) + do i = 1, n + lresult = lresult .and. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .and. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort + + lresult = .false. + lvresult = .false. + + ! '.or.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.or.:lresult) + do i = 1, n + lresult = lresult .or. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .or. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort + + lresult = .false. + lvresult = .false. + + ! '.eqv.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.eqv.:lresult) + do i = 1, n + lresult = lresult .eqv. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .eqv. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort + + lresult = .false. + lvresult = .false. + + ! '.neqv.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.neqv.:lresult) + do i = 1, n + lresult = lresult .neqv. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .neqv. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort +end program reduction_2 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-3.f90 new file mode 100644 index 0000000..2b8005d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-3.f90 @@ -0,0 +1,170 @@ +! { dg-do run } + +! double precision reductions + +program reduction_3 + implicit none + + integer, parameter :: n = 10, vl = 2 + integer :: i + double precision, parameter :: e = .001 + double precision :: vresult, result + logical :: lresult, lvresult + double precision, dimension (n) :: array + + do i = 1, n + array(i) = i + end do + + result = 0 + vresult = 0 + + ! '+' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(+:result) + do i = 1, n + result = result + array(i) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = vresult + array(i) + end do + + if (abs (result - vresult) .ge. e) call abort + + result = 1 + vresult = 1 + + ! '*' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(*:result) + do i = 1, n + result = result * array(i) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = vresult * array(i) + end do + + if (result.ne.vresult) call abort + + result = 0 + vresult = 0 + + ! 'max' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(max:result) + do i = 1, n + result = max (result, array(i)) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = max (vresult, array(i)) + end do + + if (result.ne.vresult) call abort + + result = 1 + vresult = 1 + + ! 'min' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(min:result) + do i = 1, n + result = min (result, array(i)) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = min (vresult, array(i)) + end do + + if (result.ne.vresult) call abort + + result = 1 + vresult = 1 + + ! '.and.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.and.:lresult) + do i = 1, n + lresult = lresult .and. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .and. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort + + lresult = .false. + lvresult = .false. + + ! '.or.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.or.:lresult) + do i = 1, n + lresult = lresult .or. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .or. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort + + lresult = .false. + lvresult = .false. + + ! '.eqv.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.eqv.:lresult) + do i = 1, n + lresult = lresult .eqv. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .eqv. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort + + lresult = .false. + lvresult = .false. + + ! '.neqv.' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(.neqv.:lresult) + do i = 1, n + lresult = lresult .neqv. (array(i) .ge. 5) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + lvresult = lvresult .neqv. (array(i) .ge. 5) + end do + + if (result.ne.vresult) call abort +end program reduction_3 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-4.f90 new file mode 100644 index 0000000..12f7a33 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-4.f90 @@ -0,0 +1,54 @@ +! { dg-do run } + +! complex reductions + +program reduction_4 + implicit none + + integer, parameter :: n = 10, vl = 32 + integer :: i + complex :: vresult, result + complex, dimension (n) :: array + + do i = 1, n + array(i) = i + end do + + result = 0 + vresult = 0 + + ! '+' reductions + + !$acc parallel vector_length(vl) num_gangs(1) + !$acc loop reduction(+:result) + do i = 1, n + result = result + array(i) + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vresult = vresult + array(i) + end do + + if (result .ne. vresult) call abort + + result = 1 + vresult = 1 + +! ! '*' reductions +! +! !$acc parallel vector_length(vl) +! !$acc loop reduction(*:result) +! do i = 1, n +! result = result * array(i) +! end do +! !$acc end parallel +! +! ! Verify the results +! do i = 1, n +! vresult = vresult * array(i) +! end do +! +! if (result.ne.vresult) call abort +end program reduction_4 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90 new file mode 100644 index 0000000..df44a7a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90 @@ -0,0 +1,32 @@ +! { dg-do run } + +! subroutine reduction + +program reduction + integer, parameter :: n = 40, c = 10 + integer :: i, vsum, sum + + call redsub (sum, n, c) + + vsum = 0 + + ! Verify the results + do i = 1, n + vsum = vsum + c + end do + + if (sum.ne.vsum) call abort () +end program reduction + +subroutine redsub(sum, n, c) + integer :: sum, n, c + + sum = 0 + + !$acc parallel vector_length(n) copyin (n, c) num_gangs(1) + !$acc loop reduction(+:sum) + do i = 1, n + sum = sum + c + end do + !$acc end parallel +end subroutine redsub diff --git a/libgomp/testsuite/libgomp.oacc-fortran/reduction-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/reduction-6.f90 new file mode 100644 index 0000000..6325431 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-6.f90 @@ -0,0 +1,30 @@ +! { dg-do run } + +program reduction + implicit none + + integer, parameter :: n = 100 + integer :: i, s1, s2, vs1, vs2 + + s1 = 0 + s2 = 0 + vs1 = 0 + vs2 = 0 + + !$acc parallel vector_length (1000) + !$acc loop reduction(+:s1, s2) + do i = 1, n + s1 = s1 + 1 + s2 = s2 + 2 + end do + !$acc end parallel + + ! Verify the results + do i = 1, n + vs1 = vs1 + 1 + vs2 = vs2 + 2 + end do + + if (s1.ne.vs1) call abort () + if (s2.ne.vs2) call abort () +end program reduction diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-1.f90 new file mode 100644 index 0000000..3390515 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-1.f90 @@ -0,0 +1,32 @@ +! { dg-do run } +! { dg-options "-fno-inline" } + + interface + recursive function fact (x) + !$acc routine + integer, intent(in) :: x + integer :: fact + end function fact + end interface + integer, parameter :: n = 10 + integer :: a(n), i + !$acc parallel + !$acc loop + do i = 1, n + a(i) = fact (i) + end do + !$acc end parallel + do i = 1, n + if (a(i) .ne. fact(i)) call abort + end do +end +recursive function fact (x) result (res) + !$acc routine + integer, intent(in) :: x + integer :: res + if (x < 1) then + res = 1 + else + res = x * fact (x - 1) + end if +end function fact diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-2.f90 new file mode 100644 index 0000000..3d418b6 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-2.f90 @@ -0,0 +1,29 @@ +! { dg-do run } +! { dg-options "-fno-inline" } + + module m1 + contains + recursive function fact (x) result (res) + !$acc routine + integer, intent(in) :: x + integer :: res + if (x < 1) then + res = 1 + else + res = x * fact (x - 1) + end if + end function fact + end module m1 + use m1 + integer, parameter :: n = 10 + integer :: a(n), i + !$acc parallel + !$acc loop + do i = 1, n + a(i) = fact (i) + end do + !$acc end parallel + do i = 1, n + if (a(i) .ne. fact(i)) call abort + end do +end diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-3.f90 new file mode 100644 index 0000000..d233a63 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-3.f90 @@ -0,0 +1,27 @@ +! { dg-do run } +! { dg-options "-fno-inline" } + + integer, parameter :: n = 10 + integer :: a(n), i + integer, external :: fact + !$acc routine (fact) + !$acc parallel + !$acc loop + do i = 1, n + a(i) = fact (i) + end do + !$acc end parallel + do i = 1, n + if (a(i) .ne. fact(i)) call abort + end do +end +recursive function fact (x) result (res) + !$acc routine + integer, intent(in) :: x + integer :: res + if (x < 1) then + res = 1 + else + res = x * fact (x - 1) + end if +end function fact diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-4.f90 new file mode 100644 index 0000000..3e5fb09 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-4.f90 @@ -0,0 +1,23 @@ +! { dg-do run } +! { dg-options "-fno-inline" } + + integer, parameter :: n = 10 + integer :: a(n), i + do i = 1, n + a(i) = i + end do + !$acc parallel + !$acc loop + do i = 1, n + call incr(a(i)) + end do + !$acc end parallel + do i = 1, n + if (a(i) .ne. (i + 1)) call abort + end do +end +subroutine incr (x) + !$acc routine + integer, intent(inout) :: x + x = x + 1 +end subroutine incr diff --git a/libgomp/testsuite/libgomp.oacc-fortran/subarrays-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/subarrays-1.f90 new file mode 100644 index 0000000..b39414f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/subarrays-1.f90 @@ -0,0 +1,97 @@ +program subarrays + integer, parameter :: n = 20, c = 10 + integer :: i, a(n), b(n) + + a(:) = 0 + b(:) = 0 + + ! COPY + + !$acc parallel copy (a(1:n)) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + do i = 1, n + b(i) = i + end do + + call check (a, b, n) + + ! COPYOUT + + a(:) = 0 + + !$acc parallel copyout (a(1:n)) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + do i = 1, n + if (a(i) .ne. b(i)) call abort + end do + call check (a, b, n) + + ! COPYIN + + a(:) = 0 + + !$acc parallel copyout (a(1:n)) copyin (b(1:n)) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + call check (a, b, n) + + ! PRESENT_OR_COPY + + !$acc parallel pcopy (a(1:n)) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + call check (a, b, n) + + ! PRESENT_OR_COPYOUT + + a(:) = 0 + + !$acc parallel pcopyout (a(1:n)) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + call check (a, b, n) + + ! PRESENT_OR_COPYIN + + a(:) = 0 + + !$acc parallel pcopyout (a(1:n)) pcopyin (b(1:n)) + !$acc loop + do i = 1, n + a(i) = i + end do + !$acc end parallel + + call check (a, b, n) +end program subarrays + +subroutine check (a, b, n) + integer :: n, a(n), b(n) + integer :: i + + do i = 1, n + if (a(i) .ne. b(i)) call abort + end do +end subroutine check diff --git a/libgomp/testsuite/libgomp.oacc-fortran/subarrays-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/subarrays-2.f90 new file mode 100644 index 0000000..81799f6 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/subarrays-2.f90 @@ -0,0 +1,100 @@ +program subarrays + integer, parameter :: n = 20, c = 10, low = 5, high = 10 + integer :: i, a(n), b(n) + + a(:) = 0 + b(:) = 0 + + ! COPY + + !$acc parallel copy (a(low:high)) + !$acc loop + do i = low, high + a(i) = i + end do + !$acc end parallel + + do i = low, high + b(i) = i + end do + + call check (a, b, n) + + ! COPYOUT + + a(:) = 0 + + !$acc parallel copyout (a(low:high)) + !$acc loop + do i = low, high + a(i) = i + end do + !$acc end parallel + + do i = low, high + if (a(i) .ne. b(i)) call abort + end do + call check (a, b, n) + + ! COPYIN + + a(:) = 0 + + !$acc parallel copyout (a(low:high)) copyin (b(low:high)) + !$acc loop + do i = low, high + a(i) = b(i) + end do + !$acc end parallel + + call check (a, b, n) + + ! PRESENT_OR_COPY + + a(:) = 0 + + !$acc parallel pcopy (a(low:high)) + !$acc loop + do i = low, high + a(i) = i + end do + !$acc end parallel + + call check (a, b, n) + + ! PRESENT_OR_COPYOUT + + a(:) = 0 + + !$acc parallel pcopyout (a(low:high)) + !$acc loop + do i = low, high + a(i) = i + end do + !$acc end parallel + + call check (a, b, n) + + ! PRESENT_OR_COPYIN + + a(:) = 0 + + !$acc parallel pcopyout (a(low:high)) & + !$acc & pcopyin (b(low:high)) + !$acc loop + do i = low, high + a(i) = b(i) + end do + !$acc end parallel + + call check (a, b, n) +end program subarrays + +subroutine check (a, b, n) + integer :: n, a(n), b(n) + integer :: i + + do i = 1, n + if (a(i) .ne. b(i)) call abort + end do +end subroutine check |