aboutsummaryrefslogtreecommitdiff
path: root/libgomp/testsuite
AgeCommit message (Collapse)AuthorFilesLines
2023-02-22Add '-Wno-complain-wrong-lang', and use it in ↵Thomas Schwinge16-17/+21
'gcc/testsuite/lib/target-supports.exp:check_compile' and elsewhere I noticed that GCC/Rust recently lost all LTO variants in torture testing: PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O0 (test for excess errors) PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O1 (test for excess errors) PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O2 (test for excess errors) -PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) -PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O3 -g (test for excess errors) PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -Os (test for excess errors) Etc. The reason is that when probing for availability of LTO, we run into: spawn [...]/build-gcc/gcc/testsuite/rust/../../gccrs -B[...]/build-gcc/gcc/testsuite/rust/../../ -fdiagnostics-plain-output -frust-incomplete-and-experimental-compiler-do-not-use -flto -c -o lto8274.o lto8274.c cc1: warning: command-line option '-frust-incomplete-and-experimental-compiler-do-not-use' is valid for Rust but not for C For GCC/Rust testing, this flag is (as of recently) defaulted in 'gcc/testsuite/lib/rust.exp:rust_init': lappend ALWAYS_RUSTFLAGS "additional_flags=-frust-incomplete-and-experimental-compiler-do-not-use" A few more "command-line option [...] is valid for [...] but not for [...]" instances were found in the test suite logs, when more than one language is involved. With '-Wno-complain-wrong-lang' used in 'gcc/testsuite/lib/target-supports.exp:check_compile', we get back: PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O0 (test for excess errors) PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O1 (test for excess errors) PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O2 (test for excess errors) +PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) +PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -O3 -g (test for excess errors) PASS: rust/compile/torture/all_doc_comment_line_blocks.rs -Os (test for excess errors) Etc., and in total: === rust Summary for unix === # of expected passes [-4990-]{+6718+} # of expected failures [-39-]{+51+} Anything that 'gcc/opts-global.cc:complain_wrong_lang' might do is cut short by '-Wno-complain-wrong-lang', not just the one 'warning' diagnostic. This corresponds to what already exists via 'lang_hooks.complain_wrong_lang_p'. The 'gcc/opts-common.cc:prune_options' changes follow the same rationale as PR67640 "driver passes -fdiagnostics-color= always last": we need to process '-Wno-complain-wrong-lang' early, so that it properly affects other options appearing before it on the command line. gcc/ * common.opt (-Wcomplain-wrong-lang): New. * doc/invoke.texi (-Wno-complain-wrong-lang): Document it. * opts-common.cc (prune_options): Handle it. * opts-global.cc (complain_wrong_lang): Use it. gcc/testsuite/ * gcc.dg/Wcomplain-wrong-lang-1.c: New. * gcc.dg/Wcomplain-wrong-lang-2.c: Likewise. * gcc.dg/Wcomplain-wrong-lang-3.c: Likewise. * gcc.dg/Wcomplain-wrong-lang-4.c: Likewise. * gcc.dg/Wcomplain-wrong-lang-5.c: Likewise. * lib/target-supports.exp (check_compile): Use '-Wno-complain-wrong-lang'. * g++.dg/abi/empty12.C: Likewise. * g++.dg/abi/empty13.C: Likewise. * g++.dg/abi/empty14.C: Likewise. * g++.dg/abi/empty15.C: Likewise. * g++.dg/abi/empty16.C: Likewise. * g++.dg/abi/empty17.C: Likewise. * g++.dg/abi/empty18.C: Likewise. * g++.dg/abi/empty19.C: Likewise. * g++.dg/abi/empty22.C: Likewise. * g++.dg/abi/empty25.C: Likewise. * g++.dg/abi/empty26.C: Likewise. * gfortran.dg/bind-c-contiguous-1.f90: Likewise. * gfortran.dg/bind-c-contiguous-4.f90: Likewise. * gfortran.dg/bind-c-contiguous-5.f90: Likewise. libgomp/ * testsuite/libgomp.fortran/alloc-10.f90: Use '-Wno-complain-wrong-lang'. * testsuite/libgomp.fortran/alloc-11.f90: Likewise. * testsuite/libgomp.fortran/alloc-7.f90: Likewise. * testsuite/libgomp.fortran/alloc-9.f90: Likewise. * testsuite/libgomp.fortran/allocate-1.f90: Likewise. * testsuite/libgomp.fortran/depend-4.f90: Likewise. * testsuite/libgomp.fortran/depend-5.f90: Likewise. * testsuite/libgomp.fortran/depend-6.f90: Likewise. * testsuite/libgomp.fortran/depend-7.f90: Likewise. * testsuite/libgomp.fortran/depend-inoutset-1.f90: Likewise. * testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Likewise. * testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Likewise. * testsuite/libgomp.fortran/order-reproducible-1.f90: Likewise. * testsuite/libgomp.fortran/order-reproducible-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.fortran/task-detach-6.f90: Remove left-over 'dg-prune-output'.
2023-02-16libgomp: Fix comment typoJakub Jelinek1-1/+1
I saw FAIL: libgomp.fortran/target-nowait-array-section.f90 -O execution test in my last x86_64-linux bootstrap. From quick skimming, it might be just unreliable test, which assumes that asynchronous execution wouldn't produce ordered sequence, but can't it happen even with asynchronous execution? That said, while skimming the test, I've noticed a comment typo and this patch fixes that up. 2023-02-16 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.fortran/target-nowait-array-section.f90: Fix comment typo and improve its wording.
2023-02-15libgomp: Fix reverse-offload for GOMP_MAP_TO_PSETTobias Burnus1-2/+4
libgomp/ * target.c (gomp_target_rev): Dereference ptr to get device address. * testsuite/libgomp.fortran/reverse-offload-5.f90: Add test for unallocated allocatable.
2023-02-15libgomp: Fix 'target enter data' with always pointerTobias Burnus1-0/+22
As GOMP_MAP_ALWAYS_POINTER operates on the previous map item, ensure that with 'target enter data' both are passed together to gomp_map_vars_internal. libgomp/ChangeLog: * target.c (gomp_map_vars_internal): Add 'i > 0' before doing a kind check. (GOMP_target_enter_exit_data): If the next map item is GOMP_MAP_ALWAYS_POINTER map it together with the current item. * testsuite/libgomp.fortran/target-enter-data-3.f90: New test.
2023-02-09OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]Tobias Burnus6-0/+1740
This patch ensures that loop bounds depending on outer loop vars use the proper TREE_VEC format. It additionally gives a sorry if such an outer var has a non-one/non-minus-one increment as currently a count variable is used in this case (see PR). Finally, it avoids 'count' and just uses a local loop variable if the step increment is +/-1. PR fortran/107424 gcc/fortran/ChangeLog: * trans-openmp.cc (struct dovar_init_d): Add 'sym' and 'non_unit_incr' members. (gfc_nonrect_loop_expr): New. (gfc_trans_omp_do): Call it; use normal loop bounds for unit stride - and only create local loop var. libgomp/ChangeLog: * testsuite/libgomp.fortran/non-rectangular-loop-1.f90: New test. * testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: New test. * testsuite/libgomp.fortran/non-rectangular-loop-2.f90: New test. * testsuite/libgomp.fortran/non-rectangular-loop-3.f90: New test. * testsuite/libgomp.fortran/non-rectangular-loop-4.f90: New test. * testsuite/libgomp.fortran/non-rectangular-loop-5.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/privatization-1-compute-loop.f90: Update dg-note. * gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise.
2023-02-07Fix 'libgomp.fortran/reverse-offload-6.f90' nvptx offloading compilationThomas Schwinge1-0/+2
Fix-up for recent commit 0b1ce70a813b98ef2893779d14ad6c90c5d06a71 "libgomp: Fix reverse offload issues". libgomp/ * testsuite/libgomp.fortran/reverse-offload-6.f90: Fix nvptx offloading compilation.
2023-02-03libgomp: Fix reverse offload issuesTobias Burnus1-0/+32
If there is nothing to map, skip the mapping and avoid attempting to copy 0 bytes from addrs, sizes and kinds. Additionally, it could happen that a non-allocated address was deallocated, such as a pointer set, leading to a free for the actual data. libgomp/ * target.c (gomp_target_rev): Handle mapnum == 0 and avoid freeing not allocated memory. * testsuite/libgomp.fortran/reverse-offload-6.f90: New test.
2023-02-01Fortran: Extend align-clause checks of OpenMP's allocate directiveTobias Burnus2-2/+44
gcc/fortran/ChangeLog: * openmp.cc (resolve_omp_clauses): Check also for power of two. libgomp/ChangeLog: * testsuite/libgomp.fortran/allocate-3.f90: Fix ALIGN usage, remove unused -fdump-tree-original. * testsuite/libgomp.fortran/allocate-4.f90: New.
2023-01-27OpenMP/Fortran: Fix has_device_addr clause splitting [PR108558]Tobias Burnus1-0/+59
gcc/fortran/ChangeLog: PR fortran/108558 * trans-openmp.cc (gfc_split_omp_clauses): Handle has_device_addr. libgomp/ChangeLog: PR fortran/108558 * testsuite/libgomp.fortran/has_device_addr.f90: New test.
2023-01-19openmp: Fix up OpenMP expansion of non-rectangular loops [PR108459]Jakub Jelinek1-0/+41
expand_omp_for_init_counts was using for the case where collapse(2) inner loop has init expression dependent on non-constant multiple of the outer iterator and the condition upper bound expression doesn't depend on the outer iterator fold_unary (NEGATE_EXPR, ...). This will just return NULL if it can't be folded, we need fold_build1 instead. 2023-01-19 Jakub Jelinek <jakub@redhat.com> PR middle-end/108459 * omp-expand.cc (expand_omp_for_init_counts): Use fold_build1 rather than fold_unary for NEGATE_EXPR. * testsuite/libgomp.c/pr108459.c: New test.
2023-01-16Update copyright years.Jakub Jelinek4-4/+4
2023-01-05openmp: Fix up finish_omp_target_clauses [PR108286]Jakub Jelinek1-0/+29
The comment in the loop says that we shouldn't add a map clause if such a clause exists already, but the loop was actually using OMP_CLAUSE_DECL on any clause. Target construct can have various clauses which don't have OMP_CLAUSE_DECL at all (e.g. nowait, device or if) or clause where it means something different (e.g. privatization clauses, allocate, depend). So, only check OMP_CLAUSE_DECL on OMP_CLAUSE_MAP clauses. 2023-01-05 Jakub Jelinek <jakub@redhat.com> PR c++/108286 * semantics.cc (finish_omp_target_clauses): Ignore clauses other than OMP_CLAUSE_MAP. * testsuite/libgomp.c++/pr108286.C: New test.
2022-12-21openmp: Don't try to destruct DECL_OMP_PRIVATIZED_MEMBER vars [PR108180]Jakub Jelinek1-0/+55
DECL_OMP_PRIVATIZED_MEMBER vars are artificial vars with DECL_VALUE_EXPR of this->field used just during gimplification and omp lowering/expansion to privatize individual fields in methods when needed. As the following testcase shows, when not in templates, they were handled right, but in templates we actually called cp_finish_decl on them and that can result in their destruction, which is obviously undesirable, we should only destruct the privatized copies of them created in omp lowering. Fixed thusly. 2022-12-21 Jakub Jelinek <jakub@redhat.com> PR c++/108180 * pt.cc (tsubst_expr): Don't call cp_finish_decl on DECL_OMP_PRIVATIZED_MEMBER vars. * testsuite/libgomp.c++/pr108180.C: New test.
2022-12-16Remove libgomp/testsuite/libgomp.fortran/allocate-4.f90 [PR108056]Tobias Burnus1-42/+0
Commit r13-4716-ge205ec03f0794aeac3e8a89e947c12624d5a274e accidentally included a testcase of another patch that is pending review: https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608401.html libgomp/ PR libfortran/108056 * testsuite/libgomp.fortran/allocate-4.f90: Remove accidentally added file.
2022-12-15libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only ↵Tobias Burnus1-0/+42
code [PR108056] Since GCC 12, the conversion between the array descriptors formats - the internal (GFC) and the C binding one (CFI) - moved to the compiler itself such that the cfi_desc_to_gfc_desc/gfc_desc_to_cfi_desc functions are only used with older code (GCC 9 to 11). The newly added checks caused asserts as older code did not pass the proper values (e.g. real(4) as effective argument arrived as BT_ASSUME type as the effective type got lost inbetween). As proposed in the PR, revert to the GCC 11 version - known bugs is better than some fixes and new issues. Still, GCC 12 is much better in terms of TS29113 support and should really be used. This patch uses the current libgomp version of the GCC 11 branch, except it fixes the GFC version number (which is 0), uses calloc instead of malloc, and sets the lower bound to 1 instead of keeping it as is for CFI_attribute_other. libgfortran/ChangeLog: PR libfortran/108056 * runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc, gfc_desc_to_cfi_desc): Mostly revert to GCC 11 version for those backward-compatiblity-only functions.
2022-12-14OpenMP/Fortran: Combined directives with map/firstprivate of same symbolJulian Brown1-0/+41
This patch fixes a case where a combined directive (e.g. "!$omp target parallel ...") contains both a map and a firstprivate clause for the same variable. When the combined directive is split into two nested directives, the outer "target" gets the "map" clause, and the inner "parallel" gets the "firstprivate" clause, like so: !$omp target parallel map(x) firstprivate(x) --> !$omp target map(x) !$omp parallel firstprivate(x) ... When there is no map of the same variable, the firstprivate is distributed to both directives, e.g. for 'y' in: !$omp target parallel map(x) firstprivate(y) --> !$omp target map(x) firstprivate(y) !$omp parallel firstprivate(y) ... This is not a recent regression, but appear to fix a long-standing ICE. (The included testcase is based on one by Tobias.) 2022-12-06 Julian Brown <julian@codesourcery.com> gcc/fortran/ * trans-openmp.cc (gfc_add_firstprivate_if_unmapped): New function. (gfc_split_omp_clauses): Call above. libgomp/ * testsuite/libgomp.fortran/combined-directive-splitting-1.f90: New test.
2022-12-10libgomp: Handle OpenMP's reverse offloadsTobias Burnus5-0/+467
This commit enabled reverse offload for nvptx such that gomp_target_rev actually gets called. And it fills the latter function to do all of the following: finding the host function to the device func ptr and copying the arguments to the host, processing the mapping/firstprivate, calling the host function, copying back the data and freeing as needed. The data handling is made easier by assuming that all host variables either existed before (and are in the mapping) or that those are devices variables not yet available on the host. Thus, the reverse mapping can do without refcounts etc. Note that the spec disallows inside a target region device-affecting constructs other than target plus ancestor device-modifier and it also limits the clauses permitted on this construct. For the function addresses, an additional splay tree is used; for the lookup of mapped variables, the existing splay-tree is used. Unfortunately, its data structure requires a full walk of the tree; Additionally, the just mapped variables are recorded in a separate data structure an extra lookup. While the lookup is slow, assuming that only few variables get mapped in each reverse offload construct and that reverse offload is the exception and not performance critical, this seems to be acceptable. libgomp/ChangeLog: * libgomp.h (struct target_mem_desc): Predeclare; move below after 'reverse_splay_tree_node' and add rev_array member. (struct reverse_splay_tree_key_s, reverse_splay_compare): New. (reverse_splay_tree_node, reverse_splay_tree, reverse_splay_tree_key): New typedef. (struct gomp_device_descr): Add mem_map_rev member. * oacc-host.c (host_dispatch): NULL init .mem_map_rev. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim support for GOMP_REQUIRES_REVERSE_OFFLOAD. * splay-tree.h (splay_tree_callback_stop): New typedef; like splay_tree_callback but returning int not void. (splay_tree_foreach_lazy): Define; like splay_tree_foreach but taking splay_tree_callback_stop as argument. * splay-tree.c (splay_tree_foreach_internal_lazy, splay_tree_foreach_lazy): New; but early exit if callback returns nonzero. * target.c: Instatiate splay_tree_c with splay_tree_prefix 'reverse'. (gomp_map_lookup_rev): New. (gomp_load_image_to_device): Handle reverse-offload function lookup table. (gomp_unload_image_from_device): Free devicep->mem_map_rev. (struct gomp_splay_tree_rev_lookup_data, gomp_splay_tree_rev_lookup, gomp_map_rev_lookup, struct cpy_data, gomp_map_cdata_lookup_int, gomp_map_cdata_lookup): New auxiliary structs and functions for gomp_target_rev. (gomp_target_rev): Implement reverse offloading and its mapping. (gomp_target_init): Init current_device.mem_map_rev.root. * testsuite/libgomp.fortran/reverse-offload-2.f90: New test. * testsuite/libgomp.fortran/reverse-offload-3.f90: New test. * testsuite/libgomp.fortran/reverse-offload-4.f90: New test. * testsuite/libgomp.fortran/reverse-offload-5.f90: New test. * testsuite/libgomp.fortran/reverse-offload-5a.f90: New test without mapping of on-device allocated variables.
2022-12-09Fortran/OpenMP: align/allocator modifiers to the allocate clauseTobias Burnus2-0/+53
gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_namelist): Improve OMP_LIST_ALLOCATE output. * gfortran.h (struct gfc_omp_namelist): Add 'align' to 'u'. (gfc_free_omp_namelist): Add bool arg. * match.cc (gfc_free_omp_namelist): Likewise; free 'u.align'. * openmp.cc (gfc_free_omp_clauses, gfc_match_omp_clause_reduction, gfc_match_omp_flush): Update call. (gfc_match_omp_clauses): Match 'align/allocate modifers in 'allocate' clause. (resolve_omp_clauses): Resolve align. * st.cc (gfc_free_statement): Update call * trans-openmp.cc (gfc_trans_omp_clauses): Handle 'align'. libgomp/ChangeLog: * libgomp.texi (5.1 Impl. Status): Split allocate clause/directive item about 'align'; mark clause as 'Y' and directive as 'N'. * testsuite/libgomp.fortran/allocate-2.f90: New test. * testsuite/libgomp.fortran/allocate-3.f90: New test.
2022-12-06OpenMP: omp_get_max_teams, omp_set_num_teams, and ↵Marcel Vollweiler7-25/+757
omp_{gs}et_teams_thread_limit on offload devices This patch adds support for omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices. That includes the usage of device-specific ICV values (specified as environment variables or changed on a device). In order to reuse device-specific ICV values, a copy back mechanism is implemented that copies ICV values back from device to the host. Additionally, a limitation of the number of teams on gcn offload devices is implemented. The number of teams is limited by twice the number of compute units (one team is executed on one compute unit). This avoids queueing unnessecary many teams and a corresponding allocation of large amounts of memory. Without that limitation the memory allocation for a large number of user-specified teams can result in an "memory access fault". A limitation of the number of teams is already also implemented for nvptx devices (see nvptx_adjust_launch_bounds in libgomp/plugin/plugin-nvptx.c). gcc/ChangeLog: * gimplify.cc (optimize_target_teams): Set initial num_teams_upper to "-2" instead of "1" for non-existing num_teams clause in order to disambiguate from the case of an existing num_teams clause with value 1. libgomp/ChangeLog: * config/gcn/icv-device.c (omp_get_teams_thread_limit): Added to allow processing of device-specific values. (omp_set_teams_thread_limit): Likewise. (ialias): Likewise. * config/nvptx/icv-device.c (omp_get_teams_thread_limit): Likewise. (omp_set_teams_thread_limit): Likewise. (ialias): Likewise. * icv-device.c (omp_get_teams_thread_limit): Likewise. (ialias): Likewise. (omp_set_teams_thread_limit): Likewise. * icv.c (omp_set_teams_thread_limit): Removed. (omp_get_teams_thread_limit): Likewise. (ialias): Likewise. * libgomp.texi: Updated documentation for nvptx and gcn corresponding to the limitation of the number of teams. * plugin/plugin-gcn.c (limit_teams): New helper function that limits the number of teams by twice the number of compute units. (parse_target_attributes): Limit the number of teams on gcn offload devices. * target.c (get_gomp_offload_icvs): Added teams_thread_limit_var handling. (gomp_load_image_to_device): Added a size check for the ICVs struct variable. (gomp_copy_back_icvs): New function that is used in GOMP_target_ext to copy back the ICV values from device to host. (GOMP_target_ext): Update the number of teams and threads in the kernel args also considering device-specific values. * testsuite/libgomp.c-c++-common/icv-4.c: Fixed an error in the reading of OMP_TEAMS_THREAD_LIMIT from the environment. * testsuite/libgomp.c-c++-common/icv-5.c: Extended. * testsuite/libgomp.c-c++-common/icv-6.c: Extended. * testsuite/libgomp.c-c++-common/icv-7.c: Extended. * testsuite/libgomp.c-c++-common/icv-9.c: New test. * testsuite/libgomp.fortran/icv-5.f90: New test. * testsuite/libgomp.fortran/icv-6.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-teams-1.c: Adapt expected values for num_teams from "1" to "-2" in cases without num_teams clause. * g++.dg/gomp/target-teams-1.C: Likewise. * gfortran.dg/gomp/defaultmap-4.f90: Likewise. * gfortran.dg/gomp/defaultmap-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-6.f90: Likewise.
2022-11-30amdgcn: Support AMD-specific 'isa' traits in OpenMP context selectorsPaul-Antoine Arras7-0/+106
Add support for gfx803 as an alias for fiji. Add test cases for all supported 'isa' values. gcc/ChangeLog: * config/gcn/gcn.cc (gcn_omp_device_kind_arch_isa): Add gfx803. * config/gcn/t-omp-device: Add gfx803. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4-fiji.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx803.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx900.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx906.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx908.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx90a.c: New test. * testsuite/libgomp.c/declare-variant-4.h: New header file.
2022-11-25OpenMP: Generate SIMD clones for functions with "declare target"Sandra Loosemore4-0/+123
This patch causes the IPA simdclone pass to generate clones for functions with the "omp declare target" attribute as if they had "omp declare simd", provided the function appears to be suitable for SIMD execution. The filter is conservative, rejecting functions that write memory or that call other functions not known to be safe. A new option -fopenmp-target-simd-clone is added to control this transformation; it's enabled for offload processing at -O2 and higher. gcc/ChangeLog: * common.opt (fopenmp-target-simd-clone): New option. (target_simd_clone_device): New enum to go with it. * doc/invoke.texi (-fopenmp-target-simd-clone): Document. * flag-types.h (enum omp_target_simd_clone_device_kind): New. * omp-simd-clone.cc (auto_simd_fail): New function. (auto_simd_check_stmt): New function. (plausible_type_for_simd_clone): New function. (ok_for_auto_simd_clone): New function. (simd_clone_create): Add force_local argument, make the symbol have internal linkage if it is true. (expand_simd_clones): Also check for cloneable functions with "omp declare target". Pass explicit_p argument to simd_clone.compute_vecsize_and_simdlen target hook. * opts.cc (default_options_table): Add -fopenmp-target-simd-clone. * target.def (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN): Add bool explicit_p argument. * doc/tm.texi: Regenerated. * config/aarch64/aarch64.cc (aarch64_simd_clone_compute_vecsize_and_simdlen): Update. * config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen): Update. * config/i386/i386.cc (ix86_simd_clone_compute_vecsize_and_simdlen): Update. gcc/testsuite/ChangeLog: * g++.dg/gomp/target-simd-clone-1.C: New. * g++.dg/gomp/target-simd-clone-2.C: New. * gcc.dg/gomp/target-simd-clone-1.c: New. * gcc.dg/gomp/target-simd-clone-2.c: New. * gcc.dg/gomp/target-simd-clone-3.c: New. * gcc.dg/gomp/target-simd-clone-4.c: New. * gcc.dg/gomp/target-simd-clone-5.c: New. * gcc.dg/gomp/target-simd-clone-6.c: New. * gcc.dg/gomp/target-simd-clone-7.c: New. * gcc.dg/gomp/target-simd-clone-8.c: New. * lib/scanoffloadipa.exp: New. libgomp/ChangeLog: * testsuite/lib/libgomp.exp: Load scanoffloadipa.exp library. * testsuite/libgomp.c/target-simd-clone-1.c: New. * testsuite/libgomp.c/target-simd-clone-2.c: New. * testsuite/libgomp.c/target-simd-clone-3.c: New.
2022-11-25libgomp: Add no-target-region rev offload test + fix plugin-nvptxTobias Burnus1-0/+49
OpenMP permits that a 'target device(ancestor:1)' is called without being enclosed in a target region - using the current device (i.e. the host) in that case. This commit adds a testcase for this. In case of nvptx, the missing on-device 'GOMP_target_ext' call causes that it and also the associated on-device GOMP_REV_OFFLOAD_VAR variable are not linked in from nvptx's libgomp.a. Thus, handle the failing cuModuleGetGlobal gracefully by disabling reverse offload and assuming that the failure is fine. libgomp/ChangeLog: * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Use unsigned int for 'i' to match 'fn_entries'; regard absent GOMP_REV_OFFLOAD_VAR as valid and the code having no reverse-offload code. * testsuite/libgomp.c-c++-common/reverse-offload-2.c: New test.
2022-11-04Remove support for Intel MIC offloadingThomas Schwinge4-75/+0
... after its deprecation in GCC 12. * Makefile.def: Remove module 'liboffloadmic'. * Makefile.in: Regenerate. * configure.ac: Remove 'liboffloadmic' handling. * configure: Regenerate. contrib/ * gcc-changelog/git_commit.py (default_changelog_locations): Remove 'liboffloadmic'. * gcc_update (files_and_dependencies): Remove 'liboffloadmic' files. * update-copyright.py (GCCCmdLine): Remove 'liboffloadmic' comment. gcc/ * config.gcc [target *-intelmic-* | *-intelmicemul-*]: Remove. * config/i386/i386-options.cc (ix86_omp_device_kind_arch_isa) [ACCEL_COMPILER]: Remove. * config/i386/intelmic-mkoffload.cc: Remove. * config/i386/intelmic-offload.h: Likewise. * config/i386/t-intelmic: Likewise. * config/i386/t-omp-device: Likewise. * configure.ac [target *-intelmic-* | *-intelmicemul-*]: Remove. * configure: Regenerate. * doc/install.texi (--enable-offload-targets=[...]): Update. * doc/sourcebuild.texi: Remove 'liboffloadmic' documentation. include/ * gomp-constants.h (GOMP_DEVICE_INTEL_MIC): Comment out. (GOMP_VERSION_INTEL_MIC): Remove. libgomp/ * libgomp-plugin.h (OFFLOAD_TARGET_TYPE_INTEL_MIC): Remove. * libgomp.texi (OpenMP Context Selectors): Remove Intel MIC documentation. * plugin/configfrag.ac <enable_offload_targets> [*-intelmic-* | *-intelmicemul-*]: Remove. * configure: Regenerate. * testsuite/lib/libgomp.exp (libgomp_init): Remove 'liboffloadmic' handling. (offload_target_to_openacc_device_type) [$offload_target = *-intelmic*]: Remove. (check_effective_target_offload_device_intel_mic) (check_effective_target_offload_device_any_intel_mic): Remove. * testsuite/libgomp.c-c++-common/on_device_arch.h (device_arch_intel_mic, on_device_arch_intel_mic, any_device_arch) (any_device_arch_intel_mic): Remove. * testsuite/libgomp.c-c++-common/target-45.c: Remove 'offload_device_any_intel_mic' XFAIL. * testsuite/libgomp.fortran/target10.f90: Likewise. liboffloadmic/ * ChangeLog: Remove. * Makefile.am: Likewise. * Makefile.in: Likewise. * aclocal.m4: Likewise. * configure: Likewise. * configure.ac: Likewise. * configure.tgt: Likewise. * doc/doxygen/config: Likewise. * doc/doxygen/header.tex: Likewise. * include/coi/common/COIEngine_common.h: Likewise. * include/coi/common/COIEvent_common.h: Likewise. * include/coi/common/COIMacros_common.h: Likewise. * include/coi/common/COIPerf_common.h: Likewise. * include/coi/common/COIResult_common.h: Likewise. * include/coi/common/COISysInfo_common.h: Likewise. * include/coi/common/COITypes_common.h: Likewise. * include/coi/sink/COIBuffer_sink.h: Likewise. * include/coi/sink/COIPipeline_sink.h: Likewise. * include/coi/sink/COIProcess_sink.h: Likewise. * include/coi/source/COIBuffer_source.h: Likewise. * include/coi/source/COIEngine_source.h: Likewise. * include/coi/source/COIEvent_source.h: Likewise. * include/coi/source/COIPipeline_source.h: Likewise. * include/coi/source/COIProcess_source.h: Likewise. * liboffloadmic_host.spec.in: Likewise. * liboffloadmic_target.spec.in: Likewise. * plugin/Makefile.am: Likewise. * plugin/Makefile.in: Likewise. * plugin/aclocal.m4: Likewise. * plugin/configure: Likewise. * plugin/configure.ac: Likewise. * plugin/libgomp-plugin-intelmic.cpp: Likewise. * plugin/offload_target_main.cpp: Likewise. * runtime/cean_util.cpp: Likewise. * runtime/cean_util.h: Likewise. * runtime/coi/coi_client.cpp: Likewise. * runtime/coi/coi_client.h: Likewise. * runtime/coi/coi_server.cpp: Likewise. * runtime/coi/coi_server.h: Likewise. * runtime/compiler_if_host.cpp: Likewise. * runtime/compiler_if_host.h: Likewise. * runtime/compiler_if_target.cpp: Likewise. * runtime/compiler_if_target.h: Likewise. * runtime/dv_util.cpp: Likewise. * runtime/dv_util.h: Likewise. * runtime/emulator/coi_common.h: Likewise. * runtime/emulator/coi_device.cpp: Likewise. * runtime/emulator/coi_device.h: Likewise. * runtime/emulator/coi_host.cpp: Likewise. * runtime/emulator/coi_host.h: Likewise. * runtime/emulator/coi_version_asm.h: Likewise. * runtime/emulator/coi_version_linker_script.map: Likewise. * runtime/liboffload_error.c: Likewise. * runtime/liboffload_error_codes.h: Likewise. * runtime/liboffload_msg.c: Likewise. * runtime/liboffload_msg.h: Likewise. * runtime/mic_lib.f90: Likewise. * runtime/offload.h: Likewise. * runtime/offload_common.cpp: Likewise. * runtime/offload_common.h: Likewise. * runtime/offload_engine.cpp: Likewise. * runtime/offload_engine.h: Likewise. * runtime/offload_env.cpp: Likewise. * runtime/offload_env.h: Likewise. * runtime/offload_host.cpp: Likewise. * runtime/offload_host.h: Likewise. * runtime/offload_iterator.h: Likewise. * runtime/offload_omp_host.cpp: Likewise. * runtime/offload_omp_target.cpp: Likewise. * runtime/offload_orsl.cpp: Likewise. * runtime/offload_orsl.h: Likewise. * runtime/offload_table.cpp: Likewise. * runtime/offload_table.h: Likewise. * runtime/offload_target.cpp: Likewise. * runtime/offload_target.h: Likewise. * runtime/offload_target_main.cpp: Likewise. * runtime/offload_timer.h: Likewise. * runtime/offload_timer_host.cpp: Likewise. * runtime/offload_timer_target.cpp: Likewise. * runtime/offload_trace.cpp: Likewise. * runtime/offload_trace.h: Likewise. * runtime/offload_util.cpp: Likewise. * runtime/offload_util.h: Likewise. * runtime/ofldbegin.cpp: Likewise. * runtime/ofldend.cpp: Likewise. * runtime/orsl-lite/include/orsl-lite.h: Likewise. * runtime/orsl-lite/lib/orsl-lite.c: Likewise. * runtime/orsl-lite/version.txt: Likewise.
2022-11-03OpenMP/Fortran: 'target update' with DT componentsTobias Burnus2-0/+234
OpenMP 5.0 permits to use arrays with derived type components for the list items to the 'from'/'to' clauses of the 'target update' directive. gcc/fortran/ChangeLog: * openmp.cc (gfc_match_omp_clauses): Permit derived types for the 'to' and 'from' clauses of 'target update'. * trans-openmp.cc (gfc_trans_omp_clauses): Fixes for derived-type changes; fix size for scalars. libgomp/ChangeLog: * testsuite/libgomp.fortran/target-11.f90: New test. * testsuite/libgomp.fortran/target-13.f90: New test.
2022-11-02Support OpenACC 'declare create' with Fortran allocatable arrays, part II ↵Thomas Schwinge2-27/+146
[PR106643, PR96668] PR libgomp/106643 PR fortran/96668 libgomp/ * oacc-mem.c (goacc_enter_data_internal): Support OpenACC 'declare create' with Fortran allocatable arrays, part II. * testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90: Adjust. * testsuite/libgomp.oacc-fortran/pr106643-1.f90: New.
2022-11-02Support OpenACC 'declare create' with Fortran allocatable arrays, part I ↵Thomas Schwinge2-0/+680
[PR106643] PR libgomp/106643 libgomp/ * oacc-mem.c (goacc_enter_data_internal): Support OpenACC 'declare create' with Fortran allocatable arrays, part I. * testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90: New. * testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90: New.
2022-11-02Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'Thomas Schwinge1-0/+402
libgomp/ * testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90: New.
2022-11-02Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'Thomas Schwinge1-0/+278
... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted for missing support for OpenACC "Changes from Version 2.0 to 2.5": "The 'declare create' directive with a Fortran 'allocatable' has new behavior". Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete' manually. libgomp/ * testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90: New.
2022-11-02Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'Cesar Philippidis1-0/+268
libgomp/ * testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New. Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2022-10-28OpenACC: Don't gang-privatize artificial variables [PR90115]Julian Brown5-37/+22
This patch prevents compiler-generated artificial variables from being treated as privatization candidates for OpenACC. The rationale is that e.g. "gang-private" variables actually must be shared by each worker and vector spawned within a particular gang, but that sharing is not necessary for any compiler-generated variable (at least at present, but no such need is anticipated either). Variables on the stack (and machine registers) are already private per-"thread" (gang, worker and/or vector), and that's fine for artificial variables. We're restricting this to blocks, as we still need to understand what it means for a 'DECL_ARTIFICIAL' to appear in a 'private' clause. Several tests need their scan output patterns adjusted to compensate. 2022-10-14 Julian Brown <julian@codesourcery.com> PR middle-end/90115 gcc/ * omp-low.cc (oacc_privatization_candidate_p): Artificial vars are not privatization candidates. libgomp/ * testsuite/libgomp.oacc-fortran/declare-1.f90: Adjust scan output. * testsuite/libgomp.oacc-fortran/host_data-5.F90: Likewise. * testsuite/libgomp.oacc-fortran/if-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/print-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise. Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2022-10-21Restore 'libgomp.oacc-c-c++-common/nvptx-sese-1.c' SESE regions checking ↵Thomas Schwinge1-1/+1
[PR107195, PR107344] That is, adjust for optimization introduced with recent commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04 "[PR107195] Set range to zero when nonzero mask is 0", where GCC now understands that after 'r *= 2;', 'r & 1' will never hold here, and thus transforms/optimizes/"disturbs" the original code such that GCC/nvptx's later "Neuter whole SESE regions" optimization no longer is applicable to it: UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 (test for excess errors) PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 scan-nvptx-none-offload-rtl-dump mach "SESE regions:.* [0-9]+{[0-9]+->[0-9]+(\\.[0-9]+)+}" Same for C++. It's unclear to me if this is an actual "problem", which optimization is "more important", so I've filed PR107344 "GCC/nvptx SESE region optimization" to capture this question, and here restore what we intend to be testing (to my understanding) in 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'. PR tree-optimization/107195 PR target/107344 libgomp/ * testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c: Restore SESE regions checking.
2022-10-20libgomp: Add offload_device_gcn check, add requires-4a.c testTobias Burnus3-0/+64
Duplicate libgomp.c-c++-common/requires-4.c (as ...-4a.c) but with using a heap-allocated instead of static memory for a variable. This change and the added offload_device_gcn check prepare for pseudo-USM, where the device hardware cannot access all host memory but only managed and pinned memory; for those, requires-4.c will fail and the new check permits to add target { ! { offload_device_nvptx || offload_device_gcn } } to requires-4.c; however, it has not been added yet as pseuo-USM support is not yet on mainline. (Review is pending for the USM patches.) include/ChangeLog: * gomp-constants.h (GOMP_DEVICE_HSA): Comment out unused define. libgomp/ChangeLog: * testsuite/lib/libgomp.exp (check_effective_target_offload_device_gcn): New. * testsuite/libgomp.c-c++-common/on_device_arch.h (device_arch_gcn, on_device_arch_gcn): New. * testsuite/libgomp.c-c++-common/requires-4a.c: New test; copied from requires-4.c but using heap-allocated memory.
2022-10-20Add 'libgomp.oacc-c-c++-common/private-big-1.c' [PR105421]Thomas Schwinge1-0/+100
After commit r13-3404-g7c55755d4c760de326809636531478fd7419e1e5 "amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421]", "big" private data now works for GCN offloading, too. PR target/105421 libgomp/ * testsuite/libgomp.oacc-c-c++-common/private-big-1.c: New.
2022-10-17Fix nvptx-specific '-foffload-options' syntax in ↵Thomas Schwinge1-1/+1
'libgomp.c/reverse-offload-sm30.c' That is, '-mptx=_' is only valid in '-foffload-options=nvptx-none', too. Fix test case added in recent commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2 "nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible". libgomp/ * testsuite/libgomp.c/reverse-offload-sm30.c: Fix nvptx-specific '-foffload-options' syntax.
2022-10-13libgomp: Add Fortran testcases for omp_in_explicit_taskTobias Burnus7-0/+247
Fortranized testcases of commits r13-3257-ga58a965eb73 and r13-3258-g0ec4e93fb9f. libgomp/ChangeLog: * testsuite/libgomp.fortran/task-7.f90: New test. * testsuite/libgomp.fortran/task-8.f90: New test. * testsuite/libgomp.fortran/task-in-explicit-1.f90: New test. * testsuite/libgomp.fortran/task-in-explicit-2.f90: New test. * testsuite/libgomp.fortran/task-in-explicit-3.f90: New test. * testsuite/libgomp.fortran/task-reduction-17.f90: New test. * testsuite/libgomp.fortran/task-reduction-18.f90: New test.
2022-10-12libgomp: Add omp_in_explicit_task supportJakub Jelinek3-0/+168
This is pretty straightforward, if gomp_thread ()->task is NULL, it can't be explicit task, otherwise if gomp_thread ()->task->kind == GOMP_TASK_IMPLICIT, it is an implicit task, otherwise explicit task. 2022-10-12 Jakub Jelinek <jakub@redhat.com> * omp.h.in (omp_in_explicit_task): Declare. * omp_lib.h.in (omp_in_explicit_task): Likewise. * omp_lib.f90.in (omp_in_explicit_task): New interface. * libgomp.map (OMP_5.2): New symbol version, export omp_in_explicit_task and omp_in_explicit_task_. * task.c (omp_in_explicit_task): New function. * fortran.c (omp_in_explicit_task): Add ialias_redirect. (omp_in_explicit_task_): New function. * libgomp.texi (OpenMP 5.2): Mark omp_in_explicit_task as implemented. * testsuite/libgomp.c-c++-common/task-in-explicit-1.c: New test. * testsuite/libgomp.c-c++-common/task-in-explicit-2.c: New test. * testsuite/libgomp.c-c++-common/task-in-explicit-3.c: New test.
2022-10-12libgomp: Fix up creation of artificial teamsJakub Jelinek4-0/+93
When not in explicit parallel/target/teams construct, we in some cases create an artificial parallel with a single thread (either to handle target nowait or for task reduction purposes). In those cases, it handled again artificially created implicit task (created by gomp_new_icv for cases where we needed to write to some ICVs), but as the testcases show, didn't take into account possibility of this being done from explicit task(s). The code would destroy/free the previous task and replace it with the new implicit task. If task is an explicit task (when teams is NULL, all explicit tasks behave like if (0)), it is a pointer to a local stack variable, so freeing it doesn't work, and additionally we shouldn't lose the explicit tasks - the new implicit task should instead replace the ancestor task which is the first implicit one. 2022-10-12 Jakub Jelinek <jakub@redhat.com> * task.c (gomp_create_artificial_team): Fix up handling of invocations from within explicit task. * target.c (GOMP_target_ext): Likewise. * testsuite/libgomp.c/task-7.c: New test. * testsuite/libgomp.c/task-8.c: New test. * testsuite/libgomp.c-c++-common/task-reduction-17.c: New test. * testsuite/libgomp.c-c++-common/task-reduction-18.c: New test.
2022-09-30Fortran: Update use_device_ptr for OpenMP 5.1 [PR105318]Tobias Burnus1-0/+159
OpenMP 5.1 added has_device_addr and relaxed the restrictions for use_device_ptr, including processing non-type(c_ptr) arguments as if has_device_addr was used. (There is a semantic difference.) For completeness, the likewise change was done for 'use_device_ptr', where non-type(c_ptr) arguments now use use_device_addr. Finally, a warning for 'device(omp_{initial,invalid}_device)' was silenced on the way as affecting the new testcase. PR fortran/105318 gcc/fortran/ChangeLog: * openmp.cc (resolve_omp_clauses): Update is_device_ptr restrictions for OpenMP 5.1 and map to has_device_addr where applicable; map use_device_ptr to use_device_addr where applicable. Silence integer-range warning for device(omp_{initial,invalid}_device). libgomp/ChangeLog: * testsuite/libgomp.fortran/is_device_ptr-2.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/is_device_ptr-1.f90: Remove dg-error. * gfortran.dg/gomp/is_device_ptr-2.f90: Likewise. * gfortran.dg/gomp/is_device_ptr-3.f90: Update tree-scan-dump.
2022-09-24openmp, c: Tighten up c_tree_equal [PR106981]Jakub Jelinek1-0/+19
This patch changes c_tree_equal to work more like cp_tree_equal, be more strict in what it accepts. The ICE on the first testcase was due to INTEGER_CST wi::wide (t1) == wi::wide (t2) comparison which ICEs if the two constants have different precision, but as the second testcase shows, being too lenient in it can also lead to miscompilation of valid OpenMP programs where we think certain expression is the same even when it isn't and can be guaranteed at runtime to represent different memory location. So, the patch looks through only NON_LVALUE_EXPRs and for constants as well as casts requires that the types match before actually comparing the constant values or recursing on the cast operands. 2022-09-24 Jakub Jelinek <jakub@redhat.com> PR c/106981 gcc/c/ * c-typeck.cc (c_tree_equal): Only strip NON_LVALUE_EXPRs at the start. For CONSTANT_CLASS_P or CASE_CONVERT: return false if t1 and t2 have different types. gcc/testsuite/ * c-c++-common/gomp/pr106981.c: New test. libgomp/ * testsuite/libgomp.c-c++-common/pr106981.c: New test.
2022-09-14OpenMP/OpenACC struct sibling list gimplification extension and reworkJulian Brown4-0/+483
This patch refactors struct sibling-list processing in gimplify.cc, and adjusts some related mapping-clause processing in the Fortran FE and omp-low.cc accordingly. 2022-09-13 Julian Brown <julian@codesourcery.com> gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_clauses): Don't create GOMP_MAP_TO_PSET mappings for class metadata, nor GOMP_MAP_POINTER mappings for POINTER_TYPE_P decls. gcc/ * gimplify.cc (gimplify_omp_var_data): Remove GOVD_MAP_HAS_ATTACHMENTS. (GOMP_FIRSTPRIVATE_IMPLICIT): Renumber. (insert_struct_comp_map): Refactor function into... (build_omp_struct_comp_nodes): This new function. Remove list handling and improve self-documentation. (extract_base_bit_offset): Remove BASE_REF, OFFSETP parameters. Move code to strip outer parts of address out of function, but strip no-op conversions. (omp_mapping_group): Add DELETED field for use during reindexing. (omp_strip_components_and_deref, omp_strip_indirections): New functions. (omp_group_last, omp_group_base): Add GOMP_MAP_STRUCT handling. (omp_gather_mapping_groups): Initialise DELETED field for new groups. (omp_index_mapping_groups): Notice DELETED groups when (re)indexing. (omp_siblist_insert_node_after, omp_siblist_move_node_after, omp_siblist_move_nodes_after, omp_siblist_move_concat_nodes_after): New helper functions. (omp_accumulate_sibling_list): New function to build up GOMP_MAP_STRUCT node groups for sibling lists. Outlined from gimplify_scan_omp_clauses. (omp_build_struct_sibling_lists): New function. (gimplify_scan_omp_clauses): Remove struct_map_to_clause, struct_seen_clause, struct_deref_set. Call omp_build_struct_sibling_lists as pre-pass instead of handling sibling lists in the function's main processing loop. (gimplify_adjust_omp_clauses_1): Remove GOVD_MAP_HAS_ATTACHMENTS handling, unused now. * omp-low.cc (scan_sharing_clauses): Handle pointer-type indirect struct references, and references to pointers to structs also. gcc/testsuite/ * g++.dg/goacc/member-array-acc.C: New test. * g++.dg/gomp/member-array-omp.C: New test. * g++.dg/gomp/target-3.C: Update expected output. * g++.dg/gomp/target-lambda-1.C: Likewise. * g++.dg/gomp/target-this-2.C: Likewise. * c-c++-common/goacc/deep-copy-arrayofstruct.c: Move test from here. * c-c++-common/gomp/target-50.c: New test. libgomp/ * testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test. * testsuite/libgomp.oacc-c++/deep-copy-17.C: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-arrayofstruct.c: Move test to here, make "run" test.
2022-09-12nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possibleTobias Burnus6-0/+21
Reverse offload requests at least -misa=sm_35; with this patch, a warning instead of an error is shown, still permitting reverse offload for all other configured device types. This is achieved by not calling GOMP_offload_register_ver (and stopping generating pointless 'static const char' variables, once known.) The tool_name as progname changes adds "nvptx " and "gcn " to the "mkoffload: warning/error:" diagnostic. gcc/ChangeLog: * config/nvptx/mkoffload.cc (process): Replace a fatal_error by a warning + not enabling offloading if -misa=sm_30 prevents reverse offload. (main): Use tool_name as progname for diagnostic. * config/gcn/mkoffload.cc (main): Likewise. libgomp/ChangeLog: * libgomp.texi (Offload-Target Specifics: nvptx): Document that reverse offload requires >= -march=sm_35. * testsuite/libgomp.c-c++-common/requires-4.c: Build for nvptx with -misa=sm_35. * testsuite/libgomp.c-c++-common/requires-5.c: Likewise. * testsuite/libgomp.c-c++-common/requires-6.c: Likewise. * testsuite/libgomp.c-c++-common/reverse-offload-1.c: Likewise. * testsuite/libgomp.fortran/reverse-offload-1.f90: Likewise. * testsuite/libgomp.c/reverse-offload-sm30.c: New test.
2022-09-12libgomp: Fix up icv-6.c [PR106894]Jakub Jelinek1-9/+17
The thing is, make check or make check RUNTESTFLAGS="c.exp='icv-6.c' c++.exp='icv-6.c'" in libgomp obj dir work fine, but make -j32 -k check RUNTESTFLAGS="c.exp='icv-6.c' c++.exp='icv-6.c'" fails. The thing is that the testcase as written relies on OMP_NUM_THREADS not being set in environment (as it takes priority over OMP_NUM_THREADS_ALL for the host). So, if either a user has OMP_NUM_THREADS=42 in the environment by himself, or when doing make check with -jN, we trigger: if test $$num_cpus -gt 8 && test -z "$$OMP_NUM_THREADS"; then \ OMP_NUM_THREADS=8; export OMP_NUM_THREADS; \ echo @@@ libgomp OMP_NUM_THREADS adjusted to 8 because of parallel make check and too many CPUs; \ fi; \ in libgomp/testsuite/Makefile.am and so the test fails. 2022-09-12 Jakub Jelinek <jakub@redhat.com> PR libgomp/106894 * testsuite/libgomp.c-c++-common/icv-6.c: Include string.h. (main): Avoid tests for which corresponding non-_ALL suffixed variable is in the environment, or for OMP_NUM_TEAMS on the device OMP_NUM_TEAMS_DEV_?.
2022-09-08OpenMP, libgomp: Environment variable syntax extensionMarcel Vollweiler6-0/+263
This patch considers the environment variable syntax extension for device-specific variants of environment variables from OpenMP 5.1 (see OpenMP 5.1 specification, p. 75 and p. 639). An environment variable (e.g. OMP_NUM_TEAMS) can have different suffixes: _DEV (e.g. OMP_NUM_TEAMS_DEV): affects all devices but not the host. _DEV_<device> (e.g. OMP_NUM_TEAMS_DEV_42): affects only device with number <device>. no suffix (e.g. OMP_NUM_TEAMS): affects only the host. In future OpenMP versions also suffix _ALL will be introduced (see discussion https://github.com/OpenMP/spec/issues/3179). This is also considered in this patch: _ALL (e.g. OMP_NUM_TEAMS_ALL): affects all devices and the host. The precedence is as follows (descending). For the host: 1. no suffix 2. _ALL For devices: 1. _DEV_<device> 2. _DEV 3. _ALL That means, _DEV_<device> is used whenever available. Otherwise _DEV is used if available, and at last _ALL. If there is no value for any of the variable variants, default values are used as already implemented before. This patch concerns parsing (a), storing (b), output (c) and transmission to the device (d): (a) The actual number of devices and the numbering are not known when parsing the environment variables. Thus all environment variables are iterated and searched for device-specific ones. (b) Only configured device-specific variables are stored. Thus, a linked list is used. (c) The output is done in omp_display_env (see specification p. 468f). Global ICVs are tagged with [all], see https://github.com/OpenMP/spec/issues/3179. ICVs which are not global but aren't handled device-specific yet are tagged with [host]. omp_display_env outputs the initial values of the ICVs. That is why a dedicated data structure is introduced for the inital values only (gomp_initial_icv_list). (d) Device-specific ICVs are transmitted to the device via GOMP_ADDITIONAL_ICVS. libgomp/ChangeLog: * config/gcn/icv-device.c (omp_get_default_device): Return device- specific ICV. (omp_get_max_teams): Added for GCN devices. (omp_set_num_teams): Likewise. (ialias): Likewise. * config/nvptx/icv-device.c (omp_get_default_device): Return device- specific ICV. (omp_get_max_teams): Added for NVPTX devices. (omp_set_num_teams): Likewise. (ialias): Likewise. * env.c (struct gomp_icv_list): New struct to store entries of initial ICV values. (struct gomp_offload_icv_list): New struct to store entries of device- specific ICV values that are copied to the device and back. (struct gomp_default_icv_values): New struct to store default values of ICVs according to the OpenMP standard. (parse_schedule): Generalized for different variants of OMP_SCHEDULE. (print_env_var_error): Function that prints an error for invalid values for ICVs. (parse_unsigned_long_1): Removed getenv. Generalized. (parse_unsigned_long): Likewise. (parse_int_1): Likewise. (parse_int): Likewise. (parse_int_secure): Likewise. (parse_unsigned_long_list): Likewise. (parse_target_offload): Likewise. (parse_bind_var): Likewise. (parse_stacksize): Likewise. (parse_boolean): Likewise. (parse_wait_policy): Likewise. (parse_allocator): Likewise. (omp_display_env): Extended to output different variants of environment variables. (print_schedule): New helper function for omp_display_env which prints the values of run_sched_var. (print_proc_bind): New helper function for omp_display_env which prints the values of proc_bind_var. (enum gomp_parse_type): Collection of types used for parsing environment variables. (ENTRY): Preprocess string lengths of environment variables. (OMP_VAR_CNT): Preprocess table size. (OMP_HOST_VAR_CNT): Likewise. (INT_MAX_STR_LEN): Constant for the maximal number of digits of a device number. (gomp_get_icv_flag): Returns if a flag for a particular ICV is set. (gomp_set_icv_flag): Sets a flag for a particular ICV. (print_device_specific_icvs): New helper function for omp_display_env to print device specific ICV values. (get_device_num): New helper function for parse_device_specific. Extracts the device number from an environment variable name. (get_icv_member_addr): Gets the memory address for a particular member of an ICV struct. (gomp_get_initial_icv_item): Get a list item of gomp_initial_icv_list. (initialize_icvs): New function to initialize a gomp_initial_icvs struct. (add_initial_icv_to_list): Adds an ICV struct to gomp_initial_icv_list. (startswith): Checks if a string starts with a given prefix. (initialize_env): Extended to parse the new syntax of environment variables. * icv-device.c (omp_get_max_teams): Added. (ialias): Likewise. (omp_set_num_teams): Likewise. * icv.c (omp_set_num_teams): Moved to icv-device.c. (omp_get_max_teams): Likewise. (ialias): Likewise. * libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Removed. (GOMP_ADDITIONAL_ICVS): New target-side struct that holds the designated ICVs of the target device. * libgomp.h (enum gomp_icvs): Collection of ICVs. (enum gomp_device_num): Definition of device numbers for _ALL, _DEV, and no suffix. (enum gomp_env_suffix): Collection of possible suffixes of environment variables. (struct gomp_initial_icvs): Contains all ICVs for which we need to store initial values. (struct gomp_default_icv):New struct to hold ICVs for which we need to store initial values. (struct gomp_icv_list): Definition of a linked list that is used for storing ICVs for the devices and also for _DEV, _ALL, and without suffix. (struct gomp_offload_icvs): New struct to hold ICVs that are copied to a device. (struct gomp_offload_icv_list): Definition of a linked list that holds device-specific ICVs that are copied to devices. (gomp_get_initial_icv_item): Get a list item of gomp_initial_icv_list. (gomp_get_icv_flag): Returns if a flag for a particular ICV is set. * libgomp.texi: Updated. * plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Extended to read further ICVs from the offload image. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise. * target.c (gomp_get_offload_icv_item): Get a list item of gomp_offload_icv_list. (get_gomp_offload_icvs): New. Returns the ICV values depending on the device num and the variable hierarchy. (gomp_load_image_to_device): Extended to copy further ICVs to a device. * testsuite/libgomp.c-c++-common/icv-5.c: New test. * testsuite/libgomp.c-c++-common/icv-6.c: New test. * testsuite/libgomp.c-c++-common/icv-7.c: New test. * testsuite/libgomp.c-c++-common/icv-8.c: New test. * testsuite/libgomp.c-c++-common/omp-display-env-1.c: New test. * testsuite/libgomp.c-c++-common/omp-display-env-2.c: New test.
2022-09-08openmp: Implement doacross(sink: omp_cur_iteration - 1)Jakub Jelinek4-0/+888
This patch implements doacross(sink: omp_cur_iteration - 1) that the previous patchset emitted a sorry on during omp expansion. It can be implemented with existing library functions. To recap, depend(source)/doacross(source:)/doacross(source:omp_cur_iteration) is implemented calling GOMP_doacross_post or GOMP_doacross_ull_post, called with an array of long or unsigned long long elements, one for all collapsed loops together and one for each further ordered loop if any. We initialize that array in each thread when grabbing further set of iterations and update it at the end of loops, so that it represents the current iteration (as 0 based counters). When the worksharing loop is created, we tell the library through another similar array the counts (the loop needs to be rectangular) in each dimension, first element is count of all logical iterations in the collapsed loops. depend(sink:v1 op N1, v2 op N2, ...) is then implemented by conditionally calling GOMP_doacross_wait/GOMP_doacross_ull_wait. For N? of 0 there is no check, otherwise if it wants to wait in a particular dimension for a previous iteration, we check that the corresponding iterator isn't the first one (or first few), where the previous iterator in that dimension would be out of range, and similarly for checking of next iteration in a dimension that it isn't the last one (or last few) where it would be similarly out of bounds. Then the collapsed loop counters are folded into a single 0 based counter (first argument) and then other 0 based iterations counters on what iteration it should wait for. Now, doacross(sink: omp_cur_iteration - 1) is supposed to wait for the previous logical iteration in the combined iteration space of all ordered loops. For the very first iteration in that combined iteration space it does nothing, there is no previous iteration. And similarly it does nothing if there are more ordered loops than collapsed loop and it isn't the first logical iteration of the combined loops inside of the collapsed loops, because as implemented we know the previous iteration in that case is always executed by the same thread as the current one. In the implementation, we use the same value as is stored in the first element of the array for GOMP_doacross_post/GOMP_doacross_ull_post, if that value is 0, we do nothing. The rest is different based on if ordered argument is equal to collapse or not. If it is, then we otherwise call GOMP_doacross_wait/GOMP_doacross_ull_wait with a single argument, one less than that counter we compare against 0. If ordered argument is bigger than collapse, we add a per-thread boolean variable .first.N, which we set to true at the start of the outermost ordered loop inside of the collapsed set of loops and set to false at the end of the innermost ordered loop. If .first.N is false, we don't do anything (we know the previous iteration was handled by the current thread and by my reading of the spec we don't need to emit even a memory barrier in that case, because it is just synchronization with the same thread), otherwise we call GOMP_doacross_wait/GOMP_doacross_ull_wait with the first argument one less than the counter we compare against 0, and then one less than 2nd and following counts if iterations we pass to the workshare initialization. If say .counts.N passed to the workshare initialization is { 256, 13, 5, 2 } for collapse(3) ordered(6) loop, then GOMP_doacross_post/GOMP_doacross_ull_post is called with arguments equal to .ordereda.N[0] - 1, 12, 4, 1. 2022-09-08 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-expand.cc (expand_omp_ordered_sink): Add CONT_BB argument. Add doacross(sink:omp_cur_iteration-1) support. (expand_omp_ordered_source_sink): Clear counts[fd->ordered + 1]. Adjust expand_omp_ordered_sink caller. (expand_omp_for_ordered_loops): If counts[fd->ordered + 1] is non-NULL, set that variable to true at the start of outermost non-collapsed loop and set it to false at the end of innermost ordered loop. (expand_omp_for_generic): If fd->ordered, allocate 1 + (fd->ordered - fd->collapse) further elements in counts array. Copy to counts + 2 + fd->ordered the counts of fd->collapse .. fd->ordered - 1 loop if any. gcc/testsuite/ * c-c++-common/gomp/doacross-7.c: New test. libgomp/ * libgomp.texi (OpenMP 5.2): Mention that omp_cur_iteration is now fully supported. * testsuite/libgomp.c/doacross-4.c: New test. * testsuite/libgomp.c/doacross-5.c: New test. * testsuite/libgomp.c/doacross-6.c: New test. * testsuite/libgomp.c/doacross-7.c: New test.
2022-08-26OpenMP: Support reverse offload (middle end part)Tobias Burnus4-0/+193
gcc/ChangeLog: * internal-fn.cc (expand_GOMP_TARGET_REV): New. * internal-fn.def (GOMP_TARGET_REV): New. * lto-cgraph.cc (lto_output_node, verify_node_partition): Mark 'omp target device_ancestor_host' as in_other_partition and don't error if absent. * omp-low.cc (create_omp_child_function): Mark as 'noclone'. * omp-expand.cc (expand_omp_target): For reverse offload, remove sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create empty-body nohost function. * omp-offload.cc (execute_omp_device_lower): Handle IFN_GOMP_TARGET_REV. (pass_omp_target_link::execute): For ACCEL_COMPILER, don't nullify fn argument for reverse offload libgomp/ChangeLog: * libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but refer to 'requires'. * testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test. * testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test. * testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test. * testsuite/libgomp.fortran/reverse-offload-1.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/reverse-offload-1.c: Remove dg-sorry. * c-c++-common/gomp/target-device-ancestor-4.c: Likewise. * gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise. * gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise. * c-c++-common/goacc/classify-kernels-parloops.c: Add 'noclone' to scan-tree-dump-times. * c-c++-common/goacc/classify-kernels-unparallelized-parloops.c: Likewise. * c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-serial.c: Likewise. * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise. * c-c++-common/goacc/kernels-loop-2.c: Likewise. * c-c++-common/goacc/kernels-loop-3.c: Likewise. * c-c++-common/goacc/kernels-loop-data-2.c: Likewise. * c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise. * c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise. * c-c++-common/goacc/kernels-loop-data-update.c: Likewise. * c-c++-common/goacc/kernels-loop-data.c: Likewise. * c-c++-common/goacc/kernels-loop-g.c: Likewise. * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise. * c-c++-common/goacc/kernels-loop-n.c: Likewise. * c-c++-common/goacc/kernels-loop-nest.c: Likewise. * c-c++-common/goacc/kernels-loop.c: Likewise. * c-c++-common/goacc/kernels-one-counter-var.c: Likewise. * c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: Likewise. * gfortran.dg/goacc/classify-kernels-parloops.f95: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/classify-parallel.f95: Likewise. * gfortran.dg/goacc/classify-serial.f95: Likewise. * gfortran.dg/goacc/kernels-loop-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data.f95: Likewise. * gfortran.dg/goacc/kernels-loop-n.f95: Likewise. * gfortran.dg/goacc/kernels-loop.f95: Likewise. * gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: Likewise.
2022-08-17OpenMP: Fix var replacement with 'simd' and linear-step vars [PR106548]Tobias Burnus1-0/+254
gcc/ChangeLog: PR middle-end/106548 * omp-low.cc (lower_rec_input_clauses): Use build_outer_var_ref for 'simd' linear-step values that are variable. libgomp/ChangeLog: PR middle-end/106548 * testsuite/libgomp.c/linear-2.c: New test.
2022-07-29Add libgomp.c-c++-common/pr106449-2.cTobias Burnus1-0/+64
This run-time test test pointer-based iteration with collapse, similar to the '(parallel) simd' test for PR106449 but for 'for'. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/pr106449-2.c: New test.
2022-07-29openmp: Fix up handling of non-rectangular simd loops with pointer type ↵Jakub Jelinek1-0/+62
iterators [PR106449] There were 2 issues visible on this new testcase, one that we didn't have special POINTER_TYPE_P handling in a few spots of expand_omp_simd - for pointers we need to use POINTER_PLUS_EXPR and need to have the non-pointer part in sizetype, for non-rectangular loop on the other side we can rely on multiplication factor 1, pointers can't be multiplied, without those changes we'd ICE. The other issue was that we put n2 expression directly into a comparison in a condition and regimplified that, for the &a[512] case that and with gimplification being destructed that unfortunately meant modification of original fd->loops[?].n2. Fixed by unsharing the expression. This was causing a runtime failure on the testcase. 2022-07-29 Jakub Jelinek <jakub@redhat.com> PR middle-end/106449 * omp-expand.cc (expand_omp_simd): Fix up handling of pointer iterators in non-rectangular simd loops. Unshare fd->loops[i].n2 or n2 before regimplifying it inside of a condition. * testsuite/libgomp.c-c++-common/pr106449.c: New test.
2022-07-12XFAIL 'offloading_enabled' diagnostics issue in ↵Thomas Schwinge1-3/+4
'libgomp.oacc-c-c++-common/reduction-5.c' [PR101551] Fix-up for recent commit 06b2a2abe26554c6f9365676683d67368cbba206 "Enhance '_Pragma' diagnostics verification in OMP C/C++ test cases". Supposedly it's the same issue as in <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101551#c2>, where I'd noted that: | [...] with an offloading-enabled build of GCC we're losing | "note: in expansion of macro '[...]'" diagnostics. | (Effectively '-ftrack-macro-expansion=0'?) PR middle-end/101551 libgomp/ * testsuite/libgomp.oacc-c-c++-common/reduction-5.c: XFAIL 'offloading_enabled' diagnostics issue.
2022-07-11Enhance '_Pragma' diagnostics verification in OMP C/C++ test casesThomas Schwinge1-3/+5
Follow-up to recent commit 0587cef3d7962a8b0f44779589ba2920dd3d71e5 "c: Fix location for _Pragma tokens [PR97498]". gcc/testsuite/ * c-c++-common/gomp/pragma-3.c: Enhance '_Pragma' diagnostics verification. * c-c++-common/gomp/pragma-5.c: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Enhance '_Pragma' diagnostics verification.