aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2022-06-21graphite: Extend SCoP detection dump outputFrederik Harwath2-23/+175
Extend dump output to make understanding why Graphite rejects to include a loop in a SCoP easier (for GCC developers). gcc/ChangeLog: * graphite-scop-detection.cc (scop_detection::can_represent_loop): Output reason for failure to dump file. (scop_detection::harmful_loop_in_region): Likewise. (scop_detection::graphite_can_represent_expr): Likewise. (scop_detection::stmt_has_simple_data_refs_p): Likewise. (scop_detection::stmt_simple_for_scop_p): Likewise. (print_sese_loop_numbers): New function. (scop_detection::add_scop): Use from here to print loops in rejected SCoP.
2022-06-21openacc: Move pass_oacc_device_lower after pass_graphiteFrederik Harwath43-79/+417
The OpenACC device lowering pass must run after the Graphite pass to allow for the use of Graphite for automatic parallelization of kernels regions in the future. Experimentation has shown that it is best, performancewise, to run pass_oacc_device_lower together with the related passes pass_oacc_loop_designation and pass_oacc_gimple_workers early after pass_graphite in pass_tree_loop, at least if the other tree loop passes are not adjusted. In particular, to enable vectorization which is crucial for GCN offloading, device lowering should happen before pass_vectorize. To bring the loops contained in the offloading functions into the shape expected by the loop vectorizer, we have to make sure that some passes that previously were executed only once before pass_tree_loop are also executed on the offloading functions. To ensure the execution of pass_oacc_device_lower if pass_tree_loop does not execute (no loops, no optimizations), we introduce two further copies of the pass to the pipeline that run if there are no loops or if no optimization is performed. gcc/ChangeLog: * omp-general.cc (oacc_get_fn_dim_size): Return 0 on missing "dims". * omp-offload.cc (pass_oacc_loop_designation::clone): New member function. (pass_oacc_gimple_workers::clone): Likewise. (pass_oacc_gimple_device_lower::clone): Likewise. * passes.cc (pass_data_no_loop_optimizations): New pass_data. (class pass_no_loop_optimizations): New pass. (make_pass_no_loop_optimizations): New function. * passes.def: Move pass_oacc_{loop_designation, gimple_workers, device_lower} into tree_loop, and add copies to pass_tree_no_loop and to new pass_no_loop_optimizations. Add copies of passes pass_ccp, pass_ipa_warn, pass_complete_unrolli, pass_backprop, pass_phiprop, pass_fix_loops after the OpenACC passes in pass_tree_loop. * tree-ssa-loop-ivcanon.cc (pass_complete_unroll::clone): New member function. (pass_complete_unrolli::clone): Likewise. * tree-ssa-loop.cc (pass_fix_loops::clone): Likewise. (pass_tree_loop_init::clone): Likewise. (pass_tree_loop_done::clone): Likewise. * tree-ssa-phiprop.cc (pass_phiprop::clone): Likewise. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Adjust expected output to pass name changes due to the pass reordering and cloning. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Likewise * testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Likewise. gcc/testsuite/ChangeLog: * gcc.dg/goacc/loop-processing-1.c: Adjust expected output to pass name changes due to the pass reordering and cloning. * c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-routine.c: Likewise. * c-c++-common/goacc/routine-nohost-1.c: Likewise. * c-c++-common/unroll-1.c: Likewise. * c-c++-common/unroll-4.c: Likewise. * gcc.dg/goacc/loop-processing-1.c: Likewise. * gcc.dg/tree-ssa/backprop-1.c: Likewise. * gcc.dg/tree-ssa/backprop-2.c: Likewise. * gcc.dg/tree-ssa/backprop-3.c: Likewise. * gcc.dg/tree-ssa/backprop-4.c: Likewise. * gcc.dg/tree-ssa/backprop-5.c: Likewise. * gcc.dg/tree-ssa/backprop-6.c: Likewise. * gcc.dg/tree-ssa/cunroll-1.c: Likewise. * gcc.dg/tree-ssa/cunroll-3.c: Likewise. * gcc.dg/tree-ssa/cunroll-9.c: Likewise. * gcc.dg/tree-ssa/ldist-17.c: Likewise. * gcc.dg/tree-ssa/loop-38.c: Likewise. * gcc.dg/tree-ssa/pr21463.c: Likewise. * gcc.dg/tree-ssa/pr45427.c: Likewise. * gcc.dg/tree-ssa/pr61743-1.c: Likewise. * gcc.dg/unroll-2.c: Likewise. * gcc.dg/unroll-3.c: Likewise. * gcc.dg/unroll-4.c: Likewise. * gcc.dg/unroll-5.c: Likewise. * gcc.dg/vect/vect-profile-1.c: Likewise. * c-c++-common/goacc/device-lowering-debug-optimization.c: New test. * c-c++-common/goacc/device-lowering-no-loops.c: New test. * c-c++-common/goacc/device-lowering-no-optimization.c: New test. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2022-06-21Fortran: delinearize multi-dimensional array accessesSandra Loosemore16-95/+292
The Fortran front end presently linearizes accesses to multi-dimensional arrays by combining the indices for the various dimensions into a series of explicit multiplies and adds with refactoring to allow CSE of invariant parts of the computation. Unfortunately this representation interferes with Graphite-based loop optimizations. It is difficult to recover the original multi-dimensional form of the access by the time loop optimizations run because parts of it have already been optimized away or into a form that is not easily recognizable, so it seems better to have the Fortran front end produce delinearized accesses to begin with, a set of nested ARRAY_REFs similar to the existing behavior of the C and C++ front ends. This is a long-standing problem that has previously been discussed e.g. in PR 14741 and PR61000. This patch is an initial implementation for explicit array accesses only; it doesn't handle the accesses generated during scalarization of whole-array or array-section operations, which follow a different code path. gcc/ * expr.cc (get_inner_reference): Handle NOP_EXPR like VIEW_CONVERT_EXPR. gcc/fortran/ * lang.opt (-param=delinearize=): New. * trans-array.cc (get_class_array_vptr): New, split from... (build_array_ref): ...here. (get_array_lbound, get_array_ubound): New, split from... (gfc_conv_array_ref): ...here. Additional code refactoring plus support for delinearization of the array access. gcc/testsuite/ * gfortran.dg/assumed_type_2.f90: Adjust patterns. * gfortran.dg/goacc/kernels-loop-inner.f95: Likewise. * gfortran.dg/graphite/block-3.f90: Remove xfails. * gfortran.dg/graphite/block-4.f90: Likewise. * gfortran.dg/inline_matmul_24.f90: Adjust patterns. * gfortran.dg/no_arg_check_2.f90: Likewise. * gfortran.dg/pr32921.f: Likewise. * gfortran.dg/reassoc_4.f: Disable delinearization for this test. Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
2022-06-21Fix gimple_debug_cfg declarationFrederik Harwath2-1/+6
Silence a warning. The argument type did not match the definition. gcc/ChangeLog: * tree-cfg.h (gimple_debug_cfg): Change argument type from int to dump_flags_t.
2022-06-21gomp/target-device-ancestor-*.f90: Fix testcase of OG11Tobias Burnus4-10/+17
Contrary to GCC 12 mainline, OG11 defers the error for 'omp requires reverse_offload' until runtime (via libgomp). Update the testcases accordingly. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/target-device-ancestor-2.f90: Remove dg-error for the requires-reverse_offload sorry. * gfortran.dg/gomp/target-device-ancestor-3.f90: Likewise. * gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.
2022-06-21OpenMP: Fix target device ancestor tests according to reverse_offload.Marcel Vollweiler4-3/+16
This patch removes the expectation that 'requires reverse_offload' is unsupported from some 'target device ancester' tests which were introduced in commit 03be3cfeef7b3811acb6c4a8da2fc5c1e25d3e4c. This is necessary since commit f5bfc65f9a6e1f69b17d3740d043d2fbda339e05 changed the behaviour for reverse_offload. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-device-ancestor-2.c: Remove message for unsupported reverse offload. * c-c++-common/gomp/target-device-ancestor-3.c: Likewise. * c-c++-common/gomp/target-device-ancestor-4.c: Likewise.
2022-06-21openacc: fix ICE for non-decl expression in non-contiguous array base-pointerChung-Lin Tang4-9/+43
Currently, we do not support cases like struct-members as the base-pointer for an OpenACC non-contiguous array. Mark such cases as unsupported in the C/C++ front-ends, instead of ICEing on them. gcc/c/ChangeLog: * c-typeck.cc (handle_omp_array_sections_1): Robustify non-contiguous array check and reject non-DECL base-pointer cases as unsupported. gcc/cp/ChangeLog: * semantics.cc (handle_omp_array_sections_1): Robustify non-contiguous array check and reject non-DECL base-pointer cases as unsupported.
2022-06-21[og11] OpenMP/OpenACC: Move array_ref/indirect_ref handling code out of ↵Julian Brown2-20/+33
extract_base_bit_offset At Richard Biener's suggestion, this patch undoes the following patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571712.html and moves the stripping of ARRAY_REFS/INDIRECT_REFS out of extract_base_bit_offset and back into the (two) call sites of the function. The difference between the two ways of looking through these nodes comes down to (I think) what processing has been done on the clause in question already: in the case where BASE_REF is non-NULL, we are processing an OMP_CLAUSE_DECL for the first time. Conversely, when BASE_REF is NULL, we are processing a node from the sorted list that is being constructed after a GOMP_MAP_STRUCT node. 2021-06-07 Julian Brown <julian@codesourcery.com> gcc/ * gimplify.cc (extract_base_bit_offset): Don't look through ARRAY_REFs or INDIRECT_REFs here. (build_struct_group): Reinstate previous behaviour for handling ARRAY_REFs/INDIRECT_REFs.
2022-06-21[og11] Rework indirect struct handling for OpenACC in gimplify.cJulian Brown8-58/+246
This patch reworks indirect struct handling in gimplify.c (i.e. for struct components mapped with "mystruct->a[0:n]", "mystruct->b", etc.), for OpenACC. The key observation leading to these changes was that component mappings of references-to-structures is already implemented and working, and indirect struct component handling via a pointer can work quite similarly. That lets us remove some earlier, special-case handling for mapping indirect struct component accesses for OpenACC, which required the pointed-to struct to be manually mapped before the indirect component mapping. With this patch, you can map struct components directly (e.g. an array slice "mystruct->a[0:n]") just like you can map a non-indirect struct component slice ("mystruct.a[0:n]"). Both references-to-pointers (with the former syntax) and references to structs (with the latter syntax) work now. For Fortran class pointers, we no longer re-use GOMP_MAP_TO_PSET for the class metadata (the structure that points to the class data and vptr) -- it is instead treated as any other struct. For C++, the struct handling also works for class members ("this->foo"), without having to explicitly map "this[:1]" first. For OpenACC, we permit chained indirect component references ("mystruct->a->b[0:n]"), though only the last part of such mappings will trigger an attach/detach operation. To properly use such a construct on the target, you must still manually map "mystruct->a[:1]" first -- but there's no need to map "mystruct[:1]" explicitly before that. This version of the patch avoids altering code paths for OpenMP, where possible. 2021-06-02 Julian Brown <julian@codesourcery.com> gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_clauses): Don't create GOMP_MAP_TO_PSET mappings for class metadata, nor GOMP_MAP_POINTER mappings for POINTER_TYPE_P decls. gcc/ * gimplify.cc (extract_base_bit_offset): Add BASE_IND and OPENMP parameters. Handle pointer-typed indirect references for OpenACC alongside reference-typed ones. (strip_components_and_deref, aggregate_base_p): New functions. (build_struct_group): Add pointer type indirect ref handling, including chained references, for OpenACC. Also handle references to structs for OpenACC. Conditionalise bits for OpenMP only where appropriate. (gimplify_scan_omp_clauses): Rework pointer-type indirect structure access handling to work more like the reference-typed handling for OpenACC only. * omp-low.cc (scan_sharing_clauses): Handle pointer-type indirect struct references, and references to pointers to structs also. gcc/testsuite/ * g++.dg/goacc/member-array-acc.C: New test. * g++.dg/gomp/member-array-omp.C: New test. libgomp/ * testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test. * testsuite/libgomp.oacc-c++/deep-copy-17.C: New test.
2022-06-21[og11] Refactor struct lowering for OpenACC/OpenMP in gimplify.cJulian Brown2-344/+602
This patch is a second attempt at refactoring struct component mapping handling for OpenACC/OpenMP during gimplification, after the patch I posted here: https://gcc.gnu.org/pipermail/gcc-patches/2018-November/510503.html And improved here, post-review: https://gcc.gnu.org/pipermail/gcc-patches/2019-November/533394.html This patch goes further, in that the struct-handling code is outlined into its own function (to create the "GOMP_MAP_STRUCT" node and the sorted list of nodes immediately following it, from a set of mappings of components of a given struct or derived type). I've also gone through the list-handling code and attempted to add comments documenting how it works to the best of my understanding, and broken out a couple of helper functions in order to (hopefully) have the code self-document better also. 2021-06-02 Julian Brown <julian@codesourcery.com> gcc/ * gimplify.cc (insert_struct_comp_map): Refactor function into... (build_struct_comp_nodes): This new function. Remove list handling and improve self-documentation. (insert_node_after, move_node_after, move_nodes_after, move_concat_nodes_after): New helper functions. (build_struct_group): New function to build up GOMP_MAP_STRUCT node groups to map struct components. Outlined from... (gimplify_scan_omp_clauses): Here. Call above function.
2022-06-21[og11] Unify ARRAY_REF/INDIRECT_REF stripping code in extract_base_bit_offsetJulian Brown2-21/+16
For historical reasons, it seems that extract_base_bit_offset unnecessarily used two different ways to strip ARRAY_REF/INDIRECT_REF nodes from component accesses. I verified that the two ways of performing the operation gave the same results across the whole testsuite (and several additional benchmarks). The code was like this since an earlier "mechanical" refactoring by me, first posted here: https://gcc.gnu.org/pipermail/gcc-patches/2018-November/510503.html It was never clear to me if there was an important semantic difference between the two ways of stripping the base before calling get_inner_reference, but it appears that there is not, so one can go away. 2021-06-02 Julian Brown <julian@codesourcery.com> gcc/ * gimplify.cc (extract_base_bit_offset): Unify ARRAY_REF/INDIRECT_REF stripping code in first call/subsequent call cases.
2022-06-21[og11] Rewrite GOMP_MAP_ATTACH_DETACH mappings unconditionallyJulian Brown2-9/+7
It never makes sense for a GOMP_MAP_ATTACH_DETACH mapping to survive beyond gimplify.c, so this patch rewrites such mappings to GOMP_MAP_ATTACH or GOMP_MAP_DETACH unconditionally (rather than checking for a list of types of OpenACC or OpenMP constructs), in cases where it hasn't otherwise been done already in the preceding code. 2021-06-02 Julian Brown <julian@codesourcery.com> gcc/ * gimplify.cc (gimplify_scan_omp_clauses): Simplify condition for changing GOMP_MAP_ATTACH_DETACH to GOMP_MAP_ATTACH or GOMP_MAP_DETACH.
2022-06-21c-c++-common/gomp/map-6.c: Fix dg-error due to mapping changesTobias Burnus2-2/+6
OpenMP 5 relaxed some repetition rules such that some double-use warnings no longer occur; that patch is not yet on mainline. gcc/testsuite/ * c-c++-common/gomp/map-6.c: Remove two dg-error.
2022-06-21Add -Wopenacc-parallelism to tests only in OG11Kwok Cheung Yeung2-0/+8
2021-04-30 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/testsuite/ * c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c: Add -Wopenacc-parallelism option.
2022-06-21Update expected messages in data-clause-1 testsKwok Cheung Yeung3-2/+7
The patch 'Merge non-contiguous array support patches' handles one of the non-contiguous cases such that it is no longer an error. 2021-04-29 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/testsuite/ * c-c++-common/goacc/data-clause-1.c (foo): Remove expected message. * g++.dg/goacc/data-clause-1.C (foo): Remove expected message.
2022-06-21Update expected messages in kernels-decompose-2 testsKwok Cheung Yeung3-8/+15
This changes expected messages that differ between mainline and OG11. On OG10, these messages were added in the patch: 081a01963ca8 Update expected messages, errors and warnings for "kernels" tests 2021-04-29 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-2.c (main): Update expected messages. * gfortran.dg/goacc/kernels-decompose-2.f95 (main): Update expected messages.
2022-06-21Fix is_oacc_parallel_or_serial for kernel regionsKwok Cheung Yeung2-1/+11
2021-04-07 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * omp-low.cc (is_oacc_parallel_or_serial): Handle GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_PARALLELIZED and GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GANG_SINGLE.
2022-06-21Update expected messages in OpenACC testsKwok Cheung Yeung12-165/+186
This updates the types of messages expected in the test, and the '-fopt-info' option used to request them. The phrasing of the expected messages has also changed somewhat and has been adjusted to match. 2021-04-07 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/testsuite/ * c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c: Update additional options and expected messages. * c-c++-common/goacc/note-parallelism-1-kernels-loop-auto.c: Likewise. * c-c++-common/goacc/note-parallelism-1-kernels-loop-independent_seq.c: Likewise. * c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Likewise. * c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c: Likewise. * c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c: Likewise. * c-c++-common/goacc/note-parallelism-combined-kernels-loop-independent_seq.c: Likewise. * c-c++-common/goacc/note-parallelism-kernels-conditional-loop-independent_seq.c: Likewise. * c-c++-common/goacc/note-parallelism-kernels-loop-auto.c: Likewise. * c-c++-common/goacc/note-parallelism-kernels-loop-independent_seq.c: Likewise. * c-c++-common/goacc/note-parallelism-kernels-loops.c: Likewise.
2022-06-21DWARF: late code range fixupAndrew Stubbs2-23/+64
Ensure that the parent DWARF subprograms of offload kernel functions have a code range, and are therefore not discarded by GDB. This is only necessary when the parent function does not actually exist in the final binary, which is commonly the case within the offload device's binary. This patch replaces 808bdf1bb29 and fdcb23540a2. It should be squashed with those before being posted upstream. gcc/ * dwarf2out.cc (notional_parents_list): New file variable. (gen_subprogram_die): Record offload kernel functions in notional_parents_list. (fixup_notional_parents): New function. (dwarf2out_finish): Call fixup_notional_parents. (dwarf2out_c_finalize): Reset notional_parents_list.
2022-06-21openmp: Scale type precision of collapsed iterator variableKwok Cheung Yeung3-7/+35
This sets the type precision of the collapsed iterator variable to the sum of the precision of the collapsed loop variables, up to a maximum of sizeof(long long) (i.e. 64-bits). 2021-03-01 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * omp-expand.cc (expand_oacc_for): Convert .tile variable to diff_type before multiplying. * omp-general.cc (omp_extract_for_data): Use accumulated precision of all collapsed for-loops as precision of iteration variable, up to the precision of a long long. libgomp/ * testsuite/libgomp.c-c++-common/collapse-4.c: New. * testsuite/libgomp.fortran/collapse5.f90: New.
2022-06-21Allow static constexpr fields in mappable types for C++Chung-Lin Tang5-1/+48
This patch is a merge of: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01246.html Static members in general disqualify a C++ class from being target mappable, but static constexprs are inline optimized away, so should not interfere. OpenMP 5.0 in general lifts the static member limitation, so this patch will probably further adjusted later. 2021-03-03 Chung-Lin Tang <cltang@codesourcery.com> gcc/cp/ChangeLog: * decl2.cc (cp_omp_mappable_type_1): Allow fields with DECL_DECLARED_CONSTEXPR_P to be mapped. gcc/testsuite/ChangeLog: * g++.dg/goacc/static-constexpr-1.C: New test. * g++.dg/gomp/static-constexpr-1.C: New test.
2022-06-21DWARF: fix ICE caused by offload debug fixAndrew Stubbs2-3/+20
This should be squashed with 808bdf1bb29 and fdcb23540a2 to go to mainline. gcc/ * dwarf2out.cc (gen_subprogram_die): Replace existing low/high PC attributes, rather than ICE.
2022-06-21OpenMP 5.0: requires directiveChung-Lin Tang13-15/+117
This is a merge of: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563393.html This patch completes more of the reverse_offload, unified_address, and unified_shared_memory clauses for the OpenMP 5.0 requires directive, including runtime verification of the offload target. (currently no offload devices actually support above features, only warning messages are emitted) This may possibly reverted/updated when a final patch is approved for mainline. 2021-02-02 Chung-Lin Tang <cltang@codesourcery.com> gcc/c/ChangeLog: * c-parser.cc (c_parser_declaration_or_fndef): Set OMP_REQUIRES_TARGET_USED in omp_requires_mask if function has "omp declare target" attribute. (c_parser_omp_target_data): Set OMP_REQUIRES_TARGET_USED in omp_requires_mask. (c_parser_omp_target_enter_data): Likewise. (c_parser_omp_target_exit_data): Likewise. (c_parser_omp_requires): Adjust to only mention "not implemented yet" for OMP_REQUIRES_DYNAMIC_ALLOCATORS. gcc/cp/ChangeLog: * parser.cc (cp_parser_simple_declaration): Set OMP_REQUIRES_TARGET_USED in omp_requires_mask if function has "omp declare target" attribute. (cp_parser_omp_target_data): Set OMP_REQUIRES_TARGET_USED in omp_requires_mask. (cp_parser_omp_target_enter_data): Likewise. (cp_parser_omp_target_exit_data): Likewise. (cp_parser_omp_requires): Adjust to only mention "not implemented yet" for OMP_REQUIRES_DYNAMIC_ALLOCATORS. gcc/fortran/ChangeLog: * openmp.cc (gfc_check_omp_requires): Fix REVERSE_OFFLOAD typo. (gfc_match_omp_requires): Adjust to only mention "not implemented yet" for OMP_REQUIRES_DYNAMIC_ALLOCATORS. * parse.cc ("tree.h"): Add include. ("omp-general.h"): Likewise. (gfc_parse_file): Add code to merge omp_requires to omp_requires_mask. gcc/ChangeLog: * omp-offload.cc (omp_finish_file): Add code to create OpenMP requires mask variable in .gnu.gomp_requires section if needed. gcc/testsuite/ChangeLog: * c-c++-common/gomp/requires-4.c: Remove prune of "not supported yet". * gfortran.dg/gomp/requires-4.f90: Fix REVERSE_OFFLOAD typo. * gfortran.dg/gomp/requires-8.f90: Likewise. include/ChangeLog: * gomp-constants.h (GOMP_REQUIRES_UNIFIED_ADDRESS): New symbol. (GOMP_REQUIRES_UNIFIED_SHARED_MEMORY): Likewise. (GOMP_REQUIRES_REVERSE_OFFLOAD): Likewise. libgcc/ChangeLog: * offloadstuff.c (__requires_mask_table): New symbol to mark start of .gnu.gomp_requires section. (__requires_mask_table_end): New symbol to mark end of .gnu.gomp_requires section. libgomp/ChangeLog: * libgomp-plugin.h (GOMP_OFFLOAD_supported_features): New declaration. * libgomp.h (struct gomp_device_descr): New 'supported_features_func' plugin hook field. * oacc-host.c (host_supported_features): New host hook function. (host_dispatch): Initialize 'supported_features_func' host hook. * plugin/plugin-gcn.c (GOMP_OFFLOAD_supported_features): New function. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_supported_features): Likewise. * target.c (<stdio.h>): Add include of standard header. (gomp_requires_mask): New static variable. (__requires_mask_table): New declaration. (__requires_mask_table_end): Likewise. (gomp_load_plugin_for_device): Add loading of 'supported_features' hook. (gomp_target_init): Add code to summarize .gnu._gomp_requires section mask values, emit error if inconsistency found. * testsuite/libgomp.c-c++-common/requires-1.c: New test. * testsuite/libgomp.c-c++-common/requires-1-aux.c: New file linked with above test. * testsuite/libgomp.c-c++-common/requires-2.c: New test. * testsuite/libgomp.c-c++-common/requires-2-aux.c: New file linked with above test. liboffloadmic/ChangeLog: * plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_supported_features): New function.
2022-06-21OpenMP 5.0: Allow multiple clauses mapping same variableChung-Lin Tang6-30/+86
This is a merge of: https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562081.html This patch now allows multiple clauses on the same construct to map the same variable, which was not valid in OpenMP 4.5, but now allowed in 5.0. This may possibly reverted/updated when a final patch is approved for mainline. 2021-02-01 Chung-Lin Tang <cltang@codesourcery.com> gcc/cp/ChangeLog: * semantics.cc (finish_omp_clauses): Adjust to allow duplicate mapped variables for OpenMP. gcc/ChangeLog: * omp-low.cc (install_var_field): Add new 'tree key_expr = NULL_TREE' default parameter. Set splay-tree lookup key to key_expr instead of var if key_expr is non-NULL. Adjust call to install_parm_decl. Update comments. (scan_sharing_clauses): Use clause tree expression as splay-tree key for map/to/from and OpenACC firstprivate cases when installing the variable field into the send/receive record type. (maybe_lookup_field_in_outer_ctx): Add code to search through construct clauses instead of entirely based on splay-tree lookup. (lower_oacc_reductions): Adjust to find map-clause of reduction variable, then create receiver-ref. (lower_omp_target): Adjust to lookup var field using clause expression. gcc/testsuite/ChangeLog: * c-c++-common/gomp/clauses-2.c: Adjust testcase.
2022-06-21Correct fix offload dwarf infoAndrew Stubbs2-3/+6
The previous patch wasn't quite right, apparently. Somehow the behaviour changed after another clean build? This tweak fixes it. This patch should be squashed with fdcb23540a2 to go to mainline. gcc/ChangeLog: * dwarf2out.cc (gen_subprogram_die): Check offload attributes only.
2022-06-21DWARF address space for variablesAndrew Stubbs2-0/+14
Add DWARF address class attributes for variables that exist outside the generic address space. In particular, this is the case for gang-private variables in OpenACC offload kernels. gcc/ChangeLog: * dwarf2out.cc (add_location_or_const_value_attribute): Set DW_AT_address_class, if appropriate.
2022-06-21Fix offload dwarf infoAndrew Stubbs2-0/+19
Add a notional code range to the notional parent function of offload kernel functions. This is enough to prevent GDB discarding them. gcc/ChangeLog: * dwarf2out.cc (gen_subprogram_die): Add high/low_pc attributes for parents of offload kernels.
2022-06-21[og10] vect: Add target hook to prefer gather/scatter instructionsJulian Brown6-2/+35
For AMD GCN, the instructions available for loading/storing vectors are always scatter/gather operations (i.e. there are separate addresses for each vector lane), so the current heuristic to avoid gather/scatter operations with too many elements in get_group_load_store_type is counterproductive. Avoiding such operations in that function can subsequently lead to a missed vectorization opportunity whereby later analyses in the vectorizer try to use a very wide array type which is not available on this target, and thus it bails out. The attached patch adds a target hook to override the "single_element_p" heuristic in the function as a target hook, and activates it for GCN. This allows much better code to be generated for affected loops. 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * doc/tm.texi.in (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Add documentation hook. * doc/tm.texi: Regenerate. * target.def (prefer_gather_scatter): Add target hook under vectorizer. * tree-vect-stmts.cc (get_group_load_store_type): Optionally prefer gather/scatter instructions to scalar/elementwise fallback. * config/gcn/gcn.cc (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Define hook.
2022-06-21[og10] openacc: Adjust loop lowering for AMD GCNJulian Brown2-35/+131
This patch adjusts OpenACC loop lowering in the AMD GCN target compiler in such a way that the autovectorizer can vectorize the "vector" dimension of those loops in more cases. Rather than generating "SIMT" code that executes a scalar instruction stream for each lane of a vector in lockstep, for GCN we model the GPU like a typical CPU, with separate instructions to operate on scalar and vector data. That means that unlike other offload targets, we rely on the autovectorizer to handle the innermost OpenACC parallelism level, which is "vector". Because of this, the OpenACC builtin functions to return the current vector lane and the vector width return 0 and 1 respectively, despite the native vector width being 64 elements wide. This allows generated code to work with our chosen compilation model, but the way loops are lowered in omp-offload.c:oacc_xform_loop does not understand the discrepancy between logical (OpenACC) and physical vector sizes correctly. That means that if a loop is partitioned over e.g. the worker AND vector dimensions, we actually lower with unit vector size -- meaning that if we then autovectorize, we end up trying to vectorize over the "worker" dimension rather than the vector one! Then, because the number of workers is not fixed at compile time, that means the autovectorizer has a hard time analysing the loop and thus vectorization often fails entirely. We can fix this by deducing the true vector width in oacc_xform_loop, and using that when we are on a "non-SIMT" offload target. We can then rearrange how loops are lowered in that function so that the loop form fed to the autovectorizer is more amenable to vectorization -- namely, the innermost step is set to process each loop iteration sequentially. For some benchmarks, allowing vectorization to succeed leads to quite impressive performance improvements -- I've observed between 2.5x and 40x on one machine/GPU combination. The low-level builtins available to user code (__builtin_goacc_parlevel_id and __builtin_goacc_parlevel_size) continue to return 0/1 respectively for the vector dimension for AMD GCN, even if their containing loop is vectorized -- that's a quirk that we might possibly want to address at some later date. Only non-"chunking" loops are handled at present. "Chunking" loops are still lowered as before. 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * omp-offload.cc (oacc_thread_numbers): Add VF_BY_VECTORIZER parameter. Add overloaded wrapper for previous arguments & behaviour. (oacc_xform_loop): Lower vector loops to iterate a multiple of omp_max_vf times over contiguous steps on non-SIMT targets. libgomp/ * testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Adjust for loop lowering changes. * testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-wv-1.c: Likewise.
2022-06-21dwarf: Multi-register CFI address supportAndrew Stubbs3-1/+36
Add support for architectures such as AMD GCN, in which the pointer size is larger than the register size. This allows the CFI information to include multi-register locations for the stack pointer, frame pointer, and return address. Note that this uses a newly proposed DWARF operator DW_OP_LLVM_piece_end, which is currently only recognized by the ROCGDB debugger from AMD. The exact name and encoding for this operator is subject to change if and when the DWARF standard accepts it. gcc/ChangeLog: * dwarf2cfi.cc (get_cfa_from_loc_descr): Support register spans with DW_OP_piece and DW_OP_LLVM_piece_end. * dwarf2out.cc (build_cfa_loc): Support register spans. include/ChangeLog: * dwarf2.def (DW_OP_LLVM_piece_end): New extension operator.
2022-06-21Relax some restrictions on the loop bound in kernels loop annotation.Sandra Loosemore5-9/+205
OpenACC loop semantics require that the loop bound be computable before entering the loop, rather than the C/C++ semantics where the end test is evaluated on every iteration. Formerly the kernels loop annotater permitted only constants and variables not modified in the loop body in the loop bound expression. This patch relaxes those restrictions somewhat to allow many forms of expressions involving such constants and variables, including calls to constant functions. 2020-08-30 Sandra Loosemore <sandra@codesourcery.com> gcc/c-family/ * c-omp.cc (end_test_ok_for_annotation_r): New. (end_test_ok_for_annotation): New. (check_and_annotate_for_loop): Use the new helper function. gcc/testsuite/ * c-c++-common/goacc/kernels-loop-annotation-21.c: New. * c-c++-common/goacc/kernels-loop-annotation-22.c: New.
2022-06-21Clean up loop variable extraction in OpenACC kernels loop annotation.Sandra Loosemore2-98/+104
The code for identifying annotatable loops in OpenACC kernels regions previously looked for the loop variable as the left-hand side of the comparison in the loop end test. However, front end optimizations sometimes switch the sense of the comparison, making this method unreliable. In particular, it's ambiguous when both operands to the end test comparison are local variables. This patch reorders the loop processing to identify the loop variable from the initializer, rather than the end test. The processing of the end test then just checks that one of the operands to the comparison matches the variable appearing in the initializer. Much of the patch is code refactoring, moving the initializer analysis out of annotate_for_loop to check_and_annotate_for_loop so it can be performed earlier. 2020-08-30 Sandra Loosemore <sandra@codesourcery.com> gcc/c-family/ * c-omp.cc (annotate_for_loop): Move initializer processing... (check_and_annotate_for_loop): ... to here. Allow the loop variable as either operand to the condition.
2022-06-21Fix patterns in Fortran tests for kernels loop annotation.Sandra Loosemore15-14/+32
Several of the Fortran tests for kernels loop annotation were failing due to changes in the formatting of "acc loop" constructs in the dump file. Now the "auto" clause appears first, instead of after "private". 2020-08-23 Sandra Loosemore <sandra@codesourcery.com> gcc/testsuite/ * gfortran.dg/goacc/kernels-loop-annotation-1.f95: Update expected output. * gfortran.dg/goacc/kernels-loop-annotation-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-3.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-4.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-5.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-6.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-7.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-8.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-11.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-12.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-13.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-14.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-15.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-16.f95: Likewise.
2022-06-21Permit calls to builtins and intrinsics in kernels loops.Sandra Loosemore7-7/+85
This tweak to the OpenACC kernels loop annotation relaxes the restrictions on function calls in the loop body. Normally calls to functions not explicitly marked with a parallelism attribute are not permitted, but C/C++ builtins and Fortran intrinsics have known semantics so we can generally permit those without restriction. If any turn out to be problematical, we can add on here to recognize them, or in the processing of the "auto" annotations. 2020-08-22 Sandra Loosemore <sandra@codesourcery.com> gcc/c-family/ * c-omp.cc (annotate_loops_in_kernels_regions): Test for calls to builtins. gcc/fortran/ * openmp.cc (check_expr_for_invalid_calls): Check for intrinsic functions. gcc/testsuite/ * c-c++-common/goacc/kernels-loop-annotation-20.c: New. * gfortran.dg/goacc/kernels-loop-annotation-20.f95: New.
2022-06-21Update dg-* in gfortran.dg/gomp/pr67500.f90Tobias Burnus2-4/+9
Contrary to GCC 11, OG10 uses an error instead of a warning, cf. commit 271c7fef548a86676d304b1eb2be5c0d47280bd6. gcc/testsuite/ * gfortran.dg/gomp/pr67500.f90: Change dg-warning to dg-error.
2022-06-21Annotate inner loops in "acc kernels loop" directives (Fortran).Sandra Loosemore8-9/+151
Normally explicit loop directives in a kernels region inhibit automatic annotation of other loops in the same nest, on the theory that users have indicated they want manual control over that section of code. However there seems to be an expectation in user code that the combined "kernels loop" directive should still allow annotation of inner loops. This patch implements this behavior in Fortran. 2020-08-19 Sandra Loosemore <sandra@codesourcery.com> gcc/fortran/ * openmp.cc (annotate_do_loops_in_kernels): Handle EXEC_OACC_KERNELS_LOOP separately to permit annotation of inner loops in a combined "acc kernels loop" directive. gcc/testsuite/ * gfortran.dg/goacc/kernels-loop-annotation-18.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-19.f95: New. * gfortran.dg/goacc/combined-directives.f90: Adjust expected patterns. * gfortran.dg/goacc/private-explicit-kernels-1.f95: Likewise. * gfortran.dg/goacc/private-predetermined-kernels-1.f95: Likewise.
2022-06-21Annotate inner loops in "acc kernels loop" directives (C/C++).Sandra Loosemore6-13/+78
Normally explicit loop directives in a kernels region inhibit automatic annotation of other loops in the same nest, on the theory that users have indicated they want manual control over that section of code. However there seems to be an expectation in user code that the combined "kernels loop" directive should still allow annotation of inner loops. This patch implements this behavior for C and C++. 2020-08-19 Sandra Loosemore <sandra@codesourcery.com> gcc/c-family/ * c-omp.cc (annotate_loops_in_kernels_regions): Process inner loops in combined "acc kernels loop" directives. gcc/testsuite/ * c-c++-common/goacc/kernels-loop-annotation-18.c: New. * c-c++-common/goacc/kernels-loop-annotation-19.c: New. * c-c++-common/goacc/combined-directives.c: Adjust expected patterns.
2022-06-21Add a "combined" flag for "acc kernels loop" etc directives.Sandra Loosemore8-13/+57
2020-08-19 Sandra Loosemore <sandra@codesourcery.com> gcc/ * tree.h (OACC_LOOP_COMBINED): New. gcc/c/ * c-parser.cc (c_parser_oacc_loop): Set OACC_LOOP_COMBINED. gcc/cp/ * parser.cc (cp_parser_oacc_loop): Set OACC_LOOP_COMBINED. gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_do): Add combined parameter, use it to set OACC_LOOP_COMBINED. Update all call sites.
2022-06-21Fix gfortran.dg/goacc/pr70828.f90 testcaseKwok Cheung Yeung2-2/+7
Array mapping was changed by the patch '[OpenMP, Fortran] Add structure/derived-type element mapping'. 2020-08-19 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/testsuite/ * gfortran.dg/goacc/pr70828.f90: Update expected output in Gimple dump.
2022-06-21XFAIL tests in gfortran.dg/goacc/loop-2-kernels.f95Kwok Cheung Yeung2-8/+12
The C-equivalent version of the test (c-c++-common/goacc/loop-2-kernels.c) has these tests XFAILed in the commit 'Make new OpenACC kernels conversion the default; adjust and add tests' (commit 757f56ddc43fd80bb8740222ec352111b26d66e9), so the Fortran version should be XFAILed too. 2020-07-24 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/testsuite/ * gfortran.dg/goacc/loop-2-kernels.f95: Add XFAILs.
2022-06-21Fix failures in ↵Kwok Cheung Yeung2-8/+13
c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c testcase This should have been part of 'Update expected messages, errors and warnings for "kernels" tests' (commit 081a01963ca8db7ddaaf5871d281321454fd3246). 2020-07-24 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/testsuite/ * c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c (main): Remove obsolete expected messages.
2022-06-21Fix goacc/noncontig_array-1.c testcaseKwok Cheung Yeung2-0/+5
2020-06-02 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/testsuite/ * c-c++-common/goacc/noncontig_array-1.c: Dump Gimple pass.
2022-06-21OpenACC: fix privatization of by-reference arraysTobias Burnus2-1/+7
Replacing of a by-reference variable in a private clause by a local variable makes sense; however, for arrays, the size is not directly known by the type. This causes an ICE via create_tmp_var which indirectly invokes force_constant_size in this case - but the latter only handled Ada. gcc/ChangeLog: * gimplify.cc (localize_reductions): Do not create local variable for privatized arrays.
2022-06-21Fix bug in processing of array dimensions in data clauses.Sandra Loosemore2-0/+15
The g++ front end wraps the array length and low_bound values in NON_LVALUE_EXPR, causing the subsequent tests for INTEGER_CST to fail. The test case c-c++-common/goacc/kernels-loop-annotation-1.c was tickling this bug and giving bogus errors in g++ because it was falling through to dynamic array code instead of recognizing the constant bounds. This patch was posted upstream here https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542694.html but not yet committed. It may be that some other fix for this problem is implemented on mainline instead; check before merging this patch. 2020-03-31 Sandra Loosemore <sandra@codesourcery.com> gcc/cp/ * semantics.cc (handle_omp_array_sections_1): Call STRIP_NOPS on length and low_bound; (handle_omp_array_sections): Likewise.
2022-06-21Additional Fortran testsuite fixes for kernels loops annotation pass.Sandra Loosemore4-9/+18
These testsuite fixes are specific to the og10 branch, so are being segregated from the ones that apply to mainline in a separate commit from the main Fortran kernels loop annotation patch. 2020-03-27 Sandra Loosemore <sandra@codesourcery.com> gcc/testsuite/ * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Adjust line numbering. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/kernels-decompose-2.f95: Add -fno-openacc-kernels-annotate-loops.
2022-06-21Kernels loops annotation: Fortran.Sandra Loosemore34-0/+1025
This patch implements the Fortran support for adding "#pragma acc loop auto" annotations to loops in OpenACC kernels regions. It implements the same -fopenacc-kernels-annotate-loops and -Wopenacc-kernels-annotate-loops options that were previously added (and documented) for the C/C++ front ends. 2020-03-27 Sandra Loosemore <sandra@codesourcery.com> Gergö Barany <gergo@codesourcery.com> gcc/fortran/ * gfortran.h (gfc_oacc_annotate_loops_in_kernels_regions): Declare. * lang.opt (Wopenacc-kernels-annotate-loops): New. (fopenacc-kernels-annotate-loops): New. * openmp.cc: Include options.h. (enum annotation_state, enum annotation_result): New. (check_code_for_invalid_calls): New. (check_expr_for_invalid_calls): New. (check_for_invalid_calls): New. (annotate_do_loop): New. (annotate_do_loops_in_kernels): New. (compute_goto_targets): New. (gfc_oacc_annotate_loops_in_kernels_regions): New. * parse.cc (gfc_parse_file): Handle -fopenacc-kernels-annotate-loops. gcc/testsuite/ * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Add -fno-openacc-kernels-annotate-loops option. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/common-block-3.f90: Likewise. * gfortran.dg/goacc/kernels-loop-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data.f95: Likewise. * gfortran.dg/goacc/kernels-loop-n.f95: Likewise. * gfortran.dg/goacc/kernels-loop.f95: Likewise. * gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: Likewise. * gfortran.dg/goacc/kernels-loop-annotation-1.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-2.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-3.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-4.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-5.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-6.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-7.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-8.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-9.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-10.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-11.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-12.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-13.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-14.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-15.f95: New. * gfortran.dg/goacc/kernels-loop-annotation-16.f95: New.
2022-06-21Add -fno-openacc-kernels-annotate-loops option to more testcases.Sandra Loosemore4-0/+10
The testcases being tweaked here are present on the og10 branch but not on FSF mainline. 2020-03-27 Sandra Loosemore <sandra@codesourcery.com> gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-2.c: Add -fno-openacc-kernels-annotate-loops. * c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Likewise. * c-c++-common/goacc/note-parallelism-kernels-loops.c: Likewise.
2022-06-21Kernels loops annotation: C and C++.Sandra Loosemore49-61/+1402
This patch detects loops in kernels regions that are candidates for parallelization, and adds "#pragma acc loop auto" annotations to them. This annotation is controlled by the -fopenacc-kernels-annotate-loops option, which is enabled by default. -Wopenacc-kernels-annotate-loops can be used to produce diagnostics about loops that cannot be annotated. 2020-03-27 Sandra Loosemore <sandra@codesourcery.com> Kernels loops annotation: C and C++. gcc/c-family/ * c-common.h (c_oacc_annotate_loops_in_kernels_regions): Declare. * c-omp.cc: Include tree-iterator.h (enum annotation_state): New. (struct annotation_info): New. (do_not_annotate_loop): New. (do_not_annotate_loop_nest): New. (annotation_error): New. (c_finish_omp_for_internal): Split from c_finish_omp_for. Use annotation_error function. Code refactoring to avoid destructive changes that cannot be undone in case of error. (is_local_var): New. (lang_specific_unwrap_initializer): New. (annotate_for_loop): New. (check_and_annotate_for_loop): New. (annotate_loops_in_kernels_regions): New. (c_oacc_annotate_loops_in_kernels_regions): New. * c.opt (Wopenacc-kernels-annotate-loops): New. (fopenacc-kernels-annotate-loops): New. gcc/c/ * c-decl.cc (c_unwrap_for_init): New. (finish_function): Call c_oacc_annotate_loops_in_kernels_regions. gcc/cp/ * decl.cc (cp_unwrap_for_init): New. (finish_function): Call c_oacc_annotate_loops_in_kernels_regions. gcc/ * doc/invoke.texi (Option Summary): Add entries for -Wopenacc-kernels-annotate-loops and -fno-openacc-kernels-annotate-loops. (Warning Options): Document -Wopenacc-kernels-annotate-loops. (Optimization Options): Document -fno-openacc-kernels-annotate-loops. gcc/testsuite/ * c-c++-common/goacc/classify-kernels-unparallelized.c: Add -fno-openacc-kernels-annotate-loops option. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/kernels-counter-var-redundant-load.c: Likewise. * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise. * c-c++-common/goacc/kernels-double-reduction-n.c: Likewise. * c-c++-common/goacc/kernels-double-reduction.c: Likewise. * c-c++-common/goacc/kernels-loop-2.c: Likewise. * c-c++-common/goacc/kernels-loop-3.c: Likewise. * c-c++-common/goacc/kernels-loop-data-2.c: Likewise. * c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise. * c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise. * c-c++-common/goacc/kernels-loop-data-update.c: Likewise. * c-c++-common/goacc/kernels-loop-data.c: Likewise. * c-c++-common/goacc/kernels-loop-g.c: Likewise. * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise. * c-c++-common/goacc/kernels-loop-n.c: Likewise. * c-c++-common/goacc/kernels-loop-nest.c: Likewise. * c-c++-common/goacc/kernels-loop.c: Likewise. * c-c++-common/goacc/kernels-one-counter-var.c: Likewise. * c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: Likewise. * c-c++-common/goacc/kernels-reduction.c: Likewise. * c-c++-common/goacc/kernels-loop-annotation-1.c: New. * c-c++-common/goacc/kernels-loop-annotation-2.c: New. * c-c++-common/goacc/kernels-loop-annotation-3.c: New. * c-c++-common/goacc/kernels-loop-annotation-4.c: New. * c-c++-common/goacc/kernels-loop-annotation-5.c: New. * c-c++-common/goacc/kernels-loop-annotation-6.c: New. * c-c++-common/goacc/kernels-loop-annotation-7.c: New. * c-c++-common/goacc/kernels-loop-annotation-8.c: New. * c-c++-common/goacc/kernels-loop-annotation-9.c: New. * c-c++-common/goacc/kernels-loop-annotation-10.c: New. * c-c++-common/goacc/kernels-loop-annotation-11.c: New. * c-c++-common/goacc/kernels-loop-annotation-12.c: New. * c-c++-common/goacc/kernels-loop-annotation-13.c: New. * c-c++-common/goacc/kernels-loop-annotation-14.c: New. * c-c++-common/goacc/kernels-loop-annotation-15.c: New. * c-c++-common/goacc/kernels-loop-annotation-16.c: New. * c-c++-common/goacc/kernels-loop-annotation-17.c: New.
2022-06-21Fix vector handling for firstprivate of <= pointer sizeTobias Burnus2-1/+7
Test case is the existing libgomp.oacc-c++/firstprivate-mappings-1.C. * omp-low.c (convert_from_firstprivate_int): Use VIEW_CONVERT also for vectors.
2022-06-21Fortran "declare create"/allocate support for OpenACCJulian Brown13-27/+222
2018-10-04 Cesar Philippidis <cesar@codesourcery.com> Julian Brown <julian@codesourcery.com> gcc/ * omp-low.cc (scan_sharing_clauses): Update handling of OpenACC declare create, declare copyin and declare deviceptr to have local lifetimes. (convert_to_firstprivate_int): Handle pointer types. (convert_from_firstprivate_int): Likewise. Create local storage for the values being pointed to. Add new orig_type argument. (lower_omp_target): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}. Add orig_type argument to convert_from_firstprivate_int call. Allow pointer types with GOMP_MAP_FIRSTPRIVATE_INT. Don't privatize firstprivate VLAs. * tree-pretty-print.cc (dump_omp_clause): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}. gcc/fortran/ * gfortran.h (enum gfc_omp_map_op): Add OMP_MAP_DECLARE_ALLOCATE, OMP_MAP_DECLARE_DEALLOCATE. (gfc_omp_clauses): Add update_allocatable. * trans-array.cc (gfc_array_allocate): Call gfc_trans_oacc_declare_allocate for decls that have oacc_declare_create attribute set. * trans-decl.cc (find_module_oacc_declare_clauses): Relax oacc_declare_create to OMP_MAP_ALLOC, and oacc_declare_copyin to OMP_MAP_TO, in order to match OpenACC 2.5 semantics. * trans-openmp.cc (gfc_trans_omp_clauses): Use GOMP_MAP_ALWAYS_POINTER (for update directive) or GOMP_MAP_FIRSTPRIVATE_POINTER (otherwise) for allocatable scalar decls. Handle OMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE} clauses. (gfc_trans_oacc_executable_directive): Use GOMP_MAP_ALWAYS_POINTER for allocatable scalar data clauses inside acc update directives. (gfc_trans_oacc_declare_allocate): New function. * trans-stmt.cc (gfc_trans_allocate): Call gfc_trans_oacc_declare_allocate for decls with oacc_declare_create attribute set. (gfc_trans_deallocate): Likewise. * trans.h (gfc_trans_oacc_declare_allocate): Declare. gcc/testsuite/ * gfortran.dg/goacc/declare-allocatable-1.f90: New test. include/ * gomp-constants.h (enum gomp_map_kind): Define GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE} and GOMP_MAP_FLAG_SPECIAL_4. libgomp/ * oacc-mem.c (gomp_acc_declare_allocate): New function. * oacc-parallel.c (GOACC_enter_exit_data): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}. * testsuite/libgomp.oacc-fortran/allocatable-scalar.f90: New test. * testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New test. * testsuite/libgomp.oacc-fortran/declare-allocatable-2.f90: New test. * testsuite/libgomp.oacc-fortran/declare-allocatable-3.f90: New test. * testsuite/libgomp.oacc-fortran/declare-allocatable-4.f90: New test. 2020-02-19 Julian Brown <julian@codesourcery.com> gcc/fortran/ * trans-openmp.cc (gfc_omp_check_optional_argument): Handle non-decl case. gcc/ * gimplify.cc (gimplify_scan_omp_clauses): Handle GOMP_MAP_DECLARE_ALLOCATE and GOMP_MAP_DECLARE_DEALLOCATE. libgomp/ * libgomp.h (gomp_acc_declare_allocate): Remove prototype. * oacc-mem.c (gomp_acc_declare_allocate): Make static. Add POINTER argument. Use acc_delete instead of acc_free. Handle scalar mappings. (find_group_last): Handle GOMP_MAP_DECLARE_ALLOCATE and GOMP_MAP_DECLARE_DEALLOCATE groupings. (goacc_enter_data_internal): Fix kind check for GOMP_MAP_DECLARE_ALLOCATE. Pass new pointer argument to gomp_acc_declare_allocate. (goacc_exit_data_internal): Unlock device mutex around gomp_acc_declare_allocate call. Pass new pointer argument. Handle group pointer mapping for deallocate. 2021-04-07 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * oacc-mem.c (goacc_enter_data_internal): Unlock mutex before calling gomp_acc_declare_allocate and relock it afterwards.