aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-02-09openmp: Non-rectangular loop support for non-composite worksharing loops and ↵Jakub Jelinek7-47/+1146
distribute This implements the fallback mentioned in https://gcc.gnu.org/pipermail/gcc/2020-June/232874.html Special cases for triangular loops etc. to follow later, also composite constructs not supported yet (need to check the passing of temporaries around) and lastprivate might not give the same answers as serial loop if the last innermost body iteration isn't the last one for some of the outer loops (that will need to be solved separately together with rectangular loops that have no innermost body iterations, but some of the outer loops actually iterate). Also, simd needs work. 2020-06-27 Jakub Jelinek <jakub@redhat.com> * omp-general.h (struct omp_for_data_loop): Add non_rect_referenced member, move outer member. (struct omp_for_data): Add first_nonrect and last_nonrect members. * omp-general.c (omp_extract_for_data): Initialize first_nonrect, last_nonrect and non_rect_referenced members. * omp-expand.c (expand_omp_for_init_counts): Handle non-rectangular loops. (expand_omp_for_init_vars): Add nonrect_bounds parameter. Handle non-rectangular loops. (extract_omp_for_update_vars): Likewise. (expand_omp_for_generic, expand_omp_for_static_nochunk, expand_omp_for_static_chunk, expand_omp_simd, expand_omp_taskloop_for_outer, expand_omp_taskloop_for_inner): Adjust expand_omp_for_init_vars and extract_omp_for_update_vars callers. (expand_omp_for): Don't sorry on non-composite worksharing-loop or distribute. * testsuite/libgomp.c/loop-17.c: New test. * testsuite/libgomp.c/loop-18.c: New test. (cherry picked from commit aed3ab253dada2b7d2ed63cc6a8e15e263d5dd35)
2021-02-09openmp: Fix two pastos in non-rect loop OpenMP lowering.Jakub Jelinek2-2/+9
2020-06-24 Jakub Jelinek <jakub@redhat.com> * omp-low.c (lower_omp_for): Fix two pastos. (cherry picked from commit f0008858dec9b16da153b948834abb20b9f1ab32)
2021-02-09openmp: Compute triangular loop number of iterations at compile timeJakub Jelinek2-25/+210
2020-06-22 Jakub Jelinek <jakub@redhat.com> * omp-general.c (omp_extract_for_data): For triangular loops with all loop invariant expressions constant where the innermost loop is executed at least once compute number of iterations at compile time. (cherry picked from commit c154b8bc56831e4d9d421c52c8fcf95c570255ad)
2021-02-09openmp: Initial part of OpenMP 5.0 non-rectangular loop supportJakub Jelinek28-146/+1038
OpenMP 5.0 adds support for non-rectangular loop collapses, e.g. triangular and more complex. This patch deals just with the diagnostics so that they aren't rejected immediately as before. As the spec generally requires as before that the iteration variable initializer and bound in the comparison as invariant vs. the outermost loop, and just add some exceptional forms that can violate that, we need to avoid folding the expressions until we can detect them and in order to avoid folding it later on, I chose to use a TREE_VEC in those expressions to hold the var_outer * expr1 + expr2 triplet, the patch adds pretty-printing of that, gimplification etc. and just sorry_at during omp expansion for now. The next step will be to implement the different cases of that one by one. 2020-06-16 Jakub Jelinek <jakub@redhat.com> gcc/ * tree.h (OMP_FOR_NON_RECTANGULAR): Define. * gimplify.c (gimplify_omp_for): Diagnose schedule, ordered or dist_schedule clause on non-rectangular loops. Handle gimplification of non-rectangular lb/b expressions. When changing iteration variable, adjust also non-rectangular lb/b expressions referencing that. * omp-general.h (struct omp_for_data_loop): Add m1, m2 and outer members. (struct omp_for_data): Add non_rect member. * omp-general.c (omp_extract_for_data): Handle non-rectangular loops. Fill in non_rect, m1, m2 and outer. * omp-low.c (lower_omp_for): Handle non-rectangular lb/b expressions. * omp-expand.c (expand_omp_for): Emit sorry_at for unsupported non-rectangular loop cases and assert for cases that can't be non-rectangular. * tree-pretty-print.c (dump_mem_ref): Formatting fix. (dump_omp_loop_non_rect_expr): New function. (dump_generic_node): Handle non-rectangular OpenMP loops. * tree-pretty-print.h (dump_omp_loop_non_rect_expr): Declare. * gimple-pretty-print.c (dump_gimple_omp_for): Handle non-rectangular OpenMP loops. gcc/c-family/ * c-common.h (c_omp_check_loop_iv_exprs): Add an int argument. * c-omp.c (struct c_omp_check_loop_iv_data): Add maybe_nonrect and idx members. (c_omp_is_loop_iterator): New function. (c_omp_check_loop_iv_r): Use it. Add support for silent scanning if outer loop iterator is present. Perform duplicate checking through hash_set in the function rather than expecting caller to do that. Pass NULL instead of d->ppset to walk_tree_1. (c_omp_check_nonrect_loop_iv): New function. (c_omp_check_loop_iv): Use it. Fill in new members, allow non-rectangular loop forms, diagnose multiple associated loops with the same iterator. Pass NULL instead of &pset to walk_tree_1. (c_omp_check_loop_iv_exprs): Likewise. gcc/c/ * c-parser.c (c_parser_expr_no_commas): Save, clear and restore c_in_omp_for. (c_parser_omp_for_loop): Set c_in_omp_for around some calls to avoid premature c_fully_fold. Defer explicit c_fully_fold calls to after c_finish_omp_for. * c-tree.h (c_in_omp_for): Declare. * c-typeck.c (c_in_omp_for): Define. (build_modify_expr): Avoid c_fully_fold if c_in_omp_for. (digest_init): Likewise. (build_binary_op): Likewise. gcc/cp/ * semantics.c (handle_omp_for_class_iterator): Adjust c_omp_check_loop_iv_exprs caller. (finish_omp_for): Likewise. Don't call fold_build_cleanup_point_expr before calling c_finish_omp_for and c_omp_check_loop_iv, move it after those calls. * pt.c (tsubst_omp_for_iterator): Handle non-rectangular loops. gcc/testsuite/ * c-c++-common/gomp/loop-6.c: New test. * gcc.dg/gomp/loop-1.c: Don't expect diagnostics on valid non-rectangular loops. * gcc.dg/gomp/loop-2.c: New test. * g++.dg/gomp/loop-1.C: Don't expect diagnostics on valid non-rectangular loops. * g++.dg/gomp/loop-2.C: Likewise. * g++.dg/gomp/loop-5.C: New test. * g++.dg/gomp/loop-6.C: New test. (cherry picked from commit 1160ec9a141faf1c4c0496c7412c8febeb623962)
2021-02-08Enable gimplify GOMP_MAP_STRUCT handling of (COMPONENT_REF (INDIRECT_REF ↵Chung-Lin Tang6-17/+78
...)) map clauses. This patch tries to allow map(A->ptr) to be properly handled the same way as map(B.ptr) expressions. map(struct:*A) clauses are now produced during gimplify. This patch, as of time of commit, is only pushed to devel/omp/gcc-10, not yet submitted as mainline patch to upstream. 2021-02-08 Chung-Lin Tang <cltang@codesourcery.com> gcc/ChangeLog: * gimplify.c ("tree-hash-traits.h"): Add include. (gimplify_scan_omp_clauses): Change struct_map_to_clause to type hash_map<tree_operand, tree> *. Adjust struct map handling to handle cases of *A and A->B expressions. (gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for exit data directives code to earlier position. gcc/testsuite/ChangeLog: * g++.dg/gomp/target-3.C: Adjust testcase gimple scanning. * g++.dg/gomp/target-this-2.C: Likewise. * g++.dg/gomp/target-this-3.C: Likewise. * g++.dg/gomp/target-this-4.C: Likewise. libgomp/ChangeLog: * testsuite/libgomp.c++/target-23.C: New testcase.
2021-02-02OpenMP 5.0: requires directiveChung-Lin Tang21-8/+236
This is a merge of: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563393.html This patch completes more of the reverse_offload, unified_address, and unified_shared_memory clauses for the OpenMP 5.0 requires directive, including runtime verification of the offload target. (currently no offload devices actually support above features, only warning messages are emitted) This may possibly reverted/updated when a final patch is approved for mainline. 2021-02-02 Chung-Lin Tang <cltang@codesourcery.com> gcc/c/ChangeLog: * c-parser.c (c_parser_declaration_or_fndef): Set OMP_REQUIRES_TARGET_USED in omp_requires_mask if function has "omp declare target" attribute. (c_parser_omp_target_data): Set OMP_REQUIRES_TARGET_USED in omp_requires_mask. (c_parser_omp_target_enter_data): Likewise. (c_parser_omp_target_exit_data): Likewise. (c_parser_omp_requires): Adjust to only mention "not implemented yet" for OMP_REQUIRES_DYNAMIC_ALLOCATORS. gcc/cp/ChangeLog: * parser.c (cp_parser_simple_declaration): Set OMP_REQUIRES_TARGET_USED in omp_requires_mask if function has "omp declare target" attribute. (cp_parser_omp_target_data): Set OMP_REQUIRES_TARGET_USED in omp_requires_mask. (cp_parser_omp_target_enter_data): Likewise. (cp_parser_omp_target_exit_data): Likewise. (cp_parser_omp_requires): Adjust to only mention "not implemented yet" for OMP_REQUIRES_DYNAMIC_ALLOCATORS. gcc/fortran/ChangeLog: * openmp.c (gfc_check_omp_requires): Fix REVERSE_OFFLOAD typo. (gfc_match_omp_requires): Adjust to only mention "not implemented yet" for OMP_REQUIRES_DYNAMIC_ALLOCATORS. * parse.c ("tree.h"): Add include. ("omp-general.h"): Likewise. (gfc_parse_file): Add code to merge omp_requires to omp_requires_mask. gcc/ChangeLog: * omp-offload.c (omp_finish_file): Add code to reate OpenMP requires mask variable in .gnu.gomp_requires section if needed. gcc/testsuite/ChangeLog: * c-c++-common/gomp/requires-4.c: Remove prune of "not supported yet". * gcc/testsuite/gfortran.dg/gomp/requires-4.f90: Fix REVERSE_OFFLOAD typo. * gcc/testsuite/gfortran.dg/gomp/requires-8.f90: Likewise. include/ChangeLog: * gomp-constants.h (GOMP_REQUIRES_UNIFIED_ADDRESS): New symbol. (GOMP_REQUIRES_UNIFIED_SHARED_MEMORY): Likewise. (GOMP_REQUIRES_REVERSE_OFFLOAD): Likewise. libgcc/ChangeLog: * offloadstuff.c (__requires_mask_table): New symbol to mark start of .gnu.gomp_requires section. (__requires_mask_table_end): New symbol to mark end of .gnu.gomp_requires section. libgomp/ChangeLog: * libgomp-plugin.h (GOMP_OFFLOAD_supported_features): New declaration. * libgomp.h (struct gomp_device_descr): New 'supported_features_func' plugin hook field. * oacc-host.c (host_supported_features): New host hook function. (host_dispatch): Initialize 'supported_features_func' host hook. * plugin/plugin-gcn.c (GOMP_OFFLOAD_supported_features): New function. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_supported_features): Likewise. * target.c (<stdio.h>): Add include of standard header. (gomp_requires_mask): New static variable. (__requires_mask_table): New declaration. (__requires_mask_table_end): Likewise. (gomp_load_plugin_for_device): Add loading of 'supported_features' hook. (gomp_target_init): Add code to summarize .gnu._gomp_requires section mask values, emit error if inconsistency found. * testsuite/libgomp.c-c++-common/requires-1.c: New test. * testsuite/libgomp.c-c++-common/requires-1-aux.c: New file linked with above test. * testsuite/libgomp.c-c++-common/requires-2.c: New test. * testsuite/libgomp.c-c++-common/requires-2-aux.c: New file linked with above test. liboffloadmic/ChangeLog: * plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_supported_features): New function.
2021-02-02OpenMP 5.0: map this[:1] in C++ non-static member functions (PR 92120)Chung-Lin Tang18-87/+963
This is a merge of: https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558975.html This patch creates automatic mapping of map(this[:1]) and pointer members as zero-length array sections, as specified by the OpenMP 5.0 specification. This may possibly reverted/updated when a final patch is approved for mainline. 2021-02-02 Chung-Lin Tang <cltang@codesourcery.com> PR middle-end/92120 gcc/cp/ChangeLog: * cp-tree.h (finish_omp_target): New declaration. (set_omp_target_this_expr): Likewise. * lambda.c (lambda_expr_this_capture): Add call to set_omp_target_this_expr. * parser.c (cp_parser_omp_target): Factor out code, change to call finish_omp_target, add re-initing call to set_omp_target_this_expr. * semantics.c (omp_target_this_expr): New static variable. (omp_target_ptr_members_accessed): New static hash_map for tracking accessed non-static pointer-type members. (finish_non_static_data_member): Add call to set_omp_target_this_expr. Add recording of non-static pointer-type members access. (finish_this_expr): Add call to set_omp_target_this_expr. (set_omp_target_this_expr): New function to set omp_target_this_expr. (finish_omp_target): New function with code merged from cp_parser_omp_target, plus code to implement this[:1] and __closure map clauses for OpenMP. (handle_omp_array_sections_1): Move code to peel of '*' for reference-based COMPONENT_REFs before FIELD_DECL transforming. (finish_omp_clauses): Handle 'A->member' case in map clauses. gcc/ChangeLog: * omp-low.c (lower_omp_target): Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. * tree-pretty-print.c (dump_omp_clause): Likewise. gcc/testsuite/ChangeLog: * g++.dg/gomp/target-this-1.C: New testcase. * g++.dg/gomp/target-this-2.C: New testcase. * g++.dg/gomp/target-this-3.C: New testcase. * g++.dg/gomp/target-this-4.C: New testcase. include/ChangeLog: * gomp-constants.h (enum gomp_map_kind): Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. (GOMP_MAP_POINTER_P): Include GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION. libgomp/ChangeLog: * libgomp.h (gomp_attach_pointer): Add bool parameter. * oacc-mem.c (acc_attach_async): Update call to gomp_attach_pointer. (goacc_enter_data_internal): Likewise. * target.c (gomp_map_vars_existing): Update assert condition to include GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION. (gomp_map_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for mapping a pointer with NULL target. (gomp_attach_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for attaching a pointer with NULL target. (gomp_map_vars_internal): Update calls to gomp_map_pointer and gomp_attach_pointer, add handling for GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION cases. * testsuite/libgomp.c++/target-this-1.C: New testcase. * testsuite/libgomp.c++/target-this-2.C: New testcase. * testsuite/libgomp.c++/target-this-3.C: New testcase. * testsuite/libgomp.c++/target-this-4.C: New testcase.
2021-02-01OpenMP 5.0: Allow multiple clauses mapping same variableChung-Lin Tang4-37/+82
This is a merge of: https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562081.html This patch now allows multiple clauses on the same construct to map the same variable, which was not valid in OpenMP 4.5, but now allowed in 5.0. This may possibly reverted/updated when a final patch is approved for mainline. 2021-02-01 Chung-Lin Tang <cltang@codesourcery.com> gcc/c/ChangeLog: * c-typeck.c (c_finish_omp_clauses): Adjust to allow duplicate mapped variables for OpenMP. gcc/cp/ChangeLog: * semantics.c (finish_omp_clauses): Adjust to allow duplicate mapped variables for OpenMP. gcc/ChangeLog: * omp-low.c (install_parm_decl): Add new 'tree key_expr' parameter. Use key_expr as splay-tree key instead of var itself. (install_var_field): Add new 'tree key_expr = NULL_TREE' default parameter. Set splay-tree lookup key to key_expr instead of var if key_expr is non-NULL. Adjust call to install_parm_decl. Update comments. (scan_sharing_clauses): Use clause tree expression as splay-tree key for map/to/from and OpenACC firstprivate cases when installing the variable field into the send/receive record type. (maybe_lookup_field_in_outer_ctx): Add code to search through construct clauses instead of entirely based on splay-tree lookup. (lower_oacc_reductions): Adjust to find map-clause of reduction variable, then create receiver-ref. (lower_omp_target): Adjust to lookup var field using clause expression. gcc/testsuite/ChangeLog: * c-c++-common/gomp/clauses-2.c: Adjust testcase.
2021-01-27OpenMP 5.0 Structure element mappingChung-Lin Tang11-155/+688
This is a merge of: https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561139.html This patch implements the changes to the behavior of target mapping structure elements, as specified in OpenMP 5.0. This may possibly reverted/updated when a final patch is approved for mainline. libgomp/ChangeLog: * hashtab.h (htab_clear): New function with initialization code factored out from... (htab_create): ...here, adjust to use htab_clear function. * libgomp.h (REFCOUNT_SPECIAL): New symbol to denote range of special refcount values, add comments. (REFCOUNT_INFINITY): Adjust definition to use REFCOUNT_SPECIAL. (REFCOUNT_LINK): Likewise. (REFCOUNT_STRUCTELEM): New special refcount range for structure element siblings. (REFCOUNT_STRUCTELEM_P): Macro for testing for structure element sibling maps. (REFCOUNT_STRUCTELEM_FLAG_FIRST): Flag to indicate first sibling. (REFCOUNT_STRUCTELEM_FLAG_LAST): Flag to indicate last sibling. (REFCOUNT_STRUCTELEM_FIRST_P): Macro to test _FIRST flag. (REFCOUNT_STRUCTELEM_LAST_P): Macro to test _LAST flag. (struct splay_tree_key_s): Add structelem_refcount and structelem_refcount_ptr fields into a union with dynamic_refcount. Add comments. (gomp_map_vars): Delete declaration. (gomp_map_vars_async): Likewise. (gomp_unmap_vars): Likewise. (gomp_unmap_vars_async): Likewise. (goacc_map_vars): New declaration. (goacc_unmap_vars): Likewise. * oacc-mem.c (acc_map_data): Adjust to use goacc_map_vars. (goacc_enter_datum): Likewise. (goacc_enter_data_internal): Likewise. * oacc-parallel.c (GOACC_parallel_keyed): Adjust to use goacc_map_vars and goacc_unmap_vars. (GOACC_data_start): Adjust to use goacc_map_vars. (GOACC_data_end): Adjust to use goacc_unmap_vars. * target.c (hash_entry_type): New typedef. (htab_alloc): New function hook for hashtab.h. (htab_free): Likewise. (htab_hash): Likewise. (htab_eq): Likewise. (hashtab.h): Add file include. (gomp_increment_refcount): New function. (gomp_decrement_refcount): Likewise. (gomp_map_vars_existing): Add refcount_set parameter, adjust to use gomp_increment_refcount. (gomp_map_fields_existing): Add refcount_set parameter, adjust calls to gomp_map_vars_existing. (gomp_map_vars_internal): Add refcount_set parameter, add local openmp_p variable to guard OpenMP specific paths, adjust calls to gomp_map_vars_existing, add structure element sibling splay_tree_key sequence creation code, adjust Fortran map case to avoid increment under OpenMP. (gomp_map_vars): Adjust to static, add refcount_set parameter, manage local refcount_set if caller passed in NULL, adjust call to gomp_map_vars_internal. (gomp_map_vars_async): Adjust and rename into... (goacc_map_vars): ...this new function, adjust call to gomp_map_vars_internal. (gomp_remove_splay_tree_key): New function with code factored out from gomp_remove_var_internal. (gomp_remove_var_internal): Add code to handle removing multiple splay_tree_key sequence for structure elements, adjust code to use gomp_remove_splay_tree_key for splay-tree key removal. (gomp_unmap_vars_internal): Add refcount_set parameter, adjust to use gomp_decrement_refcount. (gomp_unmap_vars): Adjust to static, add refcount_set parameter, manage local refcount_set if caller passed in NULL, adjust call to gomp_unmap_vars_internal. (gomp_unmap_vars_async): Adjust and rename into... (goacc_unmap_vars): ...this new function, adjust call to gomp_unmap_vars_internal. (GOMP_target): Manage refcount_set and adjust calls to gomp_map_vars and gomp_unmap_vars. (GOMP_target_ext): Likewise. (gomp_target_data_fallback): Adjust call to gomp_map_vars. (GOMP_target_data): Likewise. (GOMP_target_data_ext): Likewise. (GOMP_target_end_data): Adjust call to gomp_unmap_vars. (gomp_exit_data): Add refcount_set parameter, adjust to use gomp_decrement_refcount, adjust to queue splay-tree keys for removal after main loop. (GOMP_target_enter_exit_data): Manage refcount_set and adjust calls to gomp_map_vars and gomp_exit_data. (gomp_target_task_fn): Likewise. * testsuite/libgomp.c-c++-common/refcount-1.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-1.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-2.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-3.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-4.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-5.c: New testcase.
2021-01-26amdgcn: Allow V64DFmode min/max reductionsAndrew Stubbs1-1/+2
I don't know why these were disabled. There're no direct min/max DPP instructions for this mode, but the "use moves" strategy works fine. gcc/ChangeLog: * config/gcn/gcn.c (gcn_expand_reduc_scalar): Use move instructions for V64DFmode min/max reductions. Backport from d9f50366102a8ca3521e4854f7716bd013c8ea0a.
2021-01-22openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738]Kwok Cheung Yeung7-54/+147
This adds support for the task detach clause to taskwait, and fixes a number of problems related to semaphores that may lead to a hang in some circumstances. 2021-01-21 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ PR libgomp/98738 * libgomp.h (enum gomp_task_kind): Add GOMP_TASK_DETACHED. * task.c (task_fulfilled_p): Check detach field as well. (GOMP_task): Add thread to debug messages. Use address of task as the event handle. (gomp_barrier_handle_tasks): Fix indentation. Use address of task as event handle. Set kind of suspended detach task to GOMP_TASK_DETACHED and decrement task_running_count. Move finish_cancelled block out of else branch. Skip decrement of task_running_count if task kind is GOMP_TASK_DETACHED. (GOMP_taskwait): Finish fulfilled detach tasks. Update comment. Queue detach tasks that have not been fulfilled. (omp_fulfill_event): Use address of task as event handle. Post to taskwait_sem and taskgroup_sem if necessary. Check task_running_count before calling gomp_team_barrier_wake. * testsuite/libgomp.c-c++-common/task-detach-5.c (main): Change data-sharing of detach events on enclosing parallel to private. * testsuite/libgomp.c-c++-common/task-detach-6.c (main): Likewise. * testsuite/libgomp.fortran/task-detach-5.f90 (task_detach_5): Likewise. * testsuite/libgomp.fortran/task-detach-6.f90 (task_detach_6): Likewise.
2021-01-22libgomp: Fix up GOMP_task on s390xJakub Jelinek2-11/+21
On Wed, Jan 20, 2021 at 05:04:39PM +0100, Florian Weimer wrote: > Sorry, this appears to cause OpenMP task state corruption in RPM. We > have only seen this on s390x. Haven't actually verified it, but my suspection is that this is a caller stack corruption. We play with fire with the GOMP_task API/ABI extensions, the GOMP_task function used to be: void GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *), long arg_size, long arg_align, bool if_clause, unsigned flags); and later: void GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *), long arg_size, long arg_align, bool if_clause, unsigned flags, void **depend); and later: void GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *), long arg_size, long arg_align, bool if_clause, unsigned flags, void **depend, int priority); and now: void GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *), long arg_size, long arg_align, bool if_clause, unsigned flags, void **depend, int priority, void *detach) and which of those depend, priority and detach argument is present depends on the bits in flags. I'm afraid the compiler just decided to spill the detach = NULL store in if ((flags & GOMP_TASK_FLAG_DETACH) == 0) detach = NULL; on s390x into the argument stack slot. Not a problem if the caller passes all those 10 arguments, but if not, can clobber random stack location. This hack should fix it up. Priority doesn't need changing, but I've changed it anyway just to be safe. With the patch none of the 3 arguments are ever modified, so I'd hope gcc doesn't decide to spill something unrelated there. 2021-01-20 Jakub Jelinek <jakub@redhat.com> * task.c (GOMP_task): Rename priority argument to priority_arg, add priority automatic variable and modify that variable. Instead of clearing detach argument when GOMP_TASK_FLAG_DETACH bit is not set, check flags for that bit. (cherry picked from commit 0bb27b81a762d3c607bd25409337c749f836c0cd)
2021-01-22Fix gfortran.dg/gomp/task-detach-1.f90 for non 64bit pointersTobias Burnus2-1/+10
gcc/testsuite/ChangeLog: PR fortran/98763 * gfortran.dg/gomp/task-detach-1.f90: Use integer(1) to avoid missing diagnostic issues with c_intptr_t == default integer kind. (cherry picked from commit a95538b6c5a9ea480e341da9ca8fbf201417dba5)
2021-01-22openmp: Don't ICE on detach clause with erroneous decl [PR98742]Jakub Jelinek4-0/+33
Similarly to how we handle erroneous operands to e.g. allocate clause, this change just removes those clauses instead of accessing TYPE_MAIN_VARIANT of its type, which doesn't work on error_mark_node. Also, just for good measure, bails out if TYPE_NAME is NULL. 2021-01-20 Jakub Jelinek <jakub@redhat.com> PR c++/98742 * semantics.c (finish_omp_clauses) <case OMP_CLAUSE_DETACH>: If error_operand_p, remove clause without further checking. Check for non-NULL TYPE_NAME. * c-c++-common/gomp/task-detach-2.c: New test. (cherry picked from commit 7ab1abf3b82a3bcfff9b7bc596166fef6a0d83ab)
2021-01-22openmp: Don't optimize shared to firstprivate on task with depend clauseJakub Jelinek4-0/+77
The attached testcase is miscompiled, because we optimize shared clauses to firstprivate when task body can't modify the variable even when the task has depend clause. That is wrong, because firstprivate means the variable will be copied immediately when the task is created, while with depend clause some other task might change it later before the dependencies are satisfied and the task should observe the value only after the change. 2020-12-18 Jakub Jelinek <jakub@redhat.com> * gimplify.c (struct gimplify_omp_ctx): Add has_depend member. (gimplify_scan_omp_clauses): Set it to true if OMP_CLAUSE_DEPEND appears on OMP_TASK. (gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses): Force GOVD_WRITTEN on shared variables if task construct has depend clause. * testsuite/libgomp.c/task-6.c: New test. (cherry picked from commit 8b60459465252c7d47b58abf83fae2aa84915b03)
2021-01-22RTEMS: Fix libgomp buildSebastian Huber2-0/+12
libgomp/ * config/rtems/sem.h (gomp_sem_getcount): New function. (cherry picked from commit 0f951b3dd34b355579b4c9a9e287d32ac771bc67)
2021-01-22libgomp: Don't access gomp_sem_t as int using atomics unconditionallyJakub Jelinek6-2/+62
This patch introduces gomp_sem_getcount wrapper, which uses sem_getvalue for POSIX and atomic loads for linux futex and accel. rtems for now remains broken. 2021-01-18 Jakub Jelinek <jakub@redhat.com> * config/linux/sem.h (gomp_sem_getcount): New function. * config/posix/sem.h (gomp_sem_getcount): New function. * config/posix/sem.c (gomp_sem_getcount): New function. * config/accel/sem.h (gomp_sem_getcount): New function. * task.c (task_fulfilled_p): Use gomp_sem_getcount. (omp_fulfill_event): Likewise. (cherry picked from commit d3b41bde961713ff4af7e18011126434c497edba)
2021-01-22openmp: Add support for the OpenMP 5.0 task detach clauseKwok Cheung Yeung58-62/+1341
2021-01-16 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * builtin-types.def (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT): Rename to... (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PTR): ...this. Add extra argument. * gimplify.c (omp_default_clause): Ensure that event handle is firstprivate in a task region. (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_DETACH. (gimplify_adjust_omp_clauses): Likewise. * omp-builtins.def (BUILT_IN_GOMP_TASK): Change function type to BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PTR. * omp-expand.c (expand_task_call): Add GOMP_TASK_FLAG_DETACH to flags if detach clause specified. Add detach argument when generating call to GOMP_task. * omp-low.c (scan_sharing_clauses): Setup data environment for detach clause. (finish_taskreg_scan): Move field for variable containing the event handle to the front of the struct. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DETACH. Fix ordering. * tree-nested.c (convert_nonlocal_omp_clauses): Handle OMP_CLAUSE_DETACH clause. (convert_local_omp_clauses): Handle OMP_CLAUSE_DETACH clause. * tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_DETACH. * tree.c (omp_clause_num_ops): Add entry for OMP_CLAUSE_DETACH. Fix ordering. (omp_clause_code_name): Add entry for OMP_CLAUSE_DETACH. Fix ordering. (walk_tree_1): Handle OMP_CLAUSE_DETACH. gcc/c-family/ * c-pragma.h (pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_DETACH. Redefine PRAGMA_OACC_CLAUSE_DETACH. gcc/c/ * c-parser.c (c_parser_omp_clause_detach): New. (c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH clause. (OMP_TASK_CLAUSE_MASK): Add mask for PRAGMA_OMP_CLAUSE_DETACH. * c-typeck.c (c_finish_omp_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH clause. Prevent use of detach with mergeable and overriding the data sharing mode of the event handle. gcc/cp/ * parser.c (cp_parser_omp_clause_detach): New. (cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH. (OMP_TASK_CLAUSE_MASK): Add mask for PRAGMA_OMP_CLAUSE_DETACH. * pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_DETACH clause. * semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_DETACH clause. Prevent use of detach with mergeable and overriding the data sharing mode of the event handle. gcc/fortran/ * dump-parse-tree.c (show_omp_clauses): Handle detach clause. * frontend-passes.c (gfc_code_walker): Walk detach expression. * gfortran.h (struct gfc_omp_clauses): Add detach field. (gfc_c_intptr_kind): New. * openmp.c (gfc_free_omp_clauses): Free detach clause. (gfc_match_omp_detach): New. (enum omp_mask1): Add OMP_CLAUSE_DETACH. (enum omp_mask2): Remove OMP_CLAUSE_DETACH. (gfc_match_omp_clauses): Handle OMP_CLAUSE_DETACH for OpenMP. (OMP_TASK_CLAUSES): Add OMP_CLAUSE_DETACH. (resolve_omp_clauses): Prevent use of detach with mergeable and overriding the data sharing mode of the event handle. * trans-openmp.c (gfc_trans_omp_clauses): Handle detach clause. * trans-types.c (gfc_c_intptr_kind): New. (gfc_init_kinds): Initialize gfc_c_intptr_kind. * types.def (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT): Rename to... (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PTR): ...this. Add extra argument. gcc/testsuite/ * c-c++-common/gomp/task-detach-1.c: New. * g++.dg/gomp/task-detach-1.C: New. * gcc.dg/gomp/task-detach-1.c: New. * gfortran.dg/gomp/task-detach-1.f90: New. include/ * gomp-constants.h (GOMP_TASK_FLAG_DETACH): New. libgomp/ * fortran.c (omp_fulfill_event_): New. * libgomp.h (struct gomp_task): Add detach and completion_sem fields. (struct gomp_team): Add task_detach_queue and task_detach_count fields. * libgomp.map (OMP_5.0.1): Add omp_fulfill_event and omp_fulfill_event_. * libgomp_g.h (GOMP_task): Add extra argument. * omp.h.in (enum omp_event_handle_t): New. (omp_fulfill_event): New. * omp_lib.f90.in (omp_event_handle_kind): New. (omp_fulfill_event): New. * omp_lib.h.in (omp_event_handle_kind): New. (omp_fulfill_event): Declare. * priority_queue.c (priority_tree_find): New. (priority_list_find): New. (priority_queue_find): New. * priority_queue.h (priority_queue_predicate): New. (priority_queue_find): New. * task.c (gomp_init_task): Initialize detach field. (task_fulfilled_p): New. (GOMP_task): Add detach argument. Ignore detach argument if GOMP_TASK_FLAG_DETACH not set in flags. Initialize completion_sem field. Copy address of completion_sem into detach argument and into the start of the data record. Wait for detach event if task not deferred. (gomp_barrier_handle_tasks): Queue tasks with unfulfilled events. Remove completed tasks and requeue dependent tasks. (omp_fulfill_event): New. * team.c (gomp_new_team): Initialize task_detach_queue and task_detach_count fields. (free_team): Free task_detach_queue field. * testsuite/libgomp.c-c++-common/task-detach-1.c: New testcase. * testsuite/libgomp.c-c++-common/task-detach-2.c: New testcase. * testsuite/libgomp.c-c++-common/task-detach-3.c: New testcase. * testsuite/libgomp.c-c++-common/task-detach-4.c: New testcase. * testsuite/libgomp.c-c++-common/task-detach-5.c: New testcase. * testsuite/libgomp.c-c++-common/task-detach-6.c: New testcase. * testsuite/libgomp.fortran/task-detach-1.f90: New testcase. * testsuite/libgomp.fortran/task-detach-2.f90: New testcase. * testsuite/libgomp.fortran/task-detach-3.f90: New testcase. * testsuite/libgomp.fortran/task-detach-4.f90: New testcase. * testsuite/libgomp.fortran/task-detach-5.f90: New testcase. * testsuite/libgomp.fortran/task-detach-6.f90: New testcase. (cherry picked from commit a6d22fb21c6f1ad7e8b6b722bfc0e7e11f50cb92)
2021-01-21Target mapping C++ members inside member functionsChung-Lin Tang4-16/+77
This is a merge of: https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562467.html This patch fixes some problems with target mapping when inside C++ member functions: 1. Allow deref '->' in map clauses. 2. Allow this[X] in map clauses. 3. Create map(this->member) from map(member), when encountering member's FIELD_DECL. This may possibly reverted/updated when a final patch is approved for mainline. gcc/cp/ChangeLog: * parser.c (cp_parser_omp_clause_map): Adjust call to cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true. * semantics.c (handle_omp_array_sections_1): Add handling to create 'this->member' from 'member' FIELD_DECL. (finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP map clauses. gcc/testsuite/ChangeLog: * g++.dg/gomp/target-3.C: New test. * g++.dg/gomp/this-2.C: Adjust testcase.
2021-01-16Correct fix offload dwarf infoAndrew Stubbs1-3/+2
The previous patch wasn't quite right, apparently. Somehow the behaviour changed after another clean build? This tweak fixes it. This patch should be squashed with fdcb23540a2 to go to mainline. gcc/ChangeLog: * dwarf2out.c (gen_subprogram_die): Check offload attributes only.
2021-01-15DWARF address space for variablesAndrew Stubbs1-0/+9
Add DWARF address class attributes for variables that exist outside the generic address space. In particular, this is the case for gang-private variables in OpenACC offload kernels. gcc/ChangeLog: * dwarf2out.c (add_location_or_const_value_attribute): Set DW_AT_address_class, if appropriate.
2021-01-15amdgcn: DWARF address spacesAndrew Stubbs1-0/+28
Map GCN address spaces to the proposed DWARF address spaces defined by AMD. gcc/ChangeLog: * config/gcn/gcn.c: Include dwarf2.h. (gcn_addr_space_debug): New function. (TARGET_ADDR_SPACE_DEBUG): New hook.
2021-01-15amdgcn: Fix DWARF variables with allocaAndrew Stubbs1-4/+17
Require a frame pointer for entry functions that use alloca because it isn't possible to encode the DWARF frame otherwise. Adjust the CFA definition expressions accordingly. gcc/ChangeLog: * config/gcn/gcn.c (gcn_expand_prologue): Use the frame pointer for the DWARF CFA, if it exists. (gcn_frame_pointer_rqd): Require a frame pointer for entry functions that use alloca.
2021-01-15Fix offload dwarf infoAndrew Stubbs1-0/+14
Add a notional code range to the notional parent function of offload kernel functions. This is enough to prevent GDB discarding them. gcc/ChangeLog: * dwarf2out.c (gen_subprogram_die): Add high/low_pc attributes for parents of offload kernels.
2021-01-13[og10] vect: Add target hook to prefer gather/scatter instructionsJulian Brown5-2/+24
For AMD GCN, the instructions available for loading/storing vectors are always scatter/gather operations (i.e. there are separate addresses for each vector lane), so the current heuristic to avoid gather/scatter operations with too many elements in get_group_load_store_type is counterproductive. Avoiding such operations in that function can subsequently lead to a missed vectorization opportunity whereby later analyses in the vectorizer try to use a very wide array type which is not available on this target, and thus it bails out. The attached patch adds a target hook to override the "single_element_p" heuristic in the function as a target hook, and activates it for GCN. This allows much better code to be generated for affected loops. 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * doc/tm.texi.in (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Add documentation hook. * doc/tm.texi: Regenerate. * target.def (prefer_gather_scatter): Add target hook under vectorizer. * tree-vect-stmts.c (get_group_load_store_type): Optionally prefer gather/scatter instructions to scalar/elementwise fallback. * config/gcn/gcn.c (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Define hook.
2021-01-13[og10] openacc: Adjust loop lowering for AMD GCNJulian Brown7-43/+214
This patch adjusts OpenACC loop lowering in the AMD GCN target compiler in such a way that the autovectorizer can vectorize the "vector" dimension of those loops in more cases. Rather than generating "SIMT" code that executes a scalar instruction stream for each lane of a vector in lockstep, for GCN we model the GPU like a typical CPU, with separate instructions to operate on scalar and vector data. That means that unlike other offload targets, we rely on the autovectorizer to handle the innermost OpenACC parallelism level, which is "vector". Because of this, the OpenACC builtin functions to return the current vector lane and the vector width return 0 and 1 respectively, despite the native vector width being 64 elements wide. This allows generated code to work with our chosen compilation model, but the way loops are lowered in omp-offload.c:oacc_xform_loop does not understand the discrepancy between logical (OpenACC) and physical vector sizes correctly. That means that if a loop is partitioned over e.g. the worker AND vector dimensions, we actually lower with unit vector size -- meaning that if we then autovectorize, we end up trying to vectorize over the "worker" dimension rather than the vector one! Then, because the number of workers is not fixed at compile time, that means the autovectorizer has a hard time analysing the loop and thus vectorization often fails entirely. We can fix this by deducing the true vector width in oacc_xform_loop, and using that when we are on a "non-SIMT" offload target. We can then rearrange how loops are lowered in that function so that the loop form fed to the autovectorizer is more amenable to vectorization -- namely, the innermost step is set to process each loop iteration sequentially. For some benchmarks, allowing vectorization to succeed leads to quite impressive performance improvements -- I've observed between 2.5x and 40x on one machine/GPU combination. The low-level builtins available to user code (__builtin_goacc_parlevel_id and __builtin_goacc_parlevel_size) continue to return 0/1 respectively for the vector dimension for AMD GCN, even if their containing loop is vectorized -- that's a quirk that we might possibly want to address at some later date. Only non-"chunking" loops are handled at present. "Chunking" loops are still lowered as before. 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * omp-offload.c (oacc_thread_numbers): Add VF_BY_VECTORIZER parameter. Add overloaded wrapper for previous arguments & behaviour. (oacc_xform_loop): Lower vector loops to iterate a multiple of omp_max_vf times over contiguous steps on non-SIMT targets. libgomp/testsuite/ * libgomp.oacc-c-c++-common/loop-gwv-1.c: Adjust for loop lowering changes. * libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise. * libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise. * libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise. * libgomp.oacc-c-c++-common/routine-gwv-1.c: Likewise. * libgomp.oacc-c-c++-common/routine-wv-1.c: Likewise.
2021-01-12amdgcn: Remove dead code for fixed v0 registerJulian Brown1-4/+0
This patch removes code to fix the v0 register in gcn_conditional_register_usage that was missed out of the previous patch removing the need for that: https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534284.html Backport from mainline: 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * config/gcn/gcn.c (gcn_conditional_register_usage): Remove dead code to fix v0 register. (cherry picked from commit 7993fe1877a689463d8c71a0873e5cc8db080273)
2021-01-12amdgcn: Fix exec register live-on-entry to BB in md-reorgJulian Brown1-1/+16
This patch fixes a corner case in the AMD GCN md-reorg pass when the EXEC register is live on entry to a BB, and could be clobbered by code inserted by the pass before a use in (e.g.) a different BB. Backport from mainline: 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * config/gcn/gcn.c (gcn_md_reorg): Fix case where EXEC reg is live on entry to a BB. (cherry picked from commit 3df6fac0080468d1521775e82a5e060f0b1c78ca)
2021-01-12amdgcn: Improve FP division accuracyJulian Brown3-20/+81
GCN has a reciprocal-approximation instruction but no hardware divide. This patch adjusts the open-coded reciprocal approximation/Newton-Raphson refinement steps to use fused multiply-add instructions as is necessary to obtain a properly-rounded result, and adds further refinement steps to correctly round the full division result. The patterns in question are still guarded by a flag_reciprocal_math condition, and do not yet support denormals. Backport from mainline: 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * config/gcn/gcn-valu.md (recip<mode>2<exec>, recip<mode>2): Use unspec for reciprocal-approximation instructions. (div<mode>3): Use fused multiply-accumulate operations for reciprocal refinement and division result. * config/gcn/gcn.md (UNSPEC_RCP): New unspec constant. gcc/testsuite/ * gcc.target/gcn/fpdiv.c: New test. (cherry picked from commit c8812bac8ee39f73ea881e4f6260acf5590b4190)
2021-01-12amdgcn: Fix subdf3 patternJulian Brown1-1/+1
This patch fixes a typo in the subdf3 pattern that meant it had a non-standard name and thus the compiler would emit a libcall rather than the proper hardware instruction for DFmode subtraction. Backport from mainline: 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * config/gcn/gcn-valu.md (subdf): Rename to... (subdf3): This. (cherry picked from commit abb3993e49c04bd40e42f196f55785cc3fd81682)
2021-01-05nvptx: Cache stacks block for OpenMP kernel launchJulian Brown2-18/+115
2021-01-05 Julian Brown <julian@codesourcery.com> libgomp/ * plugin/plugin-nvptx.c (SOFTSTACK_CACHE_LIMIT): New define. (struct ptx_device): Add omp_stacks struct. (nvptx_open_device): Initialise cached-stacks housekeeping info. (nvptx_close_device): Free cached stacks block and mutex. (nvptx_stacks_free): New function. (nvptx_alloc): Add SUPPRESS_ERRORS parameter. (GOMP_OFFLOAD_alloc): Add strategies for freeing soft-stacks block. (nvptx_stacks_alloc): Rename to... (nvptx_stacks_acquire): This. Cache stacks block between runs if same size or smaller is required. (nvptx_stacks_free): Remove. (GOMP_OFFLOAD_run): Call nvptx_stacks_acquire and lock stacks block during kernel execution. (cherry picked from commit 6b577a17b26347e78c8b9167f24fc5c9d9724270)
2021-01-05Fortran: Delay vtab generation until after parsing [PR92587]Tobias Burnus5-3/+100
Forward port of GCC 10 commit 6f3f06e431c181d3e51d31f49a2bf0be2944ae93 which is a backport of mainline commit ba9fa684053917a07bfa8f4742da0e196e72b9a2 gcc/fortran/ChangeLog: PR fortran/92587 * match.c (gfc_match_assignment): Move gfc_find_vtab call from here ... * resolve.c (gfc_resolve_code): ... to here. gcc/testsuite/ChangeLog: PR fortran/92587 * gfortran.dg/finalize_37.f90: New test. (cherry picked from commit ba9fa684053917a07bfa8f4742da0e196e72b9a2)
2020-12-18openmp: Fix g++.dg/gomp/declare-target-3.C testcase when offloading is disabledKwok Cheung Yeung2-7/+15
This is a backport from mainline of commit bfb37fa4dd49ee775ae90355464265a2f60c1067. 2020-12-18 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/testsuite/ * g++.dg/gomp/declare-target-3.C: Only check .offload_var_table entries if offloading is enabled.
2020-12-18OpenMP: Add implicit declare target for nested proceduresTobias Burnus4-0/+67
This is a backport from mainline of commit 8b0a63e47cd83f4e8534d0d201739bdd10f321a2. gcc/ChangeLog: * omp-offload.c (omp_discover_implicit_declare_target): Also handled nested functions. libgomp/ChangeLog: * testsuite/libgomp.fortran/declare-target-3.f90: New test.
2020-12-18openmp: Implicitly add 'declare target' directives for dynamic initializers ↵Kwok Cheung Yeung12-5/+158
in C++ This is a backport from mainline (commit 3af02d32cce2ff1ff11d078cf8094305f57ca179). 2020-12-18 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * langhooks-def.h (lhd_get_decl_init): New. (lhd_finish_decl_inits): New. (LANG_HOOKS_GET_DECL_INIT): New. (LANG_HOOKS_OMP_FINISH_DECL_INITS): New. (LANG_HOOKS_DECLS): Add LANG_HOOKS_GET_DECL_INIT and LANG_HOOKS_OMP_FINISH_DECL_INITS. * langhooks.c (lhd_omp_get_decl_init): New. (lhd_omp_finish_decl_inits): New. * langhooks.h (struct lang_hooks_for_decls): Add omp_get_decl_init and omp_finish_decl_inits. * omp-offload.c (omp_discover_declare_target_var_r): Use get_decl_init langhook in place of DECL_INITIAL. Call omp_finish_decl_inits langhook at end of function. gcc/cp/ * cp-lang.c (cxx_get_decl_init): New. (cxx_omp_finish_decl_inits): New. (LANG_HOOKS_GET_DECL_INIT): New. (LANG_HOOKS_OMP_FINISH_DECL_INITS): New. * cp-tree.h (dynamic_initializers): New. * decl.c (dynamic_initializers): New. * decl2.c (c_parse_final_cleanups): Add initializer entries from vars to dynamic_initializers. gcc/testsuite/ * g++.dg/gomp/declare-target-3.C: New.
2020-12-07openmp: Implement OpenMP 5.0 base-pointer attachement and clause orderingChung-Lin Tang15-102/+615
This patch implements some parts of the target variable mapping changes specified in OpenMP 5.0, including base-pointer attachment/detachment behavior for array section list-items in map clauses, and ordering of map clauses according to map kind. 2020-11-10 Chung-Lin Tang <cltang@codesourcery.com> gcc/c-family/ChangeLog: * c-common.h (c_omp_adjust_map_clauses): New declaration. * c-omp.c (struct map_clause): Helper type for c_omp_adjust_map_clauses. (c_omp_adjust_map_clauses): New function. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_target_data): Add use of new c_omp_adjust_map_clauses function. Add GOMP_MAP_ATTACH_DETACH as handled map clause kind. (c_parser_omp_target_enter_data): Likewise. (c_parser_omp_target_exit_data): Likewise. (c_parser_omp_target): Likewise. * c-typeck.c (handle_omp_array_sections): Adjust COMPONENT_REF case to use GOMP_MAP_ATTACH_DETACH map kind for C_ORT_OMP region type. (c_finish_omp_clauses): Adjust bitmap checks to allow struct decl and same struct field access to co-exist on OpenMP construct. gcc/cp/ChangeLog: * parser.c (cp_parser_omp_target_data): Add use of new c_omp_adjust_map_clauses function. Add GOMP_MAP_ATTACH_DETACH as handled map clause kind. (cp_parser_omp_target_enter_data): Likewise. (cp_parser_omp_target_exit_data): Likewise. (cp_parser_omp_target): Likewise. * semantics.c (handle_omp_array_sections): Adjust COMPONENT_REF case to use GOMP_MAP_ATTACH_DETACH map kind for C_ORT_OMP region type. Fix interaction between reference case and attach/detach. (finish_omp_clauses): Adjust bitmap checks to allow struct decl and same struct field access to co-exist on OpenMP construct. gcc/ChangeLog: * gimplify.c (is_or_contains_p): New static helper function. (omp_target_reorder_clauses): New function. (gimplify_scan_omp_clauses): Add use of omp_target_reorder_clauses to reorder clause list according to OpenMP 5.0 rules. Add handling of GOMP_MAP_ATTACH_DETACH for OpenMP cases. * omp-low.c (is_omp_target): New static helper function. (scan_sharing_clauses): Add scan phase handling of GOMP_MAP_ATTACH/DETACH for OpenMP cases. (lower_omp_target): Add lowering handling of GOMP_MAP_ATTACH/DETACH for OpenMP cases. gcc/testsuite/ChangeLog: * c-c++-common/gomp/clauses-2.c: Remove dg-error cases now valid. * gfortran.dg/gomp/map-2.f90: Likewise. * c-c++-common/gomp/map-5.c: New testcase. libgomp/ChangeLog: * libgomp.h (enum gomp_map_vars_kind): Adjust enum values to be bit-flag usable. * oacc-mem.c (acc_map_data): Adjust gomp_map_vars argument flags to 'GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_ENTER_DATA'. (goacc_enter_datum): Likewise for call to gomp_map_vars_async. (goacc_enter_data_internal): Likewise. * target.c (gomp_map_vars_internal): Change checks of GOMP_MAP_VARS_ENTER_DATA to use bit-and (&). Adjust use of gomp_attach_pointer for OpenMP cases. (gomp_exit_data): Add handling of GOMP_MAP_DETACH. (GOMP_target_enter_exit_data): Add handling of GOMP_MAP_ATTACH. * testsuite/libgomp.c-c++-common/ptr-attach-1.c: New testcase.
2020-11-19openmp: Retire nest-var ICV for OpenMP 5.1Kwok Cheung Yeung6-40/+132
This removes the nest-var ICV, expressing nesting in terms of the max-active-levels-var ICV instead. The max-active-levels-var ICV is now per data environment rather than per device. This is a backport from mainline (commit 6fae7eda968db658c280ad6f94fe6906a15af0c9). 2020-11-18 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * env.c (gomp_global_icv): Remove nest_var field. Add max_active_levels_var field. (gomp_max_active_levels_var): Remove. (parse_boolean): Return true on success. (handle_omp_display_env): Express OMP_NESTED in terms of max_active_levels_var. Change format specifier for max_active_levels_var. (initialize_env): Set max_active_levels_var from OMP_MAX_ACTIVE_LEVELS, OMP_NESTED, OMP_NUM_THREADS and OMP_PROC_BIND. * icv.c (omp_set_nested): Express in terms of max_active_levels_var. (omp_get_nested): Likewise. (omp_set_max_active_levels): Use max_active_levels_var field instead of gomp_max_active_levels_var. (omp_get_max_active_levels): Likewise. * libgomp.h (struct gomp_task_icv): Remove nest_var field. Add max_active_levels_var field. (gomp_supported_active_levels): Set to UCHAR_MAX. (gomp_max_active_levels_var): Delete. * libgomp.texi (omp_get_nested): Update documentation. (omp_set_nested): Likewise. (OMP_MAX_ACTIVE_LEVELS): Likewise. (OMP_NESTED): Likewise. (OMP_NUM_THREADS): Likewise. (OMP_PROC_BIND): Likewise. * parallel.c (gomp_resolve_num_threads): Replace reference to nest_var with max_active_levels_var. Use max_active_levels_var field instead of gomp_max_active_levels_var.
2020-11-13openmp: Mark deprecated symbols in OpenMP 5.0Kwok Cheung Yeung33-15/+121
This is a backport from mainline (commit 10508db867934264bbc2578f1f454c19fa558fd3). 2020-11-05 Ulrich Drepper <drepper@redhat.com> Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * Makefile.am (%.mod): Add -cpp and -fopenmp to compile flags. * Makefile.in: Regenerate. * fortran.c: Wrap uses of omp_set_nested and omp_get_nested with pragmas to ignore -Wdeprecated-declarations warnings. * icv.c: Likewise. * omp.h.in (__GOMP_DEPRECATED_5_0): Define. Mark omp_lock_hint_* enum values, omp_lock_hint_t, omp_set_nested, and omp_get_nested with __GOMP_DEPRECATED_5_0. * omp_lib.f90.in: Mark omp_get_nested and omp_set_nested as deprecated. * testsuite/libgomp.c++/affinity-1.C: Add -Wno-deprecated-declarations to test options. * testsuite/libgomp.c/affinity-1.c: Likewise. * testsuite/libgomp.c/affinity-2.c: Likewise. * testsuite/libgomp.c/appendix-a/a.15.1.c: Likewise. * testsuite/libgomp.c/lib-1.c: Likewise. * testsuite/libgomp.c/nested-1.c: Likewise. * testsuite/libgomp.c/nested-2.c: Likewise. * testsuite/libgomp.c/nested-3.c: Likewise. * testsuite/libgomp.c/pr32362-1.c: Likewise. * testsuite/libgomp.c/pr32362-2.c: Likewise. * testsuite/libgomp.c/pr32362-3.c: Likewise. * testsuite/libgomp.c/pr35549.c: Likewise. * testsuite/libgomp.c/pr42942.c: Likewise. * testsuite/libgomp.c/pr61200.c: Likewise. * testsuite/libgomp.c/sort-1.c: Likewise. * testsuite/libgomp.c/target-5.c: Likewise. * testsuite/libgomp.c/target-6.c: Likewise. * testsuite/libgomp.c/teams-1.c: Likewise. * testsuite/libgomp.c/thread-limit-1.c: Likewise. * testsuite/libgomp.c/thread-limit-2.c: Likewise. * testsuite/libgomp.c/thread-limit-4.c: Likewise. * testsuite/libgomp.fortran/affinity1.f90: Likewise. * testsuite/libgomp.fortran/lib1.f90: Likewise. * testsuite/libgomp.fortran/lib2.f: Likewise. * testsuite/libgomp.fortran/nested1.f90: Likewise. * testsuite/libgomp.fortran/teams1.f90: Likewise.
2020-11-13openmp: Add test for OMP_TARGET_OFFLOAD=mandatory for cases where it must ↵Jakub Jelinek2-0/+40
not fail This is a backport from mainline (commit 17c5b7e1dc47bab6e6cedbf4b2d88cef3283533e). 2020-10-22 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.c/target-41.c: New test.
2020-11-11openmp: Change omp_get_initial_device () to match OpenMP 5.1 requirementsJakub Jelinek8-21/+97
> Therefore, I think until omp_get_initial_device () value is changed, we The following so far untested patch implements that change. OpenMP 4.5 said for omp_get_initial_device: The value of the device number is implementation defined. If it is between 0 and one less than omp_get_num_devices() then it is valid for use with all device constructs and routines; if it is outside that range, then it is only valid for use with the device memory routines and not in the device clause. and OpenMP 5.0 similarly, but OpenMP 5.1 says: The value of the device number is the value returned by the omp_get_num_devices routine. As the new value is compatible with what has been required earlier, I think we can change it already now. This is a backport from mainline (commit 74c9882b80bda50b37c9555498de7123c6bdb9e4). 2020-10-22 Jakub Jelinek <jakub@redhat.com> * icv.c (omp_get_initial_device): Remove including corresponding ialias. * icv-device.c (omp_get_initial_device): New function. Return gomp_get_num_devices (). Add ialias. * target.c (resolve_device): Don't fail with OMP_TARGET_OFFLOAD=mandatory if device_id is equal to gomp_get_num_devices (). (omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy, omp_target_memcpy_rect, omp_target_associate_ptr, omp_target_disassociate_ptr, omp_pause_resource): Use gomp_get_num_devices () instead of GOMP_DEVICE_HOST_FALLBACK on the first use in the functions, in uses dominated by the gomp_get_num_devices call use num_devices_openmp instead. * libgomp.texi (omp_get_initial_device): Document. * config/gcn/icv-device.c (omp_get_initial_device): New function. Add ialias. * config/nvptx/icv-device.c (omp_get_initial_device): Likewise. * testsuite/libgomp.c/target-40.c: New test.
2020-11-11openmp: Implement support for OMP_TARGET_OFFLOAD environment variableKwok Cheung Yeung5-44/+192
This implements support for the OMP_TARGET_OFFLOAD environment variable introduced in the OpenMP 5.0 standard, which controls how offloading is handled. It may be set to MANDATORY (abort if offloading cannot be performed), DISABLED (no offloading to devices) or DEFAULT (offload to device if possible, fall back to host if not). This is a backport from mainline (commits 1bfc07d150790fae93184a79a7cce897655cb37b, 35f258f4bbba7fa044f90b4f14d1bc942db58089 and 121a8812c45b3155ccbd268b000ad00a778e81e8). 2020-10-20 Kwok Cheung Yeung <kcy@codesourcery.com> Jakub Jelinek <jakub@redhat.com> libgomp/ * env.c (gomp_target_offload_var): New. (parse_target_offload): New. (handle_omp_display_env): Print value of OMP_TARGET_OFFLOAD. (initialize_env): Parse OMP_TARGET_OFFLOAD. * libgomp.h (gomp_target_offload_t): New. (gomp_target_offload_var): New. * libgomp.texi (OMP_TARGET_OFFLOAD): New section. * target.c (resolve_device): Generate error if device not found and offloading is mandatory. (gomp_target_fallback): Generate error if offloading is mandatory. (GOMP_target): Add argument in call to gomp_target_fallback. (GOMP_target_ext): Likewise. (gomp_target_data_fallback): Generate error if offloading is mandatory. (GOMP_target_data): Add argument in call to gomp_target_data_fallback. (GOMP_target_data_ext): Likewise. (gomp_target_task_fn): Add argument in call to gomp_target_fallback. (gomp_target_init): Return early if offloading is disabled. Inside of the function, use automatic variables corresponding to num_devices, num_devices_openmp and devices global variables and update the globals only at the end of the function.
2020-11-11openmp: Add support for the omp_get_supported_active_levels runtime library ↵Kwok Cheung Yeung12-6/+100
routine This patch implements the omp_get_supported_active_levels runtime routine from the OpenMP 5.0 specification, which returns the maximum number of active nested parallel regions supported by this implementation. The current maximum (set using the omp_set_max_active_levels routine or the OMP_MAX_ACTIVE_LEVELS environment variable) cannot exceed this number. This is a backport from mainline (commits 8949b985dbaf07d433bd57d2883e1e5414f20e75 and 445567b22a3c535be0b1861b393e9a0b050f2b1e). 2020-10-13 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * env.c (gomp_max_active_levels_var): Initialize to gomp_supported_active_levels. (initialize_env): Limit gomp_max_active_levels_var to be at most equal to gomp_supported_active_levels. * fortran.c (omp_get_supported_active_levels): Add ialias_redirect. (omp_get_supported_active_levels_): New. * icv.c (omp_set_max_active_levels): Limit gomp_max_active_levels_var to at most equal to gomp_supported_active_levels. (omp_get_supported_active_levels): New. * libgomp.h (gomp_supported_active_levels): New. * libgomp.map (OMP_5.0.1): Add omp_get_supported_active_levels and omp_get_supported_active_levels_. * libgomp.texi (omp_get_max_active_levels): Modify description. (omp_get_supported_active_levels): New. (omp_set_max_active_levels): Update. Add reference to omp_get_supported_active_levels. * omp.h.in (omp_get_supported_active_levels): New. * omp_lib.f90.in (omp_get_supported_active_levels): New. * omp_lib.h.in (omp_get_supported_active_levels): New. * testsuite/libgomp.c/lib-2.c (main): Check omp_get_max_active_levels against omp_get_supported_active_levels. * testsuite/libgomp.fortran/lib4.f90 (lib4): Likewise.
2020-09-28Merge remote-tracking branch 'origin/releases/gcc-10' into devel/omp/gcc-10Tobias Burnus70-443/+1635
Merged up to a6c47f4ce26639bfbc72821ae629b9af7744a9d7 (2020-09-28)
2020-09-28OpenMP: Handle cpp_implicit_alias in declare-target discovery (PR96390)Tobias Burnus5-33/+46
Backport mainline version; updates commit ef509d1985aa53a8c0875c25ad4050ea807be10e for later review-comment changes. gcc/ChangeLog: PR middle-end/96390 * omp-offload.c (omp_discover_declare_target_tgt_fn_r): Handle alias nodes. libgomp/ChangeLog: PR middle-end/96390 * testsuite/libgomp.c++/pr96390.C: New test. * testsuite/libgomp.c-c++-common/pr96390.c: New test. (cherry picked from commit 2a10a2c0689db280ee3a94164504b7196b8370f4)
2020-09-28gomp/pr94874.c: Update scan-tree-dumpTobias Burnus2-2/+7
g++ on OG10 adds an additional temporary compared to mainline. * c-c++-common/gomp/pr94874.c: Update scan-tree-dump.
2020-09-28testsuite: [aarch64] Fix aarch64/advsimd-intrinsics/v{trn,uzp,zip}_half.cChristophe Lyon3-9/+0
Since r11-3402 (g:65c9878641cbe0ed898aa7047b7b994e9d4a5bb1), the vtrn_half, vuzp_half and vzip_half started failing with vtrn_half.c:76:17: error: redeclaration of 'vector_float64x2' with no linkage vtrn_half.c:77:17: error: redeclaration of 'vector2_float64x2' with no linkage vtrn_half.c:80:17: error: redeclaration of 'vector_res_float64x2' with no linkage This is because r11-3402 now always declares float64x2 variables for aarch64, leading to a duplicate declaration in these testcases. The fix is simply to remove these now useless declarations. These tests are skipped on arm*, so there is no impact on that target. 2020-09-25 Christophe Lyon <christophe.lyon@linaro.org> gcc/testsuite/ PR target/71233 * gcc.target/aarch64/advsimd-intrinsics/vtrn_half.c: Remove declarations of vector, vector2, vector_res for float64x2 type. * gcc.target/aarch64/advsimd-intrinsics/vuzp_half.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vzip_half.c: Likewise. (cherry picked from commit 8c775bf447e190024fa08c55e38db94dd013a393)
2020-09-28AArch64: Implement missing p128<->f64 reinterpret intrinsicsKyrylo Tkachov3-2/+39
This patch implements the missing reinterprets to and from poly128_t and float64x2_t. I've plugged in the appropriate testing in the advsimd-intrinsics.exp too. Bootstrapped and tested on aarch64-none-linux-gnu. Tested advsimd-intrinsics.exp on arm-none-eabi too to make sure arm testing isn't affected. gcc/ PR target/71233 * config/aarch64/arm_neon.h (vreinterpretq_f64_p128, vreinterpretq_p128_f64): Define. gcc/testsuite/ PR target/71233 * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h (clean_results): Add float64x2_t cleanup. (DECL_VARIABLE_128BITS_VARIANTS): Add float64x2_t variable. * gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c: Add testing of vreinterpretq_f64_p128, vreinterpretq_p128_f64. (cherry picked from commit 65c9878641cbe0ed898aa7047b7b994e9d4a5bb1)
2020-09-28AArch64: Implement missing vrndns_f32 intrinsicKyrylo Tkachov3-2/+21
This patch implements the missing vrndns_f32 intrinsic. This operates on a scalar float32_t value. It can be mapped down to a __builtin_aarch64_frintnsf builtin. This patch does that. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ PR target/71233 * config/aarch64/aarch64-simd-builtins.def (frintn): Use BUILTIN_VHSDF_HSDF for modes. Remove explicit hf instantiation. * config/aarch64/arm_neon.h (vrndns_f32): Define. gcc/testsuite/ PR target/71233 * gcc.target/aarch64/simd/vrndns_f32_1.c: New test. (cherry picked from commit 02b5377b3766804059b7824330d33d0e1cef2e5b)
2020-09-28AArch64: Implement missing _p64 intrinsics for vector permutesKyrylo Tkachov2-0/+111
This patch implements some missing vector permute intrinsics operating on poly64x2_t types. They are implemented identically to their uint64x2_t brethren. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ PR target/71233 * config/aarch64/arm_neon.h (vtrn1q_p64, vtrn2q_p64, vuzp1q_p64, vuzp2q_p64, vzip1q_p64, vzip2q_p64): Define. gcc/testsuite/ PR target/71233 * gcc.target/aarch64/simd/trn_zip_p64_1.c: New test. (cherry picked from commit e8e818399d70c5a5a3d30a54d305c6e2b92e2c66)
2020-09-28AArch64: Implement vldrq_p128 intrinsicKyrylo Tkachov2-0/+20
This patch implements the missing vldrq_p128 intrinsic that just loads from the appropriate pointer. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ PR target/71233 * config/aarch64/arm_neon.h (vldrq_p128): Define. gcc/testsuite/ PR target/71233 * gcc.target/aarch64/simd/vldrq_p128_1.c: New test. (cherry picked from commit f2868e4bcff2c7b882d01231f039459c00e59d7b)