aboutsummaryrefslogtreecommitdiff
path: root/libgomp
AgeCommit message (Collapse)AuthorFilesLines
2023-12-12Daily bump.GCC Administrator1-0/+15
2023-12-11libgfortran: Replace mutex with rwlockLipeng Zhu3-0/+73
This patch try to introduce the rwlock and split the read/write to unit_root tree and unit_cache with rwlock instead of the mutex to increase CPU efficiency. In the get_gfc_unit function, the percentage to step into the insert_unit function is around 30%, in most instances, we can get the unit in the phase of reading the unit_cache or unit_root tree. So split the read/write phase by rwlock would be an approach to make it more parallel. BTW, the IPC metrics can gain around 9x in our test server with 220 cores. The benchmark we used is https://github.com/rwesson/NEAT libgcc/ChangeLog: * gthr-posix.h (__GTHREAD_RWLOCK_INIT): New macro. (__gthrw): New function. (__gthread_rwlock_rdlock): New function. (__gthread_rwlock_tryrdlock): New function. (__gthread_rwlock_wrlock): New function. (__gthread_rwlock_trywrlock): New function. (__gthread_rwlock_unlock): New function. libgfortran/ChangeLog: * io/async.c (DEBUG_LINE): New macro. * io/async.h (RWLOCK_DEBUG_ADD): New macro. (CHECK_RDLOCK): New macro. (CHECK_WRLOCK): New macro. (TAIL_RWLOCK_DEBUG_QUEUE): New macro. (IN_RWLOCK_DEBUG_QUEUE): New macro. (RDLOCK): New macro. (WRLOCK): New macro. (RWUNLOCK): New macro. (RD_TO_WRLOCK): New macro. (INTERN_RDLOCK): New macro. (INTERN_WRLOCK): New macro. (INTERN_RWUNLOCK): New macro. * io/io.h (struct gfc_unit): Change UNIT_LOCK to UNIT_RWLOCK in a comment. (unit_lock): Remove including associated internal_proto. (unit_rwlock): New declarations including associated internal_proto. (dec_waiting_unlocked): Use WRLOCK and RWUNLOCK on unit_rwlock instead of __gthread_mutex_lock and __gthread_mutex_unlock on unit_lock. * io/transfer.c (st_read_done_worker): Use WRLOCK and RWUNLOCK on unit_rwlock instead of LOCK and UNLOCK on unit_lock. (st_write_done_worker): Likewise. * io/unit.c: Change UNIT_LOCK to UNIT_RWLOCK in 'IO locking rules' comment. Use unit_rwlock variable instead of unit_lock variable. (get_gfc_unit_from_unit_root): New function. (get_gfc_unit): Use RDLOCK, WRLOCK and RWUNLOCK on unit_rwlock instead of LOCK and UNLOCK on unit_lock. (close_unit_1): Use WRLOCK and RWUNLOCK on unit_rwlock instead of LOCK and UNLOCK on unit_lock. (close_units): Likewise. (newunit_alloc): Use RWUNLOCK on unit_rwlock instead of UNLOCK on unit_lock. * io/unix.c (find_file): Use RDLOCK and RWUNLOCK on unit_rwlock instead of LOCK and UNLOCK on unit_lock. (flush_all_units): Use WRLOCK and RWUNLOCK on unit_rwlock instead of LOCK and UNLOCK on unit_lock.
2023-12-11aarch64: enable mixed-types for aarch64 simdclonesAndre Vieira2-3/+13
This patch enables the use of mixed-types for simd clones for AArch64, adds aarch64 as a target_vect_simd_clones and corrects the way the simdlen is chosen for non-specified simdlen clauses according to the 'Vector Function Application Binary Interface Specification for AArch64'. Additionally this patch also restricts combinations of simdlen and return/argument types that map to vectors larger than 128 bits as we currently do not have a way to represent these types in a way that is consistent internally and externally. gcc/ChangeLog: * config/aarch64/aarch64.cc (lane_size): New function. (aarch64_simd_clone_compute_vecsize_and_simdlen): Determine simdlen according to NDS rule and reject combination of simdlen and types that lead to vectors larger than 128bits. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add aarch64 targets to vect_simd_clones. * c-c++-common/gomp/declare-variant-14.c: Adapt test for aarch64. * c-c++-common/gomp/pr60823-1.c: Likewise. * c-c++-common/gomp/pr60823-2.c: Likewise. * c-c++-common/gomp/pr60823-3.c: Likewise. * g++.dg/gomp/attrs-10.C: Likewise. * g++.dg/gomp/declare-simd-1.C: Likewise. * g++.dg/gomp/declare-simd-3.C: Likewise. * g++.dg/gomp/declare-simd-4.C: Likewise. * g++.dg/gomp/declare-simd-7.C: Likewise. * g++.dg/gomp/declare-simd-8.C: Likewise. * g++.dg/gomp/pr88182.C: Likewise. * gcc.dg/declare-simd.c: Likewise. * gcc.dg/gomp/declare-simd-1.c: Likewise. * gcc.dg/gomp/declare-simd-3.c: Likewise. * gcc.dg/gomp/pr87887-1.c: Likewise. * gcc.dg/gomp/pr87895-1.c: Likewise. * gcc.dg/gomp/pr89246-1.c: Likewise. * gcc.dg/gomp/pr99542.c: Likewise. * gcc.dg/gomp/simd-clones-2.c: Likewise. * gcc.dg/vect/vect-simd-clone-1.c: Likewise. * gcc.dg/vect/vect-simd-clone-2.c: Likewise. * gcc.dg/vect/vect-simd-clone-4.c: Likewise. * gcc.dg/vect/vect-simd-clone-5.c: Likewise. * gcc.dg/vect/vect-simd-clone-6.c: Likewise. * gcc.dg/vect/vect-simd-clone-7.c: Likewise. * gcc.dg/vect/vect-simd-clone-8.c: Likewise. * gfortran.dg/gomp/declare-simd-2.f90: Likewise. * gfortran.dg/gomp/declare-simd-coarray-lib.f90: Likewise. * gfortran.dg/gomp/declare-variant-14.f90: Likewise. * gfortran.dg/gomp/pr79154-1.f90: Likewise. * gfortran.dg/gomp/pr83977.f90: Likewise. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-1.c: Adapt test for aarch64. * testsuite/libgomp.fortran/declare-simd-1.f90: Likewise.
2023-12-11OpenMP: Minor '!$omp allocators' cleanupTobias Burnus1-0/+3
gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): Set 'fn spec'. libgomp/ChangeLog: * libgomp_g.h (GOMP_add_alloc, GOMP_is_alloc): Add.
2023-12-09Daily bump.GCC Administrator1-0/+22
2023-12-08OpenMP/Fortran: Implement omp allocators/allocate for ptr/allocatablesTobias Burnus12-7/+417
This commit adds -fopenmp-allocators which enables support for 'omp allocators' and 'omp allocate' that are associated with a Fortran allocate-stmt. If such a construct is encountered, an error is shown, unless the -fopenmp-allocators flag is present. With -fopenmp -fopenmp-allocators, those constructs get turned into GOMP_alloc allocations, while -fopenmp-allocators (also without -fopenmp) ensures deallocation and reallocation (via intrinsic assignments) are properly directed to GOMP_free/omp_realloc - while normal Fortran allocations are processed by free/realloc. In order to distinguish a 'malloc'ed from a 'GOMP_alloc'ed memory, the version field of the Fortran array discriptor is (mis)used: 0 indicates the normal Fortran allocation while 1 denotes GOMP_alloc. For scalars, there is record keeping in libgomp: GOMP_add_alloc(ptr) will add the pointer address to a splay_tree while GOMP_is_alloc(ptr) will return true it was previously added but also removes it from the list. Besides Fortran FE work, BUILT_IN_GOMP_REALLOC is no part of omp-builtins.def and libgomp gains the mentioned two new function. gcc/ChangeLog: * builtin-types.def (BT_FN_PTR_PTR_SIZE_PTRMODE_PTRMODE): New. * omp-builtins.def (BUILT_IN_GOMP_REALLOC): New. * builtins.cc (builtin_fnspec): Handle it. * gimple-ssa-warn-access.cc (fndecl_alloc_p, matching_alloc_calls_p): Likewise. * gimple.cc (nonfreeing_call_p): Likewise. * predict.cc (expr_expected_value_1): Likewise. * tree-ssa-ccp.cc (evaluate_stmt): Likewise. * tree.cc (fndecl_dealloc_argno): Likewise. gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_node): Handle EXEC_OMP_ALLOCATE and EXEC_OMP_ALLOCATORS. * f95-lang.cc (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST): Add 'ECF_LEAF | ECF_MALLOC' to existing 'ECF_NOTHROW'. (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST): Define. * gfortran.h (gfc_omp_clauses): Add contained_in_target_construct. * invoke.texi (-fopenacc, -fopenmp): Update based on C version. (-fopenmp-simd): New, based on C version. (-fopenmp-allocators): New. * lang.opt (fopenmp-allocators): Add. * openmp.cc (resolve_omp_clauses): For allocators/allocate directive, add target and no dynamic_allocators diagnostic and more invalid diagnostic. * parse.cc (decode_omp_directive): Set contains_teams_construct. * trans-array.h (gfc_array_allocate): Update prototype. (gfc_conv_descriptor_version): New prototype. * trans-decl.cc (gfc_init_default_dt): Fix comment. * trans-array.cc (gfc_conv_descriptor_version): New. (gfc_array_allocate): Support GOMP_alloc allocation. (gfc_alloc_allocatable_for_assignment, structure_alloc_comps): Handle GOMP_free/omp_realloc as needed. * trans-expr.cc (gfc_conv_procedure_call): Likewise. (alloc_scalar_allocatable_for_assignment): Likewise. * trans-intrinsic.cc (conv_intrinsic_move_alloc): Likewise. * trans-openmp.cc (gfc_trans_omp_allocators, gfc_trans_omp_directive): Handle allocators/allocate directive. (gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): New. * trans-stmt.h (gfc_trans_allocate): Update prototype. * trans-stmt.cc (gfc_trans_allocate): Support GOMP_alloc. * trans-types.cc (gfc_get_dtype_rank_type): Set version field. * trans.cc (gfc_allocate_using_malloc, gfc_allocate_allocatable): Update to handle GOMP_alloc. (gfc_deallocate_with_status, gfc_deallocate_scalar_with_status): Handle GOMP_free. (trans_code): Update call. * trans.h (gfc_allocate_allocatable, gfc_allocate_using_malloc): Update prototype. (gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): New prototype. * types.def (BT_FN_PTR_PTR_SIZE_PTRMODE_PTRMODE): New. libgomp/ChangeLog: * allocator.c (struct fort_alloc_splay_tree_key_s, fort_alloc_splay_compare, GOMP_add_alloc, GOMP_is_alloc): New. * libgomp.h: Define splay_tree_static for 'reverse' splay tree. * libgomp.map (GOMP_5.1.2): New; add GOMP_add_alloc and GOMP_is_alloc; move GOMP_target_map_indirect_ptr from ... (GOMP_5.1.1): ... here. * libgomp.texi (Impl. Status, Memory management): Update for allocators/allocate directives. * splay-tree.c: Handle splay_tree_static define to declare all functions as static. (splay_tree_lookup_node): New. * splay-tree.h: Handle splay_tree_decl_only define. (splay_tree_lookup_node): New prototype. * target.c: Define splay_tree_static for 'reverse'. * testsuite/libgomp.fortran/allocators-1.f90: New test. * testsuite/libgomp.fortran/allocators-2.f90: New test. * testsuite/libgomp.fortran/allocators-3.f90: New test. * testsuite/libgomp.fortran/allocators-4.f90: New test. * testsuite/libgomp.fortran/allocators-5.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/allocate-14.f90: Add coarray and not-listed tests. * gfortran.dg/gomp/allocate-5.f90: Remove sorry dg-message. * gfortran.dg/bind_c_array_params_2.f90: Update expected dump for dtype '.version=0'. * gfortran.dg/gomp/allocate-16.f90: New test. * gfortran.dg/gomp/allocators-3.f90: New test. * gfortran.dg/gomp/allocators-4.f90: New test.
2023-12-07Daily bump.GCC Administrator1-0/+75
2023-12-06amdgcn, libgomp: low-latency allocatorAndrew Stubbs7-10/+188
This implements the OpenMP low-latency memory allocator for AMD GCN using the small per-team LDS memory (Local Data Store). Since addresses can now refer to LDS space, the "Global" address space is no-longer compatible. This patch therefore switches the backend to use entirely "Flat" addressing (which supports both memories). A future patch will re-enable "global" instructions for cases where it is known to be safe to do so. gcc/ChangeLog: * config/gcn/gcn-builtins.def (DISPATCH_PTR): New built-in. * config/gcn/gcn.cc (gcn_init_machine_status): Disable global addressing. (gcn_expand_builtin_1): Implement GCN_BUILTIN_DISPATCH_PTR. libgomp/ChangeLog: * config/gcn/libgomp-gcn.h (TEAM_ARENA_START): Move to here. (TEAM_ARENA_FREE): Likewise. (TEAM_ARENA_END): Likewise. (GCN_LOWLAT_HEAP): New. * config/gcn/team.c (LITTLEENDIAN_CPU): New, and import hsa.h. (__gcn_lowlat_init): New prototype. (gomp_gcn_enter_kernel): Initialize the low-latency heap. * libgomp.h (TEAM_ARENA_START): Move to libgomp.h. (TEAM_ARENA_FREE): Likewise. (TEAM_ARENA_END): Likewise. * plugin/plugin-gcn.c (lowlat_size): New variable. (print_kernel_dispatch): Label the group_segment_size purpose. (init_environment_variables): Read GOMP_GCN_LOWLAT_POOL. (create_kernel_dispatch): Pass low-latency head allocation to kernel. (run_kernel): Use shadow; don't assume values. * testsuite/libgomp.c/omp_alloc-traits.c: Enable for amdgcn. * config/gcn/allocator.c: New file. * libgomp.texi: Document low-latency implementation details.
2023-12-06openmp, nvptx: low-lat memory access traitsAndrew Stubbs10-6/+166
The NVPTX low latency memory is not accessible outside the team that allocates it, and therefore should be unavailable for allocators with the access trait "all". This change means that the omp_low_lat_mem_alloc predefined allocator no longer works (but omp_cgroup_mem_alloc still does). libgomp/ChangeLog: * allocator.c (MEMSPACE_VALIDATE): New macro. (omp_init_allocator): Use MEMSPACE_VALIDATE. (omp_aligned_alloc): Use OMP_LOW_LAT_MEM_ALLOC_INVALID. (omp_aligned_calloc): Likewise. (omp_realloc): Likewise. * config/nvptx/allocator.c (nvptx_memspace_validate): New function. (MEMSPACE_VALIDATE): New macro. (OMP_LOW_LAT_MEM_ALLOC_INVALID): New define. * libgomp.texi: Document low-latency implementation details. * testsuite/libgomp.c/omp_alloc-1.c (main): Add gnu_lowlat. * testsuite/libgomp.c/omp_alloc-2.c (main): Add gnu_lowlat. * testsuite/libgomp.c/omp_alloc-3.c (main): Add gnu_lowlat. * testsuite/libgomp.c/omp_alloc-4.c (main): Add access trait. * testsuite/libgomp.c/omp_alloc-5.c (main): Add gnu_lowlat. * testsuite/libgomp.c/omp_alloc-6.c (main): Add access trait. * testsuite/libgomp.c/omp_alloc-traits.c: New test.
2023-12-06libgomp, nvptx: low-latency memory allocatorAndrew Stubbs12-105/+1239
This patch adds support for allocating low-latency ".shared" memory on NVPTX GPU device, via the omp_low_lat_mem_space and omp_alloc. The memory can be allocated, reallocated, and freed using a basic but fast algorithm, is thread safe and the size of the low-latency heap can be configured using the GOMP_NVPTX_LOWLAT_POOL environment variable. The use of the PTX dynamic_smem_size feature means that low-latency allocator will not work with the PTX 3.1 multilib. For now, the omp_low_lat_mem_alloc allocator also works, but that will change when I implement the access traits. libgomp/ChangeLog: * allocator.c (MEMSPACE_ALLOC): New macro. (MEMSPACE_CALLOC): New macro. (MEMSPACE_REALLOC): New macro. (MEMSPACE_FREE): New macro. (predefined_alloc_mapping): New array. Add _Static_assert to match. (ARRAY_SIZE): New macro. (omp_aligned_alloc): Use MEMSPACE_ALLOC. Implement fall-backs for predefined allocators. Simplify existing fall-backs. (omp_free): Use MEMSPACE_FREE. (omp_calloc): Use MEMSPACE_CALLOC. Implement fall-backs for predefined allocators. Simplify existing fall-backs. (omp_realloc): Use MEMSPACE_REALLOC, MEMSPACE_ALLOC, and MEMSPACE_FREE. Implement fall-backs for predefined allocators. Simplify existing fall-backs. * config/nvptx/team.c (__nvptx_lowlat_pool): New asm variable. (__nvptx_lowlat_init): New prototype. (gomp_nvptx_main): Call __nvptx_lowlat_init. * libgomp.texi: Update memory space table. * plugin/plugin-nvptx.c (lowlat_pool_size): New variable. (GOMP_OFFLOAD_init_device): Read the GOMP_NVPTX_LOWLAT_POOL envvar. (GOMP_OFFLOAD_run): Apply lowlat_pool_size. * basic-allocator.c: New file. * config/nvptx/allocator.c: New file. * testsuite/libgomp.c/omp_alloc-1.c: New test. * testsuite/libgomp.c/omp_alloc-2.c: New test. * testsuite/libgomp.c/omp_alloc-3.c: New test. * testsuite/libgomp.c/omp_alloc-4.c: New test. * testsuite/libgomp.c/omp_alloc-5.c: New test. * testsuite/libgomp.c/omp_alloc-6.c: New test. Co-authored-by: Kwok Cheung Yeung <kcy@codesourcery.com> Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2023-12-01Daily bump.GCC Administrator1-0/+46
2023-11-30Fix 'libgomp.c/declare-variant-4-*.c', add 'libgomp.c/declare-variant-4.c'Thomas Schwinge8-12/+31
These test cases being 'dg-skip-if [...] { ! amdgcn-*-* }' meant they were only ever considered for 'amdgcn-*-*' target -- that is, GCN *target*, not GCN *offload target*, what is intended to be tested here, but for which they thus always were UNSUPPORTED. Use the same style of 'dg-[...]' directives as for the nvptx offloading test cases, 'libgomp.c/declare-variant-3*.c'. Fix-up for commit 1fd508744eccda9ad9c6d6fcce5b2ea9c568818d "amdgcn: Support AMD-specific 'isa' traits in OpenMP context selectors". libgomp/ * testsuite/libgomp.c/declare-variant-4-fiji.c: Adjust. * testsuite/libgomp.c/declare-variant-4-gfx803.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx900.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx906.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx908.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx90a.c: Likewise. * testsuite/libgomp.c/declare-variant-4.h: Likewise. * testsuite/libgomp.c/declare-variant-4.c: New.
2023-11-30Spin 'dg-do run' part of 'libgomp.c/declare-variant-3-sm30.c' off into new ↵Thomas Schwinge3-1/+14
'libgomp.c/declare-variant-3.c' Having nvptx offloading configured doesn't imply being able to run nvptx offloading test cases on the test host. Also, make 'libgomp.c/declare-variant-3.c' work for all non-offloading and offloading cases. Fix-up for commit 59b8ade88774b4dcf1691a8f650cdbb86cc30862 "[libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c". libgomp/ * testsuite/libgomp.c/declare-variant-3-sm30.c: Turn 'dg-do run' into 'dg-do link'. * testsuite/libgomp.c/declare-variant-3.c: New. * testsuite/libgomp.c/declare-variant-3.h: Extend.
2023-11-30In 'libgomp.c/declare-variant-{3,4}-*.c', restrict 'scan-offload-tree-dump's ↵Thomas Schwinge12-12/+12
to 'only_for_offload_target [...]' ... to care for the case where not just one but both of GCN and nvptx offloading are enabled. In that case, we currently get: UNRESOLVED: libgomp.c/declare-variant-3-sm30.c scan-amdgcn-amdhsa-offload-tree-dump optimized "= f30 \\(\\);" ... in addition to: PASS: libgomp.c/declare-variant-3-sm30.c scan-nvptx-none-offload-tree-dump optimized "= f30 \\(\\);" Etc. Fix-up for commit 59b8ade88774b4dcf1691a8f650cdbb86cc30862 "[libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c", and commit 1fd508744eccda9ad9c6d6fcce5b2ea9c568818d "amdgcn: Support AMD-specific 'isa' traits in OpenMP context selectors". libgomp/ * testsuite/libgomp.c/declare-variant-3-sm30.c: Restrict 'scan-offload-tree-dump' to 'only_for_offload_target nvptx-none'. * testsuite/libgomp.c/declare-variant-3-sm35.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm53.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm70.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm75.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm80.c: Likewise. * testsuite/libgomp.c/declare-variant-4-fiji.c: Restrict 'scan-offload-tree-dump' to 'only_for_offload_target amdgcn-amdhsa'. * testsuite/libgomp.c/declare-variant-4-gfx803.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx900.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx906.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx908.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx90a.c: Likewise.
2023-11-30Fix 'libgomp.c/declare-variant-3-*.c' compilation for configurations where ↵Thomas Schwinge6-0/+6
GCN offloading is enabled in addition to nvptx The GCN offloading compiler doesn't like '-misa=sm_30' etc.; restrict to '-foffload=nvptx-none' compilation only. Fix-up for commit 59b8ade88774b4dcf1691a8f650cdbb86cc30862 "[libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c". libgomp/ * testsuite/libgomp.c/declare-variant-3-sm30.c: 'dg-additional-options -foffload=nvptx-none'. * testsuite/libgomp.c/declare-variant-3-sm35.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm53.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm70.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm75.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm80.c: Likewise.
2023-11-30Daily bump.GCC Administrator1-0/+8
2023-11-29In 'libgomp.c/target-simd-clone-{1,2,3}.c', restrict ↵Thomas Schwinge3-5/+5
'scan-offload-ipa-dump's to 'only_for_offload_target amdgcn-amdhsa' This gets rid of UNRESOLVEDs if nvptx offloading compilation is enabled in addition to GCN: PASS: libgomp.c/target-simd-clone-1.c (test for excess errors) PASS: libgomp.c/target-simd-clone-1.c scan-amdgcn-amdhsa-offload-ipa-dump simdclone "Generated local clone _ZGV.*N.*_addit" -UNRESOLVED: libgomp.c/target-simd-clone-1.c scan-nvptx-none-offload-ipa-dump simdclone "Generated local clone _ZGV.*N.*_addit" PASS: libgomp.c/target-simd-clone-1.c scan-amdgcn-amdhsa-offload-ipa-dump simdclone "Generated local clone _ZGV.*M.*_addit" -UNRESOLVED: libgomp.c/target-simd-clone-1.c scan-nvptx-none-offload-ipa-dump simdclone "Generated local clone _ZGV.*M.*_addit" PASS: libgomp.c/target-simd-clone-2.c (test for excess errors) PASS: libgomp.c/target-simd-clone-2.c scan-amdgcn-amdhsa-offload-ipa-dump-not simdclone "Generated .* clone" -UNRESOLVED: libgomp.c/target-simd-clone-2.c scan-nvptx-none-offload-ipa-dump-not simdclone "Generated .* clone" PASS: libgomp.c/target-simd-clone-3.c (test for excess errors) PASS: libgomp.c/target-simd-clone-3.c scan-amdgcn-amdhsa-offload-ipa-dump simdclone "device doesn't match" -UNRESOLVED: libgomp.c/target-simd-clone-3.c scan-nvptx-none-offload-ipa-dump simdclone "device doesn't match" PASS: libgomp.c/target-simd-clone-3.c scan-amdgcn-amdhsa-offload-ipa-dump-not simdclone "Generated .* clone" -UNRESOLVED: libgomp.c/target-simd-clone-3.c scan-nvptx-none-offload-ipa-dump-not simdclone "Generated .* clone" Minor fix-up for commit 309e2d95e3b930c6f15c8a5346b913158404c76d 'OpenMP: Generate SIMD clones for functions with "declare target"'. libgomp/ * testsuite/libgomp.c/target-simd-clone-1.c: Restrict 'scan-offload-ipa-dump's to 'only_for_offload_target amdgcn-amdhsa'. * testsuite/libgomp.c/target-simd-clone-2.c: Likewise. * testsuite/libgomp.c/target-simd-clone-3.c: Likewise.
2023-11-25Daily bump.GCC Administrator1-0/+5
2023-11-24OpenMP: Accept argument to depobj's destroy clauseTobias Burnus1-1/+1
Since OpenMP 5.2, the destroy clause takes an depend argument as argument; for the depobj directive, it the new argument is optional but, if present, it must be identical to the directive's argument. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_depobj): Accept optionally an argument to the destroy clause. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_depobj): Accept optionally an argument to the destroy clause. gcc/fortran/ChangeLog: * openmp.cc (gfc_match_omp_depobj): Accept optionally an argument to the destroy clause. libgomp/ChangeLog: * libgomp.texi (5.2 Impl. Status): An argument to the destroy clause is now supported. gcc/testsuite/ChangeLog: * c-c++-common/gomp/depobj-3.c: New test. * gfortran.dg/gomp/depobj-3.f90: New test.
2023-11-23Daily bump.GCC Administrator1-0/+7
2023-11-22Adjust 'libgomp.c/declare-variant-{3,4}-[...]' for inter-procedural value ↵Thomas Schwinge2-0/+15
range propagation ..., that is, commit 53ba8d669550d3a1f809048428b97ca607f95cf5 "inter-procedural value range propagation", after which we see: [-PASS:-]{+FAIL:+} libgomp.c/declare-variant-3-sm30.c scan-nvptx-none-offload-tree-dump optimized "= f30 \\(\\);" Etc. That's due to: @@ -144,13 +144,11 @@ __attribute__((omp target entrypoint, noclone)) void main._omp_fn.0 (const struct .omp_data_t.3 & restrict .omp_data_i) { - int _3; int * _5; <bb 2> [local count: 1073741824]: - _3 = f30 (); _5 = *.omp_data_i_4(D).v; - *_5 = _3; + *_5 = 30; return; It's nice to see this optimization work here, too, but it does interfere with how we're currently testing OpenMP 'declare variant'. libgomp/ * testsuite/libgomp.c/declare-variant-3.h (f30, f35, f53, f70) (f75, f80, f): Add '__attribute__ ((noipa))'. * testsuite/libgomp.c/declare-variant-4.h (gfx803, gfx900, gfx906) (gfx908, gfx90a, f): Likewise.
2023-11-16Daily bump.GCC Administrator1-0/+6
2023-11-15amdgcn: Add Accelerator VGPR registersAndrew Stubbs1-2/+22
Add the new CDNA register file. We don't support any of the specialized instructions that use these registers, but they're useful to relieve register pressure without spilling to stack. Co-authored-by: Andrew Jenner <andrew@codesourcery.com> gcc/ChangeLog: * config/gcn/constraints.md: Add "a" AVGPR constraint. * config/gcn/gcn-valu.md (*mov<mode>): Add AVGPR alternatives. (*mov<mode>_4reg): Likewise. (@mov<mode>_sgprbase): Likewise. (gather<mode>_insn_1offset<exec>): Likewise. (gather<mode>_insn_1offset_ds<exec>): Likewise. (gather<mode>_insn_2offsets<exec>): Likewise. (scatter<mode>_expr<exec_scatter>): Likewise. (scatter<mode>_insn_1offset_ds<exec_scatter>): Likewise. (scatter<mode>_insn_2offsets<exec_scatter>): Likewise. * config/gcn/gcn.cc (MAX_NORMAL_AVGPR_COUNT): Define. (gcn_class_max_nregs): Handle AVGPR_REGS and ALL_VGPR_REGS. (gcn_hard_regno_mode_ok): Likewise. (gcn_regno_reg_class): Likewise. (gcn_spill_class): Allow spilling to AVGPRs on TARGET_CDNA1_PLUS. (gcn_sgpr_move_p): Handle AVGPRs. (gcn_secondary_reload): Reload AVGPRs via VGPRs. (gcn_conditional_register_usage): Handle AVGPRs. (gcn_vgpr_equivalent_register_operand): New function. (gcn_valid_move_p): Check for validity of AVGPR moves. (gcn_compute_frame_offsets): Handle AVGPRs. (gcn_memory_move_cost): Likewise. (gcn_register_move_cost): Likewise. (gcn_vmem_insn_p): Handle TYPE_VOP3P_MAI. (gcn_md_reorg): Handle AVGPRs. (gcn_hsa_declare_function_name): Likewise. (print_reg): Likewise. (gcn_dwarf_register_number): Likewise. * config/gcn/gcn.h (FIRST_AVGPR_REG): Define. (AVGPR_REGNO): Define. (LAST_AVGPR_REG): Define. (SOFT_ARG_REG): Update. (FRAME_POINTER_REGNUM): Update. (DWARF_LINK_REGISTER): Update. (FIRST_PSEUDO_REGISTER): Update. (AVGPR_REGNO_P): Define. (enum reg_class): Add AVGPR_REGS and ALL_VGPR_REGS. (REG_CLASS_CONTENTS): Add new register classes and add entries for AVGPRs to all classes. (REGISTER_NAMES): Add AVGPRs. * config/gcn/gcn.md (FIRST_AVGPR_REG, LAST_AVGPR_REG): Define. (AP_REGNUM, FP_REGNUM): Update. (define_attr "type"): Add vop3p_mai. (define_attr "unit"): Handle vop3p_mai. (define_attr "gcn_version"): Add "cdna2". (define_attr "enabled"): Handle cdna2. (*mov<mode>_insn): Add AVGPR alternatives. (*movti_insn): Likewise. * config/gcn/mkoffload.cc (isa_has_combined_avgprs): New. (process_asm): Process avgpr_count. * config/gcn/predicates.md (gcn_avgpr_register_operand): New. (gcn_avgpr_hard_register_operand): New. * doc/md.texi: Document the "a" constraint. gcc/testsuite/ChangeLog: * gcc.target/gcn/avgpr-mem-double.c: New test. * gcc.target/gcn/avgpr-mem-int.c: New test. * gcc.target/gcn/avgpr-mem-long.c: New test. * gcc.target/gcn/avgpr-mem-short.c: New test. * gcc.target/gcn/avgpr-spill-double.c: New test. * gcc.target/gcn/avgpr-spill-int.c: New test. * gcc.target/gcn/avgpr-spill-long.c: New test. * gcc.target/gcn/avgpr-spill-short.c: New test. libgomp/ChangeLog: * plugin/plugin-gcn.c (max_isa_vgprs): New. (run_kernel): CDNA2 devices have more VGPRs.
2023-11-14Daily bump.GCC Administrator1-0/+5
2023-11-10libgomp.texi: Update OpenMP 6.0-preview implementation-status listTobias Burnus1-15/+55
libgomp/ChangeLog: * libgomp.texi (OpenMP Impl. Status): Update for OpenMP TR12; renamed section from TR11.
2023-11-08Daily bump.GCC Administrator1-0/+46
2023-11-07Fix libgomp build on targets that are not Linux-based or acceleratorsKwok Cheung Yeung1-0/+0
The patch 'openmp: Add support for the 'indirect' clause in C/C++' introduced a new file target-indirect.c into the Makefile sources, but that file was only present in config/linux/ and config/accel/, so targets that are not Linux-based or GPU accelerators will not pick it up and fail to build. This is fixed by making the version in config/linux/ the default by moving it into the base directory of libgomp. 2023-11-07 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * config/linux/target-indirect.c: Move to... * target-indirect.c: ...here.
2023-11-07openmp: Add support for the 'indirect' clause in C/C++Kwok Cheung Yeung18-13/+445
This adds support for the 'indirect' clause in the 'declare target' directive. Functions declared as indirect may be called via function pointers passed from the host in offloaded code. Virtual calls to member functions via the object pointer in C++ are currently not supported in target regions. 2023-11-07 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/c-family/ * c-attribs.cc (c_common_attribute_table): Add attribute for indirect functions. * c-pragma.h (enum parma_omp_clause): Add entry for indirect clause. gcc/c/ * c-decl.cc (c_decl_attributes): Add attribute for indirect functions. * c-lang.h (c_omp_declare_target_attr): Add indirect field. * c-parser.cc (c_parser_omp_clause_name): Handle indirect clause. (c_parser_omp_clause_indirect): New. (c_parser_omp_all_clauses): Handle indirect clause. (OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask. (c_parser_omp_declare_target): Handle indirect clause. Emit error message if device_type or indirect clauses used alone. Emit error if indirect clause used with device_type that is not 'any'. (OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask. (c_parser_omp_begin): Handle indirect clause. * c-typeck.cc (c_finish_omp_clauses): Handle indirect clause. gcc/cp/ * cp-tree.h (cp_omp_declare_target_attr): Add indirect field. * decl2.cc (cplus_decl_attributes): Add attribute for indirect functions. * parser.cc (cp_parser_omp_clause_name): Handle indirect clause. (cp_parser_omp_clause_indirect): New. (cp_parser_omp_all_clauses): Handle indirect clause. (handle_omp_declare_target_clause): Add extra parameter. Add indirect attribute for indirect functions. (OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask. (cp_parser_omp_declare_target): Handle indirect clause. Emit error message if device_type or indirect clauses used alone. Emit error if indirect clause used with device_type that is not 'any'. (OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask. (cp_parser_omp_begin): Handle indirect clause. * semantics.cc (finish_omp_clauses): Handle indirect clause. gcc/ * lto-cgraph.cc (enum LTO_symtab_tags): Add tag for indirect functions. (output_offload_tables): Write indirect functions. (input_offload_tables): read indirect functions. * lto-section-names.h (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New. * omp-builtins.def (BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR): New. * omp-offload.cc (offload_ind_funcs): New. (omp_discover_implicit_declare_target): Add functions marked with 'omp declare target indirect' to indirect functions list. (omp_finish_file): Add indirect functions to section for offload indirect functions. (execute_omp_device_lower): Redirect indirect calls on target by passing function pointer to BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR. (pass_omp_device_lower::gate): Run pass_omp_device_lower if indirect functions are present on an accelerator device. * omp-offload.h (offload_ind_funcs): New. * tree-core.h (omp_clause_code): Add OMP_CLAUSE_INDIRECT. * tree.cc (omp_clause_num_ops): Add entry for OMP_CLAUSE_INDIRECT. (omp_clause_code_name): Likewise. * tree.h (OMP_CLAUSE_INDIRECT_EXPR): New. * config/gcn/mkoffload.cc (process_asm): Process offload_ind_funcs section. Count number of indirect functions. (process_obj): Emit number of indirect functions. * config/nvptx/mkoffload.cc (ind_func_ids, ind_funcs_tail): New. (process): Emit offload_ind_func_table in PTX code. Emit indirect function names and count in image. * config/nvptx/nvptx.cc (nvptx_record_offload_symbol): Mark indirect functions in PTX code with IND_FUNC_MAP. gcc/testsuite/ * c-c++-common/gomp/declare-target-7.c: Update expected error message. * c-c++-common/gomp/declare-target-indirect-1.c: New. * c-c++-common/gomp/declare-target-indirect-2.c: New. * g++.dg/gomp/attrs-21.C (v12): Update expected error message. * g++.dg/gomp/declare-target-indirect-1.C: New. * gcc.dg/gomp/attrs-21.c (v12): Update expected error message. include/ * gomp-constants.h (GOMP_VERSION): Increment to 3. (GOMP_VERSION_SUPPORTS_INDIRECT_FUNCS): New. libgcc/ * offloadstuff.c (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New. (__offload_ind_func_table): New. (__offload_ind_funcs_end): New. (__OFFLOAD_TABLE__): Add entries for indirect functions. libgomp/ * Makefile.am (libgomp_la_SOURCES): Add target-indirect.c. * Makefile.in: Regenerate. * libgomp-plugin.h (GOMP_INDIRECT_ADDR_MAP): New define. (GOMP_OFFLOAD_load_image): Add extra argument. * libgomp.h (struct indirect_splay_tree_key_s): New. (indirect_splay_tree_node, indirect_splay_tree, indirect_splay_tree_key): New. (indirect_splay_compare): New. * libgomp.map (GOMP_5.1.1): Add GOMP_target_map_indirect_ptr. * libgomp.texi (OpenMP 5.1): Update documentation on indirect calls in target region and on indirect clause. (Other new OpenMP 5.2 features): Add entry for virtual function calls. * libgomp_g.h (GOMP_target_map_indirect_ptr): Add prototype. * oacc-host.c (host_load_image): Add extra argument. * target.c (gomp_load_image_to_device): If the GOMP_VERSION is high enough, read host indirect functions table and pass to load_image_func. * config/accel/target-indirect.c: New. * config/linux/target-indirect.c: New. * config/gcn/team.c (build_indirect_map): Add prototype. (gomp_gcn_enter_kernel): Initialize support for indirect function calls on GCN target. * config/nvptx/team.c (build_indirect_map): Add prototype. (gomp_nvptx_main): Initialize support for indirect function calls on NVPTX target. * plugin/plugin-gcn.c (struct gcn_image_desc): Add field for indirect functions count. (GOMP_OFFLOAD_load_image): Add extra argument. If the GOMP_VERSION is high enough, build address translation table and copy it to target memory. * plugin/plugin-nvptx.c (nvptx_tdata): Add field for indirect functions count. (GOMP_OFFLOAD_load_image): Add extra argument. If the GOMP_VERSION is high enough, Build address translation table and copy it to target memory. * testsuite/libgomp.c-c++-common/declare-target-indirect-1.c: New. * testsuite/libgomp.c-c++-common/declare-target-indirect-2.c: New. * testsuite/libgomp.c++/declare-target-indirect-1.C: New.
2023-11-06Daily bump.GCC Administrator1-0/+5
2023-11-05openmp: Mention C attribute syntax in documentationJakub Jelinek1-1/+1
This patch mentions the C attribute syntax support in the libgomp documentation. 2023-11-05 Jakub Jelinek <jakub@redhat.com> * libgomp.texi (Enabling OpenMP): Adjust wording for attribute syntax supported also in C.
2023-11-01Daily bump.GCC Administrator1-0/+5
2023-10-31Add OpenACC 'acc_map_data' variant to 'libgomp.oacc-c-c++-common/deep-copy-8.c'Thomas Schwinge1-2/+27
libgomp/ * testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: Add OpenACC 'acc_map_data' variant.
2023-10-26Daily bump.GCC Administrator1-0/+21
2023-10-25Handle OpenACC 'self' clause for compute constructs in OpenACC 'kernels' ↵Thomas Schwinge2-20/+12
decomposition ... to fix up recent commit 3a3596389c2e539cb8fd5dc5784a4e2afe193a2a "OpenACC 2.7: Implement self clause for compute constructs" for that case. gcc/ * omp-oacc-kernels-decompose.cc (omp_oacc_kernels_decompose_1): Handle 'OMP_CLAUSE_SELF' like 'OMP_CLAUSE_IF'. * omp-expand.cc (expand_omp_target): Handle 'OMP_CLAUSE_SELF' for 'GF_OMP_TARGET_KIND_OACC_DATA_KERNELS'. gcc/testsuite/ * c-c++-common/goacc/self-clause-2.c: Verify '--param=openacc-kernels=decompose'. * gfortran.dg/goacc/kernels-tree.f95: Adjust. libgomp/ * oacc-parallel.c (GOACC_data_start): Handle 'GOACC_FLAG_LOCAL_DEVICE'. (GOACC_parallel_keyed): Simplify accordingly. * testsuite/libgomp.oacc-fortran/self-1.f90: Adjust.
2023-10-25Extend test suite coverage for OpenACC 'self' clause for compute constructsThomas Schwinge5-0/+1045
... on top of what was provided in recent commit 3a3596389c2e539cb8fd5dc5784a4e2afe193a2a "OpenACC 2.7: Implement self clause for compute constructs". gcc/testsuite/ * c-c++-common/goacc/if-clause-2.c: Enhance. * c-c++-common/goacc/self-clause-1.c: Likewise. * c-c++-common/goacc/self-clause-2.c: Likewise. * gfortran.dg/goacc/if.f95: Likewise. * gfortran.dg/goacc/kernels-tree.f95: Likewise. * gfortran.dg/goacc/parallel-tree.f95: Likewise. * gfortran.dg/goacc/self.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/if-1.c: Enhance. * testsuite/libgomp.oacc-c-c++-common/self-1.c: Likewise. * testsuite/libgomp.oacc-fortran/if-1.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/if-self-1.c: New. * testsuite/libgomp.oacc-fortran/self-1.f90: Likewise.
2023-10-25OpenACC 2.7: Implement self clause for compute constructsChung-Lin Tang2-0/+973
This patch implements the 'self' clause for compute constructs: parallel, kernels, and serial. This clause conditionally uses the local device (the host mult-core CPU) as the executing device of the compute region. The actual implementation of the "local device" device type inside libgomp (presumably using pthreads) is still not yet completed, so the libgomp side is still implemented the exact same as host-fallback mode. (so as of now, it essentially behaves like the 'if' clause with the condition inverted) gcc/c/ChangeLog: * c-parser.cc (c_parser_oacc_compute_clause_self): New function. (c_parser_oacc_all_clauses): Add new 'bool compute_p = false' parameter, add parsing of self clause when compute_p is true. (OACC_KERNELS_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_SELF. (OACC_PARALLEL_CLAUSE_MASK): Likewise, (OACC_SERIAL_CLAUSE_MASK): Likewise. (c_parser_oacc_compute): Adjust call to c_parser_oacc_all_clauses to set compute_p argument to true. * c-typeck.cc (c_finish_omp_clauses): Add OMP_CLAUSE_SELF case. gcc/cp/ChangeLog: * parser.cc (cp_parser_oacc_compute_clause_self): New function. (cp_parser_oacc_all_clauses): Add new 'bool compute_p = false' parameter, add parsing of self clause when compute_p is true. (OACC_KERNELS_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_SELF. (OACC_PARALLEL_CLAUSE_MASK): Likewise, (OACC_SERIAL_CLAUSE_MASK): Likewise. (cp_parser_oacc_compute): Adjust call to c_parser_oacc_all_clauses to set compute_p argument to true. * pt.cc (tsubst_omp_clauses): Add OMP_CLAUSE_SELF case. * semantics.cc (c_finish_omp_clauses): Add OMP_CLAUSE_SELF case, merged with OMP_CLAUSE_IF case. gcc/fortran/ChangeLog: * gfortran.h (typedef struct gfc_omp_clauses): Add self_expr field. * openmp.cc (enum omp_mask2): Add OMP_CLAUSE_SELF. (gfc_match_omp_clauses): Add handling for OMP_CLAUSE_SELF. (OACC_PARALLEL_CLAUSES): Add OMP_CLAUSE_SELF. (OACC_KERNELS_CLAUSES): Likewise. (OACC_SERIAL_CLAUSES): Likewise. (resolve_omp_clauses): Add handling for omp_clauses->self_expr. * trans-openmp.cc (gfc_trans_omp_clauses): Add handling of clauses->self_expr and building of OMP_CLAUSE_SELF tree clause. (gfc_split_omp_clauses): Add handling of self_expr field copy. gcc/ChangeLog: * gimplify.cc (gimplify_scan_omp_clauses): Add OMP_CLAUSE_SELF case. (gimplify_adjust_omp_clauses): Likewise. * omp-expand.cc (expand_omp_target): Add OMP_CLAUSE_SELF expansion code, * omp-low.cc (scan_sharing_clauses): Add OMP_CLAUSE_SELF case. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_SELF enum. * tree-nested.cc (convert_nonlocal_omp_clauses): Add OMP_CLAUSE_SELF case. (convert_local_omp_clauses): Likewise. * tree-pretty-print.cc (dump_omp_clause): Add OMP_CLAUSE_SELF case. * tree.cc (omp_clause_num_ops): Add OMP_CLAUSE_SELF entry. (omp_clause_code_name): Likewise. * tree.h (OMP_CLAUSE_SELF_EXPR): New macro. gcc/testsuite/ChangeLog: * c-c++-common/goacc/self-clause-1.c: New test. * c-c++-common/goacc/self-clause-2.c: New test. * gfortran.dg/goacc/self.f95: New test. include/ChangeLog: * gomp-constants.h (GOACC_FLAG_LOCAL_DEVICE): New flag bit value. libgomp/ChangeLog: * oacc-parallel.c (GOACC_parallel_keyed): Add code to handle GOACC_FLAG_LOCAL_DEVICE case. * testsuite/libgomp.oacc-c-c++-common/self-1.c: New test.
2023-10-23Daily bump.GCC Administrator1-0/+7
2023-10-22Config,Darwin: Allow for configuring Darwin to use embedded runpath.Iain Sandoe4-9/+130
Recent Darwin versions place contraints on the use of run paths specified in environment variables. This breaks some assumptions in the GCC build. This change allows the user to configure a Darwin build to use '@rpath/libraryname.dylib' in library names and then to add an embedded runpath to executables (and libraries with dependents). The embedded runpath is added by default unless the user adds '-nodefaultrpaths' to the link line. For an installed compiler, it means that any executable built with that compiler will reference the runtimes installed with the compiler (equivalent to hard-coding the library path into the name of the library). During build-time configurations any "-B" entries will be added to the runpath thus the newly-built libraries will be found by exes. Since the install name is set in libtool, that decision needs to be available here (but might also cause dependent ones in Makefiles, so we need to export a conditional). This facility is not available for Darwin 8 or earlier, however the existing environment variable runpath does work there. We default this on for systems where the external DYLD_LIBRARY_PATH does not work and off for Darwin 8 or earlier. For systems that can use either method, if the value is unset, we use the default (which is currently DYLD_LIBRARY_PATH). ChangeLog: * configure: Regenerate. * configure.ac: Do not add default runpaths to GCC exes when we are building -static-libstdc++/-static-libgcc (the default). * libtool.m4: Add 'enable-darwin-at-runpath'. Act on the enable flag to alter Darwin libraries to use @rpath names. gcc/ChangeLog: * aclocal.m4: Regenerate. * configure: Regenerate. * configure.ac: Handle Darwin rpaths. * config/darwin.h: Handle Darwin rpaths. * config/darwin.opt: Handle Darwin rpaths. * Makefile.in: Handle Darwin rpaths. gcc/ada/ChangeLog: * gcc-interface/Makefile.in: Handle Darwin rpaths. gcc/jit/ChangeLog: * Make-lang.in: Handle Darwin rpaths. libatomic/ChangeLog: * Makefile.am: Handle Darwin rpaths. * Makefile.in: Regenerate. * configure: Regenerate. * configure.ac: Handle Darwin rpaths. libbacktrace/ChangeLog: * configure: Regenerate. * configure.ac: Handle Darwin rpaths. libcc1/ChangeLog: * configure: Regenerate. libffi/ChangeLog: * Makefile.am: Handle Darwin rpaths. * Makefile.in: Regenerate. * configure: Regenerate. libgcc/ChangeLog: * config/t-slibgcc-darwin: Generate libgcc_s with an @rpath name. * config.host: Handle Darwin rpaths. libgfortran/ChangeLog: * Makefile.am: Handle Darwin rpaths. * Makefile.in: Regenerate. * configure: Regenerate. * configure.ac: Handle Darwin rpaths libgm2/ChangeLog: * Makefile.am: Handle Darwin rpaths. * Makefile.in: Regenerate. * aclocal.m4: Regenerate. * configure: Regenerate. * configure.ac: Handle Darwin rpaths. * libm2cor/Makefile.am: Handle Darwin rpaths. * libm2cor/Makefile.in: Regenerate. * libm2iso/Makefile.am: Handle Darwin rpaths. * libm2iso/Makefile.in: Regenerate. * libm2log/Makefile.am: Handle Darwin rpaths. * libm2log/Makefile.in: Regenerate. * libm2min/Makefile.am: Handle Darwin rpaths. * libm2min/Makefile.in: Regenerate. * libm2pim/Makefile.am: Handle Darwin rpaths. * libm2pim/Makefile.in: Regenerate. libgomp/ChangeLog: * Makefile.am: Handle Darwin rpaths. * Makefile.in: Regenerate. * configure: Regenerate. * configure.ac: Handle Darwin rpaths libitm/ChangeLog: * Makefile.am: Handle Darwin rpaths. * Makefile.in: Regenerate. * configure: Regenerate. * configure.ac: Handle Darwin rpaths. libobjc/ChangeLog: * configure: Regenerate. * configure.ac: Handle Darwin rpaths. libphobos/ChangeLog: * configure: Regenerate. * configure.ac: Handle Darwin rpaths. * libdruntime/Makefile.am: Handle Darwin rpaths. * libdruntime/Makefile.in: Regenerate. * src/Makefile.am: Handle Darwin rpaths. * src/Makefile.in: Regenerate. libquadmath/ChangeLog: * Makefile.am: Handle Darwin rpaths. * Makefile.in: Regenerate. * configure: Regenerate. * configure.ac: Handle Darwin rpaths. libsanitizer/ChangeLog: * asan/Makefile.am: Handle Darwin rpaths. * asan/Makefile.in: Regenerate. * configure: Regenerate. * hwasan/Makefile.am: Handle Darwin rpaths. * hwasan/Makefile.in: Regenerate. * lsan/Makefile.am: Handle Darwin rpaths. * lsan/Makefile.in: Regenerate. * tsan/Makefile.am: Handle Darwin rpaths. * tsan/Makefile.in: Regenerate. * ubsan/Makefile.am: Handle Darwin rpaths. * ubsan/Makefile.in: Regenerate. libssp/ChangeLog: * Makefile.am: Handle Darwin rpaths. * Makefile.in: Regenerate. * configure: Regenerate. * configure.ac: Handle Darwin rpaths. libstdc++-v3/ChangeLog: * configure: Regenerate. * configure.ac: Handle Darwin rpaths. * src/Makefile.am: Handle Darwin rpaths. * src/Makefile.in: Regenerate. libvtv/ChangeLog: * configure: Regenerate. * configure.ac: Handle Darwin rpaths. lto-plugin/ChangeLog: * configure: Regenerate. * configure.ac: Handle Darwin rpaths. zlib/ChangeLog: * configure: Regenerate. * configure.ac: Handle Darwin rpaths.
2023-10-21Daily bump.GCC Administrator1-0/+12
2023-10-20amdgcn: add -march=gfx1030 EXPERIMENTALAndrew Stubbs2-3/+9
Accept the architecture configure option and resolve build failures. This is enough to build binaries, but I've not got a device to test it on, so there are probably runtime issues to fix. The cache control instructions might be unsafe (or too conservative), and the kernel metadata might be off. Vector reductions will need to be reworked for RDNA2. In principle, it would be better to use wavefrontsize32 for this architecture, but that would mean switching everything to allow SImode masks, so wavefrontsize64 it is. The multilib is not included in the default configuration so either configure --with-arch=gfx1030 or include it in --with-multilib-list=gfx1030,.... The majority of this patch has no effect on other devices, but changing from using scalar writes for the exit value to vector writes means we don't need the scalar cache write-back instruction anywhere (which doesn't exist in RDNA2). gcc/ChangeLog: * config.gcc: Allow --with-arch=gfx1030. * config/gcn/gcn-hsa.h (NO_XNACK): gfx1030 does not support xnack. (ASM_SPEC): gfx1030 needs -mattr=+wavefrontsize64 set. * config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX1030. (TARGET_GFX1030): New. (TARGET_RDNA2): New. * config/gcn/gcn-valu.md (@dpp_move<mode>): Disable for RDNA2. (addc<mode>3<exec_vcc>): Add RDNA2 syntax variant. (subc<mode>3<exec_vcc>): Likewise. (<convop><mode><vndi>2_exec): Add RDNA2 alternatives. (vec_cmp<mode>di): Likewise. (vec_cmp<u><mode>di): Likewise. (vec_cmp<mode>di_exec): Likewise. (vec_cmp<u><mode>di_exec): Likewise. (vec_cmp<mode>di_dup): Likewise. (vec_cmp<mode>di_dup_exec): Likewise. (reduc_<reduc_op>_scal_<mode>): Disable for RDNA2. (*<reduc_op>_dpp_shr_<mode>): Likewise. (*plus_carry_dpp_shr_<mode>): Likewise. (*plus_carry_in_dpp_shr_<mode>): Likewise. * config/gcn/gcn.cc (gcn_option_override): Recognise gfx1030. (gcn_global_address_p): RDNA2 only allows smaller offsets. (gcn_addr_space_legitimate_address_p): Likewise. (gcn_omp_device_kind_arch_isa): Recognise gfx1030. (gcn_expand_epilogue): Use VGPRs instead of SGPRs. (output_file_start): Configure gfx1030. * config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __RDNA2__; (ASSEMBLER_DIALECT): New. * config/gcn/gcn.md (rdna): New define_attr. (enabled): Use "rdna" attribute. (gcn_return): Remove s_dcache_wb. (addcsi3_scalar): Add RDNA2 syntax variant. (addcsi3_scalar_zero): Likewise. (addptrdi3): Likewise. (mulsi3): v_mul_lo_i32 should be v_mul_lo_u32 on all ISA. (*memory_barrier): Add RDNA2 syntax variant. (atomic_load<mode>): Add RDNA2 cache control variants, and disable scalar atomics for RDNA2. (atomic_store<mode>): Likewise. (atomic_exchange<mode>): Likewise. * config/gcn/gcn.opt (gpu_type): Add gfx1030. * config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1030): New. (main): Recognise -march=gfx1030. * config/gcn/t-omp-device: Add gfx1030 isa. libgcc/ChangeLog: * config/gcn/amdgcn_veclib.h (CDNA3_PLUS): Set false for __RDNA2__. libgomp/ChangeLog: * plugin/plugin-gcn.c (EF_AMDGPU_MACH_AMDGCN_GFX1030): New. (isa_hsa_name): Recognise gfx1030. (isa_code): Likewise. * team.c (defined): Remove s_endpgm.
2023-10-20omp_lib.f90.in: Deprecate omp_lock_hint_* for OpenMP 5.0Tobias Burnus1-0/+4
The omp_lock_hint_* parameters were deprecated in favor of omp_sync_hint_*. While omp.h contained deprecation markers for those, the omp_lib module only contained them for omp_{g,s}_nested. Note: The -Wdeprecated-declarations warning will only become active once openmp_version / _OPENMP is bumped from 201511 (4.5) to 201811 (5.0). libgomp/ChangeLog: * omp_lib.f90.in: Tag omp_lock_hint_* as being deprecated when _OPENMP >= 201811.
2023-10-16Daily bump.GCC Administrator1-0/+17
2023-10-15libgomp.texi: Update "Enabling OpenMP" + OpenACC / invoke.texi: ↵Tobias Burnus1-13/+18
-fopenacc/-fopenmp update The OpenACC specification does not mention the '!$ ' sentinel for conditional compilation and the feature was removed in r11-5572-g1d6f6ac693a860 for PR fortran/98011; update libgomp.texi for this and update a leftover comment. - Additionally, some other updates are done as well. libgomp/ * libgomp.texi (Enabling OpenMP): Update for C/C++ attributes; improve wording especially for Fortran; mention -fopenmp-simd. (Enabling OpenACC): Minor cleanup; remove conditional compilation sentinel. gcc/ * doc/invoke.texi (-fopenacc, -fopenmp, -fopenmp-simd): Use @samp not @code; document more completely the supported Fortran sentinels. gcc/fortran * scanner.cc (skip_free_comments, skip_fixed_comments): Remove leftover 'OpenACC' from comments about OpenMP's conditional compilation sentinel.
2023-10-15libgomp.texi: Improve "OpenACC Environment Variables"Tobias Burnus1-11/+20
None of the ACC_* env vars was documented; in particular, the valid valids for ACC_DEVICE_TYPE found to be lacking as those are not document in the OpenACC spec. GCC_ACC_NOTIFY was removed as I failed to find any traces of it but the addition to the documentation in commit r6-6185-gcdf6119dad04dd ("libgomp.texi: Updates for OpenACC."). It seems to be planned as GCC version of the ACC_NOTIFY env var used by another compiler for offloading debugging. libgomp/ * libgomp.texi (ACC_DEVICE_TYPE, ACC_DEVICE_NUM, ACC_PROFLIB): Actually document what the function does. (GCC_ACC_NOTIFY): Remove unused env var.
2023-10-15libgomp.texi: Use present not future tenseTobias Burnus1-62/+62
libgomp/ChangeLog: * libgomp.texi: Replace most future tense by present tense.
2023-10-15Daily bump.GCC Administrator1-0/+20
2023-10-14libgomp.fortran/allocate-6.f90: Run with -fdump-tree-gimpleTobias Burnus1-2/+3
libgomp/ * testsuite/libgomp.fortran/allocate-6.f90: Add missing dg-additional-options "-fdump-tree-gimple"; fix scan.
2023-10-14libgomp.texi: Note to 'Memory allocation' sect and missing mem-memory routinesTobias Burnus1-15/+367
This commit completes the documentation of the OpenMP memory-management routines, except for the unimplemented TR11 additions. It also makes clear in the 'Memory allocation' section of the 'OpenMP-Implementation Specifics' chapter under which condition OpenMP managed memory/allocators are used. libgomp/ChangeLog: * libgomp.texi: Fix some typos. (Memory Management Routines): Document remaining 5.x routines. (Memory allocation): Make clear when the section applies.
2023-10-14Fortran: Support OpenMP's 'allocate' directive for stack varsTobias Burnus5-2/+653
gcc/fortran/ChangeLog: * gfortran.h (ext_attr_t): Add omp_allocate flag. * match.cc (gfc_free_omp_namelist): Void deleting same u2.allocator multiple times now that a sequence can use the same one. * openmp.cc (gfc_match_omp_clauses, gfc_match_omp_allocate): Use same allocator expr multiple times. (is_predefined_allocator): Make static. (gfc_resolve_omp_allocate): Update/extend restriction checks; remove sorry message. (resolve_omp_clauses): Reject corarrays in allocate/allocators directive. * parse.cc (check_omp_allocate_stmt): Permit procedure pointers here (rejected later) for less misleading diagnostic. * trans-array.cc (gfc_trans_auto_array_allocation): Propagate size for GOMP_alloc and location to which it should be added to. * trans-decl.cc (gfc_trans_deferred_vars): Handle 'omp allocate' for stack variables; sorry for static variables/common blocks. * trans-openmp.cc (gfc_trans_omp_clauses): Evaluate 'allocate' clause's allocator only once; fix adding expressions to the block. (gfc_trans_omp_single): Pass a block to gfc_trans_omp_clauses. gcc/ChangeLog: * gimplify.cc (gimplify_bind_expr): Handle Fortran's 'omp allocate' for stack variables. libgomp/ChangeLog: * libgomp.texi (OpenMP Impl. Status): Mention that Fortran now supports the allocate directive for stack variables. * testsuite/libgomp.fortran/allocate-5.f90: New test. * testsuite/libgomp.fortran/allocate-6.f90: New test. * testsuite/libgomp.fortran/allocate-7.f90: New test. * testsuite/libgomp.fortran/allocate-8.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/allocate-14.c: Fix directive name. * c-c++-common/gomp/allocate-15.c: Likewise. * c-c++-common/gomp/allocate-9.c: Fix comment typo. * gfortran.dg/gomp/allocate-4.f90: Remove sorry dg-error. * gfortran.dg/gomp/allocate-7.f90: Likewise. * gfortran.dg/gomp/allocate-10.f90: New test. * gfortran.dg/gomp/allocate-11.f90: New test. * gfortran.dg/gomp/allocate-12.f90: New test. * gfortran.dg/gomp/allocate-13.f90: New test. * gfortran.dg/gomp/allocate-14.f90: New test. * gfortran.dg/gomp/allocate-15.f90: New test. * gfortran.dg/gomp/allocate-8.f90: New test. * gfortran.dg/gomp/allocate-9.f90: New test.
2023-10-13Daily bump.GCC Administrator1-0/+12