aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-03-10libgomp: Merge 'gomp_map_vars_openacc' into 'goacc_map_vars' [PR76739]Thomas Schwinge5-28/+27
Upstream has 'goacc_map_vars'; merge the new 'gomp_map_vars_openacc' into it. (Maybe the latter didn't exist yet when the former was originally added?) No functional change. Clean-up for og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f "Merge non-contiguous array support patches". PR other/76739 libgomp/ * libgomp.h (goacc_map_vars): Add 'struct goacc_ncarray_info *' formal parameter. (gomp_map_vars_openacc): Remove. * target.c (goacc_map_vars): Adjust. (gomp_map_vars_openacc): Remove. * oacc-mem.c (acc_map_data, goacc_enter_datum) (goacc_enter_data_internal): Adjust. * oacc-parallel.c (GOACC_parallel_keyed, GOACC_data_start): Adjust.
2023-03-10Revert "OpenACC profiling-interface fixes for asynchronous operations"Thomas Schwinge6-194/+113
There is occasional execution failure; these changes need to be reviewed. This reverts og12 commit 719f93c8618a134f90b5b661ab70c918d659ad05. libgomp/ * oacc-host.c: Revert "OpenACC profiling-interface fixes for asynchronous operations" changes. * oacc-mem.c: Likewise. * oacc-parallel.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Likewise.
2023-03-10Revert "Revert changes to acc_prof-init-1.c and acc_prof-parallel-1.c"Thomas Schwinge3-0/+45
... as a prerequisite for reverting "OpenACC profiling-interface fixes for asynchronous operations". This reverts og12 commit b845d2f62e7da1c4cfdfee99690de94b648d076d. libgomp/ * testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Revert "Revert changes to acc_prof-init-1.c and acc_prof-parallel-1.c" changes. * testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Likewise.
2023-03-06amdgcn: Add instruction patterns for conditional min/max operationsPaul-Antoine Arras52-2/+1448
gcc/ChangeLog: * config/gcn/gcn-valu.md (<expander><mode>3_exec): Add patterns for {s|u}{max|min} in QI, HI and DI modes. (<expander><mode>3): Add pattern for {s|u}{max|min} in DI mode. (cond_<fexpander><mode>): Add pattern for cond_f{max|min}. (cond_<expander><mode>): Add pattern for cond_{s|u}{max|min}. * config/gcn/gcn.cc (gcn_spill_class): Allow the exec register to be saved in SGPRs. gcc/testsuite/ChangeLog: * gcc.target/gcn/cond_fmaxnm_1.c: New test. * gcc.target/gcn/cond_fmaxnm_1_run.c: New test. * gcc.target/gcn/cond_fmaxnm_2.c: New test. * gcc.target/gcn/cond_fmaxnm_2_run.c: New test. * gcc.target/gcn/cond_fmaxnm_3.c: New test. * gcc.target/gcn/cond_fmaxnm_3_run.c: New test. * gcc.target/gcn/cond_fmaxnm_4.c: New test. * gcc.target/gcn/cond_fmaxnm_4_run.c: New test. * gcc.target/gcn/cond_fmaxnm_5.c: New test. * gcc.target/gcn/cond_fmaxnm_5_run.c: New test. * gcc.target/gcn/cond_fmaxnm_6.c: New test. * gcc.target/gcn/cond_fmaxnm_6_run.c: New test. * gcc.target/gcn/cond_fmaxnm_7.c: New test. * gcc.target/gcn/cond_fmaxnm_7_run.c: New test. * gcc.target/gcn/cond_fmaxnm_8.c: New test. * gcc.target/gcn/cond_fmaxnm_8_run.c: New test. * gcc.target/gcn/cond_fminnm_1.c: New test. * gcc.target/gcn/cond_fminnm_1_run.c: New test. * gcc.target/gcn/cond_fminnm_2.c: New test. * gcc.target/gcn/cond_fminnm_2_run.c: New test. * gcc.target/gcn/cond_fminnm_3.c: New test. * gcc.target/gcn/cond_fminnm_3_run.c: New test. * gcc.target/gcn/cond_fminnm_4.c: New test. * gcc.target/gcn/cond_fminnm_4_run.c: New test. * gcc.target/gcn/cond_fminnm_5.c: New test. * gcc.target/gcn/cond_fminnm_5_run.c: New test. * gcc.target/gcn/cond_fminnm_6.c: New test. * gcc.target/gcn/cond_fminnm_6_run.c: New test. * gcc.target/gcn/cond_fminnm_7.c: New test. * gcc.target/gcn/cond_fminnm_7_run.c: New test. * gcc.target/gcn/cond_fminnm_8.c: New test. * gcc.target/gcn/cond_fminnm_8_run.c: New test. * gcc.target/gcn/cond_smax_1.c: New test. * gcc.target/gcn/cond_smax_1_run.c: New test. * gcc.target/gcn/cond_smin_1.c: New test. * gcc.target/gcn/cond_smin_1_run.c: New test. * gcc.target/gcn/cond_umax_1.c: New test. * gcc.target/gcn/cond_umax_1_run.c: New test. * gcc.target/gcn/cond_umin_1.c: New test. * gcc.target/gcn/cond_umin_1_run.c: New test. * gcc.target/gcn/smax_1.c: New test. * gcc.target/gcn/smax_1_run.c: New test. * gcc.target/gcn/smin_1.c: New test. * gcc.target/gcn/smin_1_run.c: New test. * gcc.target/gcn/umax_1.c: New test. * gcc.target/gcn/umax_1_run.c: New test. * gcc.target/gcn/umin_1.c: New test. * gcc.target/gcn/umin_1_run.c: New test. (cherry picked from commit 553ff2524f412be4e02e2ffb1a0a3dc3e2280742)
2023-03-02Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus3-1/+41
Merge up to r12-9210-gb3f9d2cf7dd5488800f867a6aae076465ecb391b (2nd Mar 2023)
2023-03-02Daily bump.GCC Administrator1-1/+1
2023-03-01OpenMP/Fortran: Fix handling of optional is_device_ptr + bind(C) [PR108546]Tobias Burnus7-2/+132
For is_device_ptr, optional checks should only be done before calling libgomp, afterwards they are NULL either because of absent or, by chance, because it is unallocated or unassociated (for pointers/allocatables). Additionally, it fixes an issue with explicit mapping for 'type(c_ptr)'. PR middle-end/108546 gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_trans_omp_clauses): Fix mapping of type(C_ptr) variables. gcc/ChangeLog: * omp-low.cc (lower_omp_target): Remove optional handling on the receiver side, i.e. inside target (data), for use_device_ptr. libgomp/ChangeLog: * testsuite/libgomp.fortran/is_device_ptr-3.f90: New test. * testsuite/libgomp.fortran/use_device_ptr-optional-4.f90: New test. (cherry picked from commit 96ff97ff6574666a5509ae9fa596e7f2b6ad4f88)
2023-03-01Daily bump.GCC Administrator1-1/+1
2023-02-28Daily bump.GCC Administrator3-1/+41
2023-02-27Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus38-59/+288
Merge up to r12-9207-gb8e496d132ec087c9db5951fea23551dcc831d8c (27th Feb 2023)
2023-02-27Update dg-dump-scan for "Fortran/OpenMP: Fix mapping of array descriptors ↵Tobias Burnus3-4/+12
and deferred-length strings" Follow-up to commit 55a18d4744258e3909568e425f9f473c49f9d13f "Fortran/OpenMP: Fix mapping of array descriptors and deferred-length strings" updating the dumps. * For the goacc testcase, 'to' changed to 'release' and due to 'finally' then to 'delete', which can be regarded as bugfix. * For pr78260-2.f90, the calculation moved inside the 'if(...->data == NULL)' block to handle deferred-string length vars better, esp. when 'optional'. gcc/testsuite/: * gfortran.dg/goacc/finalize-1.f: Update scan-tree-dump-times for mapping changes. * gfortran.dg/gomp/pr78260-2.f90: Likewise.
2023-02-27asan: adjust module name for global variablesMartin Liska2-2/+12
As mentioned in the PR, when we use LTO, we wrongly use ltrans output file name as a module name of a global variable. That leads to a non-reproducible output. After the suggested change, we emit context name of normal global variables. And for artificial variables (like .Lubsan_data3), we use aux_base_name (e.g. "./a.ltrans0.ltrans"). PR sanitizer/108834 gcc/ChangeLog: * asan.cc (asan_add_global): Use proper TU name for normal global variables (and aux_base_name for the artificial one). gcc/testsuite/ChangeLog: * c-c++-common/asan/global-overflow-1.c: Test line and column info for a global variable. (cherry picked from commit 94c9b1bb79f63d000ebb05efc155c149325e332d)
2023-02-26rs6000/test: Adjust some test cases on partial vector [PR96373]Kewen Lin15-14/+45
As Richard pointed out in [1] and the testing on Power10, the proposed fix for PR96373 requires some updates on a few rs6000 test cases which adopt partial vector. This patch is to fix all of them with one extra option "-fno-trapping-math" as Richard suggested. Besides, the original test case also failed on Power10 without Richard's proposed fix, this patch adds it together for a bit better testing coverage. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-January/610728.html PR target/96373 gcc/testsuite/ChangeLog: * gcc.target/powerpc/p9-vec-length-epil-1.c: Add -fno-trapping-math. * gcc.target/powerpc/p9-vec-length-epil-2.c: Likewise. * gcc.target/powerpc/p9-vec-length-epil-3.c: Likewise. * gcc.target/powerpc/p9-vec-length-epil-4.c: Likewise. * gcc.target/powerpc/p9-vec-length-epil-5.c: Likewise. * gcc.target/powerpc/p9-vec-length-epil-6.c: Likewise. * gcc.target/powerpc/p9-vec-length-epil-8.c: Likewise. * gcc.target/powerpc/p9-vec-length-full-1.c: Likewise. * gcc.target/powerpc/p9-vec-length-full-2.c: Likewise. * gcc.target/powerpc/p9-vec-length-full-3.c: Likewise. * gcc.target/powerpc/p9-vec-length-full-4.c: Likewise. * gcc.target/powerpc/p9-vec-length-full-5.c: Likewise. * gcc.target/powerpc/p9-vec-length-full-6.c: Likewise. * gcc.target/powerpc/p9-vec-length-full-8.c: Likewise. * gcc.target/powerpc/pr96373.c: New test. (cherry picked from commit 4f5a1198065dc078f8099db628da7b06a2666f34)
2023-02-27Daily bump.GCC Administrator1-1/+1
2023-02-26Daily bump.GCC Administrator1-1/+1
2023-02-25Daily bump.GCC Administrator2-1/+9
2023-02-24RTEMS: Tune multilib selectionSebastian Huber1-8/+9
gcc/ChangeLog: * config/riscv/t-rtems: Keep only -mcmodel=medany 64-bit multilibs. Add non-compact 32-bit multilibs. (cherry picked from commit 35a067020e41d97bc3be15b518b3dc2a64b4aae2)
2023-02-24Daily bump.GCC Administrator2-1/+37
2023-02-23tree-optimization/108888 - call if-conversionRichard Biener4-7/+41
The following makes sure to only predicate calls necessary. PR tree-optimization/108888 * tree-if-conv.cc (if_convertible_stmt_p): Set PLF_2 on calls to predicate. (predicate_statements): Only predicate calls with PLF_2. * g++.dg/torture/pr108888.C: New testcase. (cherry picked from commit 31cc5821223a096ef61743bff520f4a0dbba5872)
2023-02-23vect: inbranch SIMD clonesAndrew Stubbs27-32/+721
There has been support for generating "inbranch" SIMD clones for a long time, but nothing actually uses them (as far as I can see). This patch add supports for a sub-set of possible cases (those using mask_mode == VOIDmode). The other cases fail to vectorize, just as before, so there should be no regressions. The sub-set of support should cover all cases needed by amdgcn, at present. gcc/ChangeLog: * internal-fn.cc (expand_MASK_CALL): New. * internal-fn.def (MASK_CALL): New. * internal-fn.h (expand_MASK_CALL): New prototype. * omp-simd-clone.cc (simd_clone_adjust_argument_types): Set vector_type for mask arguments also. * tree-if-conv.cc: Include cgraph.h. (if_convertible_stmt_p): Do if conversions for calls to SIMD calls. (predicate_statements): Convert functions to IFN_MASK_CALL. * tree-vect-loop.cc (vect_get_datarefs_in_loop): Recognise IFN_MASK_CALL as a SIMD function call. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Handle IFN_MASK_CALL as an inbranch SIMD function call. Generate the mask vector arguments. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-simd-clone-16.c: New test. * gcc.dg/vect/vect-simd-clone-16b.c: New test. * gcc.dg/vect/vect-simd-clone-16c.c: New test. * gcc.dg/vect/vect-simd-clone-16d.c: New test. * gcc.dg/vect/vect-simd-clone-16e.c: New test. * gcc.dg/vect/vect-simd-clone-16f.c: New test. * gcc.dg/vect/vect-simd-clone-17.c: New test. * gcc.dg/vect/vect-simd-clone-17b.c: New test. * gcc.dg/vect/vect-simd-clone-17c.c: New test. * gcc.dg/vect/vect-simd-clone-17d.c: New test. * gcc.dg/vect/vect-simd-clone-17e.c: New test. * gcc.dg/vect/vect-simd-clone-17f.c: New test. * gcc.dg/vect/vect-simd-clone-18.c: New test. * gcc.dg/vect/vect-simd-clone-18b.c: New test. * gcc.dg/vect/vect-simd-clone-18c.c: New test. * gcc.dg/vect/vect-simd-clone-18d.c: New test. * gcc.dg/vect/vect-simd-clone-18e.c: New test. * gcc.dg/vect/vect-simd-clone-18f.c: New test. (cherry picked from commit 3da77f217c8b2089ecba3eb201e727c3fcdcd19d)
2023-02-23libstdc++: Simplify three helper functions into oneMatthias Kretz1-5/+6
Broadcast is a very common function. This should reduce compile-time effort. Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: PR libstdc++/108030 * include/experimental/bits/simd.h (__vector_broadcast): Implement via __vector_broadcast_impl instead of __call_with_n_evaluations + 2 lambdas. (__vector_broadcast_impl): New. (cherry picked from commit 2e29e2fbeb8936e5c85cefaf547cba42e17e137b)
2023-02-23libstdc++: Fix -Wsign-compare issueMatthias Kretz1-1/+1
Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: * include/experimental/bits/simd_builtin.h (_S_set): Compare as int. The actual range of these indexes is very small. (cherry picked from commit ffa39f7120f6e83a567d7a83ff4437f6b41036ea)
2023-02-23libstdc++: Add missing constexpr on simd shift implementationMatthias Kretz1-4/+4
Resolves -Wtautological-compare warnings about `if (__builtin_is_constant_evaluated())` in the implementations of these functions. Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: * include/experimental/bits/simd_x86.h (_S_bit_shift_left) (_S_bit_shift_right): Declare constexpr. The implementation was already expecting constexpr evaluation. (cherry picked from commit fa37ac2b59ed1c379b35dbf9bd58f7849f9fd5b5)
2023-02-23libgomp: no need to attach USM pointersAndrew Stubbs3-0/+43
Fix a bug in which Fortran pointers inside derived types caused a runtime error when Unified Shared Memory was active. libgomp/ChangeLog: * target.c (gomp_attach_pointer): Check for USM. * testsuite/libgomp.fortran/usm-3.f90: New test.
2023-02-23libstdc++: Fix uses of non-reserved names in simd headerMatthias Kretz1-11/+11
Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h (__extract_part, split): Use reserved name for template parameter. (cherry picked from commit bb920f561e983c64d146f173dc4ebc098441a962)
2023-02-23Daily bump.GCC Administrator1-1/+1
2023-02-22Fortran/OpenMP: Fix mapping of array descriptors and deferred-length stringsTobias Burnus11-125/+1808
Previously, array descriptors might have been mapped as 'alloc' instead of 'to' for 'alloc', not updating the array bounds. The 'alloc' could also appear for 'data exit', failing with a libgomp assert. In some cases, either array descriptors or deferred-length string's length variable was not mapped. And, finally, some offset calculations with array-sections mappings went wrong. The testcases contain some comment-out tests which require follow-up work and for which PR exist. Those mostly relate to deferred-length strings which have several issues beyong OpenMP support. This is the OG12 variant of the submitted but unreviewed GCC 13/mainline patch at https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612387.html gcc/fortran/ChangeLog: * trans-decl.cc (gfc_get_symbol_decl): Add attributes such as 'declare target' also to hidden artificial variable for deferred-length character variables. * trans-openmp.cc (gfc_trans_omp_array_section, gfc_trans_omp_clauses, gfc_trans_omp_target_exit_data): Improve mapping of array descriptors and deferred-length string variables. gcc/ChangeLog: * gimplify.cc (gimplify_scan_omp_clauses): Remove Fortran special case. libgomp/ChangeLog: * testsuite/libgomp.fortran/target-enter-data-3.f90: Uncomment 'target exit data'. * testsuite/libgomp.fortran/target-enter-data-4.f90: New test. * testsuite/libgomp.fortran/target-enter-data-5.f90: New test. * testsuite/libgomp.fortran/target-enter-data-6.f90: New test. * testsuite/libgomp.fortran/target-enter-data-7.f90: New test.
2023-02-22Fix: Fortran/OpenMP: align/allocator modifiers to the allocate clauseTobias Burnus2-8/+13
When merging r13-4584-gb2e1c49b4a4 to OG12 as commit 58e0579ed87, the 'align' handling seemingly ended up in the wrong clause. (Result: libgomp.fortran/allocate-2a.f90 FAILED; now fixed.) gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_clauses): Move align modifier handling from OMP_LIST_ALLOCATOR to OMP_LIST_ALLOCATE.
2023-02-22Daily bump.GCC Administrator2-1/+22
2023-02-21libstdc++: Add noexcept-specifier to std::reference_wrapper::operator()Jonathan Wakely2-1/+17
This isn't required by the standard, but there's an LWG issue suggesting to add it. Also use __invoke_result instead of result_of, to match the spec in recent standards. libstdc++-v3/ChangeLog: * include/bits/refwrap.h (reference_wrapper::operator()): Add noexcept-specifier and use __invoke_result instead of result_of. * testsuite/20_util/reference_wrapper/invoke-noexcept.cc: New test. (cherry picked from commit e47df5eb56c4e7aca0d3e50826e5aaa1887fa446)
2023-02-21libstdc++: Fix std::filesystem errors with -fkeep-inline-functions [PR108636]Jonathan Wakely3-8/+23
With -fkeep-inline-functions there are linker errors when including <filesystem>. This happens because there are some filesystem::path constructors defined inline which call non-exported functions defined in the library. That's usually not a problem, because those constructors are only called by code that's also inside the library. But when the header is compiled with -fkeep-inline-functions those inline functions are emitted even though they aren't called. That then creates an undefined reference to the other library internsl. The fix is to just move the private constructors into the library where they are called. That way they are never even seen by users, and so not compiled even if -fkeep-inline-functions is used. libstdc++-v3/ChangeLog: PR libstdc++/108636 * include/bits/fs_path.h (path::path(string_view, _Type)) (path::_Cmpt::_Cmpt(string_view, _Type, size_t)): Move inline definitions to ... * src/c++17/fs_path.cc: ... here. * testsuite/27_io/filesystem/path/108636.cc: New test. (cherry picked from commit db8d6fc572ec316ccfcf70b1dffe3be0b1b37212)
2023-02-21Daily bump.GCC Administrator4-1/+39
2023-02-20c++: ICE with redundant capture [PR108829]Marek Polacek3-3/+28
Here we crash in is_capture_proxy: /* Location wrappers should be stripped or otherwise handled by the caller before using this predicate. */ gcc_checking_assert (!location_wrapper_p (decl)); We only crash with the redundant capture: int abyPage = [=, abyPage] { ... } because prune_lambda_captures is only called when there was a default capture, and with [=] only abyPage won't be in LAMBDA_EXPR_CAPTURE_LIST. The problem is that LAMBDA_CAPTURE_EXPLICIT_P wasn't propagated correctly and so var_to_maybe_prune proceeded where it shouldn't. Co-Authored by: Patrick Palka <ppalka@redhat.com> PR c++/108829 gcc/cp/ChangeLog: * pt.cc (prepend_one_capture): Set LAMBDA_CAPTURE_EXPLICIT_P. (tsubst_lambda_expr): Pass LAMBDA_CAPTURE_EXPLICIT_P to prepend_one_capture. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/lambda/lambda-108829-2.C: New test. * g++.dg/cpp0x/lambda/lambda-108829.C: New test. (cherry picked from commit 02d8ab3e4e2f3d9dc12157a98c976d6698e71e29)
2023-02-20Prototype 'GOMP_enable_pinned_mode'Thomas Schwinge2-0/+3
Fix-up for og12 commit 842df187487f5b16ae29bbe7e9acd79661a9df48 "openmp: -foffload-memory=pinned". No functional change. libgomp/ * libgomp_g.h (GOMP_enable_pinned_mode): New.
2023-02-20Attempt to not just register but allocate OpenMP pinned memory using a ↵Thomas Schwinge2-0/+34
device: ChangeLog ... forgotten in og12 commit 4bd844f3e0202b3d083f0784f4343570c88bb86c "Attempt to not just register but allocate OpenMP pinned memory using a device".
2023-02-20Attempt to not just register but allocate OpenMP pinned memory using a deviceThomas Schwinge7-87/+98
... instead of 'mmap' plus attempting to register using a device. Implemented for nvptx offloading via 'cuMemHostAlloc'. This re-works og12 commit a5a4800e92773da7126c00a9c79b172494d58ab5 "Attempt to register OpenMP pinned memory using a device instead of 'mlock'". include/ * cuda/cuda.h (cuMemHostRegister, cuMemHostUnregister): Remove. libgomp/ * config/linux/allocator.c (linux_memspace_alloc): Add 'init0' formal parameter. Adjust all users. (linux_memspace_alloc, linux_memspace_free): Attempt to allocate OpenMP pinned memory using a device instead of 'mmap' plus attempting to register using a device. * libgomp-plugin.h (GOMP_OFFLOAD_register_page_locked) (GOMP_OFFLOAD_unregister_page_locked): Remove. (GOMP_OFFLOAD_page_locked_host_alloc) (GOMP_OFFLOAD_page_locked_host_free): New. * libgomp.h (gomp_register_page_locked) (gomp_unregister_page_locked): Remove. (gomp_page_locked_host_alloc, gomp_page_locked_host_free): New. (struct gomp_device_descr): Remove 'register_page_locked_func', 'unregister_page_locked_func'. Add 'page_locked_host_alloc_func', 'page_locked_host_free_func'. * plugin/cuda-lib.def (cuMemHostRegister_v2, cuMemHostRegister) (cuMemHostUnregister): Remove. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_register_page_locked) (GOMP_OFFLOAD_unregister_page_locked): Remove. (GOMP_OFFLOAD_page_locked_host_alloc) (GOMP_OFFLOAD_page_locked_host_free): New. * target.c (gomp_register_page_locked) (gomp_unregister_page_locked): Remove. (gomp_page_locked_host_alloc, gomp_page_locked_host_free): Add. (gomp_load_plugin_for_device): Don't handle 'register_page_locked', 'unregister_page_locked'. Handle 'page_locked_host_alloc', 'page_locked_host_free'. Suggested-by: Andrew Stubbs <ams@codesourcery.com>
2023-02-20aarch64: Fix up bfmlal lane pattern [PR104921]Alex Coplan4-1/+28
As the testcase shows, this pattern had an incorrect constraint leading to GCC's output getting rejected by the assembler. This patch fixes the constraint accordingly. The test is split into two: one that can run without bf16 support from the assembler and another that checks that the output actually assembles when such support is available. gcc/ChangeLog: PR target/104921 * config/aarch64/aarch64-simd.md (aarch64_bfmlal<bt>_lane<q>v4sf): Use correct constraint for operand 3. gcc/testsuite/ChangeLog: PR target/104921 * gcc.target/aarch64/pr104921-1.c: New test. * gcc.target/aarch64/pr104921-2.c: New test. * gcc.target/aarch64/pr104921.x: Include file for new tests. (cherry picked from commit 277e1f30a5e4e634304a7b8a532825119f0ea47f)
2023-02-20Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus26-48/+383
Merge up to r12-9189-gc6e3ecca0e3dcf567d0c843a4987e52591041372 (20th Feb 2023)
2023-02-20Daily bump.GCC Administrator1-1/+1
2023-02-19Daily bump.GCC Administrator2-1/+14
2023-02-18LoongArch: Fix multiarch tuple canonizationXi Ruoyao2-8/+8
Multiarch tuple will be coded in file or directory names in multiarch-aware distros, so one ABI should have only one multiarch tuple. For example, "--target=loongarch64-linux-gnu --with-abi=lp64s" and "--target=loongarch64-linux-gnusf" should both set multiarch tuple to "loongarch64-linux-gnusf". Before this commit, "--target=loongarch64-linux-gnu --with-abi=lp64s --disable-multilib" will produce wrong result (loongarch64-linux-gnu). A recent LoongArch psABI revision mandates "loongarch64-linux-gnu" to be used for -mabi=lp64d (instead of "loongarch64-linux-gnuf64") for some non-technical reason [1]. Note that we cannot make "loongarch64-linux-gnuf64" an alias for "loongarch64-linux-gnu" because to implement such an alias, we must create thousands of symlinks in the distro and doing so would be completely unpractical. This commit also aligns GCC with the revision. Tested by building cross compilers with --enable-multiarch and multiple combinations of --target=loongarch64-linux-gnu*, --with-abi=lp64{s,f,d}, and --{enable,disable}-multilib; and run "xgcc --print-multiarch" then manually verify the result with eyesight. [1]: https://github.com/loongson/LoongArch-Documentation/pull/80 gcc/ChangeLog: * config.gcc (triplet_abi): Set its value based on $with_abi, instead of $target. (la_canonical_triplet): Set it after $triplet_abi is set correctly. * config/loongarch/t-linux (MULTILIB_OSDIRNAMES): Make the multiarch tuple for lp64d "loongarch64-linux-gnu" (without "f64" suffix). (cherry picked from commit 017849d9d88f021770a90f12fffec9aa2425ed27)
2023-02-18Daily bump.GCC Administrator1-1/+1
2023-02-17Daily bump.GCC Administrator4-1/+63
2023-02-16Attempt to register OpenMP pinned memory using a device instead of 'mlock'Thomas Schwinge15-19/+460
Implemented for nvptx offloading via 'cuMemHostRegister'. This means: (a) not running into 'mlock' limitations, and (b) the device is aware of this and may optimize host <-> device memory transfers. This re-works og12 commit ab7520b3b4cd9fdabfd63652badde478955bd3b5 "libgomp: pinned memory". include/ * cuda/cuda.h (cuMemHostRegister, cuMemHostUnregister): New. libgomp/ * config/linux/allocator.c (linux_memspace_alloc) (linux_memspace_free, linux_memspace_realloc): Attempt to register OpenMP pinned memory using a device instead of 'mlock'. * libgomp-plugin.h (GOMP_OFFLOAD_register_page_locked) (GOMP_OFFLOAD_unregister_page_locked): New. * libgomp.h (gomp_register_page_locked) (gomp_unregister_page_locked): New (struct gomp_device_descr): Add 'register_page_locked_func', 'unregister_page_locked_func'. * plugin/cuda-lib.def (cuMemHostRegister_v2, cuMemHostRegister) (cuMemHostUnregister): New. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_register_page_locked) (GOMP_OFFLOAD_unregister_page_locked): New. * target.c (gomp_register_page_locked) (gomp_unregister_page_locked): New. (gomp_load_plugin_for_device): Handle 'register_page_locked', 'unregister_page_locked'. * testsuite/libgomp.c/alloc-pinned-1.c: Adjust. * testsuite/libgomp.c/alloc-pinned-2.c: Likewise. * testsuite/libgomp.c/alloc-pinned-3.c: Likewise. * testsuite/libgomp.c/alloc-pinned-4.c: Likewise. * testsuite/libgomp.c/alloc-pinned-5.c: Likewise. * testsuite/libgomp.c/alloc-pinned-6.c: Likewise.
2023-02-16In 'libgomp/allocator.c:omp_realloc', route 'free' through 'MEMSPACE_FREE'Thomas Schwinge2-1/+13
... to not run into a SIGSEGV if a non-'malloc'-based allocation is 'free'd here. Fix-up for og12 commit c5d1d7651297a273321154a5fe1b01eba9dcf604 "libgomp, nvptx: low-latency memory allocator". libgomp/ * allocator.c (omp_realloc): Route 'free' through 'MEMSPACE_FREE'.
2023-02-16Clarify/verify OpenMP 'omp_calloc' zero-initialization for pinned memoryThomas Schwinge7-0/+60
Clarification for og12 commit ab7520b3b4cd9fdabfd63652badde478955bd3b5 "libgomp: pinned memory". No functional change. libgomp/ * config/linux/allocator.c (linux_memspace_alloc) (linux_memspace_calloc): Clarify zero-initialization for pinned memory. * testsuite/libgomp.c/alloc-pinned-1.c: Verify zero-initialization for pinned memory. * testsuite/libgomp.c/alloc-pinned-2.c: Likewise. * testsuite/libgomp.c/alloc-pinned-3.c: Likewise. * testsuite/libgomp.c/alloc-pinned-4.c: Likewise. * testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
2023-02-16Miscellaneous clean-up re OpenMP 'ompx_unified_shared_mem_space', ↵Thomas Schwinge4-3/+11
'ompx_host_mem_space' Clean-up for og12 commit 84914e197d91a67b3d27db0e4c69a433462983a5 "openmp, nvptx: ompx_unified_shared_mem_alloc". No functional change. libgomp/ * config/linux/allocator.c (linux_memspace_calloc): Elide (innocuous) duplicate 'if' condition. * config/nvptx/allocator.c (nvptx_memspace_free): Explicitly handle 'memspace == ompx_host_mem_space'. * libgomp.h (gomp_is_usm_ptr): Remove.
2023-02-16Un-break nvptx libgomp buildThomas Schwinge2-1/+4
In file included from [...]/libgomp/config/nvptx/allocator.c:49: [...]/libgomp/config/nvptx/../../basic-allocator.c:52:2: error: invalid preprocessing directive #deine; did you mean #define? 52 | #deine BASIC_ALLOC_YIELD | ^~~~~ | define Yes, indeed. Fix-up for og12 commit 9583738a62a33a276b2aad980a27e77097f95924 "nvptx, libgomp: Move the low-latency allocator code". libgomp/ * basic-allocator.c (BASIC_ALLOC_YIELD): instead of '#deine', '#define' it.
2023-02-16libstdc++: Fix incorrect function call in -ffast-math optimizationMatthias Kretz1-2/+2
Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: * include/experimental/bits/simd_math.h (__hypot): Bitcasting between scalars requires the __bit_cast helper function instead of simd_bit_cast. (cherry picked from commit a5de17d9120dde7e6598a05ea4d1556c2783c69b)
2023-02-16libstdc++: Fix incorrect __builtin_is_constant_evaluated callsMatthias Kretz1-9/+12
Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: * include/experimental/bits/simd_x86.h (_SimdImplX86::_S_not_equal_to, _SimdImplX86::_S_less) (_SimdImplX86::_S_less_equal): Do not call __builtin_is_constant_evaluated in constexpr-if. (cherry picked from commit 1fd3836463c65f695831ef04c7dbda1e7a1794ba)