aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-01-26restrict gcc.dg/pr107554.c to 64bit platformsRichard Biener1-1/+1
The following avoids exceeding the maximum object size on 32bit platforms. * gcc.dg/pr107554.c: Restrict to lp64. (cherry picked from commit e7ebdf51ea514ad0b2272ecfa97d6ec72a527e40)
2023-01-26Daily bump.GCC Administrator3-1/+39
2023-01-25aarch64: fix warning emission for ABI break since GCC 9.1Christophe Lyon15-7/+1132
While looking at PR 105549, which is about fixing the ABI break introduced in GCC 9.1 in parameter alignment with bit-fields, we noticed that the GCC 9.1 warning is not emitted in all the cases where it should be. This patch fixes that and the next patch in the series fixes the GCC 9.1 break. We split this into two patches since patch #2 introduces a new ABI break starting with GCC 13.1. This way, patch #1 can be back-ported to release branches if needed to fix the GCC 9.1 warning issue. The main idea is to add a new global boolean that indicates whether we're expanding the start of a function, so that aarch64_layout_arg can emit warnings for callees as well as callers. This removes the need for aarch64_function_arg_boundary to warn (with its incomplete information). However, in the first patch there are still cases where we emit warnings were we should not; this is fixed in patch #2 where we can distinguish between GCC 9.1 and GCC.13.1 ABI breaks properly. The fix in aarch64_function_arg_boundary (replacing & with &&) looks like an oversight of a previous commit in this area which changed 'abi_break' from a boolean to an integer. We also take the opportunity to fix the comment above aarch64_function_arg_alignment since the value of the abi_break parameter was changed in a previous commit, no longer matching the description. 2022-11-28 Christophe Lyon <christophe.lyon@arm.com> Richard Sandiford <richard.sandiford@arm.com> gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_function_arg_alignment): Fix comment. (aarch64_layout_arg): Factorize warning conditions. (aarch64_function_arg_boundary): Fix typo. * function.cc (currently_expanding_function_start): New variable. (expand_function_start): Handle currently_expanding_function_start. * function.h (currently_expanding_function_start): Declare. gcc/testsuite/ChangeLog: * gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning.h: New test. * g++.target/aarch64/bitfield-abi-warning-align16-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: New test. * g++.target/aarch64/bitfield-abi-warning-align32-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: New test. * g++.target/aarch64/bitfield-abi-warning-align8-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning.h: New test. (cherry picked from commit 3df1a115be22caeab3ffe7afb12e71adb54ff132)
2023-01-25Daily bump.GCC Administrator3-1/+73
2023-01-24tree-optimization/108164 - undefined overflow with IV vectorizationRichard Biener2-5/+26
vect_update_ivs_after_vectorizer can end up emitting a signed IV update when the loop body performed an unsigned computation. The following makes sure to perform that update in the type of the loop update type to avoid undefined behavior on overflow. PR tree-optimization/108164 * tree-vect-loop-manip.cc (vect_update_ivs_after_vectorizer): Perform vect_step_op_add update in the appropriate type. * gcc.dg/pr108164.c: New testcase. (cherry picked from commit ec459469f8a75d96a9b26694554efcc900d411de)
2023-01-24tree-optimization/108076 - if-conversion and forced labelsRichard Biener2-2/+29
When doing if-conversion we simply throw away labels without checking whether they are possibly targets of non-local gotos or have their address taken. The following rectifies this and refuses to if-convert such loops. PR tree-optimization/108076 * tree-if-conv.cc (if_convertible_loop_p_1): Reject blocks with non-local or forced labels that we later remove labels from. * gcc.dg/torture/pr108076.c: New testcase. (cherry picked from commit b4fddbe9592e9feb37ce567d90af822b75995531)
2023-01-24middle-end/107994 - ICE after error with comparison gimplificationRichard Biener1-2/+4
The following avoids passing down error_mark_node to fold_convert. PR middle-end/107994 * gimplify.cc (gimplify_expr): Catch errorneous comparison operand. (cherry picked from commit 845b514e8a150447ba041294586af76a6ac05158)
2023-01-24tree-optimization/107554 - fix ICE in stlen optimizationRichard Biener2-1/+13
The following fixes a wrongly typed variable causing an ICE. PR tree-optimization/107554 * tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes): Use unsigned HOST_WIDE_INT type for the strlen. * gcc.dg/pr107554.c: New testcase. Co-Authored-By: Nikita Voronov <nik_1357@mail.ru> (cherry picked from commit 81de4037454275f8ed6d858fbc129e832c6147ef)
2023-01-24driver: fix environ corruption after putenv() [PR106624]Sergei Trofimovich1-1/+1
The bug appeared afte r13-2010-g1270ccda70ca09 "Factor out jobserver_active_p" slightly changed `putenv()` use from allocating to non-allocating: -xputenv (concat ("MAKEFLAGS=", dup, NULL)); +xputenv (jinfo.skipped_makeflags.c_str ()); `xputenv()` (and `putenv()`) don't copy strings and only store the pointer in the `environ` global table. As a result `environ` got corrupted as soon as `jinfo.skipped_makeflags` store got deallocated. This started causing bootstrap crashes in `execv()` calls: xgcc: fatal error: cannot execute '/build/build/./prev-gcc/collect2': execv: Bad address The change restores memory allocation for `xputenv()` argument. gcc/ PR driver/106624 * gcc.cc (driver::detect_jobserver): Allocate storage xputenv() argument using xstrdup(). (cherry picked from commit 2b403297b111c990c331b5bbb6165b061ad2259b)
2023-01-24Update 'libgomp/libgomp.texi' for 'nvptx, libgfortran: Switch out of ↵Thomas Schwinge1-0/+5
"minimal" mode': 'libgomp/ChangeLog.omp'
2023-01-24Update 'libgomp/libgomp.texi' for 'nvptx, libgfortran: Switch out of ↵Thomas Schwinge1-4/+6
"minimal" mode' libgomp/ * libgomp.texi (nvptx): Update for 'nvptx, libgfortran: Switch out of "minimal" mode'.
2023-01-24Make 'libgcc/config/nvptx/crt0.c' build '--without-headers'Thomas Schwinge2-1/+11
..., where it currently fails: [...]/libgcc/config/nvptx/crt0.c:22:10: fatal error: stdlib.h: No such file or directory 22 | #include <stdlib.h> | ^~~~~~~~~~ Fix-up for "nvptx: Support global constructors/destructors via 'collect2'". libgcc/ * config/nvptx/crt0.c [!HAVE_STDLIB_H]: Don't '#include <stdlib.h>'. (atexit): Prototype.
2023-01-24Daily bump.GCC Administrator4-1/+45
2023-01-23Fortran: error recovery for invalid CLASS component [PR108434]Harald Anlauf2-2/+13
gcc/fortran/ChangeLog: PR fortran/108434 * expr.cc (class_allocatable): Prevent NULL pointer dereference or invalid read. (class_pointer): Likewise. gcc/testsuite/ChangeLog: PR fortran/108434 * gfortran.dg/pr108434.f90: New test. (cherry picked from commit 117848f425a3c0eda85517b4bdaf2ebe3bc705c2)
2023-01-23Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus10-24/+185
Merge up to r12-9058-ge1357577e6e39430869e294f94c2c547717b960f (23rd Jan 2023)
2023-01-23PR 106101: IBM zSystems: Fix strict_low_part problemAndreas Krebbel5-22/+116
This avoids generating illegal (strict_low_part (reg ...)) RTXs. This required two changes: 1. Do not use gen_lowpart to generate the inner expression of a STRICT_LOW_PART. gen_lowpart might fold the SUBREG either because there is already a paradoxical subreg or because it can directly be applied to the register. A new wrapper function makes sure that we always end up having an actual SUBREG. 2. Change the movstrict patterns to enforce a SUBREG as inner operand of the STRICT_LOW_PARTs. The new predicate introduced for the destination operand requires a SUBREG expression with a register_operand as inner operand. However, since reload strips away the majority of the SUBREGs we have to accept single registers as well once we reach reload. Bootstrapped and regression tested on IBM zSystems 64 bit. gcc/ChangeLog: PR target/106101 * config/s390/predicates.md (subreg_register_operand): New predicate. * config/s390/s390-protos.h (s390_gen_lowpart_subreg): New function prototype. * config/s390/s390.cc (s390_gen_lowpart_subreg): New function. (s390_expand_insv): Use s390_gen_lowpart_subreg instead of gen_lowpart. * config/s390/s390.md ("*get_tp_64", "*zero_extendhisi2_31") ("*zero_extendqisi2_31", "*zero_extendqihi2_31"): Likewise. ("movstrictqi", "movstricthi", "movstrictsi"): Use the subreg_register_operand predicate instead of register_operand. gcc/testsuite/ChangeLog: PR target/106101 * gcc.c-torture/compile/pr106101.c: New test. (cherry picked from commit 585a21bab3ec688c2039bff2922cc372d8558283)
2023-01-23Daily bump.GCC Administrator1-1/+1
2023-01-22Daily bump.GCC Administrator3-1/+11
2023-01-21Backported from master:Jerry DeLisle2-1/+58
PR fortran/106731 gcc/fortran/ChangeLog: * trans-array.cc (gfc_trans_auto_array_allocation): Remove gcc_assert (!TREE_STATIC()). gcc/testsuite/ChangeLog: * gfortran.dg/pr106731.f90: New test.
2023-01-21Daily bump.GCC Administrator1-1/+1
2023-01-20nvptx, libgfortran: Switch out of "minimal" modeThomas Schwinge10-52/+33
..., in order to enable (portions of) Fortran I/O, for example. libgfortran/ChangeLog: * configure: Regenerate. * configure.ac: No longer set LIBGFOR_MINIMAL for nvptx. libgomp/ChangeLog: * testsuite/libgomp.fortran/target-print-1.f90: Adjust. * testsuite/libgomp.fortran/target-print-1-nvptx.f90: Remove. * testsuite/libgomp.oacc-fortran/print-1.f90: Adjust. * testsuite/libgomp.oacc-fortran/print-1-nvptx.f90: Remove. * testsuite/libgomp.oacc-fortran/error_stop-2.f: Adjust. * testsuite/libgomp.oacc-fortran/stop-2.f: Likewise. Co-authored-by: Andrew Stubbs <ams@codesourcery.com>
2023-01-20nvptx, libgcc: Stub unwinding implementationThomas Schwinge3-1/+44
Adding stub '_Unwind_Backtrace', '_Unwind_GetIPInfo' functions is necessary for linking libbacktrace, as a normal (non-'LIBGFOR_MINIMAL') configuration of libgfortran wants to do, for example. The file 'libgcc/config/nvptx/unwind-nvptx.c' is copied from 'libgcc/config/gcn/unwind-gcn.c'. libgcc/ChangeLog: * config/nvptx/t-nvptx: Add unwind-nvptx.c. * config/nvptx/unwind-nvptx.c: New file. Co-authored-by: Andrew Stubbs <ams@codesourcery.com>
2023-01-20nvptx: Support global constructors/destructors via 'collect2' for offloadingThomas Schwinge4-2/+185
This extends "nvptx: Support global constructors/destructors via 'collect2'" for offloading. libgcc/ * config/nvptx/crtstuff.c ["mgomp"] (__do_global_ctors__entry__mgomp) (__do_global_dtors__entry__mgomp): New. [!"mgomp"] (__do_global_ctors__entry, __do_global_dtors__entry): New. libgomp/ * plugin/plugin-nvptx.c (nvptx_do_global_cdtors): New. (nvptx_close_device, GOMP_OFFLOAD_load_image) (GOMP_OFFLOAD_unload_image): Call it.
2023-01-20nvptx: Support global constructors/destructors via 'collect2'Thomas Schwinge12-7/+139
The function attributes 'constructor', 'destructor', and 'init_priority' now work, as do the C++ features making use of this. Test cases with effective target 'global_constructor' and 'init_priority' now generally work, and 'check-gcc-c++' test results greatly improve; no more "sorry, unimplemented: global constructors not supported on this target". This depends on <https://github.com/MentorEmbedded/nvptx-tools/pull/40> "'nm'" generally, and for global destructors support: newlib <https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/> "nvptx: Implement '_exit' instead of 'exit'". gcc/ * collect2.cc (write_c_file_glob): Allow for 'COLLECT2_MAIN_REFERENCE' override. * config.gcc <case ${target} in nvptx-*>: Set 'use_collect2=yes'. * config/nvptx/nvptx.h: Adjust. gcc/testsuite/ * gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is 'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper in identifiers. * lib/target-supports.exp (check_effective_target_global_constructor): Enable for nvptx. libgcc/ * config.host <case ${host} in nvptx-*>: Add 'crtbegin.o', 'crtend.o' to 'extra_parts'. * config/nvptx/crt0.c: Invoke '__do_global_ctors', '__do_global_dtors'. * config/nvptx/crtstuff.c: New. * config/nvptx/t-nvptx: Adjust.
2023-01-20nvptx: Prevent emitting duplicate declarations for '__nvptx_stacks', ↵Thomas Schwinge5-10/+28
'__nvptx_uni' As I have reported to Nvidia in 2022-12-01 'NVIDIA Incident Report (3891704): ptxas: Duplicate declaration error: "cannot be resolved by a '.static'"', 'ptxas' has an inscrutable error mode for duplicate declarations: ptxas softstack-decl-1.o, line 11; error : '.extern' variable '__nvptx_stacks' cannot be resolved by a '.static' ptxas fatal : Ptx assembly aborted due to errors nvptx-as: ptxas returned 255 exit status ptxas uniform-simt-decl-1.o, line 12; error : '.extern' variable '__nvptx_uni' cannot be resolved by a '.static' ptxas fatal : Ptx assembly aborted due to errors nvptx-as: ptxas returned 255 exit status This is inscrutable, because (a) what is "cannot be resolved by a '.static'" supposed to tell me (there is no '.static' in PTX?), and (b) why arent't repeated declaration just verified to match the first, but otherwise a no-op (like in other programming languages)? gcc/ * config/nvptx/nvptx.cc (nvptx_assemble_undefined_decl): Notice '__nvptx_stacks', '__nvptx_uni' declarations. (nvptx_file_end): Don't emit duplicate declarations for those. gcc/testsuite/ * gcc.target/nvptx/softstack-decl-1.c: Make 'dg-do assemble', adjust. * gcc.target/nvptx/uniform-simt-decl-1.c: Likewise.
2023-01-20Add 'gcc.target/nvptx/softstack-decl-1.c', ↵Thomas Schwinge3-0/+52
'gcc.target/nvptx/uniform-simt-decl-1.c' ... to document the status quo re implicit (via 'need_softstack_decl', 'need_unisimt_decl') and explicit declarations of '__nvptx_stacks', '__nvptx_uni'. gcc/testsuite/ * gcc.target/nvptx/softstack-decl-1.c: New. * gcc.target/nvptx/uniform-simt-decl-1.c: Likewise.
2023-01-20nvptx: Make 'nvptx_uniform_warp_check' fit for non-full-warp executionThomas Schwinge7-1/+66
For example, this allows for '-muniform-simt' code to be executed single-threaded, which currently fails (device-side 'trap'), as the 0xffffffff mask isn't correct if not all 32 threads of a warp are active. The same issue/fix, I suppose but have not verified, would apply if we were to allow for OpenACC 'vector_length' smaller than 32, for example for OpenACC 'serial'. We use 'nvptx_uniform_warp_check' only for PTX ISA version less than 6.0. Otherwise we're using 'nvptx_warpsync', which emits 'bar.warp.sync 0xffffffff', which evidently appears to do the right thing. (I've tested '-muniform-simt' code executing single-threaded.) gcc/ * config/nvptx/nvptx.md (nvptx_uniform_warp_check): Make fit for non-full-warp execution. gcc/testsuite/ * gcc.target/nvptx/nvptx.exp (check_effective_target_default_ptx_isa_version_at_least_6_0): New. * gcc.target/nvptx/uniform-simt-5.c: New. libgomp/ * plugin/plugin-nvptx.c (nvptx_exec): Assert what we know about 'blockDimX'.
2023-01-20Clean up after newlib "nvptx: In offloading execution, map '_exit' to ↵Thomas Schwinge10-33/+50
'abort' [GCC PR85463]" PR target/85463 libgfortran/ * runtime/minimal.c [__nvptx__] (exit): Don't override. libgomp/ * config/nvptx/error.c (exit): Don't override. * testsuite/libgomp.oacc-fortran/error_stop-1.f: Update. * testsuite/libgomp.oacc-fortran/error_stop-2.f: Likewise. * testsuite/libgomp.oacc-fortran/error_stop-3.f: Likewise. * testsuite/libgomp.oacc-fortran/stop-1.f: Likewise. * testsuite/libgomp.oacc-fortran/stop-2.f: Likewise. * testsuite/libgomp.oacc-fortran/stop-3.f: Likewise.
2023-01-20Fix 'libgomp.c/simd-math-1.c' configuration, againThomas Schwinge2-1/+3
Tobias pointed out that as of my recent og12 commit e7d4bcb974915bfe95be6c385641fc66a4201581 "Fix 'libgomp.c/simd-math-1.c' configuration", in GCC configurations without GCN offloading configured, we'd get: xgcc: error: GCC is not configured to support 'amdgcn-amdhsa' as '-foffload=' argument ("Interestingly", GCC doesn't complain for '-foffload-options=-lm' if there are no offload targets configured...) libgomp/ * testsuite/libgomp.c/simd-math-1.c: Fix configuration, again.
2023-01-20Force '--param openacc-kernels=parloops' in ↵Thomas Schwinge2-0/+60
'libgomp.oacc-c-c++-common/abort-3.c' libgomp/ * testsuite/libgomp.oacc-c-c++-common/abort-3.c: Force '--param openacc-kernels=parloops'.
2023-01-20Fix 'libgomp.c/simd-math-1.c' configurationThomas Schwinge2-2/+6
If nvptx offloading is configured in addition to GCN, we see: FAIL: libgomp.c/simd-math-1.c (test for excess errors) UNRESOLVED: libgomp.c/simd-math-1.c compilation failed to produce executable x86_64-pc-linux-gnu-accel-nvptx-none-gcc: error: unrecognized command-line option '-mstack-size=3000000' Thus, restrict that ooption to GCN offloading compilation, and on the other hand, there's no reason to skip this test for non-GCN offloading execution: even if not SIMD-vectorized there, we still benefit from correctness testing. libgomp/ * testsuite/libgomp.c/simd-math-1.c: Fix configuration.
2023-01-20Daily bump.GCC Administrator1-1/+1
2023-01-19Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus6-24/+1180
Merge up to r12-9052-g61ef24af3ce8ec9c5eb65770f8047d98f42a93bf (19th Jan 2023)
2023-01-19openmp: Fix up OpenMP expansion of non-rectangular loops [PR108459]Jakub Jelinek4-2/+60
expand_omp_for_init_counts was using for the case where collapse(2) inner loop has init expression dependent on non-constant multiple of the outer iterator and the condition upper bound expression doesn't depend on the outer iterator fold_unary (NEGATE_EXPR, ...). This will just return NULL if it can't be folded, we need fold_build1 instead. 2023-01-19 Jakub Jelinek <jakub@redhat.com> PR middle-end/108459 * omp-expand.cc (expand_omp_for_init_counts): Use fold_build1 rather than fold_unary for NEGATE_EXPR. * testsuite/libgomp.c/pr108459.c: New test. (cherry picked from commit 46644ec99cb355845b23bb1d02775c057ed8ee88)
2023-01-19Daily bump.GCC Administrator2-1/+11
2023-01-18libstdc++: Avoid recursion in __nothrow_wait_cv::wait [PR105730]Jonathan Wakely1-1/+21
The commit r12-5877-g9e18a25331fa25 removed the incorrect noexcept-specifier from std::condition_variable::wait and gave the new symbol version @@GLIBCXX_3.4.30. It also redefined the original symbol std::condition_variable::wait(unique_lock<mutex>&)@GLIBCXX_3.4.11 as an alias for a new symbol, __gnu_cxx::__nothrow_wait_cv::wait, which still has the incorrect noexcept guarantee. That __nothrow_wait_cv::wait is just a wrapper around the real condition_variable::wait which adds noexcept and so terminates on a __forced_unwind exception. This doesn't work on uclibc, possibly due to a dynamic linker bug. When __nothrow_wait_cv::wait calls the condition_variable::wait function it binds to the alias symbol, which means it just calls itself recursively until the stack overflows. This change avoids the possibility of a recursive call by changing the __nothrow_wait_cv::wait function so that instead of calling condition_variable::wait it re-implements it. This requires accessing the private _M_cond member of condition_variable, so we need to use the trick of instantiating a template with the member-pointer of the private member. libstdc++-v3/ChangeLog: PR libstdc++/105730 * src/c++11/compatibility-condvar.cc (__nothrow_wait_cv::wait): Access private data member of base class and call its wait member. (cherry picked from commit ee4af2ed0b7322884ec4ff537564683c3749b813)
2023-01-18Daily bump.GCC Administrator1-1/+1
2023-01-17libgomp: Add forgotten Changelog.omp entriesAndrew Stubbs1-0/+18
2023-01-17Daily bump.GCC Administrator3-1/+19
2023-01-16Add cpplib ka.poJoseph Myers1-0/+1110
* ka.po: New.
2023-01-16libstdc++: Unblock atomic wait on non-futex platforms [PR106183]Jonathan Wakely1-22/+20
When using a mutex and condition variable, the notifying thread needs to increment _M_ver while holding the mutex lock, and the waiting thread needs to re-check after locking the mutex. This avoids a missed notification as described in the PR. By moving the increment of _M_ver to the base _M_notify we can make the use of the mutex local to the use of the condition variable, and simplify the code a little. We can use a relaxed store because the mutex already provides sequential consistency. Also we don't need to check whether __addr == &_M_ver because we know that's always true for platforms that use a condition variable, and so we also know that we always need to use notify_all() not notify_one(). Reviewed-by: Thomas Rodgers <trodgers@redhat.com> libstdc++-v3/ChangeLog: PR libstdc++/106183 * include/bits/atomic_wait.h (__waiter_pool_base::_M_notify): Move increment of _M_ver here. [!_GLIBCXX_HAVE_PLATFORM_WAIT]: Lock mutex around increment. Use relaxed memory order and always notify all waiters. (__waiter_base::_M_do_wait) [!_GLIBCXX_HAVE_PLATFORM_WAIT]: Check value again after locking mutex. (__waiter_base::_M_notify): Remove increment of _M_ver. (cherry picked from commit af98cb88eb4be6a1668ddf966e975149bf8610b1)
2023-01-16Fortran/OpenMP: Reject non-scalar 'holds' expr in 'omp assume(s)' [PR107706]Tobias Burnus6-5/+45
gcc/fortran/ChangeLog: PR fortran/107706 * openmp.cc (gfc_resolve_omp_assumptions): Reject nonscalars. gcc/testsuite/ChangeLog: PR fortran/107706 * gfortran.dg/gomp/assume-2.f90: Update dg-error. * gfortran.dg/gomp/assumes-2.f90: Likewise. * gfortran.dg/gomp/assume-5.f90: New test. (cherry picked from commit 2ce55247a8bf32985a96ed63a7a92d36746723dc)
2023-01-16Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus17-52/+532
Merge up to r12-9046-gd369eb486bdc720e4c50563226dbbb11a0226b5d (16th Jan 2023)
2023-01-16Daily bump.GCC Administrator1-1/+1
2023-01-15Daily bump.GCC Administrator1-1/+1
2023-01-14Daily bump.GCC Administrator1-1/+1
2023-01-13libgomp, amdgcn: Switch USM to 128-byte alignmentAndrew Stubbs1-1/+2
This should optimize cache-lines on the AMD GPUs somewhat. libgomp/ChangeLog: * usm-allocator.c (ALIGN): Use 128-byte alignment.
2023-01-13Daily bump.GCC Administrator1-1/+1
2023-01-12Daily bump.GCC Administrator3-1/+28
2023-01-11amdgcn, libgomp: custom USM allocatorAndrew Stubbs2-89/+347
There were problems with critical driver data sharing pages with USM data, so this new allocator implementation moves USM to entirely different pages. libgomp/ChangeLog: * plugin/plugin-gcn.c: Include sys/mman.h and unistd.h. (usm_heap_create): New function. (struct usm_splay_tree_key_s): Delete function. (usm_splay_compare): Delete function. (splay_tree_prefix): Delete define. (GOMP_OFFLOAD_usm_alloc): Use new allocator. (GOMP_OFFLOAD_usm_free): Likewise. (GOMP_OFFLOAD_is_usm_ptr): Likewise. (gomp_fatal): Delete macro. (splay_tree_c): Delete. * usm-allocator.c: New file.