riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2025-07-31	aarch64: Stop using sys/ifunc.h header in libatomic and libgcc	Yury Khrustalev	1	-4/+8
	This optional header is used to bring in the definition of the struct __ifunc_arg_t type. Since it has been added to glibc only recently, the previous implementation had to check whether this header is present and, if not, it provide its own definition. This creates dead code because either one of these two parts would not be tested. The ABI specification for ifunc resolvers allows to create own ABI-compatible definition for this type, which is the right way of doing it. In addition to improving consistency, the new approach also helps with addition of new fields to struct __ifunc_arg_t type without the need to work-around situations when the definition imported from the header lacks these new fields. ABI allows to define as many hwcap fields in this struct as needed, provided that at runtime we only access the fields that are permitted by the _size value. gcc/ * config/aarch64/aarch64.cc (build_ifunc_arg_type): Add new fields _hwcap3 and _hwcap4. libatomic/ * config/linux/aarch64/host-config.h (__ifunc_arg_t): Remove sys/ifunc.h and add new fields _hwcap3 and _hwcap4. libgcc/ * config/aarch64/cpuinfo.c (__ifunc_arg_t): Likewise. (__init_cpu_features): obtain and assign values for the fields _hwcap3 and _hwcap4. (__init_cpu_features_constructor): check _size in the arg argument.
2025-04-16	libatomic: Fix up libat_{,un}lock_n for mingw [PR119796]	Jakub Jelinek	1	-18/+32
	Here is just a port of the previously posted patch to mingw which clearly has the same problems. 2025-04-16 Jakub Jelinek <jakub@redhat.com> PR libgcc/101075 PR libgcc/119796 * config/mingw/lock.c (libat_lock_n, libat_unlock_n): Start with computing how many locks will be needed and take into account ((uintptr_t)ptr % WATCH_SIZE). If some locks from the end of the locks array and others from the start of it will be needed, first lock the ones from the start followed by ones from the end.
2025-04-16	libatomic: Fix up libat_{,un}lock_n [PR119796]	Jakub Jelinek	1	-16/+23
	As mentioned in the PR (and I think in PR101075 too), we can run into deadlock with libat_lock_n calls with larger n. As mentioned in PR66842, we use multiple locks (normally 64 mutexes for each 64 byte cache line in 4KiB page) and currently can lock more than one lock, in particular for n [0, 64] a single lock, for n [65, 128] 2 locks, for n [129, 192] 3 locks etc. There are two problems with this: 1) we can deadlock if there is some wrap-around, because the locks are acquired always in the order from addr_hash (ptr) up to locks[NLOCKS-1].mutex and then if needed from locks[0].mutex onwards; so if e.g. 2 threads perform libat_lock_n with n = 2048+64, in one case at pointer starting at page boundary and in another case at page boundary + 2048 bytes, the first thread can lock the first 32 mutexes, the second thread can lock the last 32 mutexes and then first thread wait for the lock 32 held by second thread and second thread wait for the lock 0 held by the first thread; fixed below by always locking the locks in order of increasing index, if there is a wrap-around, by locking in 2 loops, first locking some locks at the start of the array and second at the end of it 2) the number of locks seems to be determined solely depending on the n value, I think that is wrong, we don't know the structure alignment on the libatomic side, it could very well be 1 byte aligned struct, and so how many cachelines are actually (partly or fully) occupied by the atomic access depends not just on the size, but also on ptr % WATCH_SIZE, e.g. 2 byte structure at address page_boundary+63 should IMHO lock 2 locks because it occupies the first and second cacheline Note, before this patch it locked exactly one lock for n = 0, while with this patch it could lock either no locks at all (if it is at cacheline boundary) or 1 (otherwise). Dunno of libatomic APIs can be called for zero sizes and whether we actually care that much how many mutexes are locked in that case, because one can't actually read/write anything into zero sized memory. If you think it is important, I could add else if (nlocks == 0) nlocks = 1; in both spots. 2025-04-16 Jakub Jelinek <jakub@redhat.com> PR libgcc/101075 PR libgcc/119796 * config/posix/lock.c (libat_lock_n, libat_unlock_n): Start with computing how many locks will be needed and take into account ((uintptr_t)ptr % WATCH_SIZE). If some locks from the end of the locks array and others from the start of it will be needed, first lock the ones from the start followed by ones from the end.
2025-01-10	libatomic: Cleanup AArch64 ifunc selection	Wilco Dijkstra	1	-41/+35
	Simplify and cleanup ifunc selection logic. Since LRCPC3 does not imply LSE2, has_rcpc3() should also check LSE2 is enabled. Passes regress and bootstrap, OK for commit? libatomic: * config/linux/aarch64/host-config.h (has_lse2): Cleanup. (has_lse128): Likewise. (has_rcpc3): Add early check for LSE2.
2025-01-02	Update copyright years.	Jakub Jelinek	26	-26/+26

2024-11-14	aarch64: libatomic: add GCS marking to asm	Szabolcs Nagy	1	-2/+9
	libatomic/ChangeLog: * config/linux/aarch64/atomic_16.S (FEATURE_1_GCS): Define. (GCS_FLAG): Define if GCS is enabled. (GNU_PROPERTY): Add GCS_FLAG.
2024-07-18	libatomic: Handle AVX+CX16 ZHAOXIN like Intel for 16b atomic [PR104688]	mayshao	1	-3/+7
	PR target/104688 libatomic/ChangeLog: * config/x86/init.c (__libat_feat1_init): Don't clear bit_AVX on ZHAOXIN CPUs.
2024-07-18	libatomic: Improve cpuid usage in __libat_feat1_init	Uros Bizjak	1	-13/+15
	Check the result of __get_cpuid and process FEAT1_REGISTER only when __get_cpuid returns success. Use __cpuid instead of nested __get_cpuid. libatomic/ChangeLog: * config/x86/init.c (__libat_feat1_init): Check the result of __get_cpuid and process FEAT1_REGISTER only when __get_cpuid returns success. Use __cpuid instead of nested __get_cpuid.
2024-06-25	libatomic: Add rcpc3 128-bit atomic operations for AArch64	Victor Do Nascimento	2	-6/+74
	The introduction of the optional RCPC3 architectural extension for Armv8.2-A upwards provides additional support for the release consistency model, introducing the Load-Acquire RCpc Pair Ordered, and Store-Release Pair Ordered operations in the form of LDIAPP and STILP. These operations are single-copy atomic on cores which also implement LSE2 and, as such, support for these operations is added to Libatomic and employed accordingly when the LSE2 and RCPC3 features are detected in a given core at runtime. libatomic/ChangeLog: * config/linux/aarch64/atomic_16.S (libat_load_16): Add LRCPC3 variant. (libat_store_16): Likewise. * config/linux/aarch64/host-config.h (HWCAP2_LRCPC3): New. (LSE2_LRCPC3_ATOP): Previously LSE2_ATOP. New ifuncs guarded under it. (has_rcpc3): New.
2024-06-12	Libatomic: Clean up AArch64 `atomic_16.S' implementation file	Victor Do Nascimento	1	-222/+223
	At present, `atomic_16.S' groups different implementations of the same functions together in the file. Therefore, as an example, the LSE2 implementation of `load_16' follows on immediately from its core implementation, as does the `store_16' LSE2 implementation. Such architectural extension-dependent implementations are dependent on ifunc support, such that they are guarded by the relevant preprocessor macro, i.e. `#if HAVE_IFUNC'. Having to apply these guards on a per-function basis adds unnecessary clutter to the file and makes its maintenance more error-prone. We therefore reorganize the layout of the file in such a way that all core implementations needing no `#ifdef's are placed first, followed by all ifunc-dependent implementations, which can all be guarded by a single `#if HAVE_IFUNC', greatly reducing the overall number of required `#ifdef' macros. libatomic/ChangeLog: * config/linux/aarch64/atomic_16.S: Reorganize functions in file. (HAVE_FEAT_LSE2): Delete.
2024-06-12	Libatomic: Make ifunc selector behavior contingent on importing file	Victor Do Nascimento	2	-54/+45
	By querying previously-defined file-identifier macros, `host-config.h' is able to get information about its environment and, based on this information, select more appropriate function-specific ifunc selectors. This reduces the number of unnecessary feature tests that need to be carried out in order to find the best atomic implementation for a function at run-time. An immediate benefit of this is that we can further fine-tune the architectural requirements for each atomic function without risk of incurring the maintenance and runtime-performance penalties of having to maintain an ifunc selector with a huge number of alternatives, most of which are irrelevant for any particular function. Consequently, for AArch64 targets, we relax the architectural requirements of `compare_exchange_16', which now requires only LSE as opposed to the newer LSE2. The new flexibility provided by this approach also means that certain functions can now be called directly, doing away with ifunc selectors altogether when only a single implementation is available for it on a given target. As per the macro expansion framework laid out in `libatomic_i.h', such functions should have their names prefixed with `__atomic_' as opposed to `libat_'. This is the same prefix applied to function names when Libatomic is configured with `--disable-gnu-indirect-function'. To achieve this, these functions unconditionally apply the aliasing rule that at present is conditionally applied only when libatomic is built without ifunc support, which ensures that the default `libat_##NAME' is accessible via the equivalent `__atomic_##NAME' too. This is ensured by using the new `ENTRY_ALIASED' macro. Finally, this means we are able to do away with a whole set of function aliases that were needed until now, thus considerably cleaning up the implementation. libatomic/ChangeLog: * config/linux/aarch64/atomic_16.S: Remove unnecessary aliasing. (LSE): New. (ENTRY_ALIASED): Likewise. * config/linux/aarch64/host-config.h (LSE_ATOP): New. (LSE2_ATOP): Likewise. (LSE128_ATOP): Likewise. (IFUNC_COND_1): Make its definition conditional on above 3 macros. (IFUNC_NCOND): Likewise.
2024-06-12	Libatomic: AArch64: Convert all lse128 assembly to .insn directives	Victor Do Nascimento	1	-44/+32
	Given the lack of support for the LSE128 instructions in all but the the most up-to-date version of Binutils (2.42), having the build-time test for assembler support for these instructions often leads to the building of Libatomic without support for LSE128-dependent atomic function implementations. This ultimately leads to different people having different versions of Libatomic on their machines, depending on which assembler was available at compilation time. Furthermore, the conditional inclusion of these atomic function implementations predicated on assembler support leads to a series of `#if HAVE_FEAT_LSE128' guards scattered throughout the codebase and the need for a series of aliases when the feature flag evaluates to false. The preprocessor macro guards, together with the conditional aliasing leads to code that is cumbersome to understand and maintain. Both of the issues highlighted above will only get worse with the coming support for LRCPC3 atomics which under the current scheme will also require build-time checks. Consequently, a better option for both consistency across builds and code cleanness is to make recourse to the `.inst' directive. By replacing all novel assembly instructions for their hexadecimal representation within `.inst's, we ensure that the Libatomic code is both considerably cleaner and all machines build the same binary, irrespective of binutils version available at compile time. This patch therefore removes all configure checks for LSE128-support in the assembler and all the guards and aliases that were associated with `HAVE_FEAT_LSE128' libatomic/ChangeLog: * acinclude.m4 (LIBAT_TEST_FEAT_AARCH64_LSE128): Delete. * auto-config.h.in (HAVE_FEAT_LSE128): Likewise * config/linux/aarch64/atomic_16.S: Replace all LSE128 instructions with equivalent `.inst' directives. (HAVE_FEAT_LSE128): Remove all references. * configure: Regenerate. * configure.ac: Remove call to LIBAT_TEST_FEAT_AARCH64_LSE128.
2024-04-26	libatomic: Cleanup macros in atomic_16.S	Wilco Dijkstra	1	-118/+102
	Cleanup the macros to add the libat_ prefixes in atomic_16.S. Emit the alias to __atomic_<op> when ifuncs are not enabled in the ENTRY macro. libatomic: * config/linux/aarch64/atomic_16.S: Add __libat_ prefix in the LSE2/LSE128/CORE macros, remove elsewhere. Add ATOMIC macro.
2024-04-04	libatomic: Fix build for --disable-gnu-indirect-function [PR113986]	Wilco Dijkstra	2	-29/+70
	Fix libatomic build to support --disable-gnu-indirect-function on AArch64. Always build atomic_16.S, add aliases to the __atomic_ functions if !HAVE_IFUNC. Include auto-config.h in atomic_16.S to avoid having to pass defines via makefiles. Fix build if HWCAP_ATOMICS/CPUID are not defined. libatomic: PR target/113986 * Makefile.in: Regenerated. * Makefile.am: Make atomic_16.S not depend on HAVE_IFUNC. Remove predefine of HAVE_FEAT_LSE128. * acinclude.m4: Remove ARCH_AARCH64_HAVE_LSE128. * configure: Regenerated. * config/linux/aarch64/atomic_16.S: Add __atomic_ alias if !HAVE_IFUNC. * config/linux/aarch64/host-config.h: Correctly handle !HAVE_IFUNC. Add defines for HWCAP_ATOMICS and HWCAP_CPUID.
2024-02-03	libatomic: Provide FPU exception defines for __hppa__	John David Anglin	1	-0/+74
	The exception defines in <fenv.h> do not match the exception bits in the FPU status register on hppa-linux and hppa64-hpux11.11. On linux, they match the trap enable bits. On 64-bit hpux, they match the exception bits for IA64. The IA64 bits are in a different order and location than HPPA. HP uses table look ups to reorder the bits in code to test and raise exceptions. All the architectures that I looked at just pass the FPU status register to __atomic_feraiseexcept(). The simplest approach for hppa is to define FE_INEXACT, etc, to match the status register and not include <fenv.h>.. 2024-02-03 John David Anglin <danglin@gcc.gnu.org> libatomic/ChangeLog: PR target/59778 * configure.tgt (hppa): Set ARCH. config/pa/fenv.c: New file.
2024-01-28	Libatomic: Add checks in ifunc selectors for LSE/LSE2 requirements.	Victor Do Nascimento	1	-3/+10
	At present, Evaluation of both `has_lse2(hwcap)' and `has_lse128(hwcap)' may require issuing an `mrs' instruction to query a system register. This instruction, when issued from user-space results in a trap by the kernel which then returns the value read in by the system register. Given the undesirable nature of the computational expense associated with the context switch, it is important to implement mechanisms to, wherever possible, forgo the operation. In light of this, given how other architectural requirements serving as prerequisites have long been assigned HWCAP bits by the kernel, we can inexpensively query for their availability before attempting to read any system registers. Where one of these early tests fail, we can assert that the main feature of interest (be it LSE2 or LSE128) cannot be present, allowing us to return from the function early and skip the unnecessary expensive kernel-mediated access to system registers. libatomic/ChangeLog: * config/linux/aarch64/host-config.h (has_lse2): Add test for LSE. (has_lse128): Add test for LSE2.
2024-01-28	libatomic: Enable LSE128 128-bit atomics for Armv9.4-a	Victor Do Nascimento	2	-7/+205
	The armv9.4-a architectural revision adds three new atomic operations associated with the LSE128 feature: * LDCLRP - Atomic AND NOT (bitclear) of a location with 128-bit value held in a pair of registers, with original data loaded into the same 2 registers. * LDSETP - Atomic OR (bitset) of a location with 128-bit value held in a pair of registers, with original data loaded into the same 2 registers. * SWPP - Atomic swap of one 128-bit value with 128-bit value held in a pair of registers. It is worth noting that in keeping with existing 128-bit atomic operations in `atomic_16.S', we have chosen to merge certain less-restrictive orderings into more restrictive ones. This is done to minimize the number of branches in the atomic functions, minimizing both the likelihood of branch mispredictions and, in keeping code small, limit the need for extra fetch cycles. Past benchmarking has revealed that acquire is typically slightly faster than release (5-10%), such that for the most frequently used atomics (CAS and SWP) it makes sense to add support for acquire, as well as release. Likewise, it was identified that combining acquire and release typically results in little to no penalty, such that it is of negligible benefit to distinguish between release and acquire-release, making the combining release/acq_rel/seq_cst a worthwhile design choice. This patch adds the logic required to make use of these when the architectural feature is present and a suitable assembler available. In order to do this, the following changes are made: 1. Add a configure-time check to check for LSE128 support in the assembler. 2. Edit host-config.h so that when N == 16, nifunc = 2. 3. Where available due to LSE128, implement the second ifunc, making use of the novel instructions. 4. For atomic functions unable to make use of these new instructions, define a new alias which causes the _i1 function variant to point ahead to the corresponding _i2 implementation. libatomic/ChangeLog: * Makefile.am (AM_CPPFLAGS): add conditional setting of -DHAVE_FEAT_LSE128. * acinclude.m4 (LIBAT_TEST_FEAT_AARCH64_LSE128): New. * config/linux/aarch64/atomic_16.S (LSE128): New macro definition. (libat_exchange_16): New LSE128 variant. (libat_fetch_or_16): Likewise. (libat_or_fetch_16): Likewise. (libat_fetch_and_16): Likewise. (libat_and_fetch_16): Likewise. * config/linux/aarch64/host-config.h (IFUNC_COND_2): New. (IFUNC_NCOND): Add operand size checking. (has_lse2): Renamed from `ifunc1`. (has_lse128): New. (HWCAP2_LSE128): Likewise. * configure.ac: Add call to LIBAT_TEST_FEAT_AARCH64_LSE128. * configure (ac_subst_vars): Regenerated via autoreconf. * Makefile.in: Likewise. * auto-config.h.in: Likewise.
2024-01-28	libatomic: Add support for __ifunc_arg_t arg in ifunc resolver	Victor Do Nascimento	1	-2/+13
	With support for new atomic features in Armv9.4-a being indicated by HWCAP2 bits, Libatomic's ifunc resolver must now query its second argument, of type __ifunc_arg_t. We therefore make this argument known to libatomic, allowing us to query hwcap2 bits in the following manner: bool resolver (unsigned long hwcap, const __ifunc_arg_t features); { return (features->hwcap2 & HWCAP2_<FEAT_NAME>); } libatomic/ChangeLog: * config/linux/aarch64/host-config.h (__ifunc_arg_t): Conditionally-defined if `sys/ifunc.h' not found. (_IFUNC_ARG_HWCAP): Likewise. (IFUNC_COND_1): Pass __ifunc_arg_t argument to ifunc. (ifunc1): Modify function signature to accept __ifunc_arg_t argument. * configure.tgt: Add second `const __ifunc_arg_t *features' argument to IFUNC_RESOLVER_ARGS.
2024-01-28	libatomic: atomic_16.S: Improve ENTRY, END and ALIAS macro interface	Victor Do Nascimento	1	-32/+43
	The introduction of further architectural-feature dependent ifuncs for AArch64 makes hard-coding ifunc `_i<n>' suffixes to functions cumbersome to work with. It is awkward to remember which ifunc maps onto which arch feature and makes the code harder to maintain when new ifuncs are added and their suffixes possibly altered. This patch uses pre-processor `#define' statements to map each suffix to a descriptive feature name macro, for example: #define LSE(NAME) NAME##_i1 Where we wish to generate ifunc names with the pre-processor's token concatenation feature, we add a level of indirection to previous macro calls. If before we would have had`MACRO(<name>_i<n>)', we now have `MACRO_FEAT(name, feature)'. Where we wish to refer to base functionality (i.e., functions where ifunc suffixes are absent), the original `MACRO(<name>)' may be used to bypass suffixing. Consequently, for base functionality, where the ifunc suffix is absent, the macro interface remains the same. For example, the entry and endpoints of `libat_store_16' remain defined by: ENTRY (libat_store_16) and END (libat_store_16) For the LSE2 implementation of the same 16-byte atomic store, we now have: ENTRY_FEAT (libat_store_16, LSE2) and END_FEAT (libat_store_16, LSE2) For the aliasing of function names, we define the following new implementation of the ALIAS macro: ALIAS (FN_BASE_NAME, FROM_SUFFIX, TO_SUFFIX) Defining the `CORE(NAME)' macro to be the identity operator, it returns the base function name unaltered and allows us to alias target-specific ifuncs to the corresponding base implementation. For example, we'd alias the LSE2 `libat_exchange_16' to it base implementation with: ALIAS (libat_exchange_16, LSE2, CORE) libatomic/ChangeLog: * config/linux/aarch64/atomic_16.S (CORE): New macro. (LSE2): Likewise. (ENTRY_FEAT): Likewise. (ENTRY_FEAT1): Likewise. (END_FEAT): Likewise. (END_FEAT1): Likewise. (ALIAS): Modify macro to take in `arch' arguments. (ALIAS1): New.
2024-01-03	Update copyright years.	Jakub Jelinek	25	-25/+25

2023-12-15	libatomic: Enable lock-free 128-bit atomics on AArch64	Wilco Dijkstra	2	-50/+161
	Enable lock-free 128-bit atomics on AArch64. This is backwards compatible with existing binaries (as for these GCC always calls into libatomic, so all 128-bit atomic uses in a process are switched), gives better performance than locking atomics and is what most users expect. 128-bit atomic loads use a load/store exclusive loop if LSE2 is not supported. This results in an implicit store which is invisible to software as long as the given address is writeable (which will be true when using atomics in real code). This doesn't yet change __atomic_is_lock_free eventhough all atomics are finally lock-free on AArch64. libatomic: * config/linux/aarch64/atomic_16.S: Implement lock-free ARMv8.0 atomics. (libat_exchange_16): Merge RELEASE and ACQ_REL/SEQ_CST cases. * config/linux/aarch64/host-config.h: Use atomic_16.S for baseline v8.0.
2023-11-10	libatomic: Improve ifunc selection on AArch64	Wilco Dijkstra	1	-1/+25
	Add support for ifunc selection based on CPUID register. Neoverse N1 supports atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse cores. Reviewed-by: Kyrylo.Tkachov@arm.com libatomic: * config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc selection.
2023-03-24	libatomic: Fix SEQ_CST 128-bit atomic load [PR108891]	Wilco Dijkstra	1	-74/+115
	The LSE2 ifunc for 16-byte atomic load requires a barrier before the LDP - without it, it effectively has Load-AcquirePC semantics similar to LDAPR, which is less restrictive than what __ATOMIC_SEQ_CST requires. This patch fixes this and adds comments to make it easier to see which sequence is used for each case. Use a load/store exclusive loop for store to simplify testing memory ordering is correct (it is slightly faster too). libatomic/ PR libgcc/108891 * config/linux/aarch64/atomic_16.S: Fix libat_load_16_i1. Add comments describing the memory order.
2023-03-03	s390: libatomic: Fix 16 byte atomic {cas,load,store}	Stefan Schulze Frielinghaus	3	-0/+176
	This is a follow-up to commit a4c6bd0821099f6b8c0f64a96ffd9d01a025c413 introducing a runtime check for alignment for 16 byte atomic compare-exchange, load, and store. libatomic/ChangeLog: * config/s390/cas_n.c: New file. * config/s390/load_n.c: New file. * config/s390/store_n.c: New file.
2023-01-16	Update copyright years.	Jakub Jelinek	22	-22/+22

2023-01-07	Always define `WIN32_LEAN_AND_MEAN` before <windows.h>	LIU Hao	1	-0/+1
	Recently, mingw-w64 has got updated <msxml.h> from Wine which is included indirectly by <windows.h> if `WIN32_LEAN_AND_MEAN` is not defined. The `IXMLDOMDocument` class has a member function named `abort()`, which gets affected by our `abort()` macro in "system.h". `WIN32_LEAN_AND_MEAN` should, nevertheless, always be defined. This can exclude 'APIs such as Cryptography, DDE, RPC, Shell, and Windows Sockets' [1], and speed up compilation of these files a bit. [1] https://learn.microsoft.com/en-us/windows/win32/winprog/using-the-windows-headers gcc/ PR middle-end/108300 * config/xtensa/xtensa-dynconfig.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. * diagnostic-color.cc: Likewise. * plugin.cc: Likewise. * prefix.cc: Likewise. gcc/ada/ PR middle-end/108300 * adaint.c: Define `WIN32_LEAN_AND_MEAN` before `#include <windows.h>`. * cio.c: Likewise. * ctrl_c.c: Likewise. * expect.c: Likewise. * gsocket.h: Likewise. * mingw32.h: Likewise. * mkdir.c: Likewise. * rtfinal.c: Likewise. * rtinit.c: Likewise. * seh_init.c: Likewise. * sysdep.c: Likewise. * terminals.c: Likewise. * tracebak.c: Likewise. gcc/jit/ PR middle-end/108300 * jit-w32.h: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libatomic/ PR middle-end/108300 * config/mingw/lock.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libffi/ PR middle-end/108300 * src/aarch64/ffi.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libgcc/ PR middle-end/108300 * config/i386/enable-execute-stack-mingw32.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. * libgcc2.c: Likewise. * unwind-generic.h: Likewise. libgfortran/ PR middle-end/108300 * intrinsics/sleep.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libgomp/ PR middle-end/108300 * config/mingw32/proc.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libiberty/ PR middle-end/108300 * make-temp-file.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. * pex-win32.c: Likewise. libssp/ PR middle-end/108300 * ssp.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libstdc++-v3/ PR middle-end/108300 * src/c++11/system_error.cc: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. * src/c++11/thread.cc: Likewise. * src/c++17/fs_ops.cc: Likewise. * src/filesystem/ops.cc: Likewise. libvtv/ PR middle-end/108300 * vtv_malloc.cc: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. * vtv_rts.cc: Likewise. * vtv_utils.cc: Likewise.
2022-11-15	libatomic: Add support for LSE and LSE2	Wilco Dijkstra	2	-5/+475
	Add support for AArch64 LSE and LSE2 to libatomic. Disable outline atomics, and use LSE ifuncs for 1-8 byte atomics and LSE2 ifuncs for 16-byte atomics. On Neoverse V1, 16-byte atomics are ~4x faster due to avoiding locks. Note this is safe since we swap all 16-byte atomics using the same ifunc, so they either use locks or LSE2 atomics, but never a mix. This also improves ABI compatibility with LLVM: its inlined 16-byte atomics are compatible with the new libatomic if LSE2 is supported. libatomic/ * Makefile.in: Regenerated with automake 1.15.1. * Makefile.am: Add atomic_16.S for AArch64. * configure.tgt: Disable outline atomics in AArch64 build. * config/linux/aarch64/atomic_16.S: New file - implementation of ifuncs for 16-byte atomics. * config/linux/aarch64/host-config.h: Enable ifuncs, use LSE (HWCAP_ATOMICS) for 1-8-byte atomics and LSE2 (HWCAP_USCAT) for 16-byte atomics.
2022-11-15	libatomic: Handle AVX+CX16 AMD like Intel for 16b atomics [PR104688]	Jakub Jelinek	1	-2/+4
	We got a response from AMD in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688#c10 so the following patch starts treating AMD with AVX and CMPXCHG16B ISAs like Intel by using vmovdqa for atomic load/store in libatomic. We still don't have confirmation from Zhaoxin and VIA (anything else with CPUs featuring AVX and CX16?). 2022-11-15 Jakub Jelinek <jakub@redhat.com> PR target/104688 * config/x86/init.c (__libat_feat1_init): Don't clear bit_AVX on AMD CPUs.
2022-08-08	AArch32: Fix 128-bit sequential consistency atomic operations.	Tamar Christina	1	-0/+19
	Similar to AArch64 the Arm implementation of 128-bit atomics is broken. For 128-bit atomics we rely on pthread barriers to correct guard the address in the pointer to get correct memory ordering. However for 128-bit atomics the address under the lock is different from the original pointer. This means that one of the values under the atomic operation is not protected properly and so we fail during when the user has requested sequential consistency as there's no barrier to enforce this requirement. As such users have resorted to adding an #ifdef GCC <emit barrier> #endif around the use of these atomics. This corrects the issue by issuing a barrier only when __ATOMIC_SEQ_CST was requested. I have hand verified that the barriers are inserted for atomic seq cst. libatomic/ChangeLog: PR target/102218 * config/arm/host-config.h (pre_seq_barrier, post_seq_barrier, pre_post_seq_barrier): Require barrier on __ATOMIC_SEQ_CST.
2022-08-08	AArch64: Fix 128-bit sequential consistency atomic operations.	Tamar Christina	2	-0/+69
	The AArch64 implementation of 128-bit atomics is broken. For 128-bit atomics we rely on pthread barriers to correct guard the address in the pointer to get correct memory ordering. However for 128-bit atomics the address under the lock is different from the original pointer. This means that one of the values under the atomic operation is not protected properly and so we fail during when the user has requested sequential consistency as there's no barrier to enforce this requirement. As such users have resorted to adding an #ifdef GCC <emit barrier> #endif around the use of these atomics. This corrects the issue by issuing a barrier only when __ATOMIC_SEQ_CST was requested. To remedy this performance hit I think we should revisit using a similar approach to out-line-atomics for the 128-bit atomics. Note that I believe I need the empty file due to the include_next chain but I am not entirely sure. I have hand verified that the barriers are inserted for atomic seq cst. libatomic/ChangeLog: PR target/102218 * config/aarch64/aarch64-config.h: New file. * config/aarch64/host-config.h: New file.
2022-03-17	libatomic: Improve 16-byte atomics on Intel AVX [PR104688]	Jakub Jelinek	2	-7/+48
	As mentioned in the PR, the latest Intel SDM has added: "Processors that enumerate support for Intel® AVX (by setting the feature flag CPUID.01H:ECX.AVX[bit 28]) guarantee that the 16-byte memory operations performed by the following instructions will always be carried out atomically: • MOVAPD, MOVAPS, and MOVDQA. • VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128. • VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded with EVEX.128 and k0 (masking disabled). (Note that these instructions require the linear addresses of their memory operands to be 16-byte aligned.)" The following patch deals with it just on the libatomic library side so far, currently (since ~ 2017) we emit all the __atomic_* 16-byte builtins as library calls since and this is something that we can hopefully backport. The patch simply introduces yet another ifunc variant that takes priority over the pure CMPXCHG16B one, one that checks AVX and CMPXCHG16B bits and on non-Intel clears the AVX bit during detection for now (if AMD comes with the same guarantee, we could revert the config/x86/init.c hunk), which implements 16-byte atomic load as vmovdqa and 16-byte atomic store as vmovdqa followed by mfence. 2022-03-17 Jakub Jelinek <jakub@redhat.com> PR target/104688 * Makefile.am (IFUNC_OPTIONS): Change on x86_64 to -mcx16 -mcx16. (libatomic_la_LIBADD): Add $(addsuffix _16_2_.lo,$(SIZEOBJS)) for x86_64. * Makefile.in: Regenerated. * config/x86/host-config.h (IFUNC_COND_1): For x86_64 define to both AVX and CMPXCHG16B bits. (IFUNC_COND_2): Define. (IFUNC_NCOND): For x86_64 define to 2 * (N == 16). (MAYBE_HAVE_ATOMIC_CAS_16, MAYBE_HAVE_ATOMIC_EXCHANGE_16, MAYBE_HAVE_ATOMIC_LDST_16): Define to IFUNC_COND_2 rather than IFUNC_COND_1. (HAVE_ATOMIC_CAS_16): Redefine to 1 whenever IFUNC_ALT != 0. (HAVE_ATOMIC_LDST_16): Redefine to 1 whenever IFUNC_ALT == 1. (atomic_compare_exchange_n): Define whenever IFUNC_ALT != 0 on x86_64 for N == 16. (__atomic_load_n, __atomic_store_n): Redefine whenever IFUNC_ALT == 1 on x86_64 for N == 16. (atomic_load_n, atomic_store_n): New functions. * config/x86/init.c (__libat_feat1_init): On x86_64 clear bit_AVX if CPU vendor is not Intel.
2022-01-03	Update copyright years.	Jakub Jelinek	19	-19/+19

2021-07-21	Adjust macro to avoid warning [PR101379].	Martin Sebor	1	-1/+7
	Resolves: PR bootstrap/101379 - libatomic arm build failure after r12-2132 due to -Warray-bounds on a constant address libatomic/ChangeLog: PR bootstrap/101379 * config/linux/arm/host-config.h (__kernel_helper_version): New function. Adjust shadow macro.
2021-01-04	Update copyright years.	Jakub Jelinek	19	-19/+19

2020-10-11	aix: remove libgomp and libatomic archives before creating FAT archives	Clément Chigot	1	-0/+3
	AIX caches shared objects in archives with read-other permission. libgomp and libatomic might be in use during the build or testing, which may cause archiver operations on them to fail. This patch adjusts the Makefile fragments to delete the library archives before creating fresh archives containing both the 32 bit and 64 bit shared objects. libatomic/ChangeLog: 2020-10-11 Clement Chigot <clement.chigot@atos.net> * config/t-aix: Delete and recreate libatomic before creating FAT library. libgomp/ChangeLog: 2020-10-11 Clement Chigot <clement.chigot@atos.net> * config/t-aix: Delete and recreate libgomp before creating FAT library.
2020-09-27	aix: Use $(AR) without -X32_64 to build FAT libraries.	Clément Chigot	1	-2/+3
	AIX FAT libraries should be built with the version of AR chosen by configure. The GNU Make $(AR) variable includes the AIX -X32_64 option needed by the default Makefile rules to accept both 32 bit and 64 bit object files. The -X32_64 option conflicts with ar archiving objects of the same name used to build FAT libraries. This patch changes the Makefile fragments for AIX FAT libraries to use $(AR), but strips the -X32_64 option from the Make variable. libgcc/ChangeLog: 2020-09-27 Clement Chigot <clement.chigot@atos.net> * config/rs6000/t-slibgcc-aix: Use $(AR) without -X32_64. libatomic/ChangeLog: 2020-09-27 Clement Chigot <clement.chigot@atos.net> * config/t-aix: Use $(AR) without -X32_64. libgomp/ChangeLog: 2020-09-27 Clement Chigot <clement.chigot@atos.net> * config/t-aix: Use $(AR) without -X32_64. libstdc++-v3/ChangeLog: 2020-09-27 Clement Chigot <clement.chigot@atos.net> * config/os/aix/t-aix: Use $(AR) without -X32_64. libgfortran/ChangeLog: 2020-09-27 Clement Chigot <clement.chigot@atos.net> * config/t-aix: Use $(AR) without -X32_64.
2020-09-11	[libatomic] Add nvptx support	Tom de Vries	2	-0/+112
	Add nvptx support to libatomic. Given that atomic_test_and_set is not implemented for nvptx (PR96964), the compiler translates __atomic_test_and_set falling back onto the "Failing all else, assume a single threaded environment and simply perform the operation" case in expand_atomic_test_and_set, so it doesn't map onto an actual atomic operation. Still, that counts as supported for the configure test of libatomic, so we end up with HAVE_ATOMIC_TAS_1/2/4/8/16 == 1, and the corresponding __atomic_test_and_set_1/2/4/8/16 in libatomic all using that non-atomic implementation. Fix this by adding an atomic_test_and_set expansion for nvptx, that uses libatomics __atomic_test_and_set_1. This again makes the configure tests for HAVE_ATOMIC_TAS_1/2/4/8/16 fail, so instead we use this case in tas_n.c: ... /* If this type is smaller than word-sized, fall back to a word-sized compare-and-swap loop. / bool SIZE(libat_test_and_set) (UTYPE mptr, int smodel) ... which for __atomic_test_and_set_8 uses INVERT_MASK_8. Add INVERT_MASK_8 in libatomic_i.h, as well as MASK_8. Tested libatomic testsuite on nvptx. gcc/ChangeLog: PR target/96964 * config/nvptx/nvptx.md (define_expand "atomic_test_and_set"): New expansion. libatomic/ChangeLog: PR target/96898 * configure.tgt: Add nvptx. * libatomic_i.h (MASK_8, INVERT_MASK_8): New macro definition. * config/nvptx/host-config.h: New file. * config/nvptx/lock.c: New file.
2020-07-14	aix: FAT libraries: test native compiler mode directly	David Edelsohn	1	-1/+1
	The FAT libraries config fragments need to know which library is native and which is a multilib to choose the correct multilib from which to append the additional object file or shared object file. Testing the top-level archive is fragile because it will fail if rebuilding. This patch tests the compiler preprocessing macros for the 64 bit AIX specific __64BIT__ to determine the native mode of the compiler in MULTILIBTOP. 2020-07-14 David Edelsohn <dje.gcc@gmail.com> libatomic/ChangeLog * config/t-aix: Set BITS from compiler cpp macro. libgcc/ChangeLog * config/rs6000/t-slibgcc-aix: Set BITS from compiler cpp macro. libgfortran/ChangeLog * config/t-aix: Set BITS from compiler cpp macro. libgomp/ChangeLog * config/t-aix: Set BITS from compiler cpp macro. libstdc++-v3/ChangeLog * config/os/aix/t-aix: Set BITS from compiler cpp macro.
2020-06-21	aix: Add GCC64 configuration and FAT target libraries.	David Edelsohn	1	-0/+10
	This patch adds the ability to configure GCC on AIX to build as a 64 bit application and to build target libraries "FAT" libraries in both 32 bit and 64 bit mode. The patch adds makefile fragment hooks to target libraries that allows them to include target-specific rules. The target specific rules for AIX place both 32 bit and 64 bit objects and shared objects in archives at the top-level, not multilib subdirectories. The multilibs are built in subdirectories, but must be combined during the last parts of the target library build process. Because of the way that GCC bootstrap works, the libraries must be combined during the multiple stages of GCC bootstrap, not solely when installed in the final destination, so the libraries are correct at the end of each target library build stage, not solely an install recipe. gcc/ChangeLog 2020-06-21 David Edelsohn <dje.gcc@gmail.com> * config.gcc: Use t-aix64, biarch64 and default64 for cpu_is_64bit. * config/rs6000/aix72.h (ASM_SPEC): Remove aix64 option. (ASM_SPEC32): New. (ASM_SPEC64): New. (ASM_CPU_SPEC): Remove vsx and altivec options. (CPP_SPEC_COMMON): Rename from CPP_SPEC. (CPP_SPEC32): New. (CPP_SPEC64): New. (CPLUSPLUS_CPP_SPEC): Rename to CPLUSPLUS_CPP_SPEC_COMMON.. (TARGET_DEFAULT): Only define if not BIARCH. (LIB_SPEC_COMMON): Rename from LIB_SPEC. (LIB_SPEC32): New. (LIB_SPEC64): New. (LINK_SPEC_COMMON): Rename from LINK_SPEC. (LINK_SPEC32): New. (LINK_SPEC64): New. (STARTFILE_SPEC): Add 64 bit version of crtcxa and crtdbase. (ASM_SPEC): Define 32 and 64 bit alternatives using DEFAULT_ARCH64_P. (CPP_SPEC): Same. (CPLUSPLUS_CPP_SPEC): Same. (LIB_SPEC): Same. (LINK_SPEC): Same. (SUBTARGET_EXTRA_SPECS): Add new 32/64 specs. * config/rs6000/defaultaix64.h: New file. * config/rs6000/t-aix64: New file. libgcc/ChangeLog 2020-06-21 David Edelsohn <dje.gcc@gmail.com> * config.host (extra_parts): Add crtcxa_64 and crtdbase_64. * config/rs6000/t-aix-cxa: Explicitly compile 32 bit with -maix32 and 64 bit with -maix64. * config/rs6000/t-slibgcc-aix: Remove extra @multilib_dir@ level. Build and install AIX-style FAT libraries. libgomp/ChangeLog 2020-06-21 David Edelsohn <dje.gcc@gmail.com> * Makefile.am (tmake_file): Build and install AIX-style FAT libraries. * Makefile.in: Regenerate * configure.ac (tmake_file): Substitute. * configure: Regenerate. * configure.tgt (powerpc-ibm-aix): Define tmake_file. config/t-aix: New file. libstdc++-v3/ChangeLog 2020-06-21 David Edelsohn <dje.gcc@gmail.com> * Makefile.am (tmake_file): Build and install AIX-style FAT libraries. * Makefile.in: Regenerate. * configure.ac (tmake_file): Substitute. * configure: Regenerate. * configure.host (aix): Define tmake_file. config/os/aix/t-aix: New file. libatomic/ChangeLog 2020-06-21 David Edelsohn <dje.gcc@gmail.com> * Makefile.am (tmake_file): Build and install AIX-style FAT libraries. * Makefile.in: Regenerate. * configure.ac (tmake_file): Substitute. * configure: Regenerate. * configure.tgt (powerpc-ibm-aix): Define tmake_file. config/t-aix: New file. libgfortran/ChangeLog 2020-06-21 David Edelsohn <dje.gcc@gmail.com> * Makefile.am (tmake_file): Build and install AIX-style FAT libraries. * Makefile.in: Regenerate. * configure.ac (tmake_file): Substitute. * configure: Regenerate. * configure.host: Add system configury stanza. Define tmake_file. * config/t-aix: New file.
2020-06-01	i386: Add __attribute__ ((gcc_struct)) to struct fenv [PR95418]	Uros Bizjak	1	-1/+1
	Windows ABI (MinGW) is different than Linux ABI when bitfileds are involved. The following patch adds __attribute__ ((gcc_struct)) to struct fenv in order to match the layout of x87 state image in memory. 2020-06-01 Uroš Bizjak <ubizjak@gmail.com> libatomic/ChangeLog: * config/x86/fenv.c (struct fenv): Add __attribute__ ((gcc_struct)). libgcc/ChangeLog: * config/i386/sfp-exceptions.c (struct fenv): Add __attribute__ ((gcc_struct)). libgfortran/ChangeLog: PR libfortran/95418 * config/fpu-387.h (struct fenv): Add __attribute__ ((gcc_struct)).
2020-05-06	i386: Use generic division to generate INEXACT exception	Uros Bizjak	1	-13/+9
	Introduce math_force_eval_div to use generic division to generate INEXACT as well as INVALID and DIVZERO exceptions. libgcc/ChangeLog: * config/i386/sfp-exceptions.c (__math_force_eval): Remove. (__math_force_eval_div): New define. (__sfp_handle_exceptions): Use __math_force_eval_div to use generic division to generate INVALID, DIVZERO and INEXACT exceptions. libatomic/ChangeLog: * config/x86/fenv.c (__math_force_eval): Remove. (__math_force_eval_div): New define. (__atomic_deraiseexcept): Use __math_force_eval_div to use generic division to generate INVALID, DIVZERO and INEXACT exceptions. libgfortran/ChangeLog: * config/fpu-387.h (__math_force_eval): Remove. (__math_force_eval_div): New define. (local_feraiseexcept): Use __math_force_eval_div to use generic division to generate INVALID, DIVZERO and INEXACT exceptions. (struct fenv): Define named struct instead of typedef.
2020-05-01	i386: Use generic division to generate INVALID and DIVZERO exceptions	Uros Bizjak	1	-12/+8
	Introduce math_force_eval to evaluate generic division to generate INVALID and DIVZERO exceptions. libgcc/ChangeLog: * config/i386/sfp-exceptions.c (__math_force_eval): New define. (__sfp_handle_exceptions): Use __math_force_eval to evaluete generic division to generate INVALID and DIVZERO exceptions. libatomic/ChangeLog: * config/x86/fenv.c (__math_force_eval): New define. (__atomic_feraiseexcept): Use __math_force_eval to evaluete generic division to generate INVALID and DIVZERO exceptions. libgfortran/ChangeLog: * config/fpu-387.h (__math_force_eval): New define. (local_feraiseexcept): Use __math_force_eval to evaluete generic division to generate INVALID and DIVZERO exceptions.
2020-04-19	i386: Remove unneeded assignments when triggering SSE exceptions	Uros Bizjak	1	-6/+0
	According to "Intel 64 and IA32 Arch SDM, Vol. 3: "Because SIMD floating-point exceptions are precise and occur immediately, the situation does not arise where an x87 FPU instruction, a WAIT/FWAIT instruction, or another SSE/SSE2/SSE3 instruction will catch a pending unmasked SIMD floating-point exception." Remove unneeded assignments to volatile memory. libgcc/ChangeLog: * config/i386/sfp-exceptions.c (__sfp_handle_exceptions) [__SSE_MATH__]: Remove unneeded assignments to volatile memory. libatomic/ChangeLog: * config/x86/fenv.c (__atomic_feraiseexcept) [__SSE_MATH__]: Remove unneeded assignments to volatile memory. libgfortran/ChangeLog: * config/fpu-387.h (local_feraiseexcept) [__SSE_MATH__]: Remove unneeded assignments to volatile memory.
2020-01-01	Update copyright years.	Jakub Jelinek	17	-17/+17
	From-SVN: r279813
2019-01-01	Update copyright years.	Jakub Jelinek	17	-17/+17
	From-SVN: r267494
2018-06-21	[ARM] Use __ARM_ARCH and __ARM_FEATURE_LDREX instead of __ARM_ARCH__	Christophe Lyon	1	-47/+5
	2018-06-21 Christophe Lyon <christophe.lyon@linaro.org> libatomic/ * config/arm/arm-config.h (__ARM_ARCH__): Remove definitions, use __ARM_ARCH instead. Use __ARM_FEATURE_LDREX to define HAVE_STREX and HAVE_STREXBHD libgcc/ * config/arm/lib1funcs.S (__ARM_ARCH__): Remove definitions, use __ARM_ARCH and __ARM_FEATURE_CLZ instead. (HAVE_ARM_CLZ): Remove definition, use __ARM_FEATURE_CLZ instead. * config/arm/ieee754-df.S: Use __ARM_FEATURE_CLZ instead of __ARM_ARCH__. * config/arm/ieee754-sf.S: Likewise. * config/arm/libunwind.S: Use __ARM_ARCH instead of __ARM_ARCH__. From-SVN: r261841
2018-05-23	x86: libatomic: Do not assume ELF constructors run before IFUNC resolvers	Florian Weimer	2	-9/+39
	PR libgcc/60790 x86: Do not assume ELF constructors run before IFUNC resolvers. * config/x86/host-config.h (libat_feat1_ecx, libat_feat1_edx): Remove declarations. (__libat_feat1, __libat_feat1_init): Declare. (FEAT1_REGISTER): Define. (load_feat1): New function. (IFUNC_COND_1): Adjust. * config/x86/init.c (libat_feat1_ecx, libat_feat1_edx) (init_cpuid): Remove definitions. (__libat_feat1): New variable. (__libat_feat1_init): New function. From-SVN: r260603
2018-03-09	S/390: libatomic: Fix 16 byte atomic exchange	Andreas Krebbel	1	-0/+69
	The compiler builtin will use the hardware instruction cdsg if the memory operand is properly aligned and will fall back to the library call otherwise. In case the compiler for one part is able to detect that the location is aligned and fails to do so for another usage of the hw instruction and the sw fall back would be mixed on the same memory location. To avoid this the library fall back also has to use the hardware instruction if possible. libatomic/ChangeLog: 2018-03-09 Andreas Krebbel <krebbel@linux.vnet.ibm.com> * config/s390/exch_n.c: New file. * configure.tgt: Add the config directory for s390. From-SVN: r258384
2018-01-03	Update copyright years.	Jakub Jelinek	16	-16/+16
	From-SVN: r256169
2017-12-05	Makefile.am (ARCH_AARCH64_LINUX): Add IFUNC_OPTIONS and libatomic_la_LIBADD.	Steve Ellcey	1	-0/+36
	2017-12-04 Steve Ellcey <sellcey@cavium.com> * Makefile.am (ARCH_AARCH64_LINUX): Add IFUNC_OPTIONS and libatomic_la_LIBADD. * config/linux/aarch64/host-config.h: New file. * configure.ac (IFUNC_RESOLVER_ARGS): Define. (ARCH_AARCH64_LINUX): New conditional for IFUNC builds. * configure.tgt (aarch64): Set ARCH and try_ifunc. (aarch64--linux) Update config_path. (aarch64--linux) Set IFUNC_RESOLVER_ARGS. * libatomic_i.h (GEN_SELECTOR): Add IFUNC_RESOLVER_ARGS argument. * Makefile.in: Regenerate. * auto-config.h.in: Regenerate. * configure: Regenerate. From-SVN: r255399