Age | Commit message (Collapse) | Author | Files | Lines |
|
As mentioned in the PR, the latest Intel SDM has added:
"Processors that enumerate support for Intel® AVX (by setting the feature flag CPUID.01H:ECX.AVX[bit 28])
guarantee that the 16-byte memory operations performed by the following instructions will always be
carried out atomically:
• MOVAPD, MOVAPS, and MOVDQA.
• VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128.
• VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded with EVEX.128 and k0 (masking disabled).
(Note that these instructions require the linear addresses of their memory operands to be 16-byte
aligned.)"
The following patch deals with it just on the libatomic library side so far,
currently (since ~ 2017) we emit all the __atomic_* 16-byte builtins as
library calls since and this is something that we can hopefully backport.
The patch simply introduces yet another ifunc variant that takes priority
over the pure CMPXCHG16B one, one that checks AVX and CMPXCHG16B bits and
on non-Intel clears the AVX bit during detection for now (if AMD comes
with the same guarantee, we could revert the config/x86/init.c hunk),
which implements 16-byte atomic load as vmovdqa and 16-byte atomic store
as vmovdqa followed by mfence.
2022-03-17 Jakub Jelinek <jakub@redhat.com>
PR target/104688
* Makefile.am (IFUNC_OPTIONS): Change on x86_64 to -mcx16 -mcx16.
(libatomic_la_LIBADD): Add $(addsuffix _16_2_.lo,$(SIZEOBJS)) for
x86_64.
* Makefile.in: Regenerated.
* config/x86/host-config.h (IFUNC_COND_1): For x86_64 define to
both AVX and CMPXCHG16B bits.
(IFUNC_COND_2): Define.
(IFUNC_NCOND): For x86_64 define to 2 * (N == 16).
(MAYBE_HAVE_ATOMIC_CAS_16, MAYBE_HAVE_ATOMIC_EXCHANGE_16,
MAYBE_HAVE_ATOMIC_LDST_16): Define to IFUNC_COND_2 rather than
IFUNC_COND_1.
(HAVE_ATOMIC_CAS_16): Redefine to 1 whenever IFUNC_ALT != 0.
(HAVE_ATOMIC_LDST_16): Redefine to 1 whenever IFUNC_ALT == 1.
(atomic_compare_exchange_n): Define whenever IFUNC_ALT != 0
on x86_64 for N == 16.
(__atomic_load_n, __atomic_store_n): Redefine whenever IFUNC_ALT == 1
on x86_64 for N == 16.
(atomic_load_n, atomic_store_n): New functions.
* config/x86/init.c (__libat_feat1_init): On x86_64 clear bit_AVX
if CPU vendor is not Intel.
|
|
|
|
|
|
Windows ABI (MinGW) is different than Linux ABI when bitfileds are involved.
The following patch adds __attribute__ ((gcc_struct)) to struct fenv in order
to match the layout of x87 state image in memory.
2020-06-01 Uroš Bizjak <ubizjak@gmail.com>
libatomic/ChangeLog:
* config/x86/fenv.c (struct fenv): Add __attribute__ ((gcc_struct)).
libgcc/ChangeLog:
* config/i386/sfp-exceptions.c (struct fenv):
Add __attribute__ ((gcc_struct)).
libgfortran/ChangeLog:
PR libfortran/95418
* config/fpu-387.h (struct fenv): Add __attribute__ ((gcc_struct)).
|
|
Introduce math_force_eval_div to use generic division to generate
INEXACT as well as INVALID and DIVZERO exceptions.
libgcc/ChangeLog:
* config/i386/sfp-exceptions.c (__math_force_eval): Remove.
(__math_force_eval_div): New define.
(__sfp_handle_exceptions): Use __math_force_eval_div to use
generic division to generate INVALID, DIVZERO and INEXACT
exceptions.
libatomic/ChangeLog:
* config/x86/fenv.c (__math_force_eval): Remove.
(__math_force_eval_div): New define.
(__atomic_deraiseexcept): Use __math_force_eval_div to use
generic division to generate INVALID, DIVZERO and INEXACT
exceptions.
libgfortran/ChangeLog:
* config/fpu-387.h (__math_force_eval): Remove.
(__math_force_eval_div): New define.
(local_feraiseexcept): Use __math_force_eval_div to use
generic division to generate INVALID, DIVZERO and INEXACT
exceptions.
(struct fenv): Define named struct instead of typedef.
|
|
Introduce math_force_eval to evaluate generic division to generate
INVALID and DIVZERO exceptions.
libgcc/ChangeLog:
* config/i386/sfp-exceptions.c (__math_force_eval): New define.
(__sfp_handle_exceptions): Use __math_force_eval to evaluete
generic division to generate INVALID and DIVZERO exceptions.
libatomic/ChangeLog:
* config/x86/fenv.c (__math_force_eval): New define.
(__atomic_feraiseexcept): Use __math_force_eval to evaluete
generic division to generate INVALID and DIVZERO exceptions.
libgfortran/ChangeLog:
* config/fpu-387.h (__math_force_eval): New define.
(local_feraiseexcept): Use __math_force_eval to evaluete
generic division to generate INVALID and DIVZERO exceptions.
|
|
According to "Intel 64 and IA32 Arch SDM, Vol. 3:
"Because SIMD floating-point exceptions are precise and occur immediately,
the situation does not arise where an x87 FPU instruction, a WAIT/FWAIT
instruction, or another SSE/SSE2/SSE3 instruction will catch a pending
unmasked SIMD floating-point exception."
Remove unneeded assignments to volatile memory.
libgcc/ChangeLog:
* config/i386/sfp-exceptions.c (__sfp_handle_exceptions) [__SSE_MATH__]:
Remove unneeded assignments to volatile memory.
libatomic/ChangeLog:
* config/x86/fenv.c (__atomic_feraiseexcept) [__SSE_MATH__]:
Remove unneeded assignments to volatile memory.
libgfortran/ChangeLog:
* config/fpu-387.h (local_feraiseexcept) [__SSE_MATH__]:
Remove unneeded assignments to volatile memory.
|
|
From-SVN: r279813
|
|
From-SVN: r267494
|
|
PR libgcc/60790
x86: Do not assume ELF constructors run before IFUNC resolvers.
* config/x86/host-config.h (libat_feat1_ecx, libat_feat1_edx):
Remove declarations.
(__libat_feat1, __libat_feat1_init): Declare.
(FEAT1_REGISTER): Define.
(load_feat1): New function.
(IFUNC_COND_1): Adjust.
* config/x86/init.c (libat_feat1_ecx, libat_feat1_edx)
(init_cpuid): Remove definitions.
(__libat_feat1): New variable.
(__libat_feat1_init): New function.
From-SVN: r260603
|
|
From-SVN: r256169
|
|
gcc/
* builtins.c (fold_builtin_atomic_always_lock_free): Make "lock-free"
conditional on existance of a fast atomic load.
* optabs-query.c (can_atomic_load_p): New function.
* optabs-query.h (can_atomic_load_p): Declare it.
* optabs.c (expand_atomic_exchange): Always delegate to libatomic if
no fast atomic load is available for the particular size of access.
(expand_atomic_compare_and_swap): Likewise.
(expand_atomic_load): Likewise.
(expand_atomic_store): Likewise.
(expand_atomic_fetch_op): Likewise.
* testsuite/lib/target-supports.exp
(check_effective_target_sync_int_128): Remove x86 because it provides
no fast atomic load.
(check_effective_target_sync_int_128_runtime): Likewise.
libatomic/
* acinclude.m4: Add #define FAST_ATOMIC_LDST_*.
* auto-config.h.in: Regenerate.
* config/x86/host-config.h (FAST_ATOMIC_LDST_16): Define to 0.
(atomic_compare_exchange_n): New.
* glfree.c (EXACT, LARGER): Change condition and add comments.
From-SVN: r245098
|
|
From-SVN: r243994
|
|
From-SVN: r232055
|
|
From-SVN: r219188
|
|
From-SVN: r206291
|
|
__TARGET_SSE__ is defined.
libgcc/ChangeLog:
2013-12-09 Uros Bizjak <ubizjak@gmail.com>
* config/i386/sfp-exceptions.c (__sfp_handle_exceptions): Emit SSE
instructions when __TARGET_SSE__ is defined.
libatomic/ChangeLog:
2013-12-09 Uros Bizjak <ubizjak@gmail.com>
* config/x86/fenv.c (__atomic_feraiseexcept): Emit SSE
instructions when __TARGET_SSE__ is defined.
From-SVN: r205811
|
|
* config/x86/fenv.c: New file.
From-SVN: r204634
|
|
From-SVN: r195164
|
|
From-SVN: r187018
|