Age | Commit message (Collapse) | Author | Files | Lines |
|
This change updates the atomic libcall support to fix the following
issues:
1) A internal compiler error with -fno-sync-libcalls.
2) When sync libcalls are disabled, we don't generate libcalls for
libatomic.
3) There is no sync libcall support for targets other than linux.
As a result, non-atomic stores are silently emitted for types
smaller or equal to the word size. There are now a few atomic
libcalls in the libgcc code, so we need sync support on all
targets.
2023-01-13 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa-linux.h (TARGET_SYNC_LIBCALL): Delete define.
* config/pa/pa.cc (pa_init_libfuncs): Use MAX_SYNC_LIBFUNC_SIZE
define.
* config/pa/pa.h (TARGET_SYNC_LIBCALLS): Use flag_sync_libcalls.
(MAX_SYNC_LIBFUNC_SIZE): Define.
(TARGET_CPU_CPP_BUILTINS): Define __SOFTFP__ when soft float is
enabled.
* config/pa/pa.md (atomic_storeqi): Emit __atomic_exchange_1
libcall when sync libcalls are disabled.
(atomic_storehi, atomic_storesi, atomic_storedi): Likewise.
(atomic_loaddi): Emit __atomic_load_8 libcall when sync libcalls
are disabled on 32-bit target.
* config/pa/pa.opt (matomic-libcalls): New option.
* doc/invoke.texi (HPPA Options): Update.
libgcc/ChangeLog:
* config.host (hppa*64*-*-linux*): Adjust tmake_file to use
pa/t-pa64-linux.
(hppa*64*-*-hpux11*): Adjust tmake_file to use pa/t-pa64-hpux
instead of pa/t-hpux and pa/t-pa64.
* config/pa/linux-atomic.c: Define u32 type.
(ATOMIC_LOAD): Define new macro to implement atomic_load_1,
atomic_load_2, atomic_load_4 and atomic_load_8. Update sync
defines to use atomic_load calls for type.
(SYNC_LOCK_LOAD_2): New macro to implement __sync_lock_load_8.
* config/pa/sync-libfuncs.c: New file.
* config/pa/t-netbsd (LIB2ADD_ST): Define.
* config/pa/t-openbsd (LIB2ADD_ST): Define.
* config/pa/t-pa64-hpux: New file.
* config/pa/t-pa64-linux: New file.
|
|
GCC 13 has a new implementation of gthr-win32.h which supports C++11
mutexes, threads etc. but this causes an unintended ABI break. The
__gthread_mutex_t type is always used in std::basic_filebuf even in
C++98, so independent of whether C++11 sync primitives work or not.
Because that type changed for the win32 thread model, we have a layout
change in std::basic_filebuf. The member is completely unused, it just
gets passed to the std::__basic_file constructor and ignored. So we
don't need that mutex to actually work, we just need its layout to not
change.
Introduce a new __gthr_win32_legacy_mutex_t struct in gthr-win32.h with
the old layout, and conditionally use that in std::basic_filebuf.
PR libstdc++/108331
libgcc/ChangeLog:
* config/i386/gthr-win32.h (__gthr_win32_legacy_mutex_t): New
struct matching the previous __gthread_mutex_t struct.
(__GTHREAD_LEGACY_MUTEX_T): Define.
libstdc++-v3/ChangeLog:
* config/io/c_io_stdio.h (__c_lock): Define as a typedef for
__GTHREAD_LEGACY_MUTEX_T if defined.
|
|
The patch to convert all thumb1 code in libgcc to unified syntax
omitted changing all swi instructions to the current name: svc.
libgcc/ChangeLog:
* config/arm/lib1funcs.S (clear_cache): Use SVC to conform to
unified syntax.
|
|
|
|
Recently, mingw-w64 has got updated <msxml.h> from Wine which is included
indirectly by <windows.h> if `WIN32_LEAN_AND_MEAN` is not defined. The
`IXMLDOMDocument` class has a member function named `abort()`, which gets
affected by our `abort()` macro in "system.h".
`WIN32_LEAN_AND_MEAN` should, nevertheless, always be defined. This
can exclude 'APIs such as Cryptography, DDE, RPC, Shell, and Windows
Sockets' [1], and speed up compilation of these files a bit.
[1] https://learn.microsoft.com/en-us/windows/win32/winprog/using-the-windows-headers
gcc/
PR middle-end/108300
* config/xtensa/xtensa-dynconfig.c: Define `WIN32_LEAN_AND_MEAN`
before <windows.h>.
* diagnostic-color.cc: Likewise.
* plugin.cc: Likewise.
* prefix.cc: Likewise.
gcc/ada/
PR middle-end/108300
* adaint.c: Define `WIN32_LEAN_AND_MEAN` before `#include
<windows.h>`.
* cio.c: Likewise.
* ctrl_c.c: Likewise.
* expect.c: Likewise.
* gsocket.h: Likewise.
* mingw32.h: Likewise.
* mkdir.c: Likewise.
* rtfinal.c: Likewise.
* rtinit.c: Likewise.
* seh_init.c: Likewise.
* sysdep.c: Likewise.
* terminals.c: Likewise.
* tracebak.c: Likewise.
gcc/jit/
PR middle-end/108300
* jit-w32.h: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
libatomic/
PR middle-end/108300
* config/mingw/lock.c: Define `WIN32_LEAN_AND_MEAN` before
<windows.h>.
libffi/
PR middle-end/108300
* src/aarch64/ffi.c: Define `WIN32_LEAN_AND_MEAN` before
<windows.h>.
libgcc/
PR middle-end/108300
* config/i386/enable-execute-stack-mingw32.c: Define
`WIN32_LEAN_AND_MEAN` before <windows.h>.
* libgcc2.c: Likewise.
* unwind-generic.h: Likewise.
libgfortran/
PR middle-end/108300
* intrinsics/sleep.c: Define `WIN32_LEAN_AND_MEAN` before
<windows.h>.
libgomp/
PR middle-end/108300
* config/mingw32/proc.c: Define `WIN32_LEAN_AND_MEAN` before
<windows.h>.
libiberty/
PR middle-end/108300
* make-temp-file.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
* pex-win32.c: Likewise.
libssp/
PR middle-end/108300
* ssp.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
libstdc++-v3/
PR middle-end/108300
* src/c++11/system_error.cc: Define `WIN32_LEAN_AND_MEAN` before
<windows.h>.
* src/c++11/thread.cc: Likewise.
* src/c++17/fs_ops.cc: Likewise.
* src/filesystem/ops.cc: Likewise.
libvtv/
PR middle-end/108300
* vtv_malloc.cc: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
* vtv_rts.cc: Likewise.
* vtv_utils.cc: Likewise.
|
|
|
|
The parameters fs->data_align and fs->code_align always have fixed
values for a particular target in GCC-generated code. Specialize
execute_cfa_program for these values, to avoid multiplications.
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Define
__LIBGCC_DWARF_CIE_DATA_ALIGNMENT__.
libgcc/
* unwind-dw2-execute_cfa.h: New file. Extracted from
the execute_cfa_program function in unwind-dw2.c.
* unwind-dw2.c (execute_cfa_program_generic): New function.
(execute_cfa_program_specialized): Likewise.
(execute_cfa_program): Call execute_cfa_program_specialized
or execute_cfa_program_generic, as appropriate.
|
|
constant"
This reverts commit 97bbdb726aba76ead550e25061029cf0aa78671b.
|
|
This reverts commit cb775ecd6e437de8fdba9a3f173f3787e90e98f2.
|
|
|
|
The parameters fs->data_align and fs->code_align always have fixed
values for a particular target in GCC-generated code. Specialize
execute_cfa_program for these values, to avoid multiplications.
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Define
__LIBGCC_DWARF_CIE_DATA_ALIGNMENT__.
libgcc/
* unwind-dw2-execute_cfa.h: New file. Extracted from
the execute_cfa_program function in unwind-dw2.c.
* unwind-dw2.c (execute_cfa_program_generic): New function.
(execute_cfa_program_specialized): Likewise.
(execute_cfa_program): Call execute_cfa_program_specialized
or execute_cfa_program_generic, as appropriate.
|
|
And use that to speed up the libgcc unwinder.
gcc/
* debug.h (dwarf_reg_sizes_constant): Declare.
* dwarf2cfi.cc (dwarf_reg_sizes_constant): New function.
gcc/c-family/
* c-cppbuiltin.cc (__LIBGCC_DWARF_REG_SIZES_CONSTANT__):
Define if constant is known.
libgcc/
* unwind-dw2.c (dwarf_reg_size): New function.
(_Unwind_GetGR, _Unwind_SetGR, _Unwind_SetGRPtr)
(_Unwind_SetSpColumn, uw_install_context_1): Use it.
(uw_init_context_1): Do not initialize dwarf_reg_size_table
if not in use.
|
|
2022 -> 2023
|
|
|
|
Broken by 9149a5b7e0a66b7b94d5b7db3194a975d18dea2f.
CC_NONE is defined by wingdi.h and conflicting with gcc.
Committed as obvious.
libgcc/:
* config/i386/gthr-win32.h: undef CC_NONE
Signed-off-by: Jonathan Yong <10walls@gmail.com>
|
|
|
|
On Darwin, GCC now uses a libgcc_s.1.1 for builtins and forwards the system
unwinder. We do, however, build a backwards compatibility libgcc_s.1.dylib.
However, this is not needed by GCC and can cause incorrect operation when
DYLD_LIBRARY_PATH is in use.
Since we do not need or use it during the build, the solution is to skip the
installation into the $build/gcc directory.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:
* config/t-slibgcc-darwin (install-darwin-libgcc-stubs): Skip the
install of libgcc_s.1.dylib when the installation is into the build
gcc directory.
|
|
|
|
This reimplements the GNU threads library on native Windows (except for the
Objective-C specific subset) using direct Win32 API calls, in lieu of the
implementation based on semaphores. This base implementations requires
Windows XP/Server 2003, which was the default minimal setting of MinGW-W64
until end of 2020. This also adds the support required for the C++11 threads,
using again direct Win32 API calls; this additional layer requires Windows
Vista/Server 2008 and is enabled only if _WIN32_WINNT >= 0x0600.
This also changes libstdc++ to pass -D_WIN32_WINNT=0x0600 but only when the
switch --enable-libstdcxx-threads is passed, which means that C++11 threads
are still disabled by default *unless* MinGW-W64 itself is configured for
Windows Vista/Server 2008 or later by default (this has been the case in
the development version since end of 2020, for earlier versions you can
configure it --with-default-win32-winnt=0x0600 to get the same effect).
I only manually tested it on i686-w64-mingw32 and x86_64-w64-mingw32 but
AdaCore has used it in their C/C++/Ada compilers for 3 years now and the
30_threads chapter of the libstdc++ testsuite was clean at the time.
2022-10-31 Eric Botcazou <ebotcazou@adacore.com>
libgcc/
* config.host (i[34567]86-*-mingw*): Add thread fragment after EH one
as well as new i386/t-slibgcc-mingw fragment.
(x86_64-*-mingw*): Likewise.
* config/i386/gthr-win32.h: If _WIN32_WINNT is at least 0x0600, define
both __GTHREAD_HAS_COND and __GTHREADS_CXX0X to 1.
Error out if _GTHREAD_USE_MUTEX_TIMEDLOCK is 1.
Include stdlib.h instead of errno.h and do not include _mingw.h.
(CONST_CAST2): Add specific definition for C++.
(ATTRIBUTE_UNUSED): New macro.
(__UNUSED_PARAM): Delete.
Define WIN32_LEAN_AND_MEAN before including windows.h.
(__gthread_objc_data_tls): Use TLS_OUT_OF_INDEXES instead of (DWORD)-1.
(__gthread_objc_init_thread_system): Likewise.
(__gthread_objc_thread_get_data): Minor tweak.
(__gthread_objc_condition_allocate): Use ATTRIBUTE_UNUSED.
(__gthread_objc_condition_deallocate): Likewise.
(__gthread_objc_condition_wait): Likewise.
(__gthread_objc_condition_broadcast): Likewise.
(__gthread_objc_condition_signal): Likewise.
Include sys/time.h.
(__gthr_win32_DWORD): New typedef.
(__gthr_win32_HANDLE): Likewise.
(__gthr_win32_CRITICAL_SECTION): Likewise.
(__gthr_win32_CONDITION_VARIABLE): Likewise.
(__gthread_t): Adjust.
(__gthread_key_t): Likewise.
(__gthread_mutex_t): Likewise.
(__gthread_recursive_mutex_t): Likewise.
(__gthread_cond_t): New typedef.
(__gthread_time_t): Likewise.
(__GTHREAD_MUTEX_INIT_DEFAULT): Delete.
(__GTHREAD_RECURSIVE_MUTEX_INIT_DEFAULT): Likewise.
(__GTHREAD_COND_INIT_FUNCTION): Define.
(__GTHREAD_TIME_INIT): Likewise.
(__gthr_i486_lock_cmp_xchg): Delete.
(__gthr_win32_create): Declare.
(__gthr_win32_join): Likewise.
(__gthr_win32_self): Likewise.
(__gthr_win32_detach): Likewise.
(__gthr_win32_equal): Likewise.
(__gthr_win32_yield): Likewise.
(__gthr_win32_mutex_destroy): Likewise.
(__gthr_win32_cond_init_function): Likewise if __GTHREADS_HAS_COND is 1.
(__gthr_win32_cond_broadcast): Likewise.
(__gthr_win32_cond_signal): Likewise.
(__gthr_win32_cond_wait): Likewise.
(__gthr_win32_cond_timedwait): Likewise.
(__gthr_win32_recursive_mutex_init_function): Delete.
(__gthr_win32_recursive_mutex_lock): Likewise.
(__gthr_win32_recursive_mutex_unlock): Likewise.
(__gthr_win32_recursive_mutex_destroy): Likewise.
(__gthread_create): New inline function.
(__gthread_join): Likewise.
(__gthread_self): Likewise.
(__gthread_detach): Likewise.
(__gthread_equal): Likewise.
(__gthread_yield): Likewise.
(__gthread_cond_init_function): Likewise if __GTHREADS_HAS_COND is 1.
(__gthread_cond_broadcast): Likewise.
(__gthread_cond_signal): Likewise.
(__gthread_cond_wait): Likewise.
(__gthread_cond_timedwait): Likewise.
(__GTHREAD_WIN32_INLINE): New macro.
(__GTHREAD_WIN32_COND_INLINE): Likewise.
(__GTHREAD_WIN32_ACTIVE_P): Likewise.
Define WIN32_LEAN_AND_MEAN before including windows.h.
(__gthread_once): Minor tweaks.
(__gthread_key_create): Use ATTRIBUTE_UNUSED and TLS_OUT_OF_INDEXES.
(__gthread_key_delete): Minor tweak.
(__gthread_getspecific): Likewise.
(__gthread_setspecific): Likewise.
(__gthread_mutex_init_function): Reimplement.
(__gthread_mutex_destroy): Likewise.
(__gthread_mutex_lock): Likewise.
(__gthread_mutex_trylock): Likewise.
(__gthread_mutex_unlock): Likewise.
(__gthr_win32_abs_to_rel_time): Declare.
(__gthread_recursive_mutex_init_function): Reimplement.
(__gthread_recursive_mutex_destroy): Likewise.
(__gthread_recursive_mutex_lock): Likewise.
(__gthread_recursive_mutex_trylock): Likewise.
(__gthread_recursive_mutex_unlock): Likewise.
(__gthread_cond_destroy): New inline function.
(__gthread_cond_wait_recursive): Likewise.
* config/i386/gthr-win32.c: Delete everything.
Include gthr-win32.h to get the out-of-line version of inline routines.
Add compile-time checks for the local version of the Win32 types.
* config/i386/gthr-win32-cond.c: New file.
* config/i386/gthr-win32-thread.c: Likewise.
* config/i386/t-gthr-win32: Add config/i386/gthr-win32-thread.c to the
EH part, config/i386/gthr-win32-cond.c and config/i386/gthr-win32.c to
the static version of libgcc.
* config/i386/t-slibgcc-mingw: New file.
* config/i386/libgcc-mingw.ver: Likewise.
libstdc++-v3/
* acinclude.m4 (GLIBCXX_EXPORT_FLAGS): Substitute CPPFLAGS.
(GLIBCXX_ENABLE_LIBSTDCXX_TIME): Set ac_has_sched_yield and
ac_has_win32_sleep to yes for MinGW. Change HAVE_WIN32_SLEEP
into _GLIBCXX_USE_WIN32_SLEEP.
(GLIBCXX_CHECK_GTHREADS): Add _WIN32_THREADS to compilation flags for
Win32 threads and force _GTHREAD_USE_MUTEX_TIMEDLOCK to 0 for them.
Add -D_WIN32_WINNT=0x0600 to compilation flags if yes was configured
and add it to CPPFLAGS on success.
* config.h.in: Regenerate.
* configure: Likewise.
* config/os/mingw32-w64/os_defines.h (_GLIBCXX_USE_GET_NPROCS_WIN32):
Define to 1.
* config/os/mingw32/os_defines.h (_GLIBCXX_USE_GET_NPROCS_WIN32): Ditto
* src/c++11/thread.cc (get_nprocs): Provide Win32 implementation if
_GLIBCXX_USE_GET_NPROCS_WIN32 is defined. Replace HAVE_WIN32_SLEEP
with USE_WIN32_SLEEP.
* testsuite/19_diagnostics/headers/system_error/errc_std_c++0x.cc: Add
missing conditional compilation.
* testsuite/lib/libstdc++.exp (check_v3_target_sleep): Add support for
_GLIBCXX_USE_WIN32_SLEEP.
(check_v3_target_nprocs): Likewise for _GLIBCXX_USE_GET_NPROCS_WIN32.
Signed-off-by: Eric Botcazou <ebotcazou@adacore.com>
Signed-off-by: Jonathan Yong <10walls@gmail.com>
|
|
|
|
When registering an unwind frame with __register_frame_info_bases
we currently initialize that fde object eagerly. This has the
advantage that it is immutable afterwards and we can safely
access it from multiple threads, but it has the disadvantage
that we pay the initialization cost even if the application
never throws an exception.
This commit changes the logic to initialize the objects lazily.
The objects themselves are inserted into the b-tree when
registering the frame, but the sorted fde_vector is
not constructed yet. Only on the first time that an
exception tries to pass through the registered code the
object is initialized. We notice that with a double checking,
first doing a relaxed load of the sorted bit and then re-checking
under a mutex when the object was not initialized yet.
Note that the check must implicitly be safe concering a concurrent
frame deregistration, as trying the deregister a frame that is
on the unwinding path of a concurrent exception is inherently racy.
libgcc/ChangeLog:
* unwind-dw2-fde.c: Initialize fde object lazily when
the first exception tries to pass through.
|
|
When registering a dynamic unwinding frame the fde list is sorted.
Previously, we split the list into a sorted and an unsorted part,
sorted the later using heap sort, and merged both. That can be
quite slow due to the large number of (expensive) comparisons.
This patch replaces that logic with a radix sort instead. The
radix sort uses the same amount of memory as the old logic,
using the second list as auxiliary space, and it includes two
techniques to speed up sorting: First, it computes the pointer
addresses for blocks of values, reducing the decoding overhead.
And it recognizes when the data has reached a sorted state,
allowing for early termination. When running out of memory
we fall back to pure heap sort, as before.
For this test program
\#include <cstdio>
int main(int argc, char** argv) {
return 0;
}
compiled with g++ -O -o hello -static hello.c we get with
perf stat -r 200 on a 5950X the following performance numbers:
old logic:
0,20 msec task-clock
930.834 cycles
3.079.765 instructions
0,00030478 +- 0,00000237 seconds time elapsed
new logic:
0,10 msec task-clock
473.269 cycles
1.239.077 instructions
0,00021119 +- 0,00000168 seconds time elapsed
libgcc/ChangeLog:
* unwind-dw2-fde.c: Use radix sort instead of split+sort+merge.
|
|
|
|
libgcc/
* config/xtensa/xtensa-config-builtin.h (XCHAL_NUM_AREGS)
(XCHAL_ICACHE_SIZE, XCHAL_DCACHE_SIZE, XCHAL_ICACHE_LINESIZE)
(XCHAL_DCACHE_LINESIZE, XCHAL_MMU_MIN_PTE_PAGE_SIZE)
(XSHAL_ABI): Remove stray symbols from macro definitions.
|
|
|
|
Now that gcc provides __XCHAL_* definitions use them instead of XCHAL_*
definitions from the include/xtensa-config.h. That makes libgcc
dynamically configurable for the target xtensa core.
libgcc/
* config/xtensa/crti.S (xtensa-config.h): Replace #inlcude with
xtensa-config-builtin.h.
* config/xtensa/crtn.S: Likewise.
* config/xtensa/lib1funcs.S: Likewise.
* config/xtensa/lib2funcs.S: Likewise.
* config/xtensa/xtensa-config-builtin.h: New File.
|
|
|
|
|
|
BFD ld (and the other linkers) only produce one encoding of these
values. It is not necessary to use the general
read_encoded_value_with_base decoding routine. This avoids the
data-dependent branches in its implementation.
libgcc/
* unwind-dw2-fde-dip.c (find_fde_tail): Special-case encoding
values actually used by BFD ld.
|
|
|
|
'libobjc/thr.c' includes 'gthr.h'. While all the other gthread headers
have `#ifdef _LIBOBJC` checks, and provide a different set of inline
functions, I think having one header provide two completely unrelated
set of APIs is unsatisfactory, complicates maintenance, and hinders
further development.
This commit references a new header for libobjc, and adds a copyright
notice, as in other headers.
libgcc/ChangeLog:
* config/i386/gthr-mcf.h: Include 'gthr_libobjc.h' when building
libobjc, instead of 'gthr.h'
|
|
|
|
This patch adds the new thread model `mcf`, which implements mutexes
and condition variables with the mcfgthread library.
Source code for mcfgthread is available at <https://github.com/lhmouse/mcfgthread>.
config/ChangeLog:
* gthr.m4 (GCC_AC_THREAD_HEADER): Add new case for `mcf` thread
model
gcc/ChangeLog:
* config/i386/mingw-mcfgthread.h: New file
* config/i386/mingw32.h: Add builtin macro and default libraries
for mcfgthread when thread model is `mcf`
* config.gcc: Include 'i386/mingw-mcfgthread.h' when thread model
is `mcf`
* configure.ac: Recognize `mcf` as a valid thread model
* config.in: Regenerate
* configure: Regenerate
libatomic/ChangeLog:
* configure.tgt: Add new case for `mcf` thread model
libgcc/ChangeLog:
* config.host: Add new cases for `mcf` thread model
* config/i386/gthr-mcf.h: New file
* config/i386/t-mingw-mcfgthread: New file
* config/i386/t-slibgcc-cygming: Add mcfgthread for libgcc DLL
* configure: Regenerate
libstdc++-v3/ChangeLog:
* libsupc++/atexit_thread.cc (__cxa_thread_atexit): Use
implementation from mcfgthread if available
* libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_release,
__cxa_guard_abort): Use implementations from mcfgthread if
available
* configure: Regenerate
|
|
|
|
If the xgcc executable has not been built (or has been removed by 'make
clean') then the command to print the multilib dir fails, and so the
MULTIOSDIR variable is empty. That then causes:
/bin/sh: line 0: test: !=: unary operator expected
We can avoid it by quoting the variable.
libgcc/ChangeLog:
* Makefile.in: Quote variable.
|
|
|
|
If shadow stack is enabled, when unwinding stack, we count how many stack
frames we pop to reach the landing pad and adjust shadow stack by the same
amount. When counting the stack frame, we compare the return address on
normal stack against the return address on shadow stack. If they don't
match, return _URC_FATAL_PHASE2_ERROR for the corrupted return address on
normal stack. Don't check the return address for
1. Non-catchable exception where exception_class == 0. Process will be
terminated.
2. Zero return address which marks the outermost stack frame.
3. Signal stack frame since kernel puts a restore token on shadow stack.
* unwind-generic.h (_Unwind_Frames_Increment): Add the EXC
argument.
* unwind.inc (_Unwind_RaiseException_Phase2): Pass EXC to
_Unwind_Frames_Increment.
(_Unwind_ForcedUnwind_Phase2): Likewise.
* config/i386/shadow-stack-unwind.h (_Unwind_Frames_Increment):
Take the EXC argument. Return _URC_FATAL_PHASE2_ERROR if the
return address on normal stack doesn't match the return address
on shadow stack.
|
|
On many architectures, there is a padding gap after the how array
member, and cfa_how can be moved there. This reduces the size of the
struct and the amount of memory that uw_frame_state_for has to clear.
There is no measurable performance benefit from this on x86-64 (even
though the memset goes from 120 to 112 bytes), but it seems to be a
good idea to do anyway.
libgcc/
* unwind-dw2.h (struct frame_state_reg_info): Move cfa_how member
and reduce its size.
|
|
|
|
Here is a complete patch to add std::bfloat16_t support on
x86 (AArch64 and ARM left for later). Almost no BFmode optabs
are added by the patch, so for binops/unops it extends to SFmode
first and then truncates back to BFmode.
For {HF,SF,DF,XF,TF}mode -> BFmode conversions libgcc has implementations
of all those conversions so that we avoid double rounding, for
BFmode -> {DF,XF,TF}mode conversions to avoid growing libgcc too much
it emits BFmode -> SFmode conversion first and then converts to the even
wider mode, neither step should be imprecise.
For BFmode -> HFmode, it first emits a precise BFmode -> SFmode conversion
and then SFmode -> HFmode, because neither format is subset or superset
of the other, while SFmode is superset of both.
expr.cc then contains a -ffast-math optimization of the BF -> SF and
SF -> BF conversions if we don't optimize for space (and for the latter
if -frounding-math isn't enabled either).
For x86, perhaps truncsfbf2 optab could be defined for TARGET_AVX512BF16
but IMNSHO should FAIL if !flag_finite_math || flag_rounding_math
|| !flag_unsafe_math_optimizations, because I think the insn doesn't
raise on sNaNs, hardcodes round to nearest and flushes denormals to zero.
By default (unless x86 -fexcess-precision=16) we use float excess
precision for BFmode, so truncate only on explicit casts and assignments.
The patch introduces a single __bf16 builtin - __builtin_nansf16b,
because (__bf16) __builtin_nansf ("") will drop the sNaN into qNaN,
and uses f16b suffix instead of bf16 because there would be ambiguity on
log vs. logb - __builtin_logbf16 could be either log with bf16 suffix
or logb with f16 suffix. In other cases libstdc++ should mostly use
__builtin_*f for std::bfloat16_t overloads (we have a problem with
std::nextafter though but that one we have also for std::float16_t).
2022-10-14 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE.
* tree.h (bfloat16_type_node): Define.
* tree.cc (excess_precision_type): Promote bfloat16_type_mode
like float16_type_mode.
(build_common_tree_nodes): Initialize bfloat16_type_node if
BFmode is supported.
* expmed.h (maybe_expand_shift): Declare.
* expmed.cc (maybe_expand_shift): No longer static.
* expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF
conversions. If there is no optab, handle BF -> {DF,XF,TF,HF}
conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add
-ffast-math generic implementation for BF -> SF and SF -> BF
conversions.
* builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING): New.
* builtins.def (BUILT_IN_NANSF16B): New builtin.
* fold-const-call.cc (fold_const_call): Handle CFN_BUILT_IN_NANSF16B.
* config/i386/i386.cc (classify_argument): Handle E_BCmode.
(ix86_libgcc_floating_mode_supported_p): Also return true for BFmode
for -msse2.
(ix86_mangle_type): Mangle BFmode as DF16b.
(ix86_invalid_conversion, ix86_invalid_unary_op,
ix86_invalid_binary_op): Remove.
(TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP,
TARGET_INVALID_BINARY_OP): Don't redefine.
* config/i386/i386-builtins.cc (ix86_bf16_type_node): Remove.
(ix86_register_bf16_builtin_type): Use bfloat16_type_node rather than
ix86_bf16_type_node, only create it if still NULL.
* config/i386/i386-builtin-types.def (BFLOAT16): Likewise.
* config/i386/i386.md (cbranchbf4, cstorebf4): New expanders.
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): If bfloat16_type_node,
predefine __BFLT16_*__ macros and for C++23 also
__STDCPP_BFLOAT16_T__. Predefine bfloat16_type_node related
macros for -fbuilding-libgcc.
* c-lex.cc (interpret_float): Handle CPP_N_BFLOAT16.
gcc/c/
* c-typeck.cc (convert_arguments): Don't promote __bf16 to
double.
gcc/cp/
* cp-tree.h (extended_float_type_p): Return true for
bfloat16_type_node.
* typeck.cc (cp_compare_floating_point_conversion_ranks): Set
extended{1,2} if mv{1,2} is bfloat16_type_node. Adjust comment.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_bfloat16,
check_effective_target_bfloat16_runtime, add_options_for_bfloat16):
New.
* gcc.dg/torture/bfloat16-basic.c: New test.
* gcc.dg/torture/bfloat16-builtin.c: New test.
* gcc.dg/torture/bfloat16-builtin-issignaling-1.c: New test.
* gcc.dg/torture/bfloat16-complex.c: New test.
* gcc.dg/torture/builtin-issignaling-1.c: Allow to be includable
from bfloat16-builtin-issignaling-1.c.
* gcc.dg/torture/floatn-basic.h: Allow to be includable from
bfloat16-basic.c.
* gcc.target/i386/vect-bfloat16-typecheck_2.c: Adjust expected
diagnostics.
* gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Likewise.
* gcc.target/i386/vect-bfloat16-typecheck_1.c: Likewise.
* g++.target/i386/bfloat_cpp_typecheck.C: Likewise.
libcpp/
* include/cpplib.h (CPP_N_BFLOAT16): Define.
* expr.cc (interpret_float_suffix): Handle bf16 and BF16 suffixes for
C++.
libgcc/
* config/i386/t-softfp (softfp_extensions): Add bfsf.
(softfp_truncations): Add tfbf xfbf dfbf sfbf hfbf.
(CFLAGS-extendbfsf2.c, CFLAGS-truncsfbf2.c, CFLAGS-truncdfbf2.c,
CFLAGS-truncxfbf2.c, CFLAGS-trunctfbf2.c, CFLAGS-trunchfbf2.c): Add
-msse2.
* config/i386/libgcc-glibc.ver (GCC_13.0.0): Export
__extendbfsf2 and __trunc{s,d,x,t,h}fbf2.
* config/i386/sfp-machine.h (_FP_NANSIGN_B): Define.
* config/i386/64/sfp-machine.h (_FP_NANFRAC_B): Define.
* config/i386/32/sfp-machine.h (_FP_NANFRAC_B): Define.
* soft-fp/brain.h: New file.
* soft-fp/truncsfbf2.c: New file.
* soft-fp/truncdfbf2.c: New file.
* soft-fp/truncxfbf2.c: New file.
* soft-fp/trunctfbf2.c: New file.
* soft-fp/trunchfbf2.c: New file.
* soft-fp/truncbfhf2.c: New file.
* soft-fp/extendbfsf2.c: New file.
libiberty/
* cp-demangle.h (D_BUILTIN_TYPE_COUNT): Increment.
* cp-demangle.c (cplus_demangle_builtin_types): Add std::bfloat16_t
entry.
(cplus_demangle_type): Demangle DF16b.
* testsuite/demangle-expected (_Z3xxxDF16b): New test.
|
|
|
|
gcc/ChangeLog:
* gcov-io.cc (gcov_write_summary): Rename to ...
(gcov_write_object_summary): ... this.
* gcov-io.h (GCOV_TAG_OBJECT_SUMMARY_LENGTH): Rename from ...
(GCOV_TAG_SUMMARY_LENGTH): ... this.
libgcc/ChangeLog:
* libgcov-driver.c: Use new function.
* libgcov.h (gcov_write_summary): Rename to ...
(gcov_write_object_summary): ... this.
|
|
|
|
This change adds the configury bits to activate the build of
shared libs on VxWorks ports configured with --enable-shared,
for libraries variants where this is generally supported (rtp,
code model !large - currently not compatible with -fPIC).
Set lt_cv_deplibs_check_method in libtool.m4, so the build of
libraries know how to establish dependencies. This is useful in
configurations such as aarch64 where proper support of LSE relies
on accurate dependency information between libstdc++ and libgcc_s
to begin with.
Regenerate configure scripts to reflect libtool.m4 change.
2022-10-09 Olivier Hainque <hainque@adacore.com>
* libtool.m4 (*vxworks*): When enable_shared, set dynamic_linker
and friends for rtp !large. Assume the linker has the required
abilities and set lt_cv_deplibs_check_method.
gcc/
* config.gcc (*vxworks*): Add t-slibgcc fragment
if enable_shared.
libgcc/
* config.host (*vxworks*): When enable_shared, add
libgcc and crtstuff "shared" fragments for rtp except
large code model.
(aarch64*-wrs-vxworks7*): Remove t-slibgcc-libgcc from
the list of fragments.
2022-10-09 Olivier Hainque <hainque@adacore.com>
gcc/
* configure: Regenerate.
libatomic/
* configure: Regenerate.
libbacktrace/
* configure: Regenerate.
libcc1/
* configure: Regenerate.
libffi/
* configure: Regenerate.
libgfortran/
* configure: Regenerate.
libgomp/
* configure: Regenerate.
libitm/
* configure: Regenerate.
libobjc/
* configure: Regenerate.
liboffloadmic/
* configure: Regenerate.
liboffloadmic/
* plugin/configure: Regenerate.
libphobos/
* configure: Regenerate.
libquadmath/
* configure: Regenerate.
libsanitizer/
* configure: Regenerate.
libssp/
* configure: Regenerate.
libstdc++-v3/
* configure: Regenerate.
libvtv/
* configure: Regenerate.
lto-plugin/
* configure: Regenerate.
zlib/
* configure: Regenerate.
|
|
|
|
Missed one spot in the r13-3108-g146e45914032 change (my sed script
didn't expect nested []s).
2022-10-07 Jakub Jelinek <jakub@redhat.com>
* config/arc/linux-unwind.h (arc_fallback_frame_state): Use
fs->regs.how[X] instead of fs->regs.reg[X].how.
|
|
area in uw_frame_state_for
The following patch implements something that has Florian found as
low hanging fruit in our unwinder and has been discussed in the
https://gcc.gnu.org/wiki/cauldron2022#cauldron2022talks.inprocess_unwinding_bof
talk.
_Unwind_FrameState type seems to be (unlike the pre-GCC 3 frame_state
which has been part of ABI) private to unwind-dw2.c + unwind.inc it
includes, it is always defined on the stack of some entrypoints, initialized
by static uw_frame_state_for and the address of it is also passed to other
static functions or the static inlines handling machine dependent unwinding,
but it isn't fortunately passed to any callbacks or public functions, so I
think we can safely change it any time we want.
Florian mentioned that the structure is large even on x86_64, 384 bytes
there, starts with 328 bytes long element with frame_state_reg_info type
which then starts with an array with __LIBGCC_DWARF_FRAME_REGISTERS__ + 1
elements, each of them is 16 bytes long, on x86_64
__LIBGCC_DWARF_FRAME_REGISTERS__ is just 17 but even that is big, on say
riscv __LIBGCC_DWARF_FRAME_REGISTERS__ is I think 128, on powerpc 111,
on sh 153 etc. And, we memset to zero the whole fs variable with the
_Unwind_FrameState type at the start of the unwinding.
The reason why each element is 16 byte (on 64-bit arches) is that it
contains some pointer or pointer sized integer and then an enum (with just
7 different enumerators) + padding.
The following patch decreases it by moving the enum into a separate
array and using just one byte for each register in that second array.
We could compress it even more, say 4 bits per register, but I don't
want to uglify the code for it too much and make the accesses slower.
Furthermore, the clearing of the object can clear only thos how array
and members after it, because REG_UNSAVED enumerator (0) doesn't actually
need any pointer or pointer sized integer, it is just the other kinds
that need to have there something.
By doing this, on x86_64 the above numbers change to _Unwind_FrameState
type being now 264 bytes long, frame_state_reg_info 208 bytes and we
don't clear the first 144 bytes of the object, so the memset is 120 bytes,
so ~ 31% of the old clearing size. On riscv 64-bit assuming it has same
structure layout rules for the few types used there that would be
~ 2160 bytes of _Unwind_FrameState type before and ~ 1264 bytes after,
with the memset previously ~ 2160 bytes and after ~ 232 bytes after.
We've also talked about possibly adding a number of initially initialized
regs and initializing the rest lazily, but at least for x86_64 with
18 elements in the array that doesn't seem to be worth it anymore,
especially because return address column is 16 there and that is usually the
first thing to be touched. It might theory help with lots of registers if
they are usually untouched, but would uglify and complicate any stores to
how by having to check there for the not initialized yet cases and lazy
initialization, and similarly for all reads of how to do there if below
last initialized one, use how, otherwise imply REG_UNSAVED.
The disadvantage of the patch is that touching reg[x].loc and how[x]
now means 2 cachelines rather than one as before, and I admit beyond
bootstrap/regtest I haven't benchmarked it in any way.
2022-10-06 Jakub Jelinek <jakub@redhat.com>
* unwind-dw2.h (REG_UNSAVED, REG_SAVED_OFFSET, REG_SAVED_REG,
REG_SAVED_EXP, REG_SAVED_VAL_OFFSET, REG_SAVED_VAL_EXP,
REG_UNDEFINED): New anonymous enum, moved from inside of
struct frame_state_reg_info.
(struct frame_state_reg_info): Remove reg[].how element and the
anonymous enum there. Add how element.
* unwind-dw2.c: Include stddef.h.
(uw_frame_state_for): Don't clear first
offsetof (_Unwind_FrameState, regs.how[0]) bytes of *fs.
(execute_cfa_program, __frame_state_for, uw_update_context_1,
uw_update_context): Use fs->regs.how[X] instead of fs->regs.reg[X].how
or fs.regs.how[X] instead of fs.regs.reg[X].how.
* config/sh/linux-unwind.h (sh_fallback_frame_state): Likewise.
* config/bfin/linux-unwind.h (bfin_fallback_frame_state): Likewise.
* config/pa/linux-unwind.h (pa32_fallback_frame_state): Likewise.
* config/pa/hpux-unwind.h (UPDATE_FS_FOR_SAR, UPDATE_FS_FOR_GR,
UPDATE_FS_FOR_FR, UPDATE_FS_FOR_PC, pa_fallback_frame_state):
Likewise.
* config/alpha/vms-unwind.h (alpha_vms_fallback_frame_state):
Likewise.
* config/alpha/linux-unwind.h (alpha_fallback_frame_state): Likewise.
* config/arc/linux-unwind.h (arc_fallback_frame_state,
arc_frob_update_context): Likewise.
* config/riscv/linux-unwind.h (riscv_fallback_frame_state): Likewise.
* config/nios2/linux-unwind.h (NIOS2_REG): Likewise.
* config/nds32/linux-unwind.h (NDS32_PUT_FS_REG): Likewise.
* config/s390/tpf-unwind.h (s390_fallback_frame_state): Likewise.
* config/s390/linux-unwind.h (s390_fallback_frame_state): Likewise.
* config/sparc/sol2-unwind.h (sparc64_frob_update_context,
MD_FALLBACK_FRAME_STATE_FOR): Likewise.
* config/sparc/linux-unwind.h (sparc64_fallback_frame_state,
sparc64_frob_update_context, sparc_fallback_frame_state): Likewise.
* config/i386/sol2-unwind.h (x86_64_fallback_frame_state,
x86_fallback_frame_state): Likewise.
* config/i386/w32-unwind.h (i386_w32_fallback_frame_state): Likewise.
* config/i386/linux-unwind.h (x86_64_fallback_frame_state,
x86_fallback_frame_state): Likewise.
* config/i386/freebsd-unwind.h (x86_64_freebsd_fallback_frame_state):
Likewise.
* config/i386/dragonfly-unwind.h
(x86_64_dragonfly_fallback_frame_state): Likewise.
* config/i386/gnu-unwind.h (x86_gnu_fallback_frame_state): Likewise.
* config/csky/linux-unwind.h (csky_fallback_frame_state): Likewise.
* config/aarch64/linux-unwind.h (aarch64_fallback_frame_state):
Likewise.
* config/aarch64/freebsd-unwind.h
(aarch64_freebsd_fallback_frame_state): Likewise.
* config/aarch64/aarch64-unwind.h (aarch64_frob_update_context):
Likewise.
* config/or1k/linux-unwind.h (or1k_fallback_frame_state): Likewise.
* config/mips/linux-unwind.h (mips_fallback_frame_state): Likewise.
* config/loongarch/linux-unwind.h (loongarch_fallback_frame_state):
Likewise.
* config/m68k/linux-unwind.h (m68k_fallback_frame_state): Likewise.
* config/xtensa/linux-unwind.h (xtensa_fallback_frame_state):
Likewise.
* config/rs6000/darwin-fallback.c (set_offset): Likewise.
* config/rs6000/aix-unwind.h (MD_FROB_UPDATE_CONTEXT): Likewise.
* config/rs6000/linux-unwind.h (ppc_fallback_frame_state): Likewise.
* config/rs6000/freebsd-unwind.h (frob_update_context): Likewise.
|
|
|
|
Investigating the reasons for libgcc build failures in a canadian
context, orthogonally to the recent update of vxcrtstuff, exposed
interesting differences in the way include search paths are managed
between a regular Linux->VxWorks cross build and a canadian setup
building a Windows->VxWorks toolchain in a Linux environment.
This change augments the comment attached to LIBGCC2_INCLUDE in
libgcc/config/t-vxworks to better describe the parameters at play.
It also adjusts the addition of options for gcc/include and
gcc/include-fixed to minimize the actual differences for libgcc
in the two kinds of configurations.
2022-03-06 Olivier Hainque <hainque@adacore.com>
libgcc/
* config/t-vxworks (LIBGCC2_INCLUDE): Augment comment. Move
-I options for gcc/include and gcc/include-fixed at the end
and make them -isystem.
|
|
Within gthr-vxworks.h, we prevent C++ errors from missing
declarations in some system headers by prepending their inclusion
with a
#pragma GCC diagnostic ignored "-Wstrict-prototypes"
But Wstrict-prototypes is internally registered as valid for
C/ObjC only, not C++, and this trick in turn triggers a Wpragma
warning with -Wsystem-headers.
This change just arranges to ignore the secondary warning locally.
2021-02-03 Olivier Hainque <hainque@adacore.com>
* config/gthr-vxworks.h: Prevent Wpragma warning for the
pragma diagnostics on Wstrict-prototypes.
|