Age | Commit message (Collapse) | Author | Files | Lines |
|
Also divmod, but only for scalar modes, for now (because there are no complex
int vectors yet).
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_expand_divmod_libfunc): New function.
(gcn_init_libfuncs): Add div and mod functions for all modes.
Add placeholders for divmod functions.
(TARGET_EXPAND_DIVMOD_LIBFUNC): Define.
libgcc/ChangeLog:
* config/gcn/lib2-divmod-di.c: Reimplement like lib2-divmod.c.
* config/gcn/lib2-divmod.c: Likewise.
* config/gcn/lib2-gcn.h: Add new types and prototypes for all the
new vector libfuncs.
* config/gcn/t-amdgcn: Add new files.
* config/gcn/amdgcn_veclib.h: New file.
* config/gcn/lib2-vec_divmod-di.c: New file.
* config/gcn/lib2-vec_divmod-hi.c: New file.
* config/gcn/lib2-vec_divmod-qi.c: New file.
* config/gcn/lib2-vec_divmod.c: New file.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/predcom-2.c: Avoid vectors on amdgcn.
* gcc.dg/unroll-8.c: Likewise.
* gcc.dg/vect/slp-26.c: Change expected results on amdgdn.
* lib/target-supports.exp
(check_effective_target_vect_int_mod): Add amdgcn.
(check_effective_target_divmod): Likewise.
* gcc.target/gcn/simd-math-3-16.c: New test.
* gcc.target/gcn/simd-math-3-2.c: New test.
* gcc.target/gcn/simd-math-3-32.c: New test.
* gcc.target/gcn/simd-math-3-4.c: New test.
* gcc.target/gcn/simd-math-3-8.c: New test.
* gcc.target/gcn/simd-math-3-char-16.c: New test.
* gcc.target/gcn/simd-math-3-char-2.c: New test.
* gcc.target/gcn/simd-math-3-char-32.c: New test.
* gcc.target/gcn/simd-math-3-char-4.c: New test.
* gcc.target/gcn/simd-math-3-char-8.c: New test.
* gcc.target/gcn/simd-math-3-char-run-16.c: New test.
* gcc.target/gcn/simd-math-3-char-run-2.c: New test.
* gcc.target/gcn/simd-math-3-char-run-32.c: New test.
* gcc.target/gcn/simd-math-3-char-run-4.c: New test.
* gcc.target/gcn/simd-math-3-char-run-8.c: New test.
* gcc.target/gcn/simd-math-3-char-run.c: New test.
* gcc.target/gcn/simd-math-3-char.c: New test.
* gcc.target/gcn/simd-math-3-long-16.c: New test.
* gcc.target/gcn/simd-math-3-long-2.c: New test.
* gcc.target/gcn/simd-math-3-long-32.c: New test.
* gcc.target/gcn/simd-math-3-long-4.c: New test.
* gcc.target/gcn/simd-math-3-long-8.c: New test.
* gcc.target/gcn/simd-math-3-long-run-16.c: New test.
* gcc.target/gcn/simd-math-3-long-run-2.c: New test.
* gcc.target/gcn/simd-math-3-long-run-32.c: New test.
* gcc.target/gcn/simd-math-3-long-run-4.c: New test.
* gcc.target/gcn/simd-math-3-long-run-8.c: New test.
* gcc.target/gcn/simd-math-3-long-run.c: New test.
* gcc.target/gcn/simd-math-3-long.c: New test.
* gcc.target/gcn/simd-math-3-run-16.c: New test.
* gcc.target/gcn/simd-math-3-run-2.c: New test.
* gcc.target/gcn/simd-math-3-run-32.c: New test.
* gcc.target/gcn/simd-math-3-run-4.c: New test.
* gcc.target/gcn/simd-math-3-run-8.c: New test.
* gcc.target/gcn/simd-math-3-run.c: New test.
* gcc.target/gcn/simd-math-3-short-16.c: New test.
* gcc.target/gcn/simd-math-3-short-2.c: New test.
* gcc.target/gcn/simd-math-3-short-32.c: New test.
* gcc.target/gcn/simd-math-3-short-4.c: New test.
* gcc.target/gcn/simd-math-3-short-8.c: New test.
* gcc.target/gcn/simd-math-3-short-run-16.c: New test.
* gcc.target/gcn/simd-math-3-short-run-2.c: New test.
* gcc.target/gcn/simd-math-3-short-run-32.c: New test.
* gcc.target/gcn/simd-math-3-short-run-4.c: New test.
* gcc.target/gcn/simd-math-3-short-run-8.c: New test.
* gcc.target/gcn/simd-math-3-short-run.c: New test.
* gcc.target/gcn/simd-math-3-short.c: New test.
* gcc.target/gcn/simd-math-3.c: New test.
* gcc.target/gcn/simd-math-4-char-run.c: New test.
* gcc.target/gcn/simd-math-4-char.c: New test.
* gcc.target/gcn/simd-math-4-long-run.c: New test.
* gcc.target/gcn/simd-math-4-long.c: New test.
* gcc.target/gcn/simd-math-4-run.c: New test.
* gcc.target/gcn/simd-math-4-short-run.c: New test.
* gcc.target/gcn/simd-math-4-short.c: New test.
* gcc.target/gcn/simd-math-4.c: New test.
* gcc.target/gcn/simd-math-5-16.c: New test.
* gcc.target/gcn/simd-math-5-32.c: New test.
* gcc.target/gcn/simd-math-5-4.c: New test.
* gcc.target/gcn/simd-math-5-8.c: New test.
* gcc.target/gcn/simd-math-5-char-16.c: New test.
* gcc.target/gcn/simd-math-5-char-32.c: New test.
* gcc.target/gcn/simd-math-5-char-4.c: New test.
* gcc.target/gcn/simd-math-5-char-8.c: New test.
* gcc.target/gcn/simd-math-5-char-run-16.c: New test.
* gcc.target/gcn/simd-math-5-char-run-32.c: New test.
* gcc.target/gcn/simd-math-5-char-run-4.c: New test.
* gcc.target/gcn/simd-math-5-char-run-8.c: New test.
* gcc.target/gcn/simd-math-5-char-run.c: New test.
* gcc.target/gcn/simd-math-5-char.c: New test.
* gcc.target/gcn/simd-math-5-long-16.c: New test.
* gcc.target/gcn/simd-math-5-long-32.c: New test.
* gcc.target/gcn/simd-math-5-long-4.c: New test.
* gcc.target/gcn/simd-math-5-long-8.c: New test.
* gcc.target/gcn/simd-math-5-long-run-16.c: New test.
* gcc.target/gcn/simd-math-5-long-run-32.c: New test.
* gcc.target/gcn/simd-math-5-long-run-4.c: New test.
* gcc.target/gcn/simd-math-5-long-run-8.c: New test.
* gcc.target/gcn/simd-math-5-long-run.c: New test.
* gcc.target/gcn/simd-math-5-long.c: New test.
* gcc.target/gcn/simd-math-5-run-16.c: New test.
* gcc.target/gcn/simd-math-5-run-32.c: New test.
* gcc.target/gcn/simd-math-5-run-4.c: New test.
* gcc.target/gcn/simd-math-5-run-8.c: New test.
* gcc.target/gcn/simd-math-5-run.c: New test.
* gcc.target/gcn/simd-math-5-short-16.c: New test.
* gcc.target/gcn/simd-math-5-short-32.c: New test.
* gcc.target/gcn/simd-math-5-short-4.c: New test.
* gcc.target/gcn/simd-math-5-short-8.c: New test.
* gcc.target/gcn/simd-math-5-short-run-16.c: New test.
* gcc.target/gcn/simd-math-5-short-run-32.c: New test.
* gcc.target/gcn/simd-math-5-short-run-4.c: New test.
* gcc.target/gcn/simd-math-5-short-run-8.c: New test.
* gcc.target/gcn/simd-math-5-short-run.c: New test.
* gcc.target/gcn/simd-math-5-short.c: New test.
* gcc.target/gcn/simd-math-5.c: New test.
|
|
The HImode libfuncs weren't called and trying to enable them fails because
TARGET_PROMOTE_FUNCTION_MODE wants to widen the arguments but the signedness
isn't known.
libgcc/ChangeLog:
* config/gcn/lib2-gcn.h (QItype, UQItype, HItype, UHItype): Delete.
(__divhi3, __modhi3, __udivhi3, __umodhi3): Delete.
* config/gcn/t-amdgcn: Don't build lib2-divmod-hi.c.
* config/gcn/lib2-divmod-hi.c: Removed.
|
|
One of my workmates found there is a warning like:
libgcc/config/rs6000/morestack.S:402: Warning: ignoring
incorrect section type for .init_array.00000
when compiling libgcc/config/rs6000/morestack.S.
Since commit r13-6545 touched that file recently, which was
suspected to be responsible for this warning, I did some
investigation and found this is a warning staying for a long
time. For section .init_stack*, it's preferred to use
section type SHT_INIT_ARRAY. So this patch is use
"@init_array" to replace "@progbits".
Although the warning is trivial, Segher suggested me to
post this to fix it, in order to avoid any possible
misunderstanding/confusion on the warning.
As Alan confirmed, this doesn't require a premise check
on if the existing binutils supports "@init_array" or not,
"because if you want split-stack to work, you must link
with gold, any version of binutils that has gold has an
assembler that understands @init_array". (Thanks Alan!)
libgcc/ChangeLog:
* config/i386/morestack.S: Use @init_array rather than
@progbits for section type of section .init_array.
* config/rs6000/morestack.S: Likewise.
* config/s390/morestack.S: Likewise.
|
|
speculation_barrier for MIPS needs sync+jr.hb (r2+),
so we implement __speculation_barrier in libgcc, like arm32 does.
gcc/ChangeLog:
* config/mips/mips-protos.h (mips_emit_speculation_barrier): New
prototype.
* config/mips/mips.cc (speculation_barrier_libfunc): New static
variable.
(mips_init_libfuncs): Initialize it.
(mips_emit_speculation_barrier): New function.
* config/mips/mips.md (speculation_barrier): Call
mips_emit_speculation_barrier.
libgcc/ChangeLog:
* config/mips/lib1funcs.S: New file.
define __speculation_barrier and include mips16.S.
* config/mips/t-mips: define LIB1ASMSRC as mips/lib1funcs.S.
define LIB1ASMFUNCS as _speculation_barrier.
set version info for __speculation_barrier.
* config/mips/libgcc-mips.ver: New file.
* config/mips/t-mips16: don't define LIB1ASMSRC as mips16.S
included in lib1funcs.S now.
|
|
Tools from later versions of the OS deprecate or fail to support
earlier OS revisions.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:
* config.host: Arrange to set min Darwin OS versions from
the configured host version.
* config/darwin10-unwind-find-enc-func.c: Do not use current
headers, but declare the nexessary structures locally to the
versions in use for Mac OSX 10.6.
* config/t-darwin: Amend to handle configured min OS
versions.
* config/t-darwin-min-1: New.
* config/t-darwin-min-5: New.
* config/t-darwin-min-8: New.
|
|
Replace LR.aq/SC.rl pairs with the SEQ_CST LR.aqrl/SC.rl pairs
recommended by table A.6 of the ISA manual.
2023-04-27 Patrick O'Neill <patrick@rivosinc.com>
libgcc/ChangeLog:
* config/riscv/atomic.c: Change LR.aq/SC.rl pairs into
sequentially consistent LR.aqrl/SC.rl pairs.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
This patch aligns the configuration to the actual PRU capabilities. It
also reduces the size of the affected libgcc functions.
For a real-world project using integer arithmetics the savings
are significant:
Before:
text data bss dec hex filename
3688 865 544 5097 13e9 hc-sr04-range-sensor.elf
With TARGET_HAS_NO_HW_DIVIDE defined:
text data bss dec hex filename
2824 865 544 4233 1089 hc-sr04-range-sensor.elf
Execution speed also appears to have improved. The moddi3 function is
now executed in half the CPU cycles.
libgcc/ChangeLog:
* config/pru/t-pru (HOST_LIBGCC2_CFLAGS): Add
-DTARGET_HAS_NO_HW_DIVIDE.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
With this, execution time for e.g. __moddi3 go from 59 to 40 cycles in
the "fast" case or from 290 to 200 cycles in the "slow" case (when the
!TARGET_HAS_NO_HW_DIVIDE variant calls division and modulus functions
for 32-bit SImode), as exposed by gcc.c-torture/execute/arith-rand-ll.c
compiled for -march=v10.
Unfortunately, it just puts a performance improvement "dent" of 0.07%
in a arith-rand-ll.c-based performance test - where all loops are also
reduced to 1/10.
The size of every affected libgcc function is reduced to less than
half and they are all now leaf functions.
* config/cris/t-cris (HOST_LIBGCC2_CFLAGS): Add
-DTARGET_HAS_NO_HW_DIVIDE.
|
|
RISC-V has no support for subword atomic operations; code currently
generates libatomic library calls.
This patch changes the default behavior to inline subword atomic calls
(using the same logic as the existing library call).
Behavior can be specified using the -minline-atomics and
-mno-inline-atomics command line flags.
gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm.
This will need to stay for backwards compatibility and the
-mno-inline-atomics flag.
2023-04-18 Patrick O'Neill <patrick@rivosinc.com>
gcc/ChangeLog:
PR target/104338
* config/riscv/riscv-protos.h: Add helper function stubs.
* config/riscv/riscv.cc: Add helper functions for subword masking.
* config/riscv/riscv.opt: Add command-line flag.
* config/riscv/sync.md: Add masking logic and inline asm for fetch_and_op,
fetch_and_nand, CAS, and exchange ops.
* doc/invoke.texi: Add blurb regarding command-line flag.
libgcc/ChangeLog:
PR target/104338
* config/riscv/atomic.c: Add reference to duplicate logic.
gcc/testsuite/ChangeLog:
PR target/104338
* gcc.target/riscv/inline-atomics-1.c: New test.
* gcc.target/riscv/inline-atomics-2.c: New test.
* gcc.target/riscv/inline-atomics-3.c: New test.
* gcc.target/riscv/inline-atomics-4.c: New test.
* gcc.target/riscv/inline-atomics-5.c: New test.
* gcc.target/riscv/inline-atomics-6.c: New test.
* gcc.target/riscv/inline-atomics-7.c: New test.
* gcc.target/riscv/inline-atomics-8.c: New test.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
reversed direction [PR109402]
muldi3 will deallocate stack space after the call to __save_r26_r31,
then re-allocate the space a short while later. If an interrupt
occurs in that window, it can clobber items on the stack.
PR target/109402
libgcc/
* config/v850/lib1funcs.S (___muldi3): Remove unnecessary
stack manipulations.
|
|
The millicode division and remainder routines trap division by zero.
The unwinder needs these directives to unwind divide by zero traps.
2023-04-05 John David Anglin <danglin@gcc.gnu.org>
libgcc/ChangeLog:
PR target/109374
* config/pa/milli64.S (RETURN_COLUMN): Define.
($$divI): Add CFI directives.
($$divU): Likewise.
($$remI): Likewise.
($$remU): Likewise.
|
|
We should always carry the exceptions forward. This bug was found when
working on testing glibc math tests, many tests were failing with
Overflow and Underflow flags not set. This was traced to here.
libgcc/ChangeLog:
* config/or1k/sfp-machine.h (FP_HANDLE_EXCEPTIONS): Remove
statement clearing existing exceptions.
|
|
gcc/
* config/xtensa/linux.h (TARGET_ASM_FILE_END): New macro.
libgcc/
* config/xtensa/crti.S: Add .note.GNU-stack section on linux.
* config/xtensa/crtn.S: Likewise.
* config/xtensa/lib1funcs.S: Likewise.
* config/xtensa/lib2funcs.S: Likewise.
|
|
x86_64/i686 has for a few months working std::bfloat16_t support, __bf16
there is no longer a storage only type, but can be used for arithmetics
and is supported in libgcc and libstdc++.
The following patch adds similar support for AArch64.
Unlike the x86 changes, this one keeps the old __bf16 mangling of
u6__bf16 rather than DF16b (so an exception from Itanium ABI), but
otherwise __bf16 and decltype (0.0bf16) are the same type and both
in C++ act as extended floating-point type.
2023-03-13 Jakub Jelinek <jakub@redhat.com>
gcc/
* config/aarch64/aarch64.h (aarch64_bf16_type_node): Remove.
(aarch64_bf16_ptr_type_node): Adjust comment.
* config/aarch64/aarch64.cc (aarch64_gimplify_va_arg_expr): Use
bfloat16_type_node rather than aarch64_bf16_type_node.
(aarch64_libgcc_floating_mode_supported_p,
aarch64_scalar_mode_supported_p): Also support BFmode.
(aarch64_invalid_conversion, aarch64_invalid_unary_op): Remove.
(aarch64_invalid_binary_op): Remove BFmode related rejections.
(TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP): Don't redefine.
* config/aarch64/aarch64-builtins.cc (aarch64_bf16_type_node): Remove.
(aarch64_int_or_fp_type): Use bfloat16_type_node rather than
aarch64_bf16_type_node.
(aarch64_init_simd_builtin_types): Likewise.
(aarch64_init_bf16_types): Likewise. Don't create bfloat16_type_node,
which is created in tree.cc already.
* config/aarch64/aarch64-sve-builtins.def (svbfloat16_t): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_opt_n_1.c:
Don't expect one __bf16 related error.
* gcc.target/aarch64/bfloat16_vector_typecheck_1.c: Adjust or remove
dg-error directives for __bf16 being an extended arithmetic type.
* gcc.target/aarch64/bfloat16_vector_typecheck_2.c: Likewise.
* gcc.target/aarch64/bfloat16_scalar_typecheck.c: Likewise.
* g++.target/aarch64/bfloat_cpp_typecheck.C: Don't expect two __bf16
related errors.
libgcc/
* config/aarch64/t-softfp (softfp_extensions): Add bfsf.
(softfp_truncations): Add tfbf dfbf sfbf hfbf.
(softfp_extras): Add floatdibf floatundibf floattibf floatuntibf.
* config/aarch64/libgcc-softfp.ver (GCC_13.0.0): Export
__extendbfsf2 and __trunc{s,d,t,h}fbf2.
* config/aarch64/sfp-machine.h (_FP_NANFRAC_B, _FP_NANSIGN_B): Define.
* soft-fp/floatundibf.c: New file.
* soft-fp/floatdibf.c: New file.
libstdc++-v3/
* config/abi/pre/gnu.ver (CXXABI_1.3.14): Also export __bf16 tinfos
if it isn't mangled as DF16b but u6__bf16.
|
|
While DI <-> BF conversions can be handled (and are) through
DI <-> XF <-> BF and for narrower integral modes even sometimes
through DF or SF, because XFmode has 64-bit mantissa and so all
the DImode values are exactly representable in XFmode.
That is not the case for TImode, and while e.g. the HF -> TI
conversions are IMHO useless in libgcc, because HFmode has
-65504.0f16, 65504.0f16 range, all the integers will be already
representable in SImode (or even HImode for unsigned) and so
I think HF -> DI -> TI conversions are faster and valid,
BFmode has roughly the same range as SFmode and so we absolutely need
the TI -> BF conversions to avoid double rounding.
As for BF -> TI conversions, they can be either also implemented
in libgcc, or they can be implemented (as done in this commit)
as BF -> SF -> TI conversions with the same code generation used
elsewhere, just doing the 16-bit left shift of the bits - I think
we don't need to handle sNaNs during the BF -> SF part because
SF -> TI (which is already a libcall too) will handle that too.
The BF -> SF -> TI path avoids wasting
32: 0000000000015e10 321 FUNC GLOBAL DEFAULT 13 __fixbfti@@GCC_13.0.0
89: 0000000000015f60 299 FUNC GLOBAL DEFAULT 13 __fixunsbfti@@GCC_13.0.0
2023-03-10 Jakub Jelinek <jakub@redhat.com>
PR target/107703
* optabs.cc (expand_fix): For conversions from BFmode to integral,
use shifts to convert it to SFmode first and then convert SFmode
to integral.
* soft-fp/floattibf.c: New file.
* soft-fp/floatuntibf.c: New file.
* config/i386/libgcc-glibc.ver: Export __float{,un}tibf @ GCC_13.0.0.
* config/i386/64/t-softfp (softfp_extras): Add floattibf and
floatuntibf.
(CFLAGS-floattibf.c, CFLAGS-floatunstibf.c): Add -msse2.
|
|
As PR108727 shows, when cleanup code called by the stack
unwinder calls function _Unwind_Resume, it goes via plt
stub like:
function 00000000.plt_call._Unwind_Resume:
=> 0x0000000010003580 <+0>: std r2,40(r1)
0x0000000010003584 <+4>: ld r12,-31760(r2)
0x0000000010003588 <+8>: mtctr r12
0x000000001000358c <+12>: ld r2,-31752(r2)
0x0000000010003590 <+16>: cmpldi r2,0
0x0000000010003594 <+20>: bnectr+
0x0000000010003598 <+24>: b 0x100031a4
<_Unwind_Resume@plt>
It wants to save TOC base (r2) to r1 + 40, but we only
bump the stack segment by 32 bytes as follows:
stdu %r29,-32(%r3)
It means the access is out of the stack segment allocated
by __generic_morestack, once the touch area isn't writable
like this failure shows, it would cause segment fault.
So fix the bump size with one reasonable value PARAMS.
PR libgcc/108727
libgcc/ChangeLog:
* config/rs6000/morestack.S (__morestack): Use PARAMS for new stack
bump size.
|
|
This patch updates the IEEE 128-bit types used in libgcc.
At the moment, we cannot build GCC when the target uses IEEE 128-bit long
doubles, such as building the compiler for a native Fedora 36 system. The
build dies when it is trying to build the _mulkc3.c and _divkc3 modules.
This patch changes libgcc to use long double for the IEEE 128-bit base type if
long double is IEEE 128-bit, and it uses _Float128 otherwise. The built-in
functions are adjusted to be the correct version based on the IEEE 128-bit base
type used.
While it is desirable to ultimately have __float128 and _Float128 use the same
internal type and mode within GCC, at present if you use the option
-mabi=ieeelongdouble, the __float128 type will use the long double type and not
the _Float128 type. We get an internal compiler error if we combine the
signbitf128 built-in with a long double type.
I've gone through several iterations of trying to fix this within GCC, and
there are various problems that have come up. I developed this alternative
patch that changes libgcc so that it does not tickle the issue. I hope we can
fix the compiler at some point, but right now, this is preventing people on
Fedora 36 systems from building compilers where the default long double is IEEE
128-bit.
2023-03-06 Michael Meissner <meissner@linux.ibm.com>
libgcc/
PR target/107299
* config/rs6000/_divkc3.c (COPYSIGN): Use the correct built-in based on
whether long double is IBM or IEEE.
(INFINITY): Likewise.
(FABS): Likewise.
* config/rs6000/_mulkc3.c (COPYSIGN): Likewise.
(INFINITY): Likewise.
* config/rs6000/quad-float128.h (TF): Remove definition.
(TFtype): Define to be long double or _Float128.
(TCtype): Define to be _Complex long double or _Complex _Float128.
* libgcc2.h (TFtype): Allow machine config files to override this.
(TCtype): Likewise.
* soft-fp/quad.h (TFtype): Likewise.
|
|
gcc/ChangeLog:
* config/riscv/riscv.h (RISCV_DWARF_VLENB): New.
(DWARF_FRAME_REGISTERS): New.
(DWARF_REG_TO_UNWIND_COLUMN): New.
libgcc/ChangeLog:
* config.host (riscv*-*-*): Add config/riscv/value-unwind.h.
* config/riscv/value-unwind.h: New.
|
|
I have noticed some warnings when building GCC for arm-eabi:
pr-support.c:110:7: warning: variable ‘set_pac_sp’ set but not used [-Wunused-but-set-variable]
pr-support.c:109:7: warning: variable ‘set_pac’ set but not used [-Wunused-but-set-variable]
This small patch avoids them by defining these two variables undef
TARGET_HAVE_PACBTI, like the code which actually uses them.
libgcc/
* config/arm/pr-support.c (__gnu_unwind_execute): Use
TARGET_HAVE_PACBTI to define set_pac and set_pac_sp.
|
|
Tested by building a toolchain and compiling gnumach for x86_64 [1].
This is the basic version without unwind support which I think is only
required to implement exceptions.
[1]
https://github.com/flavioc/cross-hurd/blob/master/bootstrap-kernel.sh.
gcc/ChangeLog:
* config.gcc: Recognize x86_64-*-gnu* targets and include
i386/gnu64.h.
* config/i386/gnu64.h: Define configuration for new target
including ld.so location.
libgcc/ChangeLog:
* config.host: Recognize x86_64-*-gnu* targets.
* config/i386/gnu-unwind.h: Update to handle __x86_64__ with a
TODO for now.
Signed-off-by: Flavio Cruz <flaviocruz@gmail.com>
|
|
This patch adds support for Arm frame unwinding instruction "0xb5" [1]. When
an exception is taken and "0xb5" instruction is encounter during runtime
stack-unwinding, we use effective vsp as modifier in pointer authentication.
On completion of stack unwinding if "0xb5" instruction is not encountered
then CFA will be used as modifier in pointer authentication.
[1] https://github.com/ARM-software/abi-aa/releases/download/2022Q3/ehabi32.pdf
libgcc/ChangeLog:
2022-11-09 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* config/arm/pr-support.c (__gnu_unwind_execute): Decode opcode
"0xb5".
|
|
This patch adds authentication for when the stack is unwound when an
exception is taken. All the changes here are done to the runtime code
in libgcc's unwinder code for Arm target. All the changes are guarded
under defined (__ARM_FEATURE_PAC_DEFAULT) and activated only if the
+pacbti feature is switched on for the architecture. This means that
switching on the target feature via -march or -mcpu is sufficient and
-mbranch-protection need not be enabled. This ensures that the
unwinder is authenticated only if the PACBTI instructions are
available in the non-NOP space as it uses AUTG. Just generating
PAC/AUT instructions using -mbranch-protection will not enable
authentication on the unwinder.
Pre-approved with the requested changes here
<https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586555.html>.
gcc/ChangeLog:
* ginclude/unwind-arm-common.h (_Unwind_VRS_RegClass): Introduce
new pseudo register class _UVRSC_PAC.
libgcc/ChangeLog:
* config/arm/pr-support.c (__gnu_unwind_execute): Decode
exception opcode (0xb4) for saving RA_AUTH_CODE and authenticate
with AUTG if found.
* config/arm/unwind-arm.c (struct pseudo_regs): New.
(phase1_vrs): Introduce new field to store pseudo-reg state.
(phase2_vrs): Likewise.
(_Unwind_VRS_Get): Load pseudo register state from virtual reg set.
(_Unwind_VRS_Set): Store pseudo register state to virtual reg set.
(_Unwind_VRS_Pop): Load pseudo register value from stack into VRS.
Co-Authored-By: Tejas Belagod <tbelagod@arm.com>
Co-Authored-By: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
|
|
A recent change only initializes the regs.how[] during Dwarf unwinding
which resulted in an uninitialized offset used in return address signing
and random failures during unwinding. The fix is to encode the return
address signing state in REG_UNSAVED and a new state REG_UNSAVED_ARCHEXT.
libgcc/
PR target/107678
* unwind-dw2.h (REG_UNSAVED_ARCHEXT): Add new enum.
* unwind-dw2.c (uw_update_context_1): Add REG_UNSAVED_ARCHEXT case.
* unwind-dw2-execute_cfa.h: Use REG_UNSAVED_ARCHEXT/REG_UNSAVED to
encode the return address signing state.
* config/aarch64/aarch64-unwind.h (aarch64_demangle_return_addr)
Check current return address signing state.
(aarch64_frob_update_contex): Remove.
|
|
|
|
This change updates the atomic libcall support to fix the following
issues:
1) A internal compiler error with -fno-sync-libcalls.
2) When sync libcalls are disabled, we don't generate libcalls for
libatomic.
3) There is no sync libcall support for targets other than linux.
As a result, non-atomic stores are silently emitted for types
smaller or equal to the word size. There are now a few atomic
libcalls in the libgcc code, so we need sync support on all
targets.
2023-01-13 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa-linux.h (TARGET_SYNC_LIBCALL): Delete define.
* config/pa/pa.cc (pa_init_libfuncs): Use MAX_SYNC_LIBFUNC_SIZE
define.
* config/pa/pa.h (TARGET_SYNC_LIBCALLS): Use flag_sync_libcalls.
(MAX_SYNC_LIBFUNC_SIZE): Define.
(TARGET_CPU_CPP_BUILTINS): Define __SOFTFP__ when soft float is
enabled.
* config/pa/pa.md (atomic_storeqi): Emit __atomic_exchange_1
libcall when sync libcalls are disabled.
(atomic_storehi, atomic_storesi, atomic_storedi): Likewise.
(atomic_loaddi): Emit __atomic_load_8 libcall when sync libcalls
are disabled on 32-bit target.
* config/pa/pa.opt (matomic-libcalls): New option.
* doc/invoke.texi (HPPA Options): Update.
libgcc/ChangeLog:
* config.host (hppa*64*-*-linux*): Adjust tmake_file to use
pa/t-pa64-linux.
(hppa*64*-*-hpux11*): Adjust tmake_file to use pa/t-pa64-hpux
instead of pa/t-hpux and pa/t-pa64.
* config/pa/linux-atomic.c: Define u32 type.
(ATOMIC_LOAD): Define new macro to implement atomic_load_1,
atomic_load_2, atomic_load_4 and atomic_load_8. Update sync
defines to use atomic_load calls for type.
(SYNC_LOCK_LOAD_2): New macro to implement __sync_lock_load_8.
* config/pa/sync-libfuncs.c: New file.
* config/pa/t-netbsd (LIB2ADD_ST): Define.
* config/pa/t-openbsd (LIB2ADD_ST): Define.
* config/pa/t-pa64-hpux: New file.
* config/pa/t-pa64-linux: New file.
|
|
GCC 13 has a new implementation of gthr-win32.h which supports C++11
mutexes, threads etc. but this causes an unintended ABI break. The
__gthread_mutex_t type is always used in std::basic_filebuf even in
C++98, so independent of whether C++11 sync primitives work or not.
Because that type changed for the win32 thread model, we have a layout
change in std::basic_filebuf. The member is completely unused, it just
gets passed to the std::__basic_file constructor and ignored. So we
don't need that mutex to actually work, we just need its layout to not
change.
Introduce a new __gthr_win32_legacy_mutex_t struct in gthr-win32.h with
the old layout, and conditionally use that in std::basic_filebuf.
PR libstdc++/108331
libgcc/ChangeLog:
* config/i386/gthr-win32.h (__gthr_win32_legacy_mutex_t): New
struct matching the previous __gthread_mutex_t struct.
(__GTHREAD_LEGACY_MUTEX_T): Define.
libstdc++-v3/ChangeLog:
* config/io/c_io_stdio.h (__c_lock): Define as a typedef for
__GTHREAD_LEGACY_MUTEX_T if defined.
|
|
The patch to convert all thumb1 code in libgcc to unified syntax
omitted changing all swi instructions to the current name: svc.
libgcc/ChangeLog:
* config/arm/lib1funcs.S (clear_cache): Use SVC to conform to
unified syntax.
|
|
Recently, mingw-w64 has got updated <msxml.h> from Wine which is included
indirectly by <windows.h> if `WIN32_LEAN_AND_MEAN` is not defined. The
`IXMLDOMDocument` class has a member function named `abort()`, which gets
affected by our `abort()` macro in "system.h".
`WIN32_LEAN_AND_MEAN` should, nevertheless, always be defined. This
can exclude 'APIs such as Cryptography, DDE, RPC, Shell, and Windows
Sockets' [1], and speed up compilation of these files a bit.
[1] https://learn.microsoft.com/en-us/windows/win32/winprog/using-the-windows-headers
gcc/
PR middle-end/108300
* config/xtensa/xtensa-dynconfig.c: Define `WIN32_LEAN_AND_MEAN`
before <windows.h>.
* diagnostic-color.cc: Likewise.
* plugin.cc: Likewise.
* prefix.cc: Likewise.
gcc/ada/
PR middle-end/108300
* adaint.c: Define `WIN32_LEAN_AND_MEAN` before `#include
<windows.h>`.
* cio.c: Likewise.
* ctrl_c.c: Likewise.
* expect.c: Likewise.
* gsocket.h: Likewise.
* mingw32.h: Likewise.
* mkdir.c: Likewise.
* rtfinal.c: Likewise.
* rtinit.c: Likewise.
* seh_init.c: Likewise.
* sysdep.c: Likewise.
* terminals.c: Likewise.
* tracebak.c: Likewise.
gcc/jit/
PR middle-end/108300
* jit-w32.h: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
libatomic/
PR middle-end/108300
* config/mingw/lock.c: Define `WIN32_LEAN_AND_MEAN` before
<windows.h>.
libffi/
PR middle-end/108300
* src/aarch64/ffi.c: Define `WIN32_LEAN_AND_MEAN` before
<windows.h>.
libgcc/
PR middle-end/108300
* config/i386/enable-execute-stack-mingw32.c: Define
`WIN32_LEAN_AND_MEAN` before <windows.h>.
* libgcc2.c: Likewise.
* unwind-generic.h: Likewise.
libgfortran/
PR middle-end/108300
* intrinsics/sleep.c: Define `WIN32_LEAN_AND_MEAN` before
<windows.h>.
libgomp/
PR middle-end/108300
* config/mingw32/proc.c: Define `WIN32_LEAN_AND_MEAN` before
<windows.h>.
libiberty/
PR middle-end/108300
* make-temp-file.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
* pex-win32.c: Likewise.
libssp/
PR middle-end/108300
* ssp.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
libstdc++-v3/
PR middle-end/108300
* src/c++11/system_error.cc: Define `WIN32_LEAN_AND_MEAN` before
<windows.h>.
* src/c++11/thread.cc: Likewise.
* src/c++17/fs_ops.cc: Likewise.
* src/filesystem/ops.cc: Likewise.
libvtv/
PR middle-end/108300
* vtv_malloc.cc: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
* vtv_rts.cc: Likewise.
* vtv_utils.cc: Likewise.
|
|
2022 -> 2023
|
|
Broken by 9149a5b7e0a66b7b94d5b7db3194a975d18dea2f.
CC_NONE is defined by wingdi.h and conflicting with gcc.
Committed as obvious.
libgcc/:
* config/i386/gthr-win32.h: undef CC_NONE
Signed-off-by: Jonathan Yong <10walls@gmail.com>
|
|
On Darwin, GCC now uses a libgcc_s.1.1 for builtins and forwards the system
unwinder. We do, however, build a backwards compatibility libgcc_s.1.dylib.
However, this is not needed by GCC and can cause incorrect operation when
DYLD_LIBRARY_PATH is in use.
Since we do not need or use it during the build, the solution is to skip the
installation into the $build/gcc directory.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:
* config/t-slibgcc-darwin (install-darwin-libgcc-stubs): Skip the
install of libgcc_s.1.dylib when the installation is into the build
gcc directory.
|
|
This reimplements the GNU threads library on native Windows (except for the
Objective-C specific subset) using direct Win32 API calls, in lieu of the
implementation based on semaphores. This base implementations requires
Windows XP/Server 2003, which was the default minimal setting of MinGW-W64
until end of 2020. This also adds the support required for the C++11 threads,
using again direct Win32 API calls; this additional layer requires Windows
Vista/Server 2008 and is enabled only if _WIN32_WINNT >= 0x0600.
This also changes libstdc++ to pass -D_WIN32_WINNT=0x0600 but only when the
switch --enable-libstdcxx-threads is passed, which means that C++11 threads
are still disabled by default *unless* MinGW-W64 itself is configured for
Windows Vista/Server 2008 or later by default (this has been the case in
the development version since end of 2020, for earlier versions you can
configure it --with-default-win32-winnt=0x0600 to get the same effect).
I only manually tested it on i686-w64-mingw32 and x86_64-w64-mingw32 but
AdaCore has used it in their C/C++/Ada compilers for 3 years now and the
30_threads chapter of the libstdc++ testsuite was clean at the time.
2022-10-31 Eric Botcazou <ebotcazou@adacore.com>
libgcc/
* config.host (i[34567]86-*-mingw*): Add thread fragment after EH one
as well as new i386/t-slibgcc-mingw fragment.
(x86_64-*-mingw*): Likewise.
* config/i386/gthr-win32.h: If _WIN32_WINNT is at least 0x0600, define
both __GTHREAD_HAS_COND and __GTHREADS_CXX0X to 1.
Error out if _GTHREAD_USE_MUTEX_TIMEDLOCK is 1.
Include stdlib.h instead of errno.h and do not include _mingw.h.
(CONST_CAST2): Add specific definition for C++.
(ATTRIBUTE_UNUSED): New macro.
(__UNUSED_PARAM): Delete.
Define WIN32_LEAN_AND_MEAN before including windows.h.
(__gthread_objc_data_tls): Use TLS_OUT_OF_INDEXES instead of (DWORD)-1.
(__gthread_objc_init_thread_system): Likewise.
(__gthread_objc_thread_get_data): Minor tweak.
(__gthread_objc_condition_allocate): Use ATTRIBUTE_UNUSED.
(__gthread_objc_condition_deallocate): Likewise.
(__gthread_objc_condition_wait): Likewise.
(__gthread_objc_condition_broadcast): Likewise.
(__gthread_objc_condition_signal): Likewise.
Include sys/time.h.
(__gthr_win32_DWORD): New typedef.
(__gthr_win32_HANDLE): Likewise.
(__gthr_win32_CRITICAL_SECTION): Likewise.
(__gthr_win32_CONDITION_VARIABLE): Likewise.
(__gthread_t): Adjust.
(__gthread_key_t): Likewise.
(__gthread_mutex_t): Likewise.
(__gthread_recursive_mutex_t): Likewise.
(__gthread_cond_t): New typedef.
(__gthread_time_t): Likewise.
(__GTHREAD_MUTEX_INIT_DEFAULT): Delete.
(__GTHREAD_RECURSIVE_MUTEX_INIT_DEFAULT): Likewise.
(__GTHREAD_COND_INIT_FUNCTION): Define.
(__GTHREAD_TIME_INIT): Likewise.
(__gthr_i486_lock_cmp_xchg): Delete.
(__gthr_win32_create): Declare.
(__gthr_win32_join): Likewise.
(__gthr_win32_self): Likewise.
(__gthr_win32_detach): Likewise.
(__gthr_win32_equal): Likewise.
(__gthr_win32_yield): Likewise.
(__gthr_win32_mutex_destroy): Likewise.
(__gthr_win32_cond_init_function): Likewise if __GTHREADS_HAS_COND is 1.
(__gthr_win32_cond_broadcast): Likewise.
(__gthr_win32_cond_signal): Likewise.
(__gthr_win32_cond_wait): Likewise.
(__gthr_win32_cond_timedwait): Likewise.
(__gthr_win32_recursive_mutex_init_function): Delete.
(__gthr_win32_recursive_mutex_lock): Likewise.
(__gthr_win32_recursive_mutex_unlock): Likewise.
(__gthr_win32_recursive_mutex_destroy): Likewise.
(__gthread_create): New inline function.
(__gthread_join): Likewise.
(__gthread_self): Likewise.
(__gthread_detach): Likewise.
(__gthread_equal): Likewise.
(__gthread_yield): Likewise.
(__gthread_cond_init_function): Likewise if __GTHREADS_HAS_COND is 1.
(__gthread_cond_broadcast): Likewise.
(__gthread_cond_signal): Likewise.
(__gthread_cond_wait): Likewise.
(__gthread_cond_timedwait): Likewise.
(__GTHREAD_WIN32_INLINE): New macro.
(__GTHREAD_WIN32_COND_INLINE): Likewise.
(__GTHREAD_WIN32_ACTIVE_P): Likewise.
Define WIN32_LEAN_AND_MEAN before including windows.h.
(__gthread_once): Minor tweaks.
(__gthread_key_create): Use ATTRIBUTE_UNUSED and TLS_OUT_OF_INDEXES.
(__gthread_key_delete): Minor tweak.
(__gthread_getspecific): Likewise.
(__gthread_setspecific): Likewise.
(__gthread_mutex_init_function): Reimplement.
(__gthread_mutex_destroy): Likewise.
(__gthread_mutex_lock): Likewise.
(__gthread_mutex_trylock): Likewise.
(__gthread_mutex_unlock): Likewise.
(__gthr_win32_abs_to_rel_time): Declare.
(__gthread_recursive_mutex_init_function): Reimplement.
(__gthread_recursive_mutex_destroy): Likewise.
(__gthread_recursive_mutex_lock): Likewise.
(__gthread_recursive_mutex_trylock): Likewise.
(__gthread_recursive_mutex_unlock): Likewise.
(__gthread_cond_destroy): New inline function.
(__gthread_cond_wait_recursive): Likewise.
* config/i386/gthr-win32.c: Delete everything.
Include gthr-win32.h to get the out-of-line version of inline routines.
Add compile-time checks for the local version of the Win32 types.
* config/i386/gthr-win32-cond.c: New file.
* config/i386/gthr-win32-thread.c: Likewise.
* config/i386/t-gthr-win32: Add config/i386/gthr-win32-thread.c to the
EH part, config/i386/gthr-win32-cond.c and config/i386/gthr-win32.c to
the static version of libgcc.
* config/i386/t-slibgcc-mingw: New file.
* config/i386/libgcc-mingw.ver: Likewise.
libstdc++-v3/
* acinclude.m4 (GLIBCXX_EXPORT_FLAGS): Substitute CPPFLAGS.
(GLIBCXX_ENABLE_LIBSTDCXX_TIME): Set ac_has_sched_yield and
ac_has_win32_sleep to yes for MinGW. Change HAVE_WIN32_SLEEP
into _GLIBCXX_USE_WIN32_SLEEP.
(GLIBCXX_CHECK_GTHREADS): Add _WIN32_THREADS to compilation flags for
Win32 threads and force _GTHREAD_USE_MUTEX_TIMEDLOCK to 0 for them.
Add -D_WIN32_WINNT=0x0600 to compilation flags if yes was configured
and add it to CPPFLAGS on success.
* config.h.in: Regenerate.
* configure: Likewise.
* config/os/mingw32-w64/os_defines.h (_GLIBCXX_USE_GET_NPROCS_WIN32):
Define to 1.
* config/os/mingw32/os_defines.h (_GLIBCXX_USE_GET_NPROCS_WIN32): Ditto
* src/c++11/thread.cc (get_nprocs): Provide Win32 implementation if
_GLIBCXX_USE_GET_NPROCS_WIN32 is defined. Replace HAVE_WIN32_SLEEP
with USE_WIN32_SLEEP.
* testsuite/19_diagnostics/headers/system_error/errc_std_c++0x.cc: Add
missing conditional compilation.
* testsuite/lib/libstdc++.exp (check_v3_target_sleep): Add support for
_GLIBCXX_USE_WIN32_SLEEP.
(check_v3_target_nprocs): Likewise for _GLIBCXX_USE_GET_NPROCS_WIN32.
Signed-off-by: Eric Botcazou <ebotcazou@adacore.com>
Signed-off-by: Jonathan Yong <10walls@gmail.com>
|
|
libgcc/
* config/xtensa/xtensa-config-builtin.h (XCHAL_NUM_AREGS)
(XCHAL_ICACHE_SIZE, XCHAL_DCACHE_SIZE, XCHAL_ICACHE_LINESIZE)
(XCHAL_DCACHE_LINESIZE, XCHAL_MMU_MIN_PTE_PAGE_SIZE)
(XSHAL_ABI): Remove stray symbols from macro definitions.
|
|
Now that gcc provides __XCHAL_* definitions use them instead of XCHAL_*
definitions from the include/xtensa-config.h. That makes libgcc
dynamically configurable for the target xtensa core.
libgcc/
* config/xtensa/crti.S (xtensa-config.h): Replace #inlcude with
xtensa-config-builtin.h.
* config/xtensa/crtn.S: Likewise.
* config/xtensa/lib1funcs.S: Likewise.
* config/xtensa/lib2funcs.S: Likewise.
* config/xtensa/xtensa-config-builtin.h: New File.
|
|
'libobjc/thr.c' includes 'gthr.h'. While all the other gthread headers
have `#ifdef _LIBOBJC` checks, and provide a different set of inline
functions, I think having one header provide two completely unrelated
set of APIs is unsatisfactory, complicates maintenance, and hinders
further development.
This commit references a new header for libobjc, and adds a copyright
notice, as in other headers.
libgcc/ChangeLog:
* config/i386/gthr-mcf.h: Include 'gthr_libobjc.h' when building
libobjc, instead of 'gthr.h'
|
|
This patch adds the new thread model `mcf`, which implements mutexes
and condition variables with the mcfgthread library.
Source code for mcfgthread is available at <https://github.com/lhmouse/mcfgthread>.
config/ChangeLog:
* gthr.m4 (GCC_AC_THREAD_HEADER): Add new case for `mcf` thread
model
gcc/ChangeLog:
* config/i386/mingw-mcfgthread.h: New file
* config/i386/mingw32.h: Add builtin macro and default libraries
for mcfgthread when thread model is `mcf`
* config.gcc: Include 'i386/mingw-mcfgthread.h' when thread model
is `mcf`
* configure.ac: Recognize `mcf` as a valid thread model
* config.in: Regenerate
* configure: Regenerate
libatomic/ChangeLog:
* configure.tgt: Add new case for `mcf` thread model
libgcc/ChangeLog:
* config.host: Add new cases for `mcf` thread model
* config/i386/gthr-mcf.h: New file
* config/i386/t-mingw-mcfgthread: New file
* config/i386/t-slibgcc-cygming: Add mcfgthread for libgcc DLL
* configure: Regenerate
libstdc++-v3/ChangeLog:
* libsupc++/atexit_thread.cc (__cxa_thread_atexit): Use
implementation from mcfgthread if available
* libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_release,
__cxa_guard_abort): Use implementations from mcfgthread if
available
* configure: Regenerate
|
|
If shadow stack is enabled, when unwinding stack, we count how many stack
frames we pop to reach the landing pad and adjust shadow stack by the same
amount. When counting the stack frame, we compare the return address on
normal stack against the return address on shadow stack. If they don't
match, return _URC_FATAL_PHASE2_ERROR for the corrupted return address on
normal stack. Don't check the return address for
1. Non-catchable exception where exception_class == 0. Process will be
terminated.
2. Zero return address which marks the outermost stack frame.
3. Signal stack frame since kernel puts a restore token on shadow stack.
* unwind-generic.h (_Unwind_Frames_Increment): Add the EXC
argument.
* unwind.inc (_Unwind_RaiseException_Phase2): Pass EXC to
_Unwind_Frames_Increment.
(_Unwind_ForcedUnwind_Phase2): Likewise.
* config/i386/shadow-stack-unwind.h (_Unwind_Frames_Increment):
Take the EXC argument. Return _URC_FATAL_PHASE2_ERROR if the
return address on normal stack doesn't match the return address
on shadow stack.
|
|
Here is a complete patch to add std::bfloat16_t support on
x86 (AArch64 and ARM left for later). Almost no BFmode optabs
are added by the patch, so for binops/unops it extends to SFmode
first and then truncates back to BFmode.
For {HF,SF,DF,XF,TF}mode -> BFmode conversions libgcc has implementations
of all those conversions so that we avoid double rounding, for
BFmode -> {DF,XF,TF}mode conversions to avoid growing libgcc too much
it emits BFmode -> SFmode conversion first and then converts to the even
wider mode, neither step should be imprecise.
For BFmode -> HFmode, it first emits a precise BFmode -> SFmode conversion
and then SFmode -> HFmode, because neither format is subset or superset
of the other, while SFmode is superset of both.
expr.cc then contains a -ffast-math optimization of the BF -> SF and
SF -> BF conversions if we don't optimize for space (and for the latter
if -frounding-math isn't enabled either).
For x86, perhaps truncsfbf2 optab could be defined for TARGET_AVX512BF16
but IMNSHO should FAIL if !flag_finite_math || flag_rounding_math
|| !flag_unsafe_math_optimizations, because I think the insn doesn't
raise on sNaNs, hardcodes round to nearest and flushes denormals to zero.
By default (unless x86 -fexcess-precision=16) we use float excess
precision for BFmode, so truncate only on explicit casts and assignments.
The patch introduces a single __bf16 builtin - __builtin_nansf16b,
because (__bf16) __builtin_nansf ("") will drop the sNaN into qNaN,
and uses f16b suffix instead of bf16 because there would be ambiguity on
log vs. logb - __builtin_logbf16 could be either log with bf16 suffix
or logb with f16 suffix. In other cases libstdc++ should mostly use
__builtin_*f for std::bfloat16_t overloads (we have a problem with
std::nextafter though but that one we have also for std::float16_t).
2022-10-14 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE.
* tree.h (bfloat16_type_node): Define.
* tree.cc (excess_precision_type): Promote bfloat16_type_mode
like float16_type_mode.
(build_common_tree_nodes): Initialize bfloat16_type_node if
BFmode is supported.
* expmed.h (maybe_expand_shift): Declare.
* expmed.cc (maybe_expand_shift): No longer static.
* expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF
conversions. If there is no optab, handle BF -> {DF,XF,TF,HF}
conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add
-ffast-math generic implementation for BF -> SF and SF -> BF
conversions.
* builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING): New.
* builtins.def (BUILT_IN_NANSF16B): New builtin.
* fold-const-call.cc (fold_const_call): Handle CFN_BUILT_IN_NANSF16B.
* config/i386/i386.cc (classify_argument): Handle E_BCmode.
(ix86_libgcc_floating_mode_supported_p): Also return true for BFmode
for -msse2.
(ix86_mangle_type): Mangle BFmode as DF16b.
(ix86_invalid_conversion, ix86_invalid_unary_op,
ix86_invalid_binary_op): Remove.
(TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP,
TARGET_INVALID_BINARY_OP): Don't redefine.
* config/i386/i386-builtins.cc (ix86_bf16_type_node): Remove.
(ix86_register_bf16_builtin_type): Use bfloat16_type_node rather than
ix86_bf16_type_node, only create it if still NULL.
* config/i386/i386-builtin-types.def (BFLOAT16): Likewise.
* config/i386/i386.md (cbranchbf4, cstorebf4): New expanders.
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): If bfloat16_type_node,
predefine __BFLT16_*__ macros and for C++23 also
__STDCPP_BFLOAT16_T__. Predefine bfloat16_type_node related
macros for -fbuilding-libgcc.
* c-lex.cc (interpret_float): Handle CPP_N_BFLOAT16.
gcc/c/
* c-typeck.cc (convert_arguments): Don't promote __bf16 to
double.
gcc/cp/
* cp-tree.h (extended_float_type_p): Return true for
bfloat16_type_node.
* typeck.cc (cp_compare_floating_point_conversion_ranks): Set
extended{1,2} if mv{1,2} is bfloat16_type_node. Adjust comment.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_bfloat16,
check_effective_target_bfloat16_runtime, add_options_for_bfloat16):
New.
* gcc.dg/torture/bfloat16-basic.c: New test.
* gcc.dg/torture/bfloat16-builtin.c: New test.
* gcc.dg/torture/bfloat16-builtin-issignaling-1.c: New test.
* gcc.dg/torture/bfloat16-complex.c: New test.
* gcc.dg/torture/builtin-issignaling-1.c: Allow to be includable
from bfloat16-builtin-issignaling-1.c.
* gcc.dg/torture/floatn-basic.h: Allow to be includable from
bfloat16-basic.c.
* gcc.target/i386/vect-bfloat16-typecheck_2.c: Adjust expected
diagnostics.
* gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Likewise.
* gcc.target/i386/vect-bfloat16-typecheck_1.c: Likewise.
* g++.target/i386/bfloat_cpp_typecheck.C: Likewise.
libcpp/
* include/cpplib.h (CPP_N_BFLOAT16): Define.
* expr.cc (interpret_float_suffix): Handle bf16 and BF16 suffixes for
C++.
libgcc/
* config/i386/t-softfp (softfp_extensions): Add bfsf.
(softfp_truncations): Add tfbf xfbf dfbf sfbf hfbf.
(CFLAGS-extendbfsf2.c, CFLAGS-truncsfbf2.c, CFLAGS-truncdfbf2.c,
CFLAGS-truncxfbf2.c, CFLAGS-trunctfbf2.c, CFLAGS-trunchfbf2.c): Add
-msse2.
* config/i386/libgcc-glibc.ver (GCC_13.0.0): Export
__extendbfsf2 and __trunc{s,d,x,t,h}fbf2.
* config/i386/sfp-machine.h (_FP_NANSIGN_B): Define.
* config/i386/64/sfp-machine.h (_FP_NANFRAC_B): Define.
* config/i386/32/sfp-machine.h (_FP_NANFRAC_B): Define.
* soft-fp/brain.h: New file.
* soft-fp/truncsfbf2.c: New file.
* soft-fp/truncdfbf2.c: New file.
* soft-fp/truncxfbf2.c: New file.
* soft-fp/trunctfbf2.c: New file.
* soft-fp/trunchfbf2.c: New file.
* soft-fp/truncbfhf2.c: New file.
* soft-fp/extendbfsf2.c: New file.
libiberty/
* cp-demangle.h (D_BUILTIN_TYPE_COUNT): Increment.
* cp-demangle.c (cplus_demangle_builtin_types): Add std::bfloat16_t
entry.
(cplus_demangle_type): Demangle DF16b.
* testsuite/demangle-expected (_Z3xxxDF16b): New test.
|
|
Missed one spot in the r13-3108-g146e45914032 change (my sed script
didn't expect nested []s).
2022-10-07 Jakub Jelinek <jakub@redhat.com>
* config/arc/linux-unwind.h (arc_fallback_frame_state): Use
fs->regs.how[X] instead of fs->regs.reg[X].how.
|
|
area in uw_frame_state_for
The following patch implements something that has Florian found as
low hanging fruit in our unwinder and has been discussed in the
https://gcc.gnu.org/wiki/cauldron2022#cauldron2022talks.inprocess_unwinding_bof
talk.
_Unwind_FrameState type seems to be (unlike the pre-GCC 3 frame_state
which has been part of ABI) private to unwind-dw2.c + unwind.inc it
includes, it is always defined on the stack of some entrypoints, initialized
by static uw_frame_state_for and the address of it is also passed to other
static functions or the static inlines handling machine dependent unwinding,
but it isn't fortunately passed to any callbacks or public functions, so I
think we can safely change it any time we want.
Florian mentioned that the structure is large even on x86_64, 384 bytes
there, starts with 328 bytes long element with frame_state_reg_info type
which then starts with an array with __LIBGCC_DWARF_FRAME_REGISTERS__ + 1
elements, each of them is 16 bytes long, on x86_64
__LIBGCC_DWARF_FRAME_REGISTERS__ is just 17 but even that is big, on say
riscv __LIBGCC_DWARF_FRAME_REGISTERS__ is I think 128, on powerpc 111,
on sh 153 etc. And, we memset to zero the whole fs variable with the
_Unwind_FrameState type at the start of the unwinding.
The reason why each element is 16 byte (on 64-bit arches) is that it
contains some pointer or pointer sized integer and then an enum (with just
7 different enumerators) + padding.
The following patch decreases it by moving the enum into a separate
array and using just one byte for each register in that second array.
We could compress it even more, say 4 bits per register, but I don't
want to uglify the code for it too much and make the accesses slower.
Furthermore, the clearing of the object can clear only thos how array
and members after it, because REG_UNSAVED enumerator (0) doesn't actually
need any pointer or pointer sized integer, it is just the other kinds
that need to have there something.
By doing this, on x86_64 the above numbers change to _Unwind_FrameState
type being now 264 bytes long, frame_state_reg_info 208 bytes and we
don't clear the first 144 bytes of the object, so the memset is 120 bytes,
so ~ 31% of the old clearing size. On riscv 64-bit assuming it has same
structure layout rules for the few types used there that would be
~ 2160 bytes of _Unwind_FrameState type before and ~ 1264 bytes after,
with the memset previously ~ 2160 bytes and after ~ 232 bytes after.
We've also talked about possibly adding a number of initially initialized
regs and initializing the rest lazily, but at least for x86_64 with
18 elements in the array that doesn't seem to be worth it anymore,
especially because return address column is 16 there and that is usually the
first thing to be touched. It might theory help with lots of registers if
they are usually untouched, but would uglify and complicate any stores to
how by having to check there for the not initialized yet cases and lazy
initialization, and similarly for all reads of how to do there if below
last initialized one, use how, otherwise imply REG_UNSAVED.
The disadvantage of the patch is that touching reg[x].loc and how[x]
now means 2 cachelines rather than one as before, and I admit beyond
bootstrap/regtest I haven't benchmarked it in any way.
2022-10-06 Jakub Jelinek <jakub@redhat.com>
* unwind-dw2.h (REG_UNSAVED, REG_SAVED_OFFSET, REG_SAVED_REG,
REG_SAVED_EXP, REG_SAVED_VAL_OFFSET, REG_SAVED_VAL_EXP,
REG_UNDEFINED): New anonymous enum, moved from inside of
struct frame_state_reg_info.
(struct frame_state_reg_info): Remove reg[].how element and the
anonymous enum there. Add how element.
* unwind-dw2.c: Include stddef.h.
(uw_frame_state_for): Don't clear first
offsetof (_Unwind_FrameState, regs.how[0]) bytes of *fs.
(execute_cfa_program, __frame_state_for, uw_update_context_1,
uw_update_context): Use fs->regs.how[X] instead of fs->regs.reg[X].how
or fs.regs.how[X] instead of fs.regs.reg[X].how.
* config/sh/linux-unwind.h (sh_fallback_frame_state): Likewise.
* config/bfin/linux-unwind.h (bfin_fallback_frame_state): Likewise.
* config/pa/linux-unwind.h (pa32_fallback_frame_state): Likewise.
* config/pa/hpux-unwind.h (UPDATE_FS_FOR_SAR, UPDATE_FS_FOR_GR,
UPDATE_FS_FOR_FR, UPDATE_FS_FOR_PC, pa_fallback_frame_state):
Likewise.
* config/alpha/vms-unwind.h (alpha_vms_fallback_frame_state):
Likewise.
* config/alpha/linux-unwind.h (alpha_fallback_frame_state): Likewise.
* config/arc/linux-unwind.h (arc_fallback_frame_state,
arc_frob_update_context): Likewise.
* config/riscv/linux-unwind.h (riscv_fallback_frame_state): Likewise.
* config/nios2/linux-unwind.h (NIOS2_REG): Likewise.
* config/nds32/linux-unwind.h (NDS32_PUT_FS_REG): Likewise.
* config/s390/tpf-unwind.h (s390_fallback_frame_state): Likewise.
* config/s390/linux-unwind.h (s390_fallback_frame_state): Likewise.
* config/sparc/sol2-unwind.h (sparc64_frob_update_context,
MD_FALLBACK_FRAME_STATE_FOR): Likewise.
* config/sparc/linux-unwind.h (sparc64_fallback_frame_state,
sparc64_frob_update_context, sparc_fallback_frame_state): Likewise.
* config/i386/sol2-unwind.h (x86_64_fallback_frame_state,
x86_fallback_frame_state): Likewise.
* config/i386/w32-unwind.h (i386_w32_fallback_frame_state): Likewise.
* config/i386/linux-unwind.h (x86_64_fallback_frame_state,
x86_fallback_frame_state): Likewise.
* config/i386/freebsd-unwind.h (x86_64_freebsd_fallback_frame_state):
Likewise.
* config/i386/dragonfly-unwind.h
(x86_64_dragonfly_fallback_frame_state): Likewise.
* config/i386/gnu-unwind.h (x86_gnu_fallback_frame_state): Likewise.
* config/csky/linux-unwind.h (csky_fallback_frame_state): Likewise.
* config/aarch64/linux-unwind.h (aarch64_fallback_frame_state):
Likewise.
* config/aarch64/freebsd-unwind.h
(aarch64_freebsd_fallback_frame_state): Likewise.
* config/aarch64/aarch64-unwind.h (aarch64_frob_update_context):
Likewise.
* config/or1k/linux-unwind.h (or1k_fallback_frame_state): Likewise.
* config/mips/linux-unwind.h (mips_fallback_frame_state): Likewise.
* config/loongarch/linux-unwind.h (loongarch_fallback_frame_state):
Likewise.
* config/m68k/linux-unwind.h (m68k_fallback_frame_state): Likewise.
* config/xtensa/linux-unwind.h (xtensa_fallback_frame_state):
Likewise.
* config/rs6000/darwin-fallback.c (set_offset): Likewise.
* config/rs6000/aix-unwind.h (MD_FROB_UPDATE_CONTEXT): Likewise.
* config/rs6000/linux-unwind.h (ppc_fallback_frame_state): Likewise.
* config/rs6000/freebsd-unwind.h (frob_update_context): Likewise.
|
|
Investigating the reasons for libgcc build failures in a canadian
context, orthogonally to the recent update of vxcrtstuff, exposed
interesting differences in the way include search paths are managed
between a regular Linux->VxWorks cross build and a canadian setup
building a Windows->VxWorks toolchain in a Linux environment.
This change augments the comment attached to LIBGCC2_INCLUDE in
libgcc/config/t-vxworks to better describe the parameters at play.
It also adjusts the addition of options for gcc/include and
gcc/include-fixed to minimize the actual differences for libgcc
in the two kinds of configurations.
2022-03-06 Olivier Hainque <hainque@adacore.com>
libgcc/
* config/t-vxworks (LIBGCC2_INCLUDE): Augment comment. Move
-I options for gcc/include and gcc/include-fixed at the end
and make them -isystem.
|
|
Within gthr-vxworks.h, we prevent C++ errors from missing
declarations in some system headers by prepending their inclusion
with a
#pragma GCC diagnostic ignored "-Wstrict-prototypes"
But Wstrict-prototypes is internally registered as valid for
C/ObjC only, not C++, and this trick in turn triggers a Wpragma
warning with -Wsystem-headers.
This change just arranges to ignore the secondary warning locally.
2021-02-03 Olivier Hainque <hainque@adacore.com>
* config/gthr-vxworks.h: Prevent Wpragma warning for the
pragma diagnostics on Wstrict-prototypes.
|
|
This change augments the comment attached to the use of auto-host.h
in vxcrtstuff.c to better describe the reason for including it and
for the associated series of #undef directives.
It also augments the comment on dso_handle and removes a redundant
guard on HAVE_INITFINI_ARRAY_SUPPORT for the shared version of the
objects, nested within a section guarded on USE_INITFINI_ARRAY.
2022-09-29 Olivier Hainque <hainque@adacore.com>
libgcc/
* config/vxcrtstuff.c: Improve the comment attached to the use
of auto-host.h and of __dso_handle. Remove redundant guard on
HAVE_INITFINI_ARRAY_SUPPORT within a USE_INITFINI_ARRAY section.
|
|
|
|
this patch fixed PR target/99184 which incorrectly rounded during 64-bit
(long) double to 16-bit and 32-bit integers.
The patch just removes the respective roundings from
libf7-asm.sx::to_integer and ::to_unsigned. Luckily, LibF7 does nowhere
use respective functions internally, the only user is in libf7.c::f7_exp
which reads
f7_round (qq, qq);
int16_t q = f7_get_s16 (qq);
so that f7_get_s16() operates on an already rounded value, and therefore
this code works unaltered with or without rounding in to_integer.
PR target/99184
libgcc/config/avr/libf7/
* libf7-asm.sx (to_integer, to_unsigned): Don't round 16-bit
and 32-bit integers.
|
|
contrib/ChangeLog:
* config-list.mk: Remove cr16.
gcc/ChangeLog:
* doc/extend.texi: Remove cr16 related stuff.
* doc/install.texi: Likewise.
* doc/invoke.texi: Likewise.
* doc/md.texi: Likewise.
* function-tests.cc (test_expansion_to_rtl): Likewise.
* common/config/cr16/cr16-common.cc: Removed.
* config/cr16/constraints.md: Removed.
* config/cr16/cr16-protos.h: Removed.
* config/cr16/cr16.cc: Removed.
* config/cr16/cr16.h: Removed.
* config/cr16/cr16.md: Removed.
* config/cr16/cr16.opt: Removed.
* config/cr16/predicates.md: Removed.
* config/cr16/t-cr16: Removed.
libgcc/ChangeLog:
* config.host: Remove cr16 related stuff.
* config/cr16/crti.S: Removed.
* config/cr16/crtlibid.S: Removed.
* config/cr16/crtn.S: Removed.
* config/cr16/divmodhi3.c: Removed.
* config/cr16/lib1funcs.S: Removed.
* config/cr16/t-cr16: Removed.
* config/cr16/t-crtlibid: Removed.
* config/cr16/unwind-cr16.c: Removed.
* config/cr16/unwind-dw2.h: Removed.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp: Remove cr16 related stuff.
|
|
RISC-V decide use _Float16 as primary IEEE half precision type, and this
already become part of psABI, this patch has added folloing support for
_Float16:
- Soft-float support for _Float16.
- Make sure _Float16 available on C++ mode.
- Name mangling for _Float16 on C++ mode.
gcc/ChangeLog
* config/riscv/riscv-builtins.cc: include stringpool.h
(riscv_float16_type_node): New.
(riscv_init_builtin_types): Ditto.
(riscv_init_builtins): Call riscv_init_builtin_types.
* config/riscv/riscv-modes.def (HF): New.
* config/riscv/riscv.cc (riscv_output_move): Handle HFmode.
(riscv_mangle_type): New.
(riscv_scalar_mode_supported_p): Ditto.
(riscv_libgcc_floating_mode_supported_p): Ditto.
(riscv_excess_precision): Ditto.
(riscv_floatn_mode): Ditto.
(riscv_init_libfuncs): Ditto.
(TARGET_MANGLE_TYPE): Ditto.
(TARGET_SCALAR_MODE_SUPPORTED_P): Ditto.
(TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Ditto.
(TARGET_INIT_LIBFUNCS): Ditto.
(TARGET_C_EXCESS_PRECISION): Ditto.
(TARGET_FLOATN_MODE): Ditto.
* config/riscv/riscv.md (mode): Add HF.
(softload): Add HF.
(softstore): Ditto.
(fmt): Ditto.
(UNITMODE): Ditto.
(movhf): New.
(*movhf_softfloat): New.
libgcc/ChangeLog:
* config/riscv/sfp-machine.h (_FP_NANFRAC_H): New.
(_FP_NANFRAC_H): Ditto.
(_FP_NANSIGN_H): Ditto.
* config/riscv/t-softfp32 (softfp_extensions): Add HF related
routines.
(softfp_truncations): Ditto.
(softfp_extras): Ditto.
* config/riscv/t-softfp64 (softfp_extras): Add HF related routines.
gcc/testsuite/ChangeLog:
* g++.target/riscv/_Float16.C: New.
* gcc.target/riscv/_Float16-soft-1.c: Ditto.
* gcc.target/riscv/_Float16-soft-2.c: Ditto.
* gcc.target/riscv/_Float16-soft-3.c: Ditto.
* gcc.target/riscv/_Float16-soft-4.c: Ditto.
* gcc.target/riscv/_Float16.c: Ditto.
|
|
The ARC soft udivmodsi4 algorithm and as well as using umodsi3
for reduced register set configurations are wrong.
libgcc/
* config/arc/lib2funcs.c (udivmodsi4): Update AND mask.
* config/arc/lib1funcs.S (umodsi3): Don't use it for RF16
configurations.
|
|
/
* MAINTAINERS: Remove tilegx and tilepro entries.
* configure.ac: Remove tilegx and tilepro stanzas.
* configure: Rebuilt.
contrib/
* config-list.mk: Remove tilegx and tilepro entries.
* gcc_update: Remove tilegx and tilepro entries.
gcc/
* common/config/tilegx/tilegx-common.cc: Removed.
* common/config/tilepro/tilepro-common.cc: Removed.
* config.gcc: Remove tilegx and tilepro entries.
* config/tilegx/constraints.md: Removed.
* config/tilegx/feedback.h: Removed.
* config/tilegx/linux.h: Removed.
* config/tilegx/mul-tables.cc: Removed.
* config/tilegx/predicates.md: Removed.
* config/tilegx/sync.md: Removed.
* config/tilegx/t-tilegx: Removed.
* config/tilegx/tilegx-builtins.h: Removed.
* config/tilegx/tilegx-c.cc: Removed.
* config/tilegx/tilegx-generic.md: Removed.
* config/tilegx/tilegx-modes.def: Removed.
* config/tilegx/tilegx-multiply.h: Removed.
* config/tilegx/tilegx-opts.h: Removed.
* config/tilegx/tilegx-protos.h: Removed.
* config/tilegx/tilegx.cc: Removed.
* config/tilegx/tilegx.h: Removed.
* config/tilegx/tilegx.md: Removed.
* config/tilegx/tilegx.opt: Removed.
* config/tilepro/constraints.md: Removed.
* config/tilepro/feedback.h: Removed.
* config/tilepro/gen-mul-tables.cc: Removed.
* config/tilepro/linux.h: Removed.
* config/tilepro/mul-tables.cc: Removed.
* config/tilepro/predicates.md: Removed.
* config/tilepro/t-tilepro: Removed.
* config/tilepro/tilepro-builtins.h: Removed.
* config/tilepro/tilepro-c.cc: Removed.
* config/tilepro/tilepro-generic.md: Removed.
* config/tilepro/tilepro-modes.def: Removed.
* config/tilepro/tilepro-multiply.h: Removed.
* config/tilepro/tilepro-protos.h: Removed.
* config/tilepro/tilepro.cc: Removed.
* config/tilepro/tilepro.h: Removed.
* config/tilepro/tilepro.md: Removed.
* config/tilepro/tilepro.opt: Removed.
* configure.ac: Remove tilegx and tilepro entries.
* configure: Rebuilt.
* doc/extend.texi: Remove tilegx and tilepro entries.
* doc/install.texi: Remove tilegx and tilepro entries.
* doc/invoke.texi: Remove tilegx and tilepro entries.
* doc/md.texi: Remove tilegx and tilepro entries.
gcc/testsuite/
* gcc.dg/lower-subreg-1.c: Remove tilegx and tilepro entries.
* gcc.misc-tests/linkage.exp: Remove tilegx and
tilepro entries.
libgcc/
* config.host: Removed tilegx and tilepro entries.
* config/tilegx/sfp-machine.h: Removed.
* config/tilegx/sfp-machine32.h: Removed.
* config/tilegx/sfp-machine64.h: Removed.
* config/tilegx/t-crtstuff: Removed.
* config/tilegx/t-softfp: Removed.
* config/tilegx/t-tilegx: Removed.
* config/tilepro/atomic.c: Removed.
* config/tilepro/atomic.h: Removed.
* config/tilepro/linux-unwind.h: Removed.
* config/tilepro/sfp-machine.h: Removed.
* config/tilepro/softdivide.c: Removed.
* config/tilepro/softmpy.S: Removed.
* config/tilepro/t-crtstuff: Removed.
* config/tilepro/t-tilepro: Removed.
|
|
> (clrsb:m x)
> Represents the number of redundant leading sign bits in x, represented
> as an integer of mode m, starting at the most significant bit position.
This explanation is just what the NSA instruction (not ever emitted before)
calculates in Xtensa ISA.
gcc/ChangeLog:
* config/xtensa/xtensa.md (clrsbsi2): New insn pattern.
libgcc/ChangeLog:
* config/xtensa/lib1funcs.S (__clrsbsi2): New function.
* config/xtensa/t-xtensa (LIB1ASMFUNCS): Add _clrsbsi2.
|