Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
This resolves GCN:
ld: error: undefined symbol: _Unwind_RaiseException
>>> referenced by eh_throw.cc:93 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_throw.cc:93)
>>> eh_throw.o:(__cxa_throw) in archive /srv/data/tschwinge/amd-instinct2/gcc/build/submit-light-target_gcn/build-gcc/amdgcn-amdhsa/gfx908/libstdc++-v3/src/.libs/libstdc++.a
[...]
collect2: error: ld returned 1 exit status
..., and/or:
ld: error: undefined symbol: _Unwind_Resume_or_Rethrow
>>> referenced by eh_throw.cc:129 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_throw.cc:129)
>>> eh_throw.o:(__cxa_rethrow) in archive /srv/data/tschwinge/amd-instinct2/gcc/build/submit-light-target_gcn/build-gcc/amdgcn-amdhsa/gfx908/libstdc++-v3/src/.libs/libstdc++.a
[...]
collect2: error: ld returned 1 exit status
..., and nvptx:
unresolved symbol _Unwind_RaiseException
collect2: error: ld returned 1 exit status
..., or:
unresolved symbol _Unwind_Resume_or_Rethrow
collect2: error: ld returned 1 exit status
For both GCN, nvptx, this each progresses ~25 'check-gcc-c++',
and ~10 'check-target-libstdc++-v3' test cases:
[-FAIL:-]{+PASS:+} [...] (test for excess errors)
..., with (if applicable, for most of them):
[-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}
..., or some 'FAIL: [...] execution test' where these test cases now FAIL when
attempting to use these interfaces, or, if applicable, FAIL due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.
libgcc/
* config/gcn/unwind-gcn.c (_Unwind_RaiseException)
(_Unwind_Resume_or_Rethrow): New.
* config/nvptx/unwind-nvptx.c (_Unwind_RaiseException)
(_Unwind_Resume_or_Rethrow): Likewise.
|
|
This resolves GCN:
ld: error: undefined symbol: _Unwind_DeleteException
>>> referenced by eh_catch.cc:109 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_catch.cc:109)
>>> eh_catch.o:(__cxa_end_catch) in archive [...]/build-gcc/amdgcn-amdhsa/libstdc++-v3/src/.libs/libstdc++.a
[...]
collect2: error: ld returned 1 exit status
..., and nvptx:
unresolved symbol _Unwind_DeleteException
collect2: error: ld returned 1 exit status
For both GCN, nvptx, this each progresses ~100 'check-gcc-c++',
and ~500 'check-target-libstdc++-v3' test cases:
[-FAIL:-]{+PASS:+} [...] (test for excess errors)
..., with (if applicable, for most of them):
[-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}
..., or just a few 'FAIL: [...] execution test' where these test cases now
FAIL for unrelated reasons, or, if applicable, FAIL due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.
libgcc/
* config/gcn/unwind-gcn.c (_Unwind_DeleteException): New.
* config/nvptx/unwind-nvptx.c (_Unwind_DeleteException): Likewise.
|
|
|
|
Follow-up to commit 1146410c0feb0e82c689b1333fdf530a2b34dc2b
"nvptx: Support '-mfake-ptx-alloca'". '-mfake-ptx-alloca' is applicable only
for configurations where PTX 'alloca' is not supported, where target libraries
are built with it enabled (that is, libstdc++, libgfortran).
This change progresses:
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++26 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
UNSUPPORTED: g++.dg/tree-ssa/pr20458.C -std=gnu++98: exception handling not supported
..., and "enables" a few test cases:
FAIL: g++.old-deja/g++.other/sibcall1.C -std=gnu++17 (test for excess errors)
[Etc.]
FAIL: g++.old-deja/g++.other/unchanging1.C -std=gnu++17 (test for excess errors)
[Etc.]
..., which now (unrelatedly to 'alloca', and in the same way as configurations
where PTX 'alloca' is supported) FAIL due to:
unresolved symbol _Unwind_DeleteException
collect2: error: ld returned 1 exit status
Most importantly, it progresses ~830 libstdc++ test cases:
[-FAIL:-]{+PASS:+} [...] (test for excess errors)
..., with (if applicable, for most of them):
[-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}
..., or just a few 'FAIL: [...] execution test' where these test cases also
FAIL in configurations where PTX 'alloca' is supported, or ~120 instances of
'FAIL: [...] execution test' due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.
This change also resolves the cases noted in
commit bac2d8a246892334e24dfa7d62be0cd0648c5606
"nvptx: Build libgfortran with '-mfake-ptx-alloca' [PR107635]":
| With '-mfake-ptx-alloca', libgfortran again succeeds to build, and compared
| to before, we've got only a small number of regressions due to nvptx 'ld'
| complaining about 'unresolved symbol __GCC_nvptx__PTX_alloca_not_supported':
|
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single [-execution test-]{+compilation failed to produce executable+}
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single [-compilation failed to produce executable-]{+execution test+}
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single [-execution test-]{+compilation failed to produce executable+}
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single [-compilation failed to produce executable-]{+execution test+}
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single [-execution test-]{+compilation failed to produce executable+}
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single [-compilation failed to produce executable-]{+execution test+}
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray_43.f90 -O (test for excess errors)
[-FAIL:-]{+PASS:+} gfortran.dg/coarray_43.f90 -O (test for excess errors)
..., and further progresses:
[-FAIL:-]{+PASS:+} gfortran.dg/coarray_lib_comm_1.f90 -O0 (test for excess errors)
[-UNRESOLVED:-]{+FAIL:+} gfortran.dg/coarray_lib_comm_1.f90 -O0 [-compilation failed to produce executable-]{+execution test+}
[Etc.]
..., which now (unrelatedly to 'alloca', and in the same way as configurations
where PTX 'alloca' is supported) FAILs due to:
error : Prototype doesn't match for '_gfortran_caf_transfer_between_remotes' in 'input file 9 at offset 159897', first defined in 'input file 9 at offset 159897'
error : Prototype doesn't match for '_gfortran_caf_stop_numeric' in 'input file 9 at offset 159897', first defined in 'input file 9 at offset 159897'
nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)
gcc/
* config/nvptx/nvptx.opt (-mfake-ptx-alloca): Update.
gcc/testsuite/
* gcc.target/nvptx/alloca-2-O0_-mfake-ptx-alloca.c: Adjust.
libgcc/
* config/nvptx/alloca.c: New.
* config/nvptx/t-nvptx (LIB2ADD): Add it.
|
|
When MUL is not available, then the __umulhisi3 and __mulhisi3
functions can use __mulhisi3_helper. This improves code size,
stack footprint and runtime on AVRrc.
libgcc/
* config/avr/lib1funcs.S (__mulhisi3, __umulhisi3): Use
__mulhisi3_helper for better performance on AVRrc.
|
|
|
|
With some minor changes, 8-bit and 16-bit fixed-point operations
can be supported on the reduced core.
libgcc/
* config/avr/t-avr (LIB1ASMFUNCS): Add (and remove from
FUNCS_notiny): _mulhisi3, _umulhisi3, _mulqq3, _mulhq3, _muluhq3,
_mulha3, _muluha3 _muluha3_round, _usmuluha3, _ssmulha3,
_divqq3, _udivuqq3, _divqq_helper, _divhq3, _udivuhq3.
_divha3 _udivuha3, _ssneg_2, _ssabs_1, _ssabs_2,
_mask1, _ret, _roundqq3 _rounduqq3,
_round_s2, _round_u2, _round_2_const, _addmask_2.
* config/avr/lib1funcs.S (__umulhisi3, __mulhisi3): Make
work on AVRrc.
* config/avr/lib1funcs-fixed.S: Build 8-bit and 16-bit functions
on AVRrc, too.
|
|
|
|
__umulhisi3 had an "rcall 1f" to save 6 bytes, which is an unreasonable
size gain vs. cycle cost. Just use the same code on all devices with MUL,
irrespective of program memory size.
libgcc/
* config/avr/lib1funcs.S (__umulhisi3) [Have MUL]: Reduce call
depth by 1.
|
|
|
|
There are many objects / functions that are not available on AVRrc,
the reduced core. The old way to exclude some objects for AVRrc
did not work properly since it tested for MULTIFLAGS.
This does not work for, say MULTIFLAGS = "-mmcu=avrtiny -mdouble=64".
This patch uses $(findstring avrtiny,$(MULTIDIR)) in the condition.
libgcc/
* config/avr/t-avr (LIB1ASMFUNCS, LIB2FUNCS_EXCLUDE):
Properly handle avrtiny.
libgcc/config/avr/libf7/
* t-libf7 (libgcc-objects): Only add objects when building
for non-AVRrc.
|
|
|
|
static local variables support"
GCN, nvptx now has libstdc++-v3/libsupc++ proper.
This reverts commit c0bf7ea189ecf252152fe15134f70f576bcd20b2.
|
|
|
|
Change AArch64 cpuinfo to follow the latest updates to the FMV spec [1]:
Remove FEAT_PREDRES and FEAT_LS64*. Preserve the ordering in enum CPUFeatures.
[1] https://github.com/ARM-software/acle/pull/382
gcc:
* common/config/aarch64/cpuinfo.h: Remove FEAT_PREDRES and FEAT_LS64*.
* config/aarch64/aarch64-option-extensions.def: Remove FMV support
for PREDRES.
libgcc:
* config/aarch64/cpuinfo.c (__init_cpu_features_constructor):
Remove FEAT_PREDRES and FEAT_LS64* support.
|
|
|
|
The following testcase shows a bug in unwind-dw2-btree.h.
In short, the header provides lock-free btree data structure (so no parent
link on nodes, both insertion and deletion are done in top-down walks
with some locking of just a few nodes at a time so that lookups can notice
concurrent modifications and retry, non-leaf (inner) nodes contain keys
which are initially the base address of the left-most leaf entry of the
following child (or all ones if there is none) minus one, insertion ensures
balancing of the tree to ensure [d/2, d] entries filled through aggressive
splitting if it sees a full tree while walking, deletion performs various
operations like merging neighbour trees, merging into parent or moving some
nodes from neighbour to the current one).
What differs from the textbook implementations is mostly that the leaf nodes
don't include just address as a key, but address range, address + size
(where we don't insert any ranges with zero size) and the lookups can be
performed for any address in the [address, address + size) range. The keys
on inner nodes are still just address-1, so the child covers all nodes
where addr <= key unless it is covered already in children to the left.
The user (static executables or JIT) should always ensure there is no
overlap in between any of the ranges.
In the testcase a bunch of insertions are done, always followed by one
removal, followed by one insertion of a range slightly different from the
removed one. E.g. in the first case [&code[0x50], &code[0x59]] range
is removed and then we insert [&code[0x4c], &code[0x53]] range instead.
This is valid, it doesn't overlap anything. But the problem is that some
non-leaf (inner) one used the &code[0x4f] key (after the 11 insertions
completely correctly). On removal, nothing adjusts the keys on the parent
nodes (it really can't in the top-down only walk, the keys could be many nodes
above it and unlike insertion, removal only knows the start address, doesn't
know the removed size and so will discover it only when reaching the leaf
node which contains it; plus even if it knew the address and size, it still
doesn't know what the second left-most leaf node will be (i.e. the one after
removal)). And on insertion, if nodes aren't split at a level, nothing
adjusts the inner keys either. If a range is inserted and is either fully
bellow key (keys are - 1, so having address + size - 1 being equal to key is
fine) or fully after key (i.e. address > key), it works just fine, but if
the key is in a middle of the range like in this case, &code[0x4f] is in the
middle of the [&code[0x4c], &code[0x53]] range, then insertion works fine
(we only use size on the leaf nodes), and lookup of the addresses below
the key work fine too (i.e. [&code[0x4c], &code[0x4f]] will succeed).
The problem is with lookups after the key (i.e. [&code[0x50, &code[0x53]]),
the lookup looks for them in different children of the btree and doesn't
find an entry and returns NULL.
As users need to ensure non-overlapping entries at any time, the following
patch fixes it by adjusting keys during insertion where we know not just
the address but also size; if we find during the top-down walk a key
which is in the middle of the range being inserted, we simply increase the
key to be equal to address + size - 1 of the range being inserted.
There can't be any existing leaf nodes overlapping the range in correct
programs and the btree rebalancing done on deletion ensures we don't have
any empty nodes which would also cause problems.
The patch adjusts the keys in two spots, once for the current node being
walked (the last hunk in the header, with large comment trying to explain
it) and once during inner node splitting in a parent node if we'd otherwise
try to add that key in the middle of the range being inserted into the
parent node (in that case it would be missed in the last hunk).
The testcase covers both of those spots, so succeeds with GCC 12 (which
didn't have btrees) and fails with vanilla GCC trunk and also fails if
either the
if (fence < base + size - 1)
fence = iter->content.children[slot].separator = base + size - 1;
or
if (left_fence >= target && left_fence < target + size - 1)
left_fence = target + size - 1;
hunk is removed (of course, only with the current node sizes, i.e. up to
15 children of inner nodes and up to 10 entries in leaf nodes).
2025-03-10 Jakub Jelinek <jakub@redhat.com>
Michael Leuchtenburg <michael@slashhome.org>
PR libgcc/119151
* unwind-dw2-btree.h (btree_split_inner): Add size argument. If
left_fence is in the middle of [target,target + size - 1] range,
increase it to target + size - 1.
(btree_insert): Adjust btree_split_inner caller. If fence is smaller
than base + size - 1, increase it and separator of the slot to
base + size - 1.
* gcc.dg/pr119151.c: New test.
|
|
Studying unwind-dw2-btree.h was really hard for me because
the formatting is wrong or weird in many ways all around the code
and that kept distracting my attention.
That includes all kinds of things, including wrong indentation, using
{} around single statement substatements, excessive use of ()s around
some parts of expressions which don't increase code clarity, no space
after dot in comments, some comments not starting with capital letters,
some not ending with dot, adding {} around some parts of code without
any obvious reason (and when it isn't done in a similar neighboring
function) or ( at the end of line without any reason.
The following patch fixes the formatting issues I found, no functional
changes.
2025-03-10 Jakub Jelinek <jakub@redhat.com>
* unwind-dw2-btree.h: Formatting fixes.
|
|
|
|
implementation instead of an external one
When INT_TYPE_SIZE < BITS_PER_WORD gcc emits a call to an external ffs()
implementation instead of a call to "__builtin_ffs()" – see function
init_optabs() in <SRCROOT>/gcc/optabs-libfuncs.cc. External ffs()
(which is usually the one from newlib) in turn calls __builtin_ffs()
what causes infinite recursion and stack overflow. This patch overrides
default gcc bahaviour for H8/300H (and newer) and provides a generic
ffs() implementation for HImode.
PR target/114222
gcc/ChangeLog:
* config/h8300/h8300.cc (h8300_init_libfuncs): For HImode override
calls to external ffs() (from newlib) with calls to __ffshi2() from
libgcc. The implementation of ffs() in newlib calls __builtin_ffs()
what causes infinite recursion and finally a stack overflow.
libgcc/ChangeLog:
* config/h8300/t-h8300: Add __ffshi2().
* config/h8300/ffshi2.c: New file.
|
|
|
|
When gcc is built for x86_64-linux-musl target, stack unwinding from
within signal handler stops at the innermost signal frame. The reason
for this behaviro is that the signal trampoline is not accompanied with
appropiate CFI directives, and the fallback path in libgcc to recognize
it by the code sequence is only enabled for glibc except 2.0. The
latter is motivated by the lack of sys/ucontext.h in that glibc version.
Given that all relevant libc-s ship sys/ucontext.h for over a decade,
and that other arches aren't shy of unconditionally using it, follow
suit and remove the preprocessor condition, too.
libgcc/ChangeLog:
* config/i386/linux-unwind.h: Remove preprocessor
condition to enable fallback path for all libc-s.
Signed-off-by: Roman Kagan <rkagan@amazon.de>
|
|
|
|
[PR118844].
Due to the presence of R_LARCH_B26 in
/usr/lib/gcc/loongarch64-linux-gnu/14/crtbeginS.o, its addressing
range is [PC-128MiB, PC+128MiB-4]. This means that when the code
segment size exceeds 128MB, linking with lld will definitely fail
(ld will not fail because the order of the two is different).
The linking order:
lld: crtbeginS.o + .text + .plt
ld : .plt + crtbeginS.o + .text
To solve this issue, add '-mcmodel=extreme' when compiling crtbeginS.o.
PR target/118844
libgcc/ChangeLog:
* config/loongarch/t-crtstuff: Add '-mcmodel=extreme'
to CRTSTUFF_T_CFLAGS_S.
|
|
|
|
As discussed from RISC-V C-API PR #101 [1], As discussed in #96, current
interface is insufficient to support some cases, like a vendor buying a
CPU IP from the upstream vendor but using their own mvendorid and custom
features from the upstream vendor. In this case, we might need to add
these extensions for each downstream vendor many times. Thus, making
__riscv_vendor_feature_bits guarded by mvendorid is not a good idea. So,
drop __riscv_vendor_feature_bits for now, and we should have time to
discuss a better solution.
[1] https://github.com/riscv-non-isa/riscv-c-api-doc/pull/101
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* config/riscv/riscv-feature-bits.h (RISCV_VENDOR_FEATURE_BITS_LENGTH): Drop.
(struct riscv_vendor_feature_bits): Drop.
libgcc/ChangeLog:
* config/riscv/feature_bits.c (RISCV_VENDOR_FEATURE_BITS_LENGTH): Drop.
(__init_riscv_features_bits_linux): Drop.
|
|
|
|
Add crtbeginT.o to extra_parts on FreeBSD. This ensures we use GCC's
crt objects for static linking. Otherwise it could mix crtbeginT.o
from the base system with libgcc's crtend.o, possibly leading to
segfaults.
libgcc:
PR target/118685
* config.host (*-*-freebsd*): Add crtbeginT.o to extra_parts.
Signed-off-by: Dimitry Andric <dimitry@andric.com>
|
|
|
|
2025-02-07 Peter Bergner <bergner@linux.ibm.com>
libgcc/
PR target/117674
* config/rs6000/linux-unwind.h (ppc_backchain_fallback): Add cast to
avoid comparison between pointer and integer warning.
|
|
|
|
This patch adds built-in functions __builtin_avr_strlen_flash,
__builtin_avr_strlen_flashx and __builtin_avr_strlen_memx.
Purpose is that higher-level functions can use __builtin_constant_p
on strlen without raising a diagnostic due to -Waddr-space-convert.
gcc/
* config/avr/builtins.def (STRLEN_FLASH, STRLEN_FLASHX)
(STRLEN_MEMX): New DEF_BUILTIN's.
* config/avr/avr.cc (avr_ftype_strlen): New static function.
(avr_builtin_supported_p): New built-ins are not for AVR_TINY.
(avr_init_builtins) <strlen_flash_node, strlen_flashx_node,
strlen_memx_node>: Provide new fntypes.
(avr_fold_builtin) [AVR_BUILTIN_STRLEN_FLASH]
[AVR_BUILTIN_STRLEN_FLASHX, AVR_BUILTIN_STRLEN_MEMX]: Fold if
possible.
* doc/extend.texi (AVR Built-in Functions): Document
__builtin_avr_strlen_flash, __builtin_avr_strlen_flashx,
__builtin_avr_strlen_memx.
libgcc/
* config/avr/t-avr (LIB1ASMFUNCS): Add _strlen_memx.
* config/avr/lib1funcs.S <L_strlen_memx, __strlen_memx>: Implement.
|
|
|
|
The arm-none-eabi port provides some alternative implementations of
__sync_synchronize for different implementations of the architecture.
These can be selected using one of -specs=sync-{none,dmb,cp15dmb}.specs.
These specs fragments fail, however, when LTO is used because they
unconditionally add a --defsym=__sync_synchronize=<implementation> to
the linker arguments and that fails if libgcc is not added to the list
of libraries.
Fix this by only adding the defsym if libgcc will be passed to the
linker.
libgcc/
PR target/118642
* config/arm/sync-none.specs (link): Only add the defsym if
libgcc will be used.
* config/arm/sync-dmb.specs: Likewise.
* config/arm/sync-cp15dmb.specs: Likewise.
|
|
|
|
gcc/ChangeLog:
* config/riscv/riscv.cc
(riscv_file_end): Add .note.gnu.property.
libgcc/ChangeLog:
* config/riscv/crti.S: Add lpad instructions.
* config/riscv/crtn.S: Likewise.
* config/riscv/save-restore.S: Likewise.
* config/riscv/riscv-asm.h: Add GNU_PROPERTY for ZICFILP,
ZICFISS.
Co-Developed-by: Jesse Huang <jesse.huang@sifive.com>
|
|
This patch is implemented according to the RISC-V CFI specification.
It supports the generation of shadow stack instructions in the prologue,
epilogue, non-local gotos, and unwinding.
RISC-V CFI SPEC: https://github.com/riscv/riscv-cfi
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add ZICFISS ISA string.
* config/riscv/predicates.md: New predicate x1x5_operand.
* config/riscv/riscv.cc
(riscv_expand_prologue): Insert shadow stack instructions.
(riscv_expand_epilogue): Likewise.
(riscv_for_each_saved_reg): Assign t0 or ra register for
sspopchk instruction.
(need_shadow_stack_push_pop_p): New function. Omit shadow
stack operation on leaf function.
* config/riscv/riscv.h
(need_shadow_stack_push_pop_p): Define.
* config/riscv/riscv.md: Add shadow stack patterns.
(save_stack_nonlocal): Add shadow stack instructions for setjump.
(restore_stack_nonlocal): Add shadow stack instructions for longjump.
* config/riscv/riscv.opt (TARGET_ZICFISS): Define.
libgcc/ChangeLog:
* config/riscv/linux-unwind.h: Include shadow-stack-unwind.h.
* config/riscv/shadow-stack-unwind.h
(_Unwind_Frames_Extra): Define.
(_Unwind_Frames_Increment): Define.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/ssp-1.c: New test.
* gcc.target/riscv/ssp-2.c: New test.
Co-Developed-by: Greg McGary <gkm@rivosinc.com>,
Kito Cheng <kito.cheng@gmail.com>
|
|
|
|
Fix __extenddfxf2:
* Remove bogus denorm handling block which would never execute --
the converted exp value is always positive as EXCESSX > EXCESSD.
* Compute the whole significand in dl instead of doing part of it in
ldl.
* Mask off exponent from dl.l.upper so the denorm shift test
works.
* Insert the hidden one bit into dl.l.upper as needed.
Fix __truncxfdf2 denorm handling. All that is required is to shift the
significand right by the correct amount; it already has all of the
necessary bits set including the explicit one. Compute the shift
amount, then perform the wide shift across both elements of the
significand.
Fix __fixxfsi:
* The value was off by a factor of two as the significand contains
32 bits, not 31 so we need to shift by one more than the equivalent
code in __fixdfsi.
* Simplify the code having realized that the lower 32 bits of the
significand can never appear in the results.
Return positive qNaN instead of negative. For floats, qNaN is 0x7fff_ffff. For
doubles, qNaN is 0x7fff_ffff_ffff_ffff.
Return correctly signed zero on float and double divide underflow. This means
that Ld$underflow now expects d7 to contain the sign bit, just like the other
return paths.
libgcc/
* config/m68k/fpgnulib.c (extenddfxf2): Simplify code by removing code
that should never execute. Fix denorm shift test and insert hidden bit
as needed.
(__truncxfdf2): Properly compue and shift the significant right.
* config/m68k/lb1sf68.S (__fixxfsi): Correct shift counts and simplify.
(QUIET_NAN): Make it a positive quiet NaN and fix return values to inject
sign properly.
|
|
|
|
In the OpenRISC build we get the following warning:
ld: warning: __modsi3_s.o: missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
Fix this by adding a .note.GNU-stack to indicate the stack does not need to be
executable for the lib1funcs.
Note, this is also needed for the upcoming glibc 2.41.
libgcc/
* config/or1k/lib1funcs.S: Add .note.GNU-stack section on linux.
|
|
|
|
2024 -> 2025
|
|
|
|
This patch adds __flashx as a new named address space that allocates
objects in .progmemx.data. The handling is mostly the same or similar
to that of 24-bit space __memx, except that the asm routines are
simpler and more efficient. Loads are emit inline when ELPMX or
LPMX is available. The address space uses a 24-bit addresses even
on devices with a program memory size of 64 KiB or less.
PR target/118001
gcc/
* doc/extend.texi (AVR Named Address Spaces): Document __flashx.
* config/avr/avr.h (ADDR_SPACE_FLASHX): New enum value.
* config/avr/avr-protos.h (avr_out_fload, avr_mem_flashx_p)
(avr_fload_libgcc_p, avr_load_libgcc_mem_p)
(avr_load_libgcc_insn_p): New.
* config/avr/avr.cc (avr_addrspace): Add ADDR_SPACE_FLASHX.
(avr_decl_flashx_p, avr_mem_flashx_p, avr_fload_libgcc_p)
(avr_load_libgcc_mem_p, avr_load_libgcc_insn_p, avr_out_fload):
New functions.
(avr_adjust_insn_length) [ADJUST_LEN_FLOAD]: Handle case.
(avr_progmem_p) [avr_decl_flashx_p]: return 2.
(avr_addr_space_legitimate_address_p) [ADDR_SPACE_FLASHX]:
Has same behavior like ADDR_SPACE_MEMX.
(avr_addr_space_convert): Use pointer sizes rather then ASes.
(avr_addr_space_contains): New function.
(avr_convert_to_type): Use it.
(avr_emit_cpymemhi): Handle ADDR_SPACE_FLASHX.
* config/avr/avr.md (adjust_len) <fload>: New attr value.
(gen_load<mode>_libgcc): Renamed from load<mode>_libgcc.
(xload8<mode>_A): Iterate over MOVMODE rather than over ALL1.
(fxmov<mode>_A): New from xloadv<mode>_A.
(xmov<mode>_8): New from xload<mode>_A.
(fmov<mode>): New insns.
(fxload<mode>_A): New from xload<mode>_A.
(fxload_<mode>_libgcc): New from xload_<mode>_libgcc.
(*fxload_<mode>_libgcc): New from *xload_<mode>_libgcc.
(mov<mode>) [avr_mem_flashx_p]: Hande ADDR_SPACE_FLASHX.
(cpymemx_<mode>): Make sure the address space is not lost
when splitting.
(*cpymemx_<mode>) [ADDR_SPACE_FLASHX]: Use __movmemf_<mode> for asm.
(*ashlqi.1.zextpsi_split): New combine pattern.
* config/avr/predicates.md (nox_general_operand): Don't match
when avr_mem_flashx_p is true.
* config/avr/avr-passes.cc (AVR_LdSt_Props):
ADDR_SPACE_FLASHX has no post_inc.
gcc/testsuite/
* gcc.target/avr/torture/addr-space-1.h [AVR_HAVE_ELPM]:
Use a function to bump .progmemx.data to a high address.
* gcc.target/avr/torture/addr-space-2.h: Same.
* gcc.target/avr/torture/addr-space-1-fx.c: New test.
* gcc.target/avr/torture/addr-space-2-fx.c: New test.
libgcc/
* config/avr/t-avr (LIB1ASMFUNCS): Add _fload_1, _fload_2,
_fload_3, _fload_4, _movmemf.
* config/avr/lib1funcs.S (.branch_plus): New .macro.
(__xload_1, __xload_2, __xload_3, __xload_4): When the address is
located in flash, then forward to...
(__fload_1, __fload_2, __fload_3, __fload_4): ...these new
functions, respectively.
(__movmemx_hi): When the address is located in flash, forward to...
(__movmemf_hi): ...this new function.
|
|
|
|
Unlike crtoffload{begin,end}.o which just define some symbols at the start/end
of the various .gnu.offload* sections, crtoffloadtable.o contains
const void *const __OFFLOAD_TABLE__[]
__attribute__ ((__visibility__ ("hidden"))) =
{
&__offload_func_table, &__offload_funcs_end,
&__offload_var_table, &__offload_vars_end,
&__offload_ind_func_table, &__offload_ind_funcs_end,
};
The problem is that linking this into PIEs or shared libraries doesn't
work when it is compiled without -fpic/-fpie - __OFFLOAD_TABLE__ for non-PIC
code is put into .rodata section, but it really needs relocations, so for
PIC it should go into .data.rel.ro/.data.rel.ro.local.
As I think we don't want .data.rel.ro section in non-PIE binaries, this patch
follows the path of e.g. crtbegin.o vs. crtbeginS.o and adds crtoffloadtableS.o
next to crtoffloadtable.o, where crtoffloadtableS.o is compiled with -fpic.
2024-11-30 Jakub Jelinek <jakub@redhat.com>
PR libgomp/117851
gcc/
* lto-wrapper.cc (find_crtoffloadtable): Add PIE_OR_SHARED argument,
search for crtoffloadtableS.o rather than crtoffloadtable.o if
true.
(run_gcc): Add pie_or_shared variable. If OPT_pie or OPT_shared or
OPT_static_pie is seen, set pie_or_shared to true, if OPT_no_pie is
seen, set pie_or_shared to false. Pass it to find_crtoffloadtable.
libgcc/
* configure.ac (extra_parts): Add crtoffloadtableS.o.
* Makefile.in (crtoffloadtableS$(objext)): New goal.
* configure: Regenerated.
|
|
|
|
Including the "arm_acle.h" header in aarch64-unwind.h requires
stdint.h to be present and it may not be available during the
first stage of cross-compilation of GCC.
When cross-building GCC for the aarch64-none-linux-gnu target
(on any supporting host) using the 3-stage bootstrap build
process when we build native compiler from source, libgcc fails
to compile due to missing header that has not been installed yet.
This could be worked around but it's better to fix the issue.
libgcc/ChangeLog:
* config/aarch64/aarch64-unwind.h (_CHKFEAT_GCS): Add.
|