Age | Commit message (Collapse) | Author | Files | Lines |
|
Add wrapper headers that prevent vendor vector headers from including
system stdint.h, ensuring tests work correctly when multilib is disabled.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/andes_vector.h: New file.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfncvtbf16s.c
(#include): Use local andes_vector.h instead of system header.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfwcvtsbf16.c
(#include): Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/andes_vector.h: New file.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfncvtbf16s.c
(#include): Use local andes_vector.h instead of system header.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfwcvtsbf16.c
(#include): Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/andes_vector.h: New file.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfncvtbf16s.c
(#include): Use local andes_vector.h instead of system header.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfwcvtsbf16.c
(#include): Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/andes_vector.h: New file.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfncvtbf16s.c
(#include): Use local andes_vector.h instead of system header.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfwcvtsbf16.c
(#include): Likewise.
* gcc.target/riscv/rvv/xsfvector/sifive_vector.h: New file.
* gcc.target/riscv/rvv/xtheadvector/riscv_th_vector.h: New file.
* gcc.target/riscv/rvv/xtheadvector/riscv_vector.h: New file.
|
|
In case an asm operand is an error node, constraints etc. are still
validated. Furthermore, all other operands are gimplified, although an
error is returned in the end anyway. For hard register constraints an
operand is required in order to determine the mode from which the number
of registers follows. Therefore, instead of adding extra guards, bail
out early.
gcc/ChangeLog:
PR middle-end/121391
* gimplify.cc (gimplify_asm_expr): In case an asm operand is an
error node, bail out early.
gcc/testsuite/ChangeLog:
* gcc.dg/pr121391-1.c: New test.
* gcc.dg/pr121391-2.c: New test.
|
|
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
gcc/cp/ChangeLog:
* mangle.cc (write_real_cst): Replace 8 spaces with Tab.
|
|
2025-09-15 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/83763
* trans-decl.cc (gfc_trans_deferred_vars): Ensure that the
parameterized components of PDTs that do not have allocatable
components are deallocated on leaving scope.
* trans-expr.cc (gfc_trans_assignment_1): Do a dependency check
on PDT assignments. If there is a dependency between lhs and
rhs, deallocate the lhs parameterized components after the rhs
has been evaluated.
gcc/testsuite/
PR fortran/83763
* gfortran.dg/pdt_46.f03: New test.
|
|
|
|
With no longer visiting TREE_CHAIN for decls we have to visit
the DECL_ARGUMENT chain manually.
PR lto/121935
* ipa-free-lang-data.cc (find_decls_types_r): Visit DECL_ARGUMENTS
chain manually.
* g++.dg/lto/pr121935_0.C: New testcase.
|
|
This patch adds support for conditional expressions in Fortran 2023 for a
limited set of types (logical, numerical), and also includes limited support
for conditional arguments without `.nil.` support.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_expr): Add support for EXPR_CONDITIONAL.
* expr.cc (gfc_get_conditional_expr): Add cond-expr constructor.
(gfc_copy_expr, free_expr0, gfc_is_constant_expr,
simplify_conditional, gfc_simplify_expr, gfc_check_init_expr,
check_restricted, gfc_traverse_expr): Add support for EXPR_CONDITIONAL.
* frontend-passes.cc (gfc_expr_walker): Ditto.
* gfortran.h (enum expr_t): Add EXPR_CONDITIONAL.
(gfc_get_operator_expr): Format fix.
(gfc_get_conditional_expr): New decl.
* matchexp.cc
(match_conditional, match_primary): Parsing for EXPR_CONDITIONAL.
* module.cc (mio_expr): Add support for EXPR_CONDITIONAL.
* resolve.cc (resolve_conditional, gfc_resolve_expr): Ditto.
* trans-array.cc (gfc_walk_conditional_expr, gfc_walk_subexpr): Ditto.
* trans-expr.cc
(gfc_conv_conditional_expr): Codegen for EXPR_CONDITIONAL.
(gfc_apply_interface_mapping_to_expr, gfc_conv_expr,
gfc_conv_expr_reference): Add support for EXPR_CONDITIONAL.
gcc/testsuite/ChangeLog:
* gfortran.dg/conditional_1.f90: New test.
* gfortran.dg/conditional_2.f90: New test.
* gfortran.dg/conditional_3.f90: New test.
* gfortran.dg/conditional_4.f90: New test.
* gfortran.dg/conditional_5.f90: New test.
* gfortran.dg/conditional_6.f90: New test.
* gfortran.dg/conditional_7.f90: New test.
* gfortran.dg/conditional_8.f90: New test.
* gfortran.dg/conditional_9.f90: New test.
|
|
This adds permute_info_type and removes the duplication from
vect_schedule_slp_node.
* tree-vectorizer.h (stmt_vec_info_type::permute_info_type): Add.
(vectorizable_slp_permutation): Declare.
* tree-vect-slp.cc (vectorizable_slp_permutation): Export.
(vect_slp_analyze_node_operations_1): Set permute_info_type
on permute nodes successfully analyzed.
(vect_schedule_slp_node): Dispatch to vect_transform_stmt
for all nodes.
* tree-vect-stmts.cc (vect_transform_stmt): Remove redundant
dump, handle permute_info_type.
* gcc.dg/vect/vect-reduc-chain-2.c: Adjust.
* gcc.dg/vect/vect-reduc-chain-3.c: Likewise.
|
|
The following makes us always use VMAT_STRIDED_SLP for negative
stride multi-element accesses. That handles falling back to
single element accesses transparently.
* tree-vect-stmts.cc (get_load_store_type): Use VMAT_STRIDED_SLP
for negative stride accesses when VMAT_CONTIGUOUS_REVERSE
isn't applicable.
|
|
The following tries to do vect_transform_slp_perm_load exactly
once during analysis and once during transform. There's a 2nd
case left during analysis in get_load_store_type. Temporarily
this records n_perms in the load-store info and verifies that
against the value computed at transform stage.
* tree-vectorizer.h (vect_load_store_data::n_perms): New.
* tree-vect-stmts.cc (vectorizable_load): Analyze
SLP_TREE_LOAD_PERMUTATION only once and remember n_perms.
Verify the transform-time n_perms against the value stored
during analysis.
|
|
|
|
gcc:
* target.def (dtors_from_cxa_atexit): Properly mark up
__cxa_atexit as code.
* doc/tm.texi: Regenerate.
|
|
It looks like we didn't have a test so far reaching this point which
changed with the new hard register constraint tests. Bootstrap and
regtest are still running on x86_64. If they succeed, ok for mainline?
-- >8 --
As noted by Sam in the PR, with checking enabled tests
gcc.target/i386/asm-hard-reg-{1,2}.c fail with an ICE. If an error is
detected in curr_insn_transform(), lra_asm_insn_error() is called and
deletes the current insn. However, afterwards processing continues with
the deleted insn and via lra_process_new_insns() we finally call recog()
for NOTE_INSN_DELETED which ICEs in case of a checking build. Thus, in
case of an error during curr_insn_transform() bail out and stop
processing.
gcc/ChangeLog:
PR rtl-optimization/121205
* lra-constraints.cc (curr_insn_transform): Stop processing on
error.
|
|
gcc:
* doc/invoke.texi (Optimize Options): Editorial changes around
-fprofile-partial-training.
|
|
Add the necessary register definitions for PRU, so that asm-hard-reg
tests can pass for PRU.
gcc/testsuite/ChangeLog:
* gcc.dg/asm-hard-reg-error-1.c: Enable test for PRU, and define
registers for PRU.
* gcc.dg/asm-hard-reg-error-4.c: Define hard regs for PRU.
* gcc.dg/asm-hard-reg-error-5.c: Ditto.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
N3517 (array subscripting without decay) has been added to C2y (via a
remote vote in May, not at a meeting). Implement this in GCC.
The conceptual change, that the array subscripting operator [] no
longer involves an array operand decaying to a pointer, is something
GCC has done for a very long time. The main effect in terms of what
is made possible in the language, subscripting a register array
(undefined behavior in C23 and before), was available as a GNU
extension, but only with constant indices. There is also a new
constraint that array indices must not be negative when they are
integer constant expressions and the array operand has array type
(negative indices are fine with pointers) - an access out of bounds of
an array (even when contained within a larger object) has undefined
behavior at runtime when not a constraint violation.
Thus, the previous GCC extension is adapted to allow the cases of
register arrays not previously allowed, clearing DECL_REGISTER on them
as needed (similar to what is done with register declarations of
structures with volatile members) and restricting the pedwarn to
pedwarn_c23. That pedwarn_c23 is also extended to cover the C23 case
of register compound literals (although not strictly needed since it
was undefined behavior rather than a constraint violation in C23).
The new error is added (only for flag_isoc2y) for negative array
indices with an operand of array type.
N3517 has some specific wording about the type of the result of
non-lvalue array element access. It's unclear what's actually desired
there in the case where the array element is itself of array type; see
C23 issue 1001 regarding types of qualified members of rvalue
structures and unions more generally. Rather than implementing the
specific wording about this in N3517, that is deferred until there's
an accepted resolution to issue 1001 and can be dealt with as part of
implementing such a resolution.
Nothing specific is done about the obsolescence in that paper of
writing index[array] or index[pointer] as opposed to array[index] or
pointer[index], although that seems like a reasonable enough thing to
warn about.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/c/
* c-typeck.cc (c_mark_addressable): New parameter
override_register.
(build_array_ref): Update calls to c_mark_addressable. Give error
in C2Y mode for negative array indices when array expression is an
array not a pointer. Use pedwarn_c23 for subscripting register
array; diagnose that also for register compound literal.
* c-tree.h (c_mark_addressable): Update prototype.
gcc/testsuite/
* gcc.dg/c23-array-negative-1.c, gcc.dg/c23-register-array-1.c,
gcc.dg/c23-register-array-2.c, gcc.dg/c23-register-array-3.c,
gcc.dg/c23-register-array-4.c, gcc.dg/c2y-array-negative-1.c,
gcc.dg/c2y-register-array-2.c, gcc.dg/c2y-register-array-3.c: New
tests.
|
|
|
|
Shreya's work to add the addptr pattern on the RISC-V port exposed a latent bug
in LRA.
We lazily allocate/reallocate the ira_reg_equiv structure and when we do
(re)allocation we'll over-allocate and zero-fill so that we don't have to
actually allocate and relocate the data so often.
In the case exposed by Shreya's work we had N requested entries at the last
rellocation step. We actually allocate N+M entries. During LRA we allocate
enough new pseudos and thus have N+M+1 pseudos.
In get_equiv we read ira_reg_equiv[regno] without bounds checking so we read
past the allocated part of the array and get back junk which we use and
depending on the precise contents we fault in various fun and interesting ways.
We could either arrange to re-allocate ira_reg_equiv again on some path through
LRA (possibly in get_equiv itself). We could also just insert the bounds check
in get_equiv like is done elsewhere in LRA. Vlad indicated no strong
preference in an email last week.
So this just adds the bounds check in a manner similar to what's done elsewhere
in LRA. Bootstrapped and regression tested on x86_64 as well as RISC-V with
Shreya's work enabled and regtested across the various embedded targets.
gcc/
* lra-constraints.cc (get_equiv): Bounds check before accessing
data in ira_reg_equiv.
|
|
This tentatively applies the same tweak to twin testcases.
gcc/testsuite/
PR ada/121532
* ada/acats-4/tests/cxa/cxai034.a: Use Long_Switch_To_New_Task
constant instead of Switch_To_New_Task in delay statements.
* ada/acats-4/tests/cxa/cxai035.a: Likewise.
* ada/acats-4/tests/cxa/cxai036.a: Likewise.
|
|
We weren't explicitly treating a pack index specifier as a non-deduced
context (as per [temp.deduct.type]/5), leading to an ICE for the first
testcase below.
PR c++/121795
gcc/cp/ChangeLog:
* pt.cc (unify) <case PACK_INDEX_TYPE>: New non-deduced context
case.
gcc/testsuite/ChangeLog:
* g++.dg/cpp26/pack-indexing17.C: New test.
* g++.dg/cpp26/pack-indexing17a.C: New test.
Reviewed-by: Marek Polacek <polacek@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
This patch contains testcases for PR120378 after the change made to
support the vnclipu variant of the SAT_TRUNC pattern.
PR target/120378
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr120378-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr120378-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr120378-3.c: New test.
* gcc.target/riscv/rvv/autovec/pr120378-4.c: New test.
Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
|
|
This patch tries to add support for a variant of SAT_TRUNC where
negative numbers are clipped to 0 instead of NARROW_TYPE_MAX_VALUE.
This form is seen in x264, aka
UT clip (T a)
{
return a & (UT)(-1) ? (-a) >> 31 : a;
}
Where sizeof(UT) < sizeof(T)
I'm unable to get the SAT_TRUNC pattern to appear on x86_64, however it
does appear when building for riscv as seen below:
Before this patch:
<bb 3> [local count: 764504183]:
# i_21 = PHI <i_14(8), 0(15)>
# vectp_x.10_54 = PHI <vectp_x.10_55(8), x_10(D)(15)>
# vectp_res.20_66 = PHI <vectp_res.20_67(8), res_11(D)(15)>
# ivtmp_70 = PHI <ivtmp_71(8), _69(15)>
_72 = .SELECT_VL (ivtmp_70, POLY_INT_CST [4, 4]);
_1 = (long unsigned int) i_21;
_2 = _1 * 4;
_3 = x_10(D) + _2;
ivtmp_53 = _72 * 4;
vect__4.12_57 = .MASK_LEN_LOAD (vectp_x.10_54, 32B, { -1, ... }, _56(D), _72, 0);
vect_x.13_58 = VIEW_CONVERT_EXPR<vector([4,4]) unsigned int>(vect__4.12_57);
vect__38.15_60 = -vect_x.13_58;
vect__15.16_61 = VIEW_CONVERT_EXPR<vector([4,4]) int>(vect__38.15_60);
vect__16.17_62 = vect__15.16_61 >> 31;
mask__29.14_59 = vect_x.13_58 > { 255, ... };
vect__17.18_63 = VEC_COND_EXPR <mask__29.14_59, vect__16.17_62, vect__4.12_57>;
vect__18.19_64 = (vector([4,4]) unsigned char) vect__17.18_63;
_4 = *_3;
_5 = res_11(D) + _1;
x.0_12 = (unsigned int) _4;
_38 = -x.0_12;
_15 = (int) _38;
_16 = _15 >> 31;
_29 = x.0_12 > 255;
_17 = _29 ? _16 : _4;
_18 = (unsigned char) _17;
.MASK_LEN_STORE (vectp_res.20_66, 8B, { -1, ... }, _72, 0, vect__18.19_64);
i_14 = i_21 + 1;
vectp_x.10_55 = vectp_x.10_54 + ivtmp_53;
vectp_res.20_67 = vectp_res.20_66 + _72;
ivtmp_71 = ivtmp_70 - _72;
if (ivtmp_71 != 0)
goto <bb 8>; [89.00%]
else
goto <bb 17>; [11.00%]
After this patch:
<bb 3> [local count: 764504183]:
# i_21 = PHI <i_14(8), 0(15)>
# vectp_x.10_68 = PHI <vectp_x.10_69(8), x_10(D)(15)>
# vectp_res.15_75 = PHI <vectp_res.15_76(8), res_11(D)(15)>
# ivtmp_79 = PHI <ivtmp_80(8), _78(15)>
_81 = .SELECT_VL (ivtmp_79, POLY_INT_CST [4, 4]);
_1 = (long unsigned int) i_21;
_2 = _1 * 4;
_3 = x_10(D) + _2;
ivtmp_67 = _81 * 4;
vect__4.12_71 = .MASK_LEN_LOAD (vectp_x.10_68, 32B, { -1, ... }, _70(D), _81, 0);
vect_patt_37.13_72 = MAX_EXPR <{ 0, ... }, vect__4.12_71>;
vect_patt_39.14_73 = .SAT_TRUNC (vect_patt_37.13_72);
_4 = *_3;
_5 = res_11(D) + _1;
x.0_12 = (unsigned int) _4;
_38 = -x.0_12;
_15 = (int) _38;
_16 = _15 >> 31;
_29 = x.0_12 > 255;
_17 = _29 ? _16 : _4;
_18 = (unsigned char) _17;
.MASK_LEN_STORE (vectp_res.15_75, 8B, { -1, ... }, _81, 0, vect_patt_39.14_73);
i_14 = i_21 + 1;
vectp_x.10_69 = vectp_x.10_68 + ivtmp_67;
vectp_res.15_76 = vectp_res.15_75 + _81;
ivtmp_80 = ivtmp_79 - _81;
if (ivtmp_80 != 0)
goto <bb 8>; [89.00%]
else
goto <bb 17>; [11.00%]
gcc/ChangeLog:
* match.pd: New NARROW_CLIP variant for SAT_TRUNC.
* tree-vect-patterns.cc (gimple_unsigned_integer_narrow_clip):
Add new decl for NARROW_CLIP.
(vect_recog_sat_trunc_pattern): Add NARROW_CLIP check.
Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
|
|
After
commit 8cad8f94b450be9b73d07bdeef7fa1778d3f2b96
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Sep 5 15:40:51 2025 -0700
c: Update TLS model after processing a TLS variable
GCC will upgrade local-dynamic TLS model to local-exec without -fPIC.
Compile TLS LD tests with -fPIC to keep local-dynamic TLS model.
PR testsuite/121888
* gcc.target/sparc/tls-ld-int16.c: Compile with -fPIC.
* gcc.target/sparc/tls-ld-int32.c: Likewise.
* gcc.target/sparc/tls-ld-int64.c: Likewise.
* gcc.target/sparc/tls-ld-int8.c: Likewise.
* gcc.target/sparc/tls-ld-uint16.c: Likewise.
* gcc.target/sparc/tls-ld-uint32.c: Likewise.
* gcc.target/sparc/tls-ld-uint8.c: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
gcc/ChangeLog:
PR diagnostics/120063
* diagnostics/context.cc (context::execution_failed_p): Also treat
any kind::fatal errors as leading to failed execution.
* diagnostics/sarif-sink.cc (maybe_get_sarif_level): Handle
kind::fatal as SARIF level "error".
gcc/testsuite/ChangeLog:
PR diagnostics/120063
* gcc.dg/fatal-error.c: New test.
* gcc.dg/fatal-error-html.py: New test.
* gcc.dg/fatal-error-sarif.py: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
PR diagnostics/121876 tracks an issue inside our crash-handling, where
if an ICE happens when we're within a nested diagnostic, an assertion
fails inside diagnostic::context::set_diagnostic_buffer, leading to
a 2nd ICE. Happily, this does not infinitely recurse, but it obscures
the original ICE and the useful part of the backtrace, and any SARIF or
HTML sinks we were writing to are left as empty files.
This patch tweaks the above so that the assertion doesn't fail, and adds
test coverage (via a plugin) to ensure that such ICEs/crashes are
gracefully handled and e.g. captured in SARIF/HTML output.
gcc/ChangeLog:
PR diagnostics/121876
* diagnostics/buffering.cc (context::set_diagnostic_buffer): Add
early reject of the no-op case.
gcc/testsuite/ChangeLog:
PR diagnostics/121876
* gcc.dg/plugin/crash-test-nested-ice-html.py: New test.
* gcc.dg/plugin/crash-test-nested-ice-sarif.py: New test.
* gcc.dg/plugin/crash-test-nested-ice.c: New test.
* gcc.dg/plugin/crash-test-nested-write-through-null-html.py: New test.
* gcc.dg/plugin/crash-test-nested-write-through-null-sarif.py: New test.
* gcc.dg/plugin/crash-test-nested-write-through-null.c: New test.
* gcc.dg/plugin/crash_test_plugin.cc: Add "nested" argument, and when
set, inject the problem within a nested diagnostic.
* gcc.dg/plugin/plugin.exp: Add crash-test-nested-ice.c and
crash-test-nested-write-through-null.c.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/crash-test-write-though-null-sarif.c: Rename to...
* gcc.dg/plugin/crash-test-write-through-null-sarif.c: ...this.
* gcc.dg/plugin/crash-test-write-though-null-stderr.c: Rename to...
* gcc.dg/plugin/crash-test-write-through-null-stderr.c: ...this.
* gcc.dg/plugin/plugin.exp: Update for above renamings. Sort the
test files for crash_test_plugin.cc alphabetically.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Another lp64 vs lp64d issue. This time adjusting a #include in the test isn't
sufficient. So instead this sets the ABI to lp64d instead of lp64. I don't
think that'll impact the test materially.
Tested on the BPI and Pioneer systems where it fixes the failures with the
Andes tests. Pushing to the trunk.
gcc/testsuite
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vd4dots.c:
Adjust ABI specification.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vd4dotsu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vd4dotu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfncvtbf16s.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfpmadb.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfpmadt.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfwcvtsbf16.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vln8.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vd4dots.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vd4dotsu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vd4dotu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfncvtbf16s.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfpmadb.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfpmadt.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfwcvtsbf16.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vln8.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vd4dots.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vd4dotsu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vd4dotu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfncvtbf16s.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfpmadb.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfpmadt.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfwcvtsbf16.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vln8.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vd4dots.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vd4dotsu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vd4dotu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfncvtbf16s.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfpmadb.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfpmadt.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfwcvtsbf16.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vln8.c:
Likewise.
|
|
My r16-3559-gc2e567a6edb563 reworked ADL for modules, including a change
to allow seeing module-linkage declarations if they only exist on the
instantiation path. This caused a crash however as I neglected to
unwrap the stat hack wrapper when we were happy to see all declarations,
allowing search_adl to add non-functions to the overload set.
PR c++/121893
gcc/cp/ChangeLog:
* name-lookup.cc (name_lookup::adl_namespace_fns): Unwrap the
STAT_HACK also when on_inst_path.
gcc/testsuite/ChangeLog:
* g++.dg/modules/adl-10_a.C: New test.
* g++.dg/modules/adl-10_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
[PR121865]
On a DECL, TREE_CHAIN will find any other declarations in the same
binding level. This caused an ICE in PR121865 because the next entity
in the binding level was the uninstantiated unique friend 'foo', for
which after being found the compiler tries to generate a mangled name
for it and crashes.
This didn't happen in non-modules testcases only because normally the
unique friend function would have been chained after its template_decl,
and find_decl_types_r bails on lang-specific nodes so it never saw the
uninstantiated decl. With modules however the order of chaining
changed, causing the error.
I don't think it's ever necessary to walk into the DECL_CHAIN, from what
I can see; other cases where it might be useful (block vars or type
fields) are already handled explicitly elsewhere, and only one test
fails because of the change, due to accidentally relying on this "walk
into the next in-scope declaration" behaviour.
PR c++/121865
gcc/ChangeLog:
* ipa-free-lang-data.cc (find_decls_types_r): Don't walk into
DECL_CHAIN for any DECL.
gcc/testsuite/ChangeLog:
* g++.dg/lto/pr101396_0.C: Ensure A will be walked into (and
isn't constant-folded out of the GIMPLE for the function).
* g++.dg/lto/pr101396_1.C: Add message.
* g++.dg/modules/lto-4_a.C: New test.
* g++.dg/modules/lto-4_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Richard Biener <rguenther@suse.de>
|
|
gcc/
* ipa-pure-const.cc (check_stmt): Minor formatting tweaks.
(pass_data_nothrow): Fix pasto in description.
|
|
comparison values
Given a sequence such as
int foo ()
{
#pragma GCC unroll 4
for (int i = 0; i < N; i++)
if (a[i] == 124)
return 1;
return 0;
}
where a[i] is long long, we will unroll the loop and use an OR reduction for
early break on Adv. SIMD. Afterwards the sequence is followed by a compression
sequence to compress the 128-bit vectors into 64-bits for use by the branch.
However if we have support for add halving and narrowing then we can instead of
using an OR, use an ADDHN which will do the combining and narrowing.
Note that for now I only do the last OR, however if we have more than one level
of unrolling we could technically chain them. I will revisit this in another
up coming early break series, however an unroll of 2 is fairly common.
gcc/ChangeLog:
* internal-fn.def (VEC_TRUNC_ADD_HIGH): New.
* doc/generic.texi: Document it.
* optabs.def (vec_trunc_add_high): New.
* doc/md.texi: Document it.
* tree-vect-stmts.cc (vectorizable_early_exit): Use addhn if supported.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vect-early-break-addhn_1.c: New test.
* gcc.target/aarch64/vect-early-break-addhn_2.c: New test.
* gcc.target/aarch64/vect-early-break-addhn_3.c: New test.
* gcc.target/aarch64/vect-early-break-addhn_4.c: New test.
|
|
This implements the new vector optabs vec_<su>addh_narrow<mode>
adding support for in-vectorizer use for early break.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (vec_addh_narrow<mode>): New.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vect-addhn_1.c: New test.
|
|
If the user has requested loop unrolling through pragma GCC unroll then at the
moment we only set LOOP_VINFO_USER_UNROLL if the vectorizer has not overrode the
unroll factor (through backend costing) or if the VF made the requested unroll
factor be 1.
When we have a loop of say int and a pragma unroll 4
If the vectorizer picks V4SI as the mode, the requested unroll ended up exactly
matching the VF. As such the requested unroll is 1 and we don't clear the pragma.
So it did honor the requested unroll factor. However since we didn't set the
unroll amount back and left it at 4 the rtl unroller won't use the rtl cost
model at all and just unroll the vector loop 4 times.
But of these events are costing related, and so it stands to reason that we
should set LOOP_VINFO_USER_UNROLL to we return the RTL unroller to use the
backend costing for any further unrolling.
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_1): If the unroll pragma was set
mark it as handled.
* doc/extend.texi (pragma GCC unroll): Update documentation.
|
|
|
|
Documentation for `__cmpsf2` and similar functions currently indicate a
return type of `int`. This is not correct however; the `libgcc`
functions return `CMPtype`, the size of which is determined by the
`libgcc_cmp_return` mode.
Update documentation to use `CMPtype` and indicate that this is
target-dependent, also mentioning the usual modes.
Reported-by: beetrees <b@beetr.ee>
Fixes: https://github.com/rust-lang/compiler-builtins/issues/919#issuecomment-2905347318
Signed-off-by: Trevor Gross <tmgross@umich.edu>
* doc/libgcc.texi (Comparison functions): Document functions as
returning CMPtype.
|
|
PR fortran/121616
gcc/fortran/ChangeLog:
* primary.cc (gfc_variable_attr): Properly set dimension attribute
from a component ref.
gcc/testsuite/ChangeLog:
* gfortran.dg/alloc_comp_assign_17.f90: New test.
|
|
-mno-direct-extern-access is used to disable direct access to external
symbol from executable with and without PIE for x86. Require PIE and
pass -fPIE to disable direct access to external symbol for other targets.
PR fortran/107421
PR testsuite/121848
* gfortran.dg/gomp/pr107421.f90: Require PIE and pass -fPIE for
non-x86 targets.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
Both C and C++ frontends should set a tentative TLS model in grokvardecl
and update TLS mode with the default TLS access model after a TLS variable
has been fully processed if the default TLS access model is stronger.
PR c/107419
PR c++/107393
* c-c++-common/tls-attr-common.c: New test.
* c-c++-common/tls-attr-le-pic.c: Likewise.
* c-c++-common/tls-attr-le-pie.c: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
Don't upgrade TLS model when cplus_decl_attributes is called on a thread
local variable whose TLS model isn't set yet.
gcc/cp/
PR c++/121889
* decl2.cc (cplus_decl_attributes): Don't upgrade TLS model if
TLS model isn't set yet.
gcc/testsuite/
PR c++/121889
* g++.dg/tls/pr121889.C: New test.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
Add an expander for isfinite using integer arithmetic. This is
typically faster and avoids generating spurious exceptions on
signaling NaNs. This fixes part of PR66462.
int isfinite1 (float x) { return __builtin_isfinite (x); }
Before:
fabs s0, s0
mov w0, 2139095039
fmov s31, w0
fcmp s0, s31
cset w0, hi
eor w0, w0, 1
ret
After:
fmov w1, s0
mov w0, -16777216
cmp w0, w1, lsl 1
cset w0, hi
ret
gcc:
PR middle-end/66462
* config/aarch64/aarch64.md (isfinite<mode>2): Add new expander.
gcc/testsuite:
PR middle-end/66462
* gcc.target/aarch64/pr66462.c: Add tests for isfinite.
|
|
With -fno-trapping-math it is safe to optimize fabs(a + 0.0) as
fabs (a).
PR tree-optimization/121595
* match.pd (fabs(a + 0.0) -> fabs (a)): Optimization pattern limited to
the -fno-trapping-math case.
* gcc.dg/fabs-plus-zero-1.c: New testcase.
* gcc.dg/fabs-plus-zero-2.c: Likewise.
Signed-off-by: Matteo Nicoli <matteo.nicoli001@gmail.com>
Reviewed-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
|
|
LSX and SCQ
Enable those tests so we won't make too stupid mistakes in 16B atomic
implementation anymore.
All these test passed on a Loongson 3C6000/S except
atomic-other-int128.c. With GDB patched to support sc.q
(https://sourceware.org/pipermail/gdb-patches/2025-August/220034.html)
this test also XPASS.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_loongarch_scq_hw): New.
(check_effective_target_sync_int_128_runtime): Return 1 on
loongarch64-*-* if hardware supports both LSX and SCQ.
* gcc.dg/atomic-compare-exchange-5.c: Pass -mlsx -mscq for
loongarch64-*-*.
* gcc.dg/atomic-exchange-5.c: Likewise.
* gcc.dg/atomic-load-5.c: Likewise.
* gcc.dg/atomic-op-5.c: Likewise.
* gcc.dg/atomic-store-5.c: Likewise.
* gcc.dg/atomic-store-6.c: Likewise.
* gcc.dg/simulate-thread/atomic-load-int128.c: Likewise.
* gcc.dg/simulate-thread/atomic-other-int128.c: Likewise.
(dg-final): xfail on loongarch64-*-* because gdb does not
handle sc.q properly yet.
|
|
In a CAS operation, even if expected != *memory we still need to do an
atomic load of *memory into output. But I made a mistake in the initial
implementation, causing the output to contain junk in this situation.
Like a normal atomic load, the atomic load embedded in the CAS semantic
is required to work on read-only page. Thus we cannot rely on sc.q to
ensure the atomicity of the load. Use LSX to perform the load instead,
and also use LSX to compare the 16B values to keep the ll-sc loop body
short.
gcc/ChangeLog:
* config/loongarch/sync.md (atomic_compare_and_swapti_scq):
Require LSX. Change the operands for the output, the memory,
and the expected value to LSX vector modes. Add a FCCmode
output to indicate if CAS has written the desired value into
memory. Use LSX to atomically load both words of the 16B value
in memory.
(atomic_compare_and_swapti): Pun the modes to satisify
the new atomic_compare_and_swapti_scq implementation. Read the
bool return value from the FCC instead of performing a
comparision.
|
|
This modifier is intended to output $r0 for (const_int 0), but the
logic:
GET_MODE (op) != TImode || (op != CONST0_RTX (TImode) && code != REG)
will reject (const_int 0) because (const_int 0) actually does not have
a mode and GET_MODE will return VOIDmode for it.
Use reg_or_0_operand instead to fix the issue.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_print_operand): Call
reg_or_0_operand for checking the sanity of %t.
|
|
The PR reports
vectorizer.h:276:3: runtime error: load of value 32695, which is not a valid value for type 'internal_fn'
which I believe is from
slp_node->data = new vect_load_store_data (std::move (ls));
where 'ls' can be partly uninitialized (and that data will be not
used, but of course the move CTOR doesn't know this). The following
tries to fix that by using value-initialization of 'ls'.
PR tree-optimization/121703
* tree-vect-stmts.cc (vectorizable_store): Value-initialize ls.
(vectorizable_load): Likewise.
|
|
In general, tail call optimization requires that the callee's saved
registers are a superset of the caller's.
The Standard Vector Calling Convention Variant (assembler: .variant_cc)
requires that a function with this calling convention preserves vector
registers v1-v7 and v24-v31 across calls (i.e. callee-saved). However,
the same set of registers are (function-local) temporary registers
(i.e. caller-saved) on the normal (non-vector) calling convention.
Even if a function with this calling convention variant calls another
function with a non-vector calling convention, those vector registers
are correctly clobbered -- except when the sibling (tail) call
optimization occurs as it violates the general rule mentioned above.
If this happens, following function body:
1. Save v1-v7 and v24-v31 for clobbering
2. Call another function with a non-vector calling convention
(which may destroy v1-v7 and/or v24-v31)
3. Restore v1-v7 and v24-v31
4. Return.
may be incorrectly optimized into the following sequence:
1. Save v1-v7 and v24-v31 for clobbering
2. Restore v1-v7 and v24-v31 (?!)
3. Jump to another function with a non-vector calling convention
(which may destroy v1-v7 and/or v24-v31).
This commit suppresses cross CC sibling call optimization from
the vector calling convention variant.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_function_ok_for_sibcall):
Suppress cross calling convention sibcall optimization from
the vector calling convention variant.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/abi-call-variant_cc-sibcall.c: New test.
* gcc.target/riscv/rvv/base/abi-call-variant_cc-sibcall-indirect-1.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-variant_cc-sibcall-indirect-2.c: Ditto.
|
|
When the vectorizer removes a forwarder created earlier by split_edge
it uses redirect_edge_pred for convenience and efficiency. That breaks
down when the edge split is originating from an asm goto as that is
a jump that needs adjustments from redirect_edge_and_branch. The
following factores a simple vect_remove_forwarder handling this
situation appropriately.
PR tree-optimization/121829
* cfgloopmanip.cc (create_preheader): Ensure we can insert
at the end of a preheader.
* gcc.dg/torture/pr121829.c: New testcase.
|
|
When a dead EH or abnormal edge makes a call queued for noreturn fixup
unreachable, just skip processing it.
PR tree-optimization/121870
* tree-ssa-propagate.cc
(substitute_and_fold_engine::substitute_and_fold): Skip
removed stmts from noreturn fixup.
* g++.dg/torture/pr121870.C: New testcase.
|
|
BACKLOG_MAX represents the number of outstanding connections in the
socket's listen queue.
gcc/ada/ChangeLog:
* libgnat/g-socket.adb (Listen_Socket): Change default value.
* libgnat/g-socket.ads (Listen_Socket): Likewise.
* s-oscons-tmplt.c (BACKLOG_MAX): New.
|
|
gcc/ada/ChangeLog:
* env.c (__gnat_clearenv): Adjust comment.
* libgnarl/a-intnam__bsd.ads: Fix copyright date.
|