Age | Commit message (Collapse) | Author | Files | Lines |
|
gcc/ada/ChangeLog:
* doc/gnat_ugn/building_executable_programs_with_gnat.rst: Move
recommendation of Toolchain_Name up.
* gnat_ugn.texi: Regenerate.
|
|
`adafinal` is not available on targets without standard library.
gcc/ada/ChangeLog:
* bindgen.adb (Gen_Adafinal): Don't generate call of adafinal
when use of standard library suppressed.
|
|
Check whether library is elaborated is not generated when there is not
standard library available on target.
gcc/ada/ChangeLog:
* bindgen.adb (Gen_Adafinal): Don't generate code when
use of standard library suppressed.
|
|
gcc/ada/ChangeLog:
* libgnat/a-swunau.ads (Set_Wide_String): New subprogram.
* libgnat/a-swunau.adb (Set_Wide_String): Likewise.
* libgnat/a-swunau__shared.adb (Set_Wide_String): Likewise.
* libgnat/a-szunau.ads (Set_Wide_Wide_String): Likewise.
* libgnat/a-szunau.adb (Set_Wide_Wide_String): Likewise.
* libgnat/a-szunau__shared.adb (Set_Wide_Wide_String): Likewise.
|
|
There are cases where we need to analyze the argument of the pragma
in order to determine the ghostliness of the pragma. However during
that analysis the ghost region of the pragma is not set yet so we
cannot perform the ghost context checks at that moment.
This patch provides the mechanism for disabling ghost context checks
and disables them for pragma arguments that determine the ghostliness
of the pragma.
gcc/ada/ChangeLog:
* ghost.adb (Check_Ghost_Context): Avoid context checks when they
are globally disabled.
* sem.ads (Ghost_Context_Checks_Disabled): New flag to control
whether ghost context checks are activated or not.
* sem_prag.adb (Analyze_Pragma): Disable ghost context checks for
pragmas that determine their ghostliness based on one of its arguments.
|
|
"Is_Ancestor_Package (E, E)" returns True and this patch fixes a comment
that claimed otherwise. This patch also renames an object local to
Is_Ancestor_Package that was misleadingly named "Par", a common
abbreviation of "Parent".
gcc/ada/ChangeLog:
* sem_util.ads (Is_Ancestor_Package): Fix documentation comment.
* sem_util.adb (Is_Ancestor_Package): Rename local object.
|
|
In this PR we have a reduction of a vector constructor, where the
type of the constructor is int16x8_t and the elements are int16x4_t;
i.e. it is representing a concatenation of two vectors.
This triggers a match.pd pattern which looks like it was written to
handle reductions of vector constructors where the elements of the ctor
are scalars, not vectors. There is no type check to enforce this
property, which leads to the pattern replacing a reduction to scalar
with an int16x4_t vector in this case, which of course is a type error,
leading to an invalid GIMPLE ICE.
This patch adds a type check to the pattern, only going ahead with the
transformation if the element type of the ctor matches that of the
reduction.
gcc/ChangeLog:
PR tree-optimization/121772
* match.pd: Add type check to reduc(ctor) pattern.
gcc/testsuite/ChangeLog:
PR tree-optimization/121772
* gcc.target/aarch64/torture/pr121772.c: New test.
|
|
Add support for some recent AVR devices.
gcc/
* config/avr/avr-mcus.def: Add avr32eb14, avr32eb20,
avr32eb28, avr32eb32.
* doc/avr-mmcu.texi: Rebuild.
|
|
If a single instruction can store or move the whole block of memory, use
vector instruction and don't align destination.
gcc/
PR target/121934
* config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): If a
single instruction can store or move the whole block of memory,
use vector instruction and don't align destination.
gcc/testsuite/
PR target/121934
* gcc.target/i386/pr121934-1a.c: New test.
* gcc.target/i386/pr121934-1b.c: Likewise.
* gcc.target/i386/pr121934-2a.c: Likewise.
* gcc.target/i386/pr121934-2b.c: Likewise.
* gcc.target/i386/pr121934-3a.c: Likewise.
* gcc.target/i386/pr121934-3b.c: Likewise.
* gcc.target/i386/pr121934-4a.c: Likewise.
* gcc.target/i386/pr121934-4b.c: Likewise.
* gcc.target/i386/pr121934-5a.c: Likewise.
* gcc.target/i386/pr121934-5b.c: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
After late-combine is added, split1 can see an input like
(insn 56 55 169 5
(set (reg/v:DI 87 [ n ])
(ior:DI (and:DI (reg/v:DI 87 [ n ])
(const_int 281474976710655 [0xffffffffffff]))
(and:DI (reg:DI 131 [ _45 ])
(const_int -281474976710656 [0xffff000000000000]))))
"pr121906.c":22:8 108 {*bstrins_di_for_ior_mask}
(nil))
And the splitter ends up emitting
(insn 184 55 185 5
(set (reg/v:DI 87 [ n ])
(reg:DI 131 [ _45 ]))
"pr121906.c":22:8 -1
(nil))
(insn 185 184 169 5
(set (zero_extract:DI (reg/v:DI 87 [ n ])
(const_int 48 [0x30])
(const_int 0 [0]))
(reg/v:DI 87 [ n ]))
"pr121906.c":22:8 -1
(nil))
which obviously lost everything in r87, instead of retaining its lower
bits as we expect. It's because the splitter didn't anticipate the
output register may be one of the input registers.
PR target/121906
gcc/
* config/loongarch/loongarch.md (*bstrins_<mode>_for_ior_mask):
Always create a new pseudo for the input register of the bstrins
instruction.
gcc/testsuite/
* gcc.target/loongarch/pr121906.c: New test.
|
|
[PR121904]
The gcc.c-torture/compile/20111209-1.c testcase which uses
typedef char* char_ptr32 __attribute__ ((mode(SI)));
ICEs on s390x since my change to optimize extensions by cheaper of
signed or unsigned extension if sign bit is known from VRP not to be set.
The problem is that get_range_pos_neg uses ranger into int_range_max
and so ICEs on pointers. All the other current callers call it from places
where only scalar integral types can appear (scalar division/modulo,
overflow ifns, etc.) I think, this spot was just testing SCALAR_INT_MODE_P.
The following patch adds check for INTEGRAL_TYPE_P, I think ranger will not
do anything useful for pointers here anyway and what is a negative pointer
is also fuzzy. I've changed both get_range_pos_neg to punt on that and
the caller, either of those changes are sufficient to fix the ICE, but I
think it doesn't hurt to do it in both places.
2025-09-15 Jakub Jelinek <jakub@redhat.com>
PR middle-end/121904
* tree.cc (get_range_pos_neg): Return 3 if arg doesn't have
scalar integral type.
* expr.cc (expand_expr_real_2) <CASE_CONVERT>: Only choose between
sign and zero extension based on costs for scalar integral inner
types.
|
|
Add wrapper headers that prevent vendor vector headers from including
system stdint.h, ensuring tests work correctly when multilib is disabled.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/andes_vector.h: New file.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfncvtbf16s.c
(#include): Use local andes_vector.h instead of system header.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfwcvtsbf16.c
(#include): Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/andes_vector.h: New file.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfncvtbf16s.c
(#include): Use local andes_vector.h instead of system header.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfwcvtsbf16.c
(#include): Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/andes_vector.h: New file.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfncvtbf16s.c
(#include): Use local andes_vector.h instead of system header.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfwcvtsbf16.c
(#include): Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/andes_vector.h: New file.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfncvtbf16s.c
(#include): Use local andes_vector.h instead of system header.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfwcvtsbf16.c
(#include): Likewise.
* gcc.target/riscv/rvv/xsfvector/sifive_vector.h: New file.
* gcc.target/riscv/rvv/xtheadvector/riscv_th_vector.h: New file.
* gcc.target/riscv/rvv/xtheadvector/riscv_vector.h: New file.
|
|
In case an asm operand is an error node, constraints etc. are still
validated. Furthermore, all other operands are gimplified, although an
error is returned in the end anyway. For hard register constraints an
operand is required in order to determine the mode from which the number
of registers follows. Therefore, instead of adding extra guards, bail
out early.
gcc/ChangeLog:
PR middle-end/121391
* gimplify.cc (gimplify_asm_expr): In case an asm operand is an
error node, bail out early.
gcc/testsuite/ChangeLog:
* gcc.dg/pr121391-1.c: New test.
* gcc.dg/pr121391-2.c: New test.
|
|
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
gcc/cp/ChangeLog:
* mangle.cc (write_real_cst): Replace 8 spaces with Tab.
|
|
2025-09-15 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/83763
* trans-decl.cc (gfc_trans_deferred_vars): Ensure that the
parameterized components of PDTs that do not have allocatable
components are deallocated on leaving scope.
* trans-expr.cc (gfc_trans_assignment_1): Do a dependency check
on PDT assignments. If there is a dependency between lhs and
rhs, deallocate the lhs parameterized components after the rhs
has been evaluated.
gcc/testsuite/
PR fortran/83763
* gfortran.dg/pdt_46.f03: New test.
|
|
|
|
With no longer visiting TREE_CHAIN for decls we have to visit
the DECL_ARGUMENT chain manually.
PR lto/121935
* ipa-free-lang-data.cc (find_decls_types_r): Visit DECL_ARGUMENTS
chain manually.
* g++.dg/lto/pr121935_0.C: New testcase.
|
|
This patch adds support for conditional expressions in Fortran 2023 for a
limited set of types (logical, numerical), and also includes limited support
for conditional arguments without `.nil.` support.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_expr): Add support for EXPR_CONDITIONAL.
* expr.cc (gfc_get_conditional_expr): Add cond-expr constructor.
(gfc_copy_expr, free_expr0, gfc_is_constant_expr,
simplify_conditional, gfc_simplify_expr, gfc_check_init_expr,
check_restricted, gfc_traverse_expr): Add support for EXPR_CONDITIONAL.
* frontend-passes.cc (gfc_expr_walker): Ditto.
* gfortran.h (enum expr_t): Add EXPR_CONDITIONAL.
(gfc_get_operator_expr): Format fix.
(gfc_get_conditional_expr): New decl.
* matchexp.cc
(match_conditional, match_primary): Parsing for EXPR_CONDITIONAL.
* module.cc (mio_expr): Add support for EXPR_CONDITIONAL.
* resolve.cc (resolve_conditional, gfc_resolve_expr): Ditto.
* trans-array.cc (gfc_walk_conditional_expr, gfc_walk_subexpr): Ditto.
* trans-expr.cc
(gfc_conv_conditional_expr): Codegen for EXPR_CONDITIONAL.
(gfc_apply_interface_mapping_to_expr, gfc_conv_expr,
gfc_conv_expr_reference): Add support for EXPR_CONDITIONAL.
gcc/testsuite/ChangeLog:
* gfortran.dg/conditional_1.f90: New test.
* gfortran.dg/conditional_2.f90: New test.
* gfortran.dg/conditional_3.f90: New test.
* gfortran.dg/conditional_4.f90: New test.
* gfortran.dg/conditional_5.f90: New test.
* gfortran.dg/conditional_6.f90: New test.
* gfortran.dg/conditional_7.f90: New test.
* gfortran.dg/conditional_8.f90: New test.
* gfortran.dg/conditional_9.f90: New test.
|
|
This adds permute_info_type and removes the duplication from
vect_schedule_slp_node.
* tree-vectorizer.h (stmt_vec_info_type::permute_info_type): Add.
(vectorizable_slp_permutation): Declare.
* tree-vect-slp.cc (vectorizable_slp_permutation): Export.
(vect_slp_analyze_node_operations_1): Set permute_info_type
on permute nodes successfully analyzed.
(vect_schedule_slp_node): Dispatch to vect_transform_stmt
for all nodes.
* tree-vect-stmts.cc (vect_transform_stmt): Remove redundant
dump, handle permute_info_type.
* gcc.dg/vect/vect-reduc-chain-2.c: Adjust.
* gcc.dg/vect/vect-reduc-chain-3.c: Likewise.
|
|
The following makes us always use VMAT_STRIDED_SLP for negative
stride multi-element accesses. That handles falling back to
single element accesses transparently.
* tree-vect-stmts.cc (get_load_store_type): Use VMAT_STRIDED_SLP
for negative stride accesses when VMAT_CONTIGUOUS_REVERSE
isn't applicable.
|
|
The following tries to do vect_transform_slp_perm_load exactly
once during analysis and once during transform. There's a 2nd
case left during analysis in get_load_store_type. Temporarily
this records n_perms in the load-store info and verifies that
against the value computed at transform stage.
* tree-vectorizer.h (vect_load_store_data::n_perms): New.
* tree-vect-stmts.cc (vectorizable_load): Analyze
SLP_TREE_LOAD_PERMUTATION only once and remember n_perms.
Verify the transform-time n_perms against the value stored
during analysis.
|
|
|
|
gcc:
* target.def (dtors_from_cxa_atexit): Properly mark up
__cxa_atexit as code.
* doc/tm.texi: Regenerate.
|
|
ranges::shuffle has a two-at-a-time PRNG optimization (copied from
std::shuffle) that considers the PRNG width vs the size of the range.
But in C++20 a random access sentinel isn't always sized so we can't
unconditionally do __last - __first to obtain the size in constant
time.
We could instead use ranges::distance, but that'd take linear time for a
non-sized sentinel which makes the optimization less clear of a win. So
this patch instead makes us only consider this optimization for sized
ranges.
PR libstdc++/121917
libstdc++-v3/ChangeLog:
* include/bits/ranges_algo.h (__shuffle_fn::operator()): Only
consider the two-at-a-time PRNG optimization if the range is
sized.
* testsuite/25_algorithms/shuffle/constrained.cc (test03): New
test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
It looks like we didn't have a test so far reaching this point which
changed with the new hard register constraint tests. Bootstrap and
regtest are still running on x86_64. If they succeed, ok for mainline?
-- >8 --
As noted by Sam in the PR, with checking enabled tests
gcc.target/i386/asm-hard-reg-{1,2}.c fail with an ICE. If an error is
detected in curr_insn_transform(), lra_asm_insn_error() is called and
deletes the current insn. However, afterwards processing continues with
the deleted insn and via lra_process_new_insns() we finally call recog()
for NOTE_INSN_DELETED which ICEs in case of a checking build. Thus, in
case of an error during curr_insn_transform() bail out and stop
processing.
gcc/ChangeLog:
PR rtl-optimization/121205
* lra-constraints.cc (curr_insn_transform): Stop processing on
error.
|
|
gcc:
* doc/invoke.texi (Optimize Options): Editorial changes around
-fprofile-partial-training.
|
|
Add the necessary register definitions for PRU, so that asm-hard-reg
tests can pass for PRU.
gcc/testsuite/ChangeLog:
* gcc.dg/asm-hard-reg-error-1.c: Enable test for PRU, and define
registers for PRU.
* gcc.dg/asm-hard-reg-error-4.c: Define hard regs for PRU.
* gcc.dg/asm-hard-reg-error-5.c: Ditto.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
N3517 (array subscripting without decay) has been added to C2y (via a
remote vote in May, not at a meeting). Implement this in GCC.
The conceptual change, that the array subscripting operator [] no
longer involves an array operand decaying to a pointer, is something
GCC has done for a very long time. The main effect in terms of what
is made possible in the language, subscripting a register array
(undefined behavior in C23 and before), was available as a GNU
extension, but only with constant indices. There is also a new
constraint that array indices must not be negative when they are
integer constant expressions and the array operand has array type
(negative indices are fine with pointers) - an access out of bounds of
an array (even when contained within a larger object) has undefined
behavior at runtime when not a constraint violation.
Thus, the previous GCC extension is adapted to allow the cases of
register arrays not previously allowed, clearing DECL_REGISTER on them
as needed (similar to what is done with register declarations of
structures with volatile members) and restricting the pedwarn to
pedwarn_c23. That pedwarn_c23 is also extended to cover the C23 case
of register compound literals (although not strictly needed since it
was undefined behavior rather than a constraint violation in C23).
The new error is added (only for flag_isoc2y) for negative array
indices with an operand of array type.
N3517 has some specific wording about the type of the result of
non-lvalue array element access. It's unclear what's actually desired
there in the case where the array element is itself of array type; see
C23 issue 1001 regarding types of qualified members of rvalue
structures and unions more generally. Rather than implementing the
specific wording about this in N3517, that is deferred until there's
an accepted resolution to issue 1001 and can be dealt with as part of
implementing such a resolution.
Nothing specific is done about the obsolescence in that paper of
writing index[array] or index[pointer] as opposed to array[index] or
pointer[index], although that seems like a reasonable enough thing to
warn about.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/c/
* c-typeck.cc (c_mark_addressable): New parameter
override_register.
(build_array_ref): Update calls to c_mark_addressable. Give error
in C2Y mode for negative array indices when array expression is an
array not a pointer. Use pedwarn_c23 for subscripting register
array; diagnose that also for register compound literal.
* c-tree.h (c_mark_addressable): Update prototype.
gcc/testsuite/
* gcc.dg/c23-array-negative-1.c, gcc.dg/c23-register-array-1.c,
gcc.dg/c23-register-array-2.c, gcc.dg/c23-register-array-3.c,
gcc.dg/c23-register-array-4.c, gcc.dg/c2y-array-negative-1.c,
gcc.dg/c2y-register-array-2.c, gcc.dg/c2y-register-array-3.c: New
tests.
|
|
|
|
Shreya's work to add the addptr pattern on the RISC-V port exposed a latent bug
in LRA.
We lazily allocate/reallocate the ira_reg_equiv structure and when we do
(re)allocation we'll over-allocate and zero-fill so that we don't have to
actually allocate and relocate the data so often.
In the case exposed by Shreya's work we had N requested entries at the last
rellocation step. We actually allocate N+M entries. During LRA we allocate
enough new pseudos and thus have N+M+1 pseudos.
In get_equiv we read ira_reg_equiv[regno] without bounds checking so we read
past the allocated part of the array and get back junk which we use and
depending on the precise contents we fault in various fun and interesting ways.
We could either arrange to re-allocate ira_reg_equiv again on some path through
LRA (possibly in get_equiv itself). We could also just insert the bounds check
in get_equiv like is done elsewhere in LRA. Vlad indicated no strong
preference in an email last week.
So this just adds the bounds check in a manner similar to what's done elsewhere
in LRA. Bootstrapped and regression tested on x86_64 as well as RISC-V with
Shreya's work enabled and regtested across the various embedded targets.
gcc/
* lra-constraints.cc (get_equiv): Bounds check before accessing
data in ira_reg_equiv.
|
|
Using std::move(*it) is incorrect for iterators that use proxy refs, we
should use ranges::iter_move(it) instead.
libstdc++-v3/ChangeLog:
PR libstdc++/121913
* include/bits/ranges_algo.h (__rotate_fn::operator()): Use
ranges::iter_move(it) instead of std::move(*it).
* testsuite/25_algorithms/rotate/121913.cc: New test.
Reviewed-by: Patrick Palka <ppalka@redhat.com>
|
|
[PR121890]
Whenever we use operator+ or similar operators on random access
iterators we need to be careful to use the iterator's difference_type
rather than some other integer type. It's not guaranteed that an
expression with an arbitrary integer type, such as `it + 1u`, has the
same effects as `it + iter_difference_t<It>(1)`.
Some of our algorithms need changes to cast values to the correct type,
or to use std::next or ranges::next instead of `it + n`. Several tests
also need fixes where the arithmetic occurs directly in the test.
The __gnu_test::random_access_iterator_wrapper class template is
adjusted to have deleted operators that make programs ill-formed if the
argument to relevant operators is not the difference_type. This will
make it easier to avoid regressing in future.
libstdc++-v3/ChangeLog:
PR libstdc++/121890
* include/bits/ranges_algo.h (ranges::rotate, ranges::shuffle)
(__insertion_sort, __unguarded_partition_pivot, __introselect):
Use ranges::next to advance iterators. Use local variables in
rotate to avoid duplicate expressions.
(ranges::push_heap, ranges::pop_heap, ranges::partial_sort)
(ranges::partial_sort_copy): Use ranges::prev.
(__final_insertion_sort): Use iter_difference_t<Iter>
for operand of operator+ on iterator.
* include/bits/ranges_base.h (ranges::advance): Use iterator's
difference_type for all iterator arithmetic.
* include/bits/stl_algo.h (__search_n_aux, __rotate)
(__insertion_sort, __unguarded_partition_pivot, __introselect)
(__final_insertion_sort, for_each_n, random_shuffle): Likewise.
Use local variables in __rotate to avoid duplicate expressions.
* include/bits/stl_algobase.h (__fill_n_a, __lc_rai::__newlast1):
Likewise.
* include/bits/stl_heap.h (push_heap): Likewise.
(__is_heap_until): Add static_assert.
(__is_heap): Convert distance to difference_type.
* include/std/functional (boyer_moore_searcher::operator()): Use
iterator's difference_type for iterator arithmetic.
* testsuite/util/testsuite_iterators.h
(random_access_iterator_wrapper): Add deleted overloads of
operators that should be called with difference_type.
* testsuite/24_iterators/range_operations/advance.cc: Use
ranges::next.
* testsuite/25_algorithms/heap/constrained.cc: Use ranges::next
and ranges::prev.
* testsuite/25_algorithms/nth_element/58800.cc: Use std::next.
* testsuite/25_algorithms/nth_element/constrained.cc: Use
ptrdiff_t for loop variable.
* testsuite/25_algorithms/nth_element/random_test.cc: Use
iterator's difference_type instead of int.
* testsuite/25_algorithms/partial_sort/check_compare_by_value.cc:
Use std::next.
* testsuite/25_algorithms/partial_sort/constrained.cc: Use
ptrdiff_t for loop variable.
* testsuite/25_algorithms/partial_sort/random_test.cc: Use
iterator's difference_type instead of int.
* testsuite/25_algorithms/partial_sort_copy/constrained.cc:
Use ptrdiff_t for loop variable.
* testsuite/25_algorithms/partial_sort_copy/random_test.cc:
Use iterator's difference_type instead of int.
* testsuite/std/ranges/adaptors/drop.cc: Use ranges::next.
* testsuite/25_algorithms/fill_n/diff_type.cc: New test.
* testsuite/25_algorithms/lexicographical_compare/diff_type.cc:
New test.
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
This tentatively applies the same tweak to twin testcases.
gcc/testsuite/
PR ada/121532
* ada/acats-4/tests/cxa/cxai034.a: Use Long_Switch_To_New_Task
constant instead of Switch_To_New_Task in delay statements.
* ada/acats-4/tests/cxa/cxai035.a: Likewise.
* ada/acats-4/tests/cxa/cxai036.a: Likewise.
|
|
We weren't explicitly treating a pack index specifier as a non-deduced
context (as per [temp.deduct.type]/5), leading to an ICE for the first
testcase below.
PR c++/121795
gcc/cp/ChangeLog:
* pt.cc (unify) <case PACK_INDEX_TYPE>: New non-deduced context
case.
gcc/testsuite/ChangeLog:
* g++.dg/cpp26/pack-indexing17.C: New test.
* g++.dg/cpp26/pack-indexing17a.C: New test.
Reviewed-by: Marek Polacek <polacek@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
This patch contains testcases for PR120378 after the change made to
support the vnclipu variant of the SAT_TRUNC pattern.
PR target/120378
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr120378-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr120378-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr120378-3.c: New test.
* gcc.target/riscv/rvv/autovec/pr120378-4.c: New test.
Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
|
|
This patch tries to add support for a variant of SAT_TRUNC where
negative numbers are clipped to 0 instead of NARROW_TYPE_MAX_VALUE.
This form is seen in x264, aka
UT clip (T a)
{
return a & (UT)(-1) ? (-a) >> 31 : a;
}
Where sizeof(UT) < sizeof(T)
I'm unable to get the SAT_TRUNC pattern to appear on x86_64, however it
does appear when building for riscv as seen below:
Before this patch:
<bb 3> [local count: 764504183]:
# i_21 = PHI <i_14(8), 0(15)>
# vectp_x.10_54 = PHI <vectp_x.10_55(8), x_10(D)(15)>
# vectp_res.20_66 = PHI <vectp_res.20_67(8), res_11(D)(15)>
# ivtmp_70 = PHI <ivtmp_71(8), _69(15)>
_72 = .SELECT_VL (ivtmp_70, POLY_INT_CST [4, 4]);
_1 = (long unsigned int) i_21;
_2 = _1 * 4;
_3 = x_10(D) + _2;
ivtmp_53 = _72 * 4;
vect__4.12_57 = .MASK_LEN_LOAD (vectp_x.10_54, 32B, { -1, ... }, _56(D), _72, 0);
vect_x.13_58 = VIEW_CONVERT_EXPR<vector([4,4]) unsigned int>(vect__4.12_57);
vect__38.15_60 = -vect_x.13_58;
vect__15.16_61 = VIEW_CONVERT_EXPR<vector([4,4]) int>(vect__38.15_60);
vect__16.17_62 = vect__15.16_61 >> 31;
mask__29.14_59 = vect_x.13_58 > { 255, ... };
vect__17.18_63 = VEC_COND_EXPR <mask__29.14_59, vect__16.17_62, vect__4.12_57>;
vect__18.19_64 = (vector([4,4]) unsigned char) vect__17.18_63;
_4 = *_3;
_5 = res_11(D) + _1;
x.0_12 = (unsigned int) _4;
_38 = -x.0_12;
_15 = (int) _38;
_16 = _15 >> 31;
_29 = x.0_12 > 255;
_17 = _29 ? _16 : _4;
_18 = (unsigned char) _17;
.MASK_LEN_STORE (vectp_res.20_66, 8B, { -1, ... }, _72, 0, vect__18.19_64);
i_14 = i_21 + 1;
vectp_x.10_55 = vectp_x.10_54 + ivtmp_53;
vectp_res.20_67 = vectp_res.20_66 + _72;
ivtmp_71 = ivtmp_70 - _72;
if (ivtmp_71 != 0)
goto <bb 8>; [89.00%]
else
goto <bb 17>; [11.00%]
After this patch:
<bb 3> [local count: 764504183]:
# i_21 = PHI <i_14(8), 0(15)>
# vectp_x.10_68 = PHI <vectp_x.10_69(8), x_10(D)(15)>
# vectp_res.15_75 = PHI <vectp_res.15_76(8), res_11(D)(15)>
# ivtmp_79 = PHI <ivtmp_80(8), _78(15)>
_81 = .SELECT_VL (ivtmp_79, POLY_INT_CST [4, 4]);
_1 = (long unsigned int) i_21;
_2 = _1 * 4;
_3 = x_10(D) + _2;
ivtmp_67 = _81 * 4;
vect__4.12_71 = .MASK_LEN_LOAD (vectp_x.10_68, 32B, { -1, ... }, _70(D), _81, 0);
vect_patt_37.13_72 = MAX_EXPR <{ 0, ... }, vect__4.12_71>;
vect_patt_39.14_73 = .SAT_TRUNC (vect_patt_37.13_72);
_4 = *_3;
_5 = res_11(D) + _1;
x.0_12 = (unsigned int) _4;
_38 = -x.0_12;
_15 = (int) _38;
_16 = _15 >> 31;
_29 = x.0_12 > 255;
_17 = _29 ? _16 : _4;
_18 = (unsigned char) _17;
.MASK_LEN_STORE (vectp_res.15_75, 8B, { -1, ... }, _81, 0, vect_patt_39.14_73);
i_14 = i_21 + 1;
vectp_x.10_69 = vectp_x.10_68 + ivtmp_67;
vectp_res.15_76 = vectp_res.15_75 + _81;
ivtmp_80 = ivtmp_79 - _81;
if (ivtmp_80 != 0)
goto <bb 8>; [89.00%]
else
goto <bb 17>; [11.00%]
gcc/ChangeLog:
* match.pd: New NARROW_CLIP variant for SAT_TRUNC.
* tree-vect-patterns.cc (gimple_unsigned_integer_narrow_clip):
Add new decl for NARROW_CLIP.
(vect_recog_sat_trunc_pattern): Add NARROW_CLIP check.
Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
|
|
After
commit 8cad8f94b450be9b73d07bdeef7fa1778d3f2b96
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Sep 5 15:40:51 2025 -0700
c: Update TLS model after processing a TLS variable
GCC will upgrade local-dynamic TLS model to local-exec without -fPIC.
Compile TLS LD tests with -fPIC to keep local-dynamic TLS model.
PR testsuite/121888
* gcc.target/sparc/tls-ld-int16.c: Compile with -fPIC.
* gcc.target/sparc/tls-ld-int32.c: Likewise.
* gcc.target/sparc/tls-ld-int64.c: Likewise.
* gcc.target/sparc/tls-ld-int8.c: Likewise.
* gcc.target/sparc/tls-ld-uint16.c: Likewise.
* gcc.target/sparc/tls-ld-uint32.c: Likewise.
* gcc.target/sparc/tls-ld-uint8.c: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
gcc/ChangeLog:
PR diagnostics/120063
* diagnostics/context.cc (context::execution_failed_p): Also treat
any kind::fatal errors as leading to failed execution.
* diagnostics/sarif-sink.cc (maybe_get_sarif_level): Handle
kind::fatal as SARIF level "error".
gcc/testsuite/ChangeLog:
PR diagnostics/120063
* gcc.dg/fatal-error.c: New test.
* gcc.dg/fatal-error-html.py: New test.
* gcc.dg/fatal-error-sarif.py: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
PR diagnostics/121876 tracks an issue inside our crash-handling, where
if an ICE happens when we're within a nested diagnostic, an assertion
fails inside diagnostic::context::set_diagnostic_buffer, leading to
a 2nd ICE. Happily, this does not infinitely recurse, but it obscures
the original ICE and the useful part of the backtrace, and any SARIF or
HTML sinks we were writing to are left as empty files.
This patch tweaks the above so that the assertion doesn't fail, and adds
test coverage (via a plugin) to ensure that such ICEs/crashes are
gracefully handled and e.g. captured in SARIF/HTML output.
gcc/ChangeLog:
PR diagnostics/121876
* diagnostics/buffering.cc (context::set_diagnostic_buffer): Add
early reject of the no-op case.
gcc/testsuite/ChangeLog:
PR diagnostics/121876
* gcc.dg/plugin/crash-test-nested-ice-html.py: New test.
* gcc.dg/plugin/crash-test-nested-ice-sarif.py: New test.
* gcc.dg/plugin/crash-test-nested-ice.c: New test.
* gcc.dg/plugin/crash-test-nested-write-through-null-html.py: New test.
* gcc.dg/plugin/crash-test-nested-write-through-null-sarif.py: New test.
* gcc.dg/plugin/crash-test-nested-write-through-null.c: New test.
* gcc.dg/plugin/crash_test_plugin.cc: Add "nested" argument, and when
set, inject the problem within a nested diagnostic.
* gcc.dg/plugin/plugin.exp: Add crash-test-nested-ice.c and
crash-test-nested-write-through-null.c.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/crash-test-write-though-null-sarif.c: Rename to...
* gcc.dg/plugin/crash-test-write-through-null-sarif.c: ...this.
* gcc.dg/plugin/crash-test-write-though-null-stderr.c: Rename to...
* gcc.dg/plugin/crash-test-write-through-null-stderr.c: ...this.
* gcc.dg/plugin/plugin.exp: Update for above renamings. Sort the
test files for crash_test_plugin.cc alphabetically.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Another lp64 vs lp64d issue. This time adjusting a #include in the test isn't
sufficient. So instead this sets the ABI to lp64d instead of lp64. I don't
think that'll impact the test materially.
Tested on the BPI and Pioneer systems where it fixes the failures with the
Andes tests. Pushing to the trunk.
gcc/testsuite
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vd4dots.c:
Adjust ABI specification.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vd4dotsu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vd4dotu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfncvtbf16s.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfpmadb.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfpmadt.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vfwcvtsbf16.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/non-overloaded/nds_vln8.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vd4dots.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vd4dotsu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vd4dotu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfncvtbf16s.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfpmadb.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfpmadt.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vfwcvtsbf16.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/non-policy/overloaded/nds_vln8.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vd4dots.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vd4dotsu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vd4dotu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfncvtbf16s.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfpmadb.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfpmadt.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vfwcvtsbf16.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/non-overloaded/nds_vln8.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vd4dots.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vd4dotsu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vd4dotu.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfncvtbf16s.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfpmadb.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfpmadt.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vfwcvtsbf16.c:
Likewise.
* gcc.target/riscv/rvv/xandesvector/policy/overloaded/nds_vln8.c:
Likewise.
|
|
Backport of upstream patch:
https://github.com/uxlfoundation/oneDPL/pull/1589
libstdc++-v3/ChangeLog:
PR libstdc++/117276
* include/pstl/parallel_backend_tbb.h (__func_task::finalize):
Make deallocation unconditional.
|
|
The r16-3435-gbbc0e70b610f19 change (for LWG 4294) needs to be applied
to the debug mode __gnu_debug::bitset as well as the normal one.
libstdc++-v3/ChangeLog:
PR libstdc++/121046
* include/debug/bitset (bitset(const CharT*, ...)): Add
constraints on CharT type.
|
|
My r16-3559-gc2e567a6edb563 reworked ADL for modules, including a change
to allow seeing module-linkage declarations if they only exist on the
instantiation path. This caused a crash however as I neglected to
unwrap the stat hack wrapper when we were happy to see all declarations,
allowing search_adl to add non-functions to the overload set.
PR c++/121893
gcc/cp/ChangeLog:
* name-lookup.cc (name_lookup::adl_namespace_fns): Unwrap the
STAT_HACK also when on_inst_path.
gcc/testsuite/ChangeLog:
* g++.dg/modules/adl-10_a.C: New test.
* g++.dg/modules/adl-10_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
[PR121865]
On a DECL, TREE_CHAIN will find any other declarations in the same
binding level. This caused an ICE in PR121865 because the next entity
in the binding level was the uninstantiated unique friend 'foo', for
which after being found the compiler tries to generate a mangled name
for it and crashes.
This didn't happen in non-modules testcases only because normally the
unique friend function would have been chained after its template_decl,
and find_decl_types_r bails on lang-specific nodes so it never saw the
uninstantiated decl. With modules however the order of chaining
changed, causing the error.
I don't think it's ever necessary to walk into the DECL_CHAIN, from what
I can see; other cases where it might be useful (block vars or type
fields) are already handled explicitly elsewhere, and only one test
fails because of the change, due to accidentally relying on this "walk
into the next in-scope declaration" behaviour.
PR c++/121865
gcc/ChangeLog:
* ipa-free-lang-data.cc (find_decls_types_r): Don't walk into
DECL_CHAIN for any DECL.
gcc/testsuite/ChangeLog:
* g++.dg/lto/pr101396_0.C: Ensure A will be walked into (and
isn't constant-folded out of the GIMPLE for the function).
* g++.dg/lto/pr101396_1.C: Add message.
* g++.dg/modules/lto-4_a.C: New test.
* g++.dg/modules/lto-4_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Richard Biener <rguenther@suse.de>
|
|
My r16-3810-g6456da6bab8a2c changes broke bootstrap for targets that use
the mutex-based atomic helpers. This fixes it by casting away the
unnecessary volatile-qualification on the _Atomic_word* before passing
it to __exchange_and_add_single.
libstdc++-v3/ChangeLog:
* config/cpu/generic/atomicity_mutex/atomicity.h
(__exchange_and_add): Use const_cast to remove volatile.
|
|
gcc/
* ipa-pure-const.cc (check_stmt): Minor formatting tweaks.
(pass_data_nothrow): Fix pasto in description.
|
|
comparison values
Given a sequence such as
int foo ()
{
#pragma GCC unroll 4
for (int i = 0; i < N; i++)
if (a[i] == 124)
return 1;
return 0;
}
where a[i] is long long, we will unroll the loop and use an OR reduction for
early break on Adv. SIMD. Afterwards the sequence is followed by a compression
sequence to compress the 128-bit vectors into 64-bits for use by the branch.
However if we have support for add halving and narrowing then we can instead of
using an OR, use an ADDHN which will do the combining and narrowing.
Note that for now I only do the last OR, however if we have more than one level
of unrolling we could technically chain them. I will revisit this in another
up coming early break series, however an unroll of 2 is fairly common.
gcc/ChangeLog:
* internal-fn.def (VEC_TRUNC_ADD_HIGH): New.
* doc/generic.texi: Document it.
* optabs.def (vec_trunc_add_high): New.
* doc/md.texi: Document it.
* tree-vect-stmts.cc (vectorizable_early_exit): Use addhn if supported.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vect-early-break-addhn_1.c: New test.
* gcc.target/aarch64/vect-early-break-addhn_2.c: New test.
* gcc.target/aarch64/vect-early-break-addhn_3.c: New test.
* gcc.target/aarch64/vect-early-break-addhn_4.c: New test.
|
|
This implements the new vector optabs vec_<su>addh_narrow<mode>
adding support for in-vectorizer use for early break.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (vec_addh_narrow<mode>): New.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vect-addhn_1.c: New test.
|
|
If the user has requested loop unrolling through pragma GCC unroll then at the
moment we only set LOOP_VINFO_USER_UNROLL if the vectorizer has not overrode the
unroll factor (through backend costing) or if the VF made the requested unroll
factor be 1.
When we have a loop of say int and a pragma unroll 4
If the vectorizer picks V4SI as the mode, the requested unroll ended up exactly
matching the VF. As such the requested unroll is 1 and we don't clear the pragma.
So it did honor the requested unroll factor. However since we didn't set the
unroll amount back and left it at 4 the rtl unroller won't use the rtl cost
model at all and just unroll the vector loop 4 times.
But of these events are costing related, and so it stands to reason that we
should set LOOP_VINFO_USER_UNROLL to we return the RTL unroller to use the
backend costing for any further unrolling.
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_1): If the unroll pragma was set
mark it as handled.
* doc/extend.texi (pragma GCC unroll): Update documentation.
|