Age | Commit message (Collapse) | Author | Files | Lines |
|
2024-06-30 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR target/115691
* config/pa/pa.md: Remove incorrect xmpyu patterns.
|
|
The following restricts copying of points-to info from defs that
might be in regions invoking UB and are never executed.
PR tree-optimization/115701
* tree-ssanames.cc (maybe_duplicate_ssa_info_at_copy):
Only copy info from within the same BB.
* gcc.dg/torture/pr115701.c: New testcase.
|
|
The following factors out the code that preserves SSA info of the LHS
of a SSA copy LHS = RHS when LHS is about to be eliminated to RHS.
PR tree-optimization/115701
* tree-ssanames.h (maybe_duplicate_ssa_info_at_copy): Declare.
* tree-ssanames.cc (maybe_duplicate_ssa_info_at_copy): New
function, split out from ...
* tree-ssa-copy.cc (fini_copy_prop): ... here.
* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt): ...
and here.
|
|
The following makes sure that for a SLP reductions all lanes have
the same STMT_VINFO_REDUC_IDX. Once we move that info and can adjust
it we can implement swapping. It also makes the existing protection
against operand swapping trigger for all stmts participating in a
reduction, not just the final one marked as reduction-def.
* tree-vect-slp.cc (vect_build_slp_tree_1): Compare
STMT_VINFO_REDUC_IDX.
(vect_build_slp_tree_2): Prevent operand swapping for
all stmts participating in a reduction.
|
|
The input vectype of reduction PHI statement must be determined before
vect cost computation for the reduction. Since lance-reducing operation has
different input vectype from normal one, so we need to traverse all reduction
statements to find out the input vectype with the least lanes, and set that to
the PHI statement.
2024-06-16 Feng Xue <fxue@os.amperecomputing.com>
gcc/
* tree-vect-loop.cc (vectorizable_reduction): Determine input vectype
during traversal of reduction statements.
|
|
Allow shift-by-induction for slp node, when it is single lane, which is
aligned with the original loop-based handling.
2024-06-26 Feng Xue <fxue@os.amperecomputing.com>
gcc/
* tree-vect-stmts.cc (vectorizable_shift): Allow shift-by-induction
for single-lane slp node.
gcc/testsuite/
* gcc.dg/vect/vect-shift-6.c
* gcc.dg/vect/vect-shift-7.c
|
|
|
|
Use INT_MIN rather than -1 in `comparison_qty' where a comparison is not
with a register, because the value of -1 is actually a valid reference
to register 0 in the case where it has not been assigned a quantity.
Using -1 makes `REG_QTY (REGNO (folded_arg1)) == ent->comparison_qty'
comparison in `fold_rtx' to incorrectly trigger in rare circumstances
and return true for a memory reference, making CSE consider a comparison
operation to evaluate to a constant expression and consequently make the
resulting code incorrectly execute or fail to execute conditional
blocks.
This has caused a miscompilation of rwlock.c from LinuxThreads for the
`alpha-linux-gnu' target, where `rwlock->__rw_writer != thread_self ()'
expression (where `thread_self' returns the thread pointer via a PALcode
call) has been decided to be always true (with `ent->comparison_qty'
using -1 for a reference to to `rwlock->__rw_writer', while register 0
holding the thread pointer retrieved by `thread_self') and code for the
false case has been optimized away where it mustn't have, causing
program lockups.
The issue has been observed as a regression from commit 08a692679fb8
("Undefined cse.c behaviour causes 3.4 regression on HPUX"),
<https://gcc.gnu.org/ml/gcc-patches/2004-10/msg02027.html>, and up to
commit 932ad4d9b550 ("Make CSE path following use the CFG"),
<https://gcc.gnu.org/ml/gcc-patches/2006-12/msg00431.html>, where CSE
has been restructured sufficiently for the issue not to trigger with the
original reproducer anymore. However the original bug remains and can
trigger, because `comparison_qty' will still be assigned -1 for a memory
reference and the `reg_qty' member of a `cse_reg_info_table' entry will
still be assigned -1 for register 0 where the entry has not been
assigned a quantity, e.g. at initialization.
Use INT_MIN then as noted above, so that the value remains negative, for
consistency with the REGNO_QTY_VALID_P macro (even though not used on
`comparison_qty'), and then so that it should not ever match a valid
negated register number, fixing the regression with commit 08a692679fb8.
gcc/
PR rtl-optimization/115565
* cse.cc (record_jump_cond): Use INT_MIN rather than -1 for
`comparison_qty' if !REG_P.
|
|
I hadn't updated my repo on the host where I handle email, so it picked
up the older version of this patch without the testsuite fix. So, V4
with the testsuite option for lmul fixed.
--
And Sergei's movmem patch. Just trivial testsuite adjustment for an
option name change and a whitespace fix from me.
I've spun this in my tester for rv32 and rv64. I'll wait for pre-commit
CI before taking further action.
Just a reminder, this patch is designed to handle the case where we can
issue a single vector load/store which avoids all the complexities of
determining which direction to copy.
--
gcc/ChangeLog
* config/riscv/riscv.md (movmem<mode>): New expander.
gcc/testsuite/ChangeLog
PR target/112109
* gcc.target/riscv/rvv/base/movmem-1.c: New test
|
|
gcc/fortran/ChangeLog:
PR fortran/114019
* trans-stmt.cc (gfc_trans_allocate): Fix handling of case of
scalar character expression being used for SOURCE.
gcc/testsuite/ChangeLog:
PR fortran/114019
* gfortran.dg/allocate_with_source_33.f90: New test.
|
|
This patch would like to support the form of unsigned scalar .SAT_ADD
when one of the op is IMM. For example as below:
Form IMM:
#define DEF_SAT_U_ADD_IMM_FMT_1(T) \
T __attribute__((noinline)) \
sat_u_add_imm_##T##_fmt_1 (T x) \
{ \
return (T)(x + 9) >= x ? (x + 9) : -1; \
}
DEF_SAT_U_ADD_IMM_FMT_1(uint64_t)
Before this patch:
__attribute__((noinline))
uint64_t sat_u_add_imm_uint64_t_fmt_1 (uint64_t x)
{
long unsigned int _1;
uint64_t _3;
;; basic block 2, loop depth 0
;; pred: ENTRY
_1 = MIN_EXPR <x_2(D), 18446744073709551606>;
_3 = _1 + 9;
return _3;
;; succ: EXIT
}
After this patch:
__attribute__((noinline))
uint64_t sat_u_add_imm_uint64_t_fmt_1 (uint64_t x)
{
uint64_t _3;
;; basic block 2, loop depth 0
;; pred: ENTRY
_3 = .SAT_ADD (x_2(D), 9); [tail call]
return _3;
;; succ: EXIT
}
The below test suites are passed for this patch:
1. The rv64gcv fully regression test with newlib.
2. The x86 bootstrap test.
3. The x86 fully regression test.
gcc/ChangeLog:
* match.pd: Add imm form for .SAT_ADD matching.
* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children):
Add .SAT_ADD matching under PLUS_EXPR.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
r15-1699-g445c62ee492 contains changes that trigger two maybe-uninitialized
warnings on Darwin, which result in a bootstrap failure.
Note that the warnings are false positives, in fact the variables should be
initialized in the cases of a switch (all values of the switch condition are
covered).
Fixed here by providing default initializations for the relevant variables.
gcc/jit/ChangeLog:
* jit-recording.cc
(recording::memento_of_typeinfo::make_debug_string): Default the value
of ident.
(recording::memento_of_typeinfo::write_reproducer): Default the value
of type.
Signed-off-by: Iain Sandoe <iains@gcc.gnu.org>
|
|
So the recent IRA change exposed a bug in the mcore backend.
The mcore has a special instruction (xtrb3) which can zero extend a GPR into
R1. It's useful because zextb requires a matching source/destination.
Unfortunately xtrb3 modifies CC.
The IRA changes twiddle register allocation such that we want to use xtrb3.
Unfortunately CC is live at the point where we want to use xtrb3 and clobbering
CC causes the test to fail.
Exposing the clobber in the expander and insn seems like the best path forward.
We could also drop the xtrb3 alternative, but that seems like it would hurt
codegen more than exposing the clobber.
The bitfield extraction patterns using xtrb look problematic as well, but I
didn't try to fix those.
This fixes the builtn-arith-overflow regressions and appears to fix
20010122-1.c as a side effect.
gcc/
* config/mcore/mcore.md (zero_extendqihi2): Clobber CC in expander
and matching insn.
(zero_extendqisi2): Likewise.
|
|
|
|
Here we notice the 'this' conversion for the call f<void>() is bad, so
we correctly defer deduction for the template candidate, but we end up
never adding it to 'bad_cands' since missing_conversion_p for it returns
false (its only argument is 'this' which has already been determined to
be bad). This is not a huge deal, but it causes us to longer accept the
call with -fpermissive in release builds, and a tree check ICE in checking
builds.
So if we have a non-strictly viable template candidate that has not been
instantiated, then we need to add it to 'bad_cands' even if no argument
conversion is missing.
PR c++/106760
gcc/cp/ChangeLog:
* call.cc (add_candidates): Relax test for adding a candidate
to 'bad_cands' to also accept an uninstantiated template candidate
that has no missing conversions.
gcc/testsuite/ChangeLog:
* g++.dg/ext/conv3.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
When the library is configured with --disable-libstdcxx-verbose the
assertions just abort instead of calling __glibcxx_assert_fail, and so I
didn't export that function for the non-verbose build. However, that
option is documented to not change the library ABI, so we still need to
export the symbol from the library. It could be needed by programs
compiled against the headers from a verbose build.
The non-verbose definition can just call abort so that it doesn't pull
in I/O symbols, which are unwanted in a non-verbose build.
libstdc++-v3/ChangeLog:
PR libstdc++/115585
* src/c++11/assert_fail.cc (__glibcxx_assert_fail): Add
definition for non-verbose builds.
|
|
We optimize std::equal to memcmp for integers and pointers, which means
that std::byte comparisons generate bigger code than char comparisons.
We can't use memcmp for arbitrary enum types, because they could have an
overloaded operator== that has custom semantics, but we know that
std::byte doesn't do that.
libstdc++-v3/ChangeLog:
PR libstdc++/101485
* include/bits/stl_algobase.h (__equal_aux1): Check for
std::byte as well.
* testsuite/25_algorithms/equal/101485.cc: New test.
|
|
When -faligned-new (or Clang's -faligned-allocation) is used our
allocators try to support extended alignments, gated on the
__cpp_aligned_new macro. However, because they use alignof(_Tp) which is
not a keyword in C++98 mode, using -std=c++98 -faligned-new results in
errors from <memory> and other headers.
We could change them to use __alignof__ instead of alignof, but that
would potentially alter the result of the conditions, because e.g.
alignof(long long) != __alignof__(long long) on some targets. That's
probably not an issue for any types with extended alignment, so maybe it
would be a safe change.
For now, it seems acceptable to just disable the extended alignment
support in C++98 mode, so that -faligned-new enables std::align_val_t
and the corresponding operator new overloads, but doesn't affect
std::allocator, __gnu_cxx::__bitmap_allocator etc.
libstdc++-v3/ChangeLog:
PR libstdc++/104395
* include/bits/new_allocator.h: Disable extended alignment
support in C++98 mode.
* include/bits/stl_tempbuf.h: Likewise.
* include/ext/bitmap_allocator.h: Likewise.
* include/ext/malloc_allocator.h: Likewise.
* include/ext/mt_allocator.h: Likewise.
* include/ext/pool_allocator.h: Likewise.
* testsuite/ext/104395.cc: New test.
|
|
As noted in a comment, the __gnu_cxx::__aligned_membuf class template
can be simplified, because alignof(T) and alignas(T) use the correct
alignment for a data member. That's true since GCC 8 and Clang 8. The
EDG front end (as used by Intel icc, aka "Intel C++ Compiler Classic")
does not implement the PR c++/69560 change, so keep using the old
implementation when __EDG__ is defined, to avoid an ABI change for icc.
For __gnu_cxx::__aligned_buffer<T> all supported compilers agree on the
value of __alignof__(T), but we can still simplify it by removing the
dependency on std::aligned_storage<sizeof(T), __alignof__(T)>.
Add a test that checks that the aligned buffer types have the expected
alignment, so that we can tell if changes like this affect their ABI
properties.
libstdc++-v3/ChangeLog:
* include/ext/aligned_buffer.h (__aligned_membuf): Use
alignas(T) directly instead of defining a struct and using 9its
alignment.
(__aligned_buffer): Remove use of std::aligned_storage.
* testsuite/abi/aligned_buffers.cc: New test.
|
|
Allow ssa_lazy cache to allocate bitmaps from a client provided obstack
if so desired.
* gimple-range-cache.cc (ssa_lazy_cache::ssa_lazy_cache): Relocate here.
Check for provided obstack.
(ssa_lazy_cache::~ssa_lazy_cache): Relocate here. Free bitmap or obstack.
* gimple-range-cache.h (ssa_lazy_cache::ssa_lazy_cache): Move.
(ssa_lazy_cache::~ssa_lazy_cache): Move.
(ssa_lazy_cache::m_ob): New.
* gimple-range.cc (dom_ranger::dom_ranger): Iniitialize obstack.
(dom_ranger::~dom_ranger): Release obstack.
(dom_ranger::pre_bb): Create ssa_lazy_cache using obstack.
* gimple-range.h (m_bitmaps): New.
|
|
Remove extra assignment, extra temp variable and variable shadowing.
No functional changes intended.
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_move): Remove extra
assignment to tmp variable, reuse tmp variable instead of
declaring new temporary variable and remove tmp variable shadowing.
|
|
Using auto_vec rather than vec for means the vectors are release
automatically upon return, to stop the leak. The problem seems is that
auto_vec<T, N> is not really move-aware, only the <T, 0> specialization
is.
gcc/ChangeLog:
* tree-profile.cc (find_conditions): Use auto_vec without
embedded storage.
|
|
The following addresses the corner case of an outer loop with an empty
header where we end up asking for the BB of a NULL stmt by
special-casing this case.
PR tree-optimization/115652
* tree-vect-slp.cc (vect_schedule_slp_node): Handle the case
where the outer loop header block is empty.
|
|
ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL [PR115635]
This patch fixes 3 bugs reported after merging the "Add DLL
import/export implementation to AArch64" series.
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/653955.html The
series refactors the i386 codebase to reuse it in AArch64, which
triggers some bugs.
Bug 115661 - [15 Regression] wrong code at -O{2,3} on x86_64-linux-gnu
since r15-1599-g63512c72df09b4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115661
Bug 115635 - [15 regression] Bootstrap fails with failed self-test
with the rust fe (diagnostic-path.cc:1153: test_empty_path: FAIL:
ASSERT_FALSE ((path.interprocedural_p ()))) since
r15-1599-g63512c72df09b4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115635
Issue 1. In some code, i386 has been relying on the
legitimize_pe_coff_symbol call on all platforms and should return
NULL_RTX if it is not supported.
Fix: NULL_RTX handling has been added when the target does not support
PECOFF.
Issue 2. ix86_GOT_alias_set is used on all platforms and cannot be
extracted to mingw.
Fix: ix86_GOT_alias_set has been returned as it was and is used on all
platforms for i386.
Bug 115643 - [15 regression] aarch64-w64-mingw32 support today breaks
x86_64-w64-mingw32 build cannot represent relocation type BFD_RELOC_64
since r15-1602-ged20feebd9ea31
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115643
Issue 3. PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED has been added and
used with a negative operator for a complex expression without braces.
Fix: Braces has been added, and
PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED has been renamed to
PE_COFF_LEGITIMIZE_EXTERN_DECL.
2024-06-28 Evgeny Karpov <Evgeny.Karpov@microsoft.com>
gcc/ChangeLog:
PR bootstrap/115635
PR target/115643
PR target/115661
* config/aarch64/cygming.h
(PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED): Rename to
PE_COFF_LEGITIMIZE_EXTERN_DECL.
(PE_COFF_LEGITIMIZE_EXTERN_DECL): Likewise.
* config/i386/cygming.h (GOT_ALIAS_SET): Remove the diffinition to
reuse it from i386.h.
(PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED): Rename to
PE_COFF_LEGITIMIZE_EXTERN_DECL.
(PE_COFF_LEGITIMIZE_EXTERN_DECL): Likewise.
* config/i386/i386-expand.cc (ix86_expand_move): Return
ix86_GOT_alias_set.
* config/i386/i386-expand.h (ix86_GOT_alias_set): Likewise.
* config/i386/i386.cc (ix86_GOT_alias_set): Likewise.
* config/i386/i386.h (GOT_ALIAS_SET): Likewise.
* config/mingw/winnt-dll.cc (get_dllimport_decl): Use
GOT_ALIAS_SET.
(legitimize_pe_coff_symbol): Rename to
PE_COFF_LEGITIMIZE_EXTERN_DECL.
* config/mingw/winnt-dll.h (ix86_GOT_alias_set): Declare
ix86_GOT_alias_set.
|
|
gcc/ChangeLog:
* range-op-ptr.cc (class hybrid_and_operator): Remove.
(class hybrid_or_operator): Same.
(class hybrid_min_operator): Same.
(class hybrid_max_operator): Same.
|
|
The following fixes wrong-code when using outer loop vectorization
and an inner loop SLP access with permutation. A wrong adjustment
to the IV increment is then applied on GCN.
PR tree-optimization/115640
* tree-vect-stmts.cc (vectorizable_load): With an inner
loop SLP access to not apply a gap adjustment.
|
|
There was an off-by-one error in the RDNA validation check, plus I forgot to
allow for two-to-one permute-and-merge operations.
PR target/115640
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_vectorize_vec_perm_const): Modify RDNA checks.
|
|
First step to adding a general assign all class type's data members
routine. Having a general routine prevents forgetting to tackle the
edge cases, e.g. setting _len.
gcc/fortran/ChangeLog:
* trans-expr.cc (gfc_class_set_vptr): Add setting of _vptr
member.
* trans-intrinsic.cc (conv_intrinsic_move_alloc): First use
of gfc_class_set_vptr and refactor very similar code.
* trans.h (gfc_class_set_vptr): Declare the new function.
gcc/testsuite/ChangeLog:
* gfortran.dg/unlimited_polymorphic_11.f90: Remove unnecessary
casts in gd-final expression.
|
|
The vptr for a class type is set in various ways in different
locations. Refactor the use and simplify code.
gcc/fortran/ChangeLog:
* trans-array.cc (structure_alloc_comps): Use reset_vptr.
* trans-decl.cc (gfc_trans_deferred_vars): Same.
(gfc_generate_function_code): Same.
* trans-expr.cc (gfc_reset_vptr): Allow supplying the class
type.
(gfc_conv_procedure_call): Use reset_vptr.
* trans-intrinsic.cc (gfc_conv_intrinsic_transfer): Same.
|
|
This patch generalizes some of the patterns in i386.md that recognize
double word concatenation, so they handle sign_extend the same way that
they handle zero_extend in appropriate contexts.
As a motivating example consider the following function:
__int128 foo(long long x, unsigned long long y)
{
return ((__int128)x<<64) | y;
}
when compiled with -O2, x86_64 currently generates:
foo: movq %rdi, %rdx
xorl %eax, %eax
xorl %edi, %edi
orq %rsi, %rax
orq %rdi, %rdx
ret
with this patch we now generate (the same as if x is unsigned):
foo: movq %rsi, %rax
movq %rdi, %rdx
ret
Treating both extensions the same way using any_extend is valid as
the top (extended) bits are "unused" after the shift by 64 (or more).
In theory, the RTL optimizers might consider canonicalizing the form
of extension used in these cases, but zero_extend is faster on some
machine, whereas sign extension is supported via addressing modes on
others, so handling both in the machine description is probably best.
2024-06-28 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.md (*concat<mode><dwi>3_3): Change zero_extend
to any_extend in first operand to left shift by mode precision.
(*concat<mode><dwi>3_4): Likewise.
(*concat<mode><dwi>3_6): Likewise.
gcc/testsuite/ChangeLog
* gcc.target/i386/concatditi-1.c: New test case.
|
|
This patch is another round of refinements to fine tune the new ternlog
infrastructure in i386's sse.md. This patch tweaks ix86_ternlog_idx
to allow multiple MEM/CONST_VECTOR/VEC_DUPLICATE operands prior to
splitting (before reload), when force_register is called on all but
one of these operands. Conceptually during the dynamic programming,
registers fill the args slots in the order 0, 1, 2, and mem-like
operands fill the slots in the order 2, 0, 1 [preferring the memory
operand to come last].
This patch allows us to remove some of the legacy ternlog patterns
in sse.md without regressions [which is left to the next and final
patch in this series]. An indication that these patterns are no
longer required is shown by the necessary testsuite tweaks below,
where the output assembler for the legacy instructions used hexadecimal,
but with the new ternlog infrastructure now consistently use decimal.
2024-06-28 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_ternlog_idx) <case VEC_DUPLICATE>:
Add a "goto do_mem_operand" as this need not match memory_operand.
<case CONST_VECTOR>: Only args[2] may be volatile memory operand.
Allow MEM/VEC_DUPLICATE/CONST_VECTOR as args[0] and args[1].
gcc/testsuite/ChangeLog
* gcc.target/i386/avx512f-andn-di-zmm-2.c: Match decimal instead
of hexadecimal immediate operand to ternlog.
* gcc.target/i386/avx512f-andn-si-zmm-2.c: Likewise.
* gcc.target/i386/avx512f-orn-si-zmm-1.c: Likewise.
* gcc.target/i386/avx512f-orn-si-zmm-2.c: Likewise.
* gcc.target/i386/pr100711-3.c: Likewise.
* gcc.target/i386/pr100711-4.c: Likewise.
* gcc.target/i386/pr100711-5.c: Likewise.
|
|
|
|
gcc/jit/ChangeLog:
* docs/topics/compatibility.rst (LIBGCCJIT_ABI_28): New ABI tag.
* docs/topics/expressions.rst: Document gcc_jit_context_new_alignof.
* jit-playback.cc (new_alignof): New method.
* jit-playback.h: New method.
* jit-recording.cc (recording::context::new_alignof): New
method.
(recording::memento_of_sizeof::replay_into,
recording::memento_of_typeinfo::replay_into,
recording::memento_of_sizeof::make_debug_string,
recording::memento_of_typeinfo::make_debug_string,
recording::memento_of_sizeof::write_reproducer,
recording::memento_of_typeinfo::write_reproducer): Rename.
* jit-recording.h (enum type_info_type): New enum.
(class memento_of_sizeof class memento_of_typeinfo): Rename.
* libgccjit.cc (gcc_jit_context_new_alignof): New function.
* libgccjit.h (gcc_jit_context_new_alignof): New function.
* libgccjit.map: New function.
gcc/testsuite/ChangeLog:
* jit.dg/all-non-failing-tests.h: New test.
* jit.dg/test-alignof.c: New test.
|
|
Add an explicit error messages when c99's static is
used without a size expression in an array declarator.
gcc/c:
* c-parser.cc (c_parser_direct_declarator_inner): Add
error message.
gcc/testsuite:
* gcc.dg/c99-arraydecl-4.c: New test.
|
|
fixincludes/ChangeLog:
* fixincl.x: Regenerate.
* inclhack.def (apple_local_stdio_fn_deprecation): Also apply to
_stdio.h.
|
|
late-combine relies on df, which for -O0 is only initialised late
(pass_df_initialize_no_opt, after split1). Other df-based passes
cope with this by requiring optimize > 0, so this patch does the
same for late-combine.
gcc/
PR rtl-optimization/115677
* late-combine.cc (pass_late_combine::gate): New function.
|
|
An explicit check for address registers was not required so far since
during register allocation the processing of address constraints was
sufficient. However, address constraints themself do not check for
REGNO_OK_FOR_{BASE,INDEX}_P. Thus, with the newly introduced
late-combine pass in r15-1579-g792f97b44ffc5e we generate new insns with
invalid address registers which aren't fixed up afterwards.
Fixed by explicitly checking for address registers in
s390_decompose_addrstyle_without_index such that those new insns are
rejected.
gcc/ChangeLog:
PR target/115634
* config/s390/s390.cc (s390_decompose_addrstyle_without_index):
Check for ADDR_REGS in s390_decompose_addrstyle_without_index.
|
|
The following avoids associating a reduction path as that might
get STMT_VINFO_REDUC_IDX out-of-sync with the SLP operand order.
This is a latent issue with SLP reductions but now easily exposed
as we're doing single-lane SLP reductions.
When we achieved SLP only we can move and update this meta-data.
PR tree-optimization/115669
* tree-vect-slp.cc (vect_build_slp_tree_2): Do not reassociate
chains that participate in a reduction.
* gcc.dg/vect/pr115669.c: New testcase.
|
|
For the GNU locale model, codecvt::do_out and codecvt::do_in incorrectly
return 'ok' when the destination range is empty. That happens because
detecting incomplete output is done in the loop body, and the loop is
never even entered if to == to_end.
By restructuring the loop condition so that we check the output range
separately, we can ensure that for a non-empty source range, we always
enter the loop at least once, and detect if the destination range is too
small.
The loops also seem easier to reason about if we return immediately on
any error, instead of checking the result twice on every iteration. We
can use an RAII type to restore the locale before returning, which also
simplifies all the other member functions.
libstdc++-v3/ChangeLog:
PR libstdc++/37475
* config/locale/gnu/codecvt_members.cc (Guard): New RAII type.
(do_out, do_in): Return partial if the destination is empty but
the source is not. Use Guard to restore locale on scope exit.
Return immediately on any conversion error.
(do_encoding, do_max_length, do_length): Use Guard.
* testsuite/22_locale/codecvt/in/char/37475.cc: New test.
* testsuite/22_locale/codecvt/in/wchar_t/37475.cc: New test.
* testsuite/22_locale/codecvt/out/char/37475.cc: New test.
* testsuite/22_locale/codecvt/out/wchar_t/37475.cc: New test.
|
|
The newly-added testcase overrides the default dg-do action set by
check_vect_support_and_set_flags (in libstdc++-dg/conformance.exp), so
it attempts to run the test even if runtime vector support is not
available.
Remove the explicit dg-do directive, so that the default is honored,
and the test is run if vector support is found, and only compiled
otherwise.
for libstdc++-v3/ChangeLog
PR libstdc++/115454
* testsuite/experimental/simd/pr115454_find_last_set.cc: Defer
to check_vect_support_and_set_flags's default dg-do action.
|
|
gcc/ChangeLog:
* gimple-range-cache.cc (update_list::update_list): Add m_bitmaps.
(update_list::~update_list): Initialize m_bitmaps.
* gimple-range-cache.h (ssa_lazy_cache): Add m_bitmaps.
* gimple-range.cc (enable_ranger): Remove global bitmap
initialization.
(disable_ranger): Remove global bitmap release.
|
|
Using std::chrono::abs is only valid if numeric_limits<rep>::is_signed
is true, so using it unconditionally made it ill-formed to format a
duration with an unsigned rep.
The duration formatter might as negate the duration itself instead of
using chrono::abs, because it already needs to check for a negative
value.
libstdc++-v3/ChangeLog:
PR libstdc++/115668
* include/bits/chrono_io.h (formatter<duration<R,P, C>::format):
Do not use chrono::abs.
* testsuite/20_util/duration/io.cc: Check formatting a duration
with unsigned rep.
|
|
This adds debug assertions for std::vector<bool> element access.
libstdc++-v3/ChangeLog:
PR libstdc++/103191
* include/bits/stl_bvector.h (vector<bool>::operator[])
(vector<bool>::front, vector<bool>::back): Add debug assertions.
* testsuite/23_containers/vector/bool/element_access/constexpr.cc:
Remove dg-error that no longer triggers.
|
|
Some of our debug assertions expand to nothing unless
_GLIBCXX_ASSERTIONS is defined, which means they are not checked during
constant evaluation. By making them unconditionally expand to a
__glibcxx_assert expression they will be checked during constant
evaluation. This allows us to diagnose more instances of undefined
behaviour at compile-time, such as accessing a vector past-the-end.
libstdc++-v3/ChangeLog:
PR libstdc++/111250
* include/debug/assertions.h (__glibcxx_requires_non_empty_range)
(__glibcxx_requires_nonempty, __glibcxx_requires_subscript):
Define to __glibcxx_assert expressions or to debug mode
__glibcxx_check_xxx expressions.
* testsuite/23_containers/array/element_access/constexpr_c++17.cc:
Add checks for out-of-bounds accesses in constant expressions.
* testsuite/23_containers/vector/element_access/constexpr.cc:
Likewise.
|
|
This completes the switch from using System.Address_Operations to using only
System.Storage_Elements in the runtime library. The remaining uses were for
simple optimizations that can be done by the optimizer alone.
gcc/ada/
* libgnat/s-carsi8.adb: Remove clauses for System.Address_Operations
and use only operations of System.Storage_Elements for addresses.
* libgnat/s-casi16.adb: Likewise.
* libgnat/s-casi32.adb: Likewise.
* libgnat/s-casi64.adb: Likewise.
* libgnat/s-casi128.adb: Likewise.
* libgnat/s-carun8.adb: Likewise.
* libgnat/s-caun16.adb: Likewise.
* libgnat/s-caun32.adb: Likewise.
* libgnat/s-caun64.adb: Likewise.
* libgnat/s-caun128.adb: Likewise.
* libgnat/s-geveop.adb: Likewise.
|
|
gcc/ada/
* sem_ch2.adb (Analyze_Interpolated_String_Literal): Report
interpretations of ambiguous parameterless function calls.
|
|
It is computed from the Etype of N_Target_Name nodes.
gcc/ada/
* sem_ch5.adb (Analyze_Target_Name): Call Analyze_Dimension on the
node once the Etype is set.
* sem_dim.adb (OK_For_Dimension): Set to True for N_Target_Name.
(Analyze_Dimension): Call Analyze_Dimension_Has_Etype for it.
|
|
This patch fixes a duo of array assigments in Mdll that were bound
to fail.
gcc/ada/
* mdll.adb (Build_Non_Reloc_DLL): Fix incorrect assignment
to array object.
(Ada_Build_Non_Reloc_DLL): Likewise.
|
|
The frontend rejects the use of user defined string literals
using interpolated strings.
gcc/ada/
* sem_res.adb (Has_Applicable_User_Defined_Literal): Add missing
support for interpolated strings.
|
|
wrappers
Implicit wrapper overridings generated for functions with
controlling result when deriving with null extension may
have field Overridden_Operation incorrectly set, when making
several such derivations in succession. This happens because
overridings were assumed to come from source, and entities
generated by Derive_Subprograms were also assumed to be
derived from source subprograms. Overridden_Operation could
be set to the entity generated by Derive_Subprograms for the
same type, resulting in a cycle between Overriden_Operation
and Alias fields, causing non-termination in GNATprove.
gcc/ada/
* sem_ch6.adb (Check_Overriding_Indicator) Remove Comes_From_Source filter.
(New_Overloaded_Entity) Move up special case of LSP_Subprogram,
and remove Comes_From_Source filter.
|