Age | Commit message (Collapse) | Author | Files | Lines |
|
The test installed by "x86: make VPTERNLOG* usable on less than 512-bit
operands with just AVX512F" won't succeed on 32-bit, for floating point
operations being done there (by default) without using SIMD insns.
gcc/testsuite/
* gcc.target/i386/avx512f-copysign.c: Suppress for 32-bit.
|
|
Following two-operand bitwise operations, add another splitter to also
deal with not followed by broadcast all on its own, which can be
expressed as simple embedded broadcast instead once a broadcast operand
is actually permitted in the respective insn. While there also permit
a broadcast operand in the corresponding expander.
gcc/
PR target/100711
* config/i386/sse.md: New splitters to simplify
not;vec_duplicate as a singular vpternlog.
(one_cmpl<mode>2): Allow broadcast for operand 1.
(<mask_codefor>one_cmpl<mode>2<mask_name>): Likewise.
gcc/testsuite/
PR target/100711
* gcc.target/i386/pr100711-6.c: New test.
|
|
With respective two-operand bitwise operations now expressable by a
single VPTERNLOG, add splitters to also deal with ior and xor
counterparts of the original and-only case. Note that the splitters need
to be separate, as the placement of "not" differs in the final insns
(*iornot<mode>3, *xnor<mode>3) which are intended to pick up one half of
the result.
gcc/
PR target/100711
* config/i386/sse.md: New splitters to simplify
not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}.
gcc/testsuite/
PR target/100711
* gcc.target/i386/pr100711-4.c: New test.
* gcc.target/i386/pr100711-5.c: New test.
|
|
The intended broadcast (with AVX512) can very well be done right from
memory.
gcc/
PR target/100711
* config/i386/sse.md: Permit non-immediate operand 1 in AVX2
form of splitter for PR target/100711.
|
|
The following adjusts the tree.def documentation about VEC_PERM_EXPR
which wasn't adjusted when the restrictions of permutes with constant
mask were relaxed.
PR middle-end/110541
* tree.def (VEC_PERM_EXPR): Adjust documentation to reflect
reality.
|
|
When it's the memory operand which is to be inverted, using VPANDN*
requires a further load instruction. The same can be achieved by a
single VPTERNLOG*. Add two new alternatives (for plain memory and
embedded broadcast), adjusting the predicate for the first operand
accordingly.
Two pre-existing testcases actually end up being affected (improved) by
the change, which is reflected in updated expectations there.
gcc/
PR target/93768
* config/i386/sse.md (*andnot<mode>3): Add new alternatives
for memory form operand 1.
gcc/testsuite/
PR target/93768
* gcc.target/i386/avx512f-andn-di-zmm-2.c: New test.
* gcc.target/i386/avx512f-andn-si-zmm-2.c: Adjust expecations
towards generated code.
* gcc.target/i386/pr100711-3.c: Adjust expectations for 32-bit
code.
|
|
All combinations of and, ior, xor, and not involving two operands can be
expressed that way in a single insn.
gcc/
PR target/93768
* config/i386/i386.cc (ix86_rtx_costs): Further special-case
bitwise vector operations.
* config/i386/sse.md (*iornot<mode>3): New insn.
(*xnor<mode>3): Likewise.
(*<nlogic><mode>3): Likewise.
(andor): New code iterator.
(nlogic): New code attribute.
(ternlog_nlogic): Likewise.
gcc/testsuite/
PR target/93768
* gcc.target/i386/avx512-binop-not-1.h: New.
* gcc.target/i386/avx512-binop-not-2.h: New.
* gcc.target/i386/avx512f-orn-si-zmm-1.c: New test.
* gcc.target/i386/avx512f-orn-si-zmm-2.c: New test.
|
|
* tree-vect-stmts.cc (vect_mark_relevant): Fix typo.
|
|
These tests fail with -std=gnu++98/-D_GLIBCXX_DEBUG in the runtest
flags. They should require the c++11 effective target.
libstdc++-v3/ChangeLog:
* testsuite/23_containers/forward_list/debug/iterator1_neg.cc:
Skip as UNSUPPORTED for C++98 mode.
* testsuite/23_containers/forward_list/debug/iterator3_neg.cc:
Likewise.
|
|
libstdc++-v3/ChangeLog:
PR libstdc++/110542
* include/bits/stl_uninitialized.h (__uninitialized_default_n):
Do not use std::fill_n during constant evaluation.
|
|
Similar to r14-2052-gdd2eb972a5b063, replace the try-block with RAII
types for deallocating storage and destroying elements.
libstdc++-v3/ChangeLog:
* include/bits/vector.tcc (_M_default_append): Replace try-block
with RAII types.
|
|
This is needed by Clang 15.
libstdc++-v3/ChangeLog:
* include/bits/iterator_concepts.h (projected): Add typename.
|
|
gcc/ChangeLog:
* config/riscv/vector.md: Add float16 attr at sew、vlmul and ratio.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/abi-10.c: Add float16 tuple type case.
* gcc.target/riscv/rvv/base/abi-11.c: Ditto.
* gcc.target/riscv/rvv/base/abi-12.c: Ditto.
* gcc.target/riscv/rvv/base/abi-15.c: Ditto.
* gcc.target/riscv/rvv/base/abi-8.c: Ditto.
* gcc.target/riscv/rvv/base/abi-9.c: Ditto.
* gcc.target/riscv/rvv/base/abi-17.c: New test.
* gcc.target/riscv/rvv/base/abi-18.c: New test.
|
|
This patch adds support for the float16 tuple type.
gcc/ChangeLog:
* config/riscv/genrvv-type-indexer.cc (valid_type): Enable FP16 tuple.
* config/riscv/riscv-modes.def (RVV_TUPLE_MODES): New macro.
(ADJUST_ALIGNMENT): Ditto.
(RVV_TUPLE_PARTIAL_MODES): Ditto.
(ADJUST_NUNITS): Ditto.
* config/riscv/riscv-vector-builtins-types.def (vfloat16mf4x2_t):
New types.
(vfloat16mf4x3_t): Ditto.
(vfloat16mf4x4_t): Ditto.
(vfloat16mf4x5_t): Ditto.
(vfloat16mf4x6_t): Ditto.
(vfloat16mf4x7_t): Ditto.
(vfloat16mf4x8_t): Ditto.
(vfloat16mf2x2_t): Ditto.
(vfloat16mf2x3_t): Ditto.
(vfloat16mf2x4_t): Ditto.
(vfloat16mf2x5_t): Ditto.
(vfloat16mf2x6_t): Ditto.
(vfloat16mf2x7_t): Ditto.
(vfloat16mf2x8_t): Ditto.
(vfloat16m1x2_t): Ditto.
(vfloat16m1x3_t): Ditto.
(vfloat16m1x4_t): Ditto.
(vfloat16m1x5_t): Ditto.
(vfloat16m1x6_t): Ditto.
(vfloat16m1x7_t): Ditto.
(vfloat16m1x8_t): Ditto.
(vfloat16m2x2_t): Ditto.
(vfloat16m2x3_t): Ditto.
(vfloat16m2x4_t): Ditto.
(vfloat16m4x2_t): Ditto.
* config/riscv/riscv-vector-builtins.def (vfloat16mf4x2_t): New macro.
(vfloat16mf4x3_t): Ditto.
(vfloat16mf4x4_t): Ditto.
(vfloat16mf4x5_t): Ditto.
(vfloat16mf4x6_t): Ditto.
(vfloat16mf4x7_t): Ditto.
(vfloat16mf4x8_t): Ditto.
(vfloat16mf2x2_t): Ditto.
(vfloat16mf2x3_t): Ditto.
(vfloat16mf2x4_t): Ditto.
(vfloat16mf2x5_t): Ditto.
(vfloat16mf2x6_t): Ditto.
(vfloat16mf2x7_t): Ditto.
(vfloat16mf2x8_t): Ditto.
(vfloat16m1x2_t): Ditto.
(vfloat16m1x3_t): Ditto.
(vfloat16m1x4_t): Ditto.
(vfloat16m1x5_t): Ditto.
(vfloat16m1x6_t): Ditto.
(vfloat16m1x7_t): Ditto.
(vfloat16m1x8_t): Ditto.
(vfloat16m2x2_t): Ditto.
(vfloat16m2x3_t): Ditto.
(vfloat16m2x4_t): Ditto.
(vfloat16m4x2_t): Ditto.
* config/riscv/riscv-vector-switch.def (TUPLE_ENTRY): New.
* config/riscv/riscv.md: New.
* config/riscv/vector-iterators.md: New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/tuple-28.c: New test.
* gcc.target/riscv/rvv/base/tuple-29.c: New test.
* gcc.target/riscv/rvv/base/tuple-30.c: New test.
* gcc.target/riscv/rvv/base/tuple-31.c: New test.
* gcc.target/riscv/rvv/base/tuple-32.c: New test.
|
|
A mips16e2 related test fails after the ifcvt change. The mips16e2
addition also causes a test for unrelated module to fail.
This patch adjusts branch costs when running the two affected tests.
These tests should not require the -mbranch-cost option, and
this issue needs to be addressed.
gcc/testsuite/ChangeLog:
* gcc.target/mips/mips16e2-cmov.c: Adjust branch cost to
encourage if-conversion.
* gcc.target/mips/movcc-3.c: Same as above.
|
|
|
|
The problem here is we might produce some values out of the type's
min/max (and/or valid values, e.g. signed booleans). The fix is to
use an integer type which has the same precision and signedness
as the original type.
Note two_value_replacement in phiopt had the same issue in previous
versions; though I don't know if a problem will show up there.
OK? Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
PR tree-optimization/110487
* match.pd (a !=/== CST1 ? CST2 : CST3): Always
build a nonstandard integer and use that.
|
|
This fixes the first part of this bug where `a ? -1 : 0`
would cause a value of 1 into the signed boolean value.
It fixes the problem by casting to an integer type of
the same size/signedness before doing the negative and
then casting to the type of expression.
OK? Bootstrapped and tested on x86_64.
gcc/ChangeLog:
* match.pd (a?-1:0): Cast type an integer type
rather the type before the negative.
(a?0:-1): Likewise.
|
|
gcc/ChangeLog:
* config/xtensa/xtensa.cc (machine_function, xtensa_expand_prologue):
Change to use HARD_REG_BIT and its macros.
* config/xtensa/xtensa.md
(peephole2: regmove elimination during DFmode input reload):
Likewise.
|
|
The following makes sure to not make conditional undefs in PHI arguments
unconditional by folding cond ? arg1 : arg2.
PR tree-optimization/110491
* tree-ssa-phiopt.cc (match_simplify_replacement): Check
whether the PHI args are possibly undefined before folding
the COND_EXPR.
* gcc.dg/torture/pr110491.c: New testcase.
|
|
We extend the machine mode from 8 to 16 bits already. But there still
one placing missing from the streamer. It has one hard coded array
for the machine code like size 256.
In the lto pass, we memset the array by MAX_MACHINE_MODE count but the
value of the MAX_MACHINE_MODE will grow as more and more modes are
added. While the machine mode array in tree-streamer still leave 256 as is.
Then, when the MAX_MACHINE_MODE is greater than 256, the memset of
lto_output_init_mode_table will touch the memory out of range unexpected.
This patch would like to take the MAX_MACHINE_MODE as the size of the
array in streamer, to make sure there is no potential unexpected
memory access in future. Meanwhile, this patch also adjust some place
which has MAX_MACHINE_MODE <= 256 assumption.
Care is taken that for offload compilation, we interpret the stream-in
data in terms of the host 'MAX_MACHINE_MODE' ('file_data->mode_bits'),
which very likely is different from the offload device
'MAX_MACHINE_MODE'.
gcc/
* lto-streamer-in.cc (lto_input_mode_table): Stream in the mode
bits for machine mode table.
* lto-streamer-out.cc (lto_write_mode_table): Stream out the
HOST machine mode bits.
* lto-streamer.h (struct lto_file_decl_data): New fields mode_bits.
* tree-streamer.cc (streamer_mode_table): Take MAX_MACHINE_MODE
as the table size.
* tree-streamer.h (streamer_mode_table): Ditto.
(bp_pack_machine_mode): Take 1 << ceil_log2 (MAX_MACHINE_MODE)
as the packing limit.
(bp_unpack_machine_mode): Ditto with 'file_data->mode_bits'.
gcc/lto/
* lto-common.cc (lto_file_finalize) [!ACCEL_COMPILER]: Initialize
'file_data->mode_bits'.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
|
|
... instead of just 'unsigned char *mode_table'. Preparation for a forthcoming
change, where we need to capture an additional 'file_data' item, so it seems
easier to just capture that one proper.
gcc/
* lto-streamer.h (class lto_input_block): Capture
'lto_file_decl_data *file_data' instead of just
'unsigned char *mode_table'.
* ipa-devirt.cc (ipa_odr_read_section): Adjust.
* ipa-fnsummary.cc (inline_read_section): Likewise.
* ipa-icf.cc (sem_item_optimizer::read_section): Likewise.
* ipa-modref.cc (read_section): Likewise.
* ipa-prop.cc (ipa_prop_read_section, read_replacements_section):
Likewise.
* ipa-sra.cc (isra_read_summary_section): Likewise.
* lto-cgraph.cc (input_cgraph_opt_section): Likewise.
* lto-section-in.cc (lto_create_simple_input_block): Likewise.
* lto-streamer-in.cc (lto_read_body_or_constructor)
(lto_input_toplevel_asms): Likewise.
* tree-streamer.h (bp_unpack_machine_mode): Likewise.
gcc/lto/
* lto-common.cc (lto_read_decls): Adjust.
|
|
The following removes gimple_uses_undefined_value_p and instead
uses the conservative mark_ssa_maybe_undefs in PHI-OPT, the last
user of the other API.
* tree-ssa-phiopt.cc (pass_phiopt::execute): Mark SSA undefs.
(empty_bb_or_one_feeding_into_p): Check for them.
* tree-ssa.h (gimple_uses_undefined_value_p): Remove.
* tree-ssa.cc (gimple_uses_undefined_value_p): Likewise.
|
|
The following removes an unnecessary check.
* tree-vect-loop.cc (vect_analyze_loop_costing): Remove
check guarding scalar_niter underflow.
|
|
This is a new testcase for the fixed bug.
PR tree-optimization/110376
* gcc.dg/torture/pr110376.c: New testcase.
|
|
slp_done_for_suggested_uf is used directly in vect_analyze_loop_2
without initialization, which is undefined behavior. Initialize it to false
according to the discussion.
gcc/ChangeLog:
PR tree-optimization/110531
* tree-vect-loop.cc (vect_analyze_loop_1): initialize
slp_done_for_suggested_uf to false.
|
|
The following replaces the simplistic gimple_uses_undefined_value_p
with the conservative mark_ssa_maybe_undefs approach as already
used by LIM and IVOPTs. This is to avoid exposing an unconditional
uninitialized read on a path from entry by if-combine.
PR tree-optimization/110228
* tree-ssa-ifcombine.cc (pass_tree_ifcombine::execute):
Mark SSA may-undefs.
(bb_no_side_effects_p): Check stmt uses for undefs.
* gcc.dg/torture/pr110228.c: New testcase.
* gcc.dg/uninit-pr101912.c: Un-XFAIL.
|
|
When we compute liveness and relevantness we have to make sure to
handle live but not relevant stmts in a way we can later vectorize
them. When the stmt uses only operands that do not need vectorization
we can just leave such stmts in place - but not in the case they
are recognized as patterns. Since we don't have a way to cancel
pattern recognition we have to force mark such stmts as relevant.
PR tree-optimization/110436
* tree-vect-stmts.cc (vect_mark_relevant): Expand dumping,
force live but not relevant pattern stmts relevant.
* gcc.dg/pr110436.c: New testcase.
|
|
Enable ENQCMD and UINTR for march=sierraforest according to Intel ISE
https://cdrdv2.intel.com/v1/dl/getContent/671368
gcc/ChangeLog
* config/i386/i386.h: Add PTA_ENQCMD and PTA_UINTR to PTA_SIERRAFOREST.
* doc/invoke.texi: Update new isa to march=sierraforest and grandridge.
|
|
This relaxes the condition under which Expand_Assign_Array leaves the
assignment to or from an array slice untouched. The main prerequisite
for the code generator is that everything be aligned on byte boundaries
and Is_Possibly_Unaligned_Slice is too strong a predicate for this, so
it is replaced by the combination of Possible_Bit_Aligned_Component and
Is_Bit_Packed_Array, modulo a change to Possible_Bit_Aligned_Component
to take into account the specific case of slices.
gcc/ada/
* exp_ch5.adb (Expand_Assign_Array): Adjust comment above the
calls to Possible_Bit_Aligned_Component on the LHS and RHS. Do not
call Is_Possibly_Unaligned_Slice in the slice case.
* exp_util.ads (Component_May_Be_Bit_Aligned): Add For_Slice
boolean parameter.
(Possible_Bit_Aligned_Component): Likewise.
* exp_util.adb (Component_May_Be_Bit_Aligned): Do not return False
for the slice of a small record or bit-packed array component.
(Possible_Bit_Aligned_Component): Pass For_Slice in recursive
calls, except in the slice case where True is passed, as well as
in call to Component_May_Be_Bit_Aligned.
|
|
The procedure is not stable under repeated invocation. Now it may be called
twice on the same node, for example during the expansion of the renaming of
the predefined equality operator after the unchecked union type is frozen.
gcc/ada/
* exp_ch4.ads (Expand_Unchecked_Union_Equality): Only take a
single parameter.
* exp_ch4.adb (Expand_Unchecked_Union_Equality): Add guard against
repeated invocation on the same node.
* exp_ch6.adb (Expand_Call): Only pass a single actual parameter
in the call to Expand_Unchecked_Union_Equality.
|
|
gcc/ada/
* doc/gnat_rm/standard_and_implementation_defined_restrictions.rst:
add No_Use_Of_Attribute & No_Use_Of_Pragma restrictions.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
|
|
The query Inherited_Subprograms was returning a list containing
some subprograms whose overridding was also in the list, when
interfaces was present. This was an issue for GNATprove. Now propose
a mode for this function to filter out overridden primitives.
gcc/ada/
* sem_disp.adb (Inherited_Subprograms): Add parameter to filter
out results.
* sem_disp.ads: Likewise.
|
|
When trying to associate (v + INT_MAX) + INT_MAX we are using
the TREE_OVERFLOW bit to check for correctness. That isn't
working for VECTOR_CSTs and it can't in general when one considers
VL vectors. It looks like it should work for COMPLEX_CSTs but
I didn't try to single out _Complex int in this change.
The following makes sure that for vectors we use the fallback of
using unsigned arithmetic when associating the above to
v + (INT_MAX + INT_MAX).
PR middle-end/110495
* tree.h (TREE_OVERFLOW): Do not mention VECTOR_CSTs
since we do not set TREE_OVERFLOW on those since the
introduction of VL vectors.
* match.pd (x +- CST +- CST): For VECTOR_CST do not look
at TREE_OVERFLOW to determine validity of association.
* gcc.dg/tree-ssa/addadd-2.c: Amend.
* gcc.dg/tree-ssa/forwprop-27.c: Adjust.
|
|
The following removes late deciding to elide vectorized epilogues to
the analysis phase and also avoids altering the epilogues niter.
The costing part from vect_determine_partial_vectors_and_peeling is
moved to vect_analyze_loop_costing where we use the main loop
analysis to constrain the epilogue scalar iterations.
I have not tried to integrate this with vect_known_niters_smaller_than_vf.
It seems the for_epilogue_p parameter in
vect_determine_partial_vectors_and_peeling is largely useless and
we could compute that in the function itself.
PR tree-optimization/110310
* tree-vect-loop.cc (vect_determine_partial_vectors_and_peeling):
Move costing part ...
(vect_analyze_loop_costing): ... here. Integrate better
estimate for epilogues from ...
(vect_analyze_loop_2): Call vect_determine_partial_vectors_and_peeling
with actual epilogue status.
* tree-vect-loop-manip.cc (vect_do_peeling): ... here and
avoid cancelling epilogue vectorization.
(vect_update_epilogue_niters): Remove. No longer update
epilogue LOOP_VINFO_NITERS.
* gcc.target/i386/pr110310.c: New testcase.
* gcc.dg/vect/slp-perm-12.c: Disable epilogue vectorization.
|
|
This reverts commit 3d95a524d4746ceb3065f92f30a5679afb88d16a.
gcc/ChangeLog:
* config/riscv/vector.md: Revert changes.
|
|
Hi, Richi and Richard.
Base one the review comments from Richard:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623405.html
I change len_mask_gather_load/len_mask_scatter_store order into:
{len,bias,mask}
We adjust adding len and mask using using add_len_and_mask_args
which is same as partial_load/parial_store.
Now, the codes become more reasonable and easier maintain.
This patch is adding LEN_MASK_{GATHER_LOAD,SCATTER_STORE} to allow targets
handle flow control by mask and loop control by length on gather/scatter memory
operations. Consider this following case:
void
f (uint8_t *restrict a,
uint8_t *restrict b, int n,
int base, int step,
int *restrict cond)
{
for (int i = 0; i < n; ++i)
{
if (cond[i])
a[i * step + base] = b[i * step + base];
}
}
We hope RVV can vectorize such case into following IR:
loop_len = SELECT_VL
control_mask = comparison
v = LEN_MASK_GATHER_LOAD (.., loop_len, bias, control_mask)
LEN_SCATTER_STORE (... v, ..., loop_len, bias, control_mask)
This patch doesn't apply such patterns into vectorizer, just add patterns
and update the documents.
Will send patch which apply such patterns into vectorizer soon after this
patch is approved.
Ok for trunk?
gcc/ChangeLog:
* doc/md.texi: Add len_mask_gather_load/len_mask_scatter_store.
* internal-fn.cc (expand_scatter_store_optab_fn): Ditto.
(expand_gather_load_optab_fn): Ditto.
(internal_load_fn_p): Ditto.
(internal_store_fn_p): Ditto.
(internal_gather_scatter_fn_p): Ditto.
(internal_fn_len_index): Ditto.
(internal_fn_mask_index): Ditto.
(internal_fn_stored_value_index): Ditto.
* internal-fn.def (LEN_MASK_GATHER_LOAD): Ditto.
(LEN_MASK_SCATTER_STORE): Ditto.
* optabs.def (OPTAB_CD): Ditto.
|
|
I recently noticed that current VSETVL pass has a unnecessary restriction on local
AVL propgation.
Consider this following case:
+ insn 1: vsetvli a5,a3,e8,mf4,ta,mu
+ insn 2: vsetvli zero,a5,e32,m1,ta,ma
+ ...
+ vle32.v v1,0(a1)
+ vsetvli a2,zero,e32,m1,ta,ma
+ vadd.vv v1,v1,v1
+ vsetvli zero,a5,e32,m1,ta,ma
+ vse32.v v1,0(a0)
+ ...
+ insn 3: sub a3,a3,a5
+ ...
We failed to elide insn 2 (vsetvl insn) since insn 3 is modifying "a3" AVL.
Actually, we don't really care about insn 3 since we should only check and make sure
there is no insn between insn 1 and insn 2 that modifies "a3" AVL. Then, we can propgate
AVL "a3" from insn 1 to insn 2. Finally, insn 2 is eliminated.
After this patch:
+ insn 1: vsetvli a5,a3,e8,mf4,ta,ma
+ ...
+ vle32.v v1,0(a1)
+ vsetvli a2,zero,e32,m1,ta,ma
+ vadd.vv v1,v1,v1
+ vsetvli zero,a5,e32,m1,ta,ma
+ vse32.v v1,0(a0)
+ ...
+ insn 3: sub a3,a3,a5
+ ...
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc
(vector_insn_info::parse_insn): Add early break.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/avl_prop-1.c: New test.
|
|
This is just expected to be a change in representation.
No code is expected to change; no new tests are added.
* config/cris/cris.md (CRIS_UNSPEC_SWAP_BITS): Remove.
("cris_swap_bits", "ctzsi2"): Use bitreverse instead.
|
|
This seems to have just been overlooked when introducing
BITREVERSE. Note that the function name mem_loc_descriptor
is a misnomer; it'd better be called rtx_loc_descriptor or
any_loc_descriptor, because "anything" RTX can end up here.
To wit, when introducing new RTL that ends up as code or for
other reasons appear in debug expressions, don't forget to
update this function. This was observed by building
libstdc+++ for cris-elf with a patch replacing the
CRIS_UNSPEC_SWAP_BITS by bitreverse, as hitting the
internal-error-generating default case.
Looking at the BSWAP, POPCOUNT and ROTATE cases, BITREVERSE
can probably be fully expressed as DWARF code if need be,
but let's start with not throwing an internal error.
gcc:
* dwarf2out.cc (mem_loc_descriptor): Handle BITREVERSE.
|
|
|
|
The <syncstream> header is only supported for the cxx11 ABI. The
declarations of basic_syncbuf, basic_osyncstream, syncbuf and
osyncstream were already correctly guarded by a check for
_GLIBCXX_USE_CXX11_ABI, but the wsyncbuf and wosyncstream declarations
were not.
libstdc++-v3/ChangeLog:
* testsuite/27_io/headers/iosfwd/synopsis.cc: Make wsyncbuf and
wosyncstream depend on _GLIBCXX_USE_CXX11_ABI.
|
|
This reapplies r10-1314-g32bab8b6ad0a90 which was lost in the recent
PSTL rebase from upstream.
* include/pstl/pstl_config.h (_PSTL_PRAGMA_SIMD_SCAN,
_PSTL_PRAGMA_SIMD_INCLUSIVE_SCAN, _PSTL_PRAGMA_SIMD_EXCLUSIVE_SCAN):
Define to OpenMP 5.0 pragmas even for GCC 10.0+.
(_PSTL_UDS_PRESENT): Define to 1 for GCC 10.0+.
|
|
These calls should be qualified to prevent ADL, which can cause errors
for incomplete types that are associated classes.
libstdc++-v3/ChangeLog:
* include/bits/alloc_traits.h (_Destroy): Qualify call.
* include/bits/stl_construct.h (_Destroy, _Destroy_n): Likewise.
* testsuite/23_containers/vector/cons/destroy-adl.cc: New test.
|
|
This series adds basic support for the vector crypto extensions:
* Zvbb
* Zvbc
* Zvkg
* Zvkned
* Zvkhn[a,b]
* Zvksed
* Zvksh
* Zvkn
* Zvknc
* Zvkng
* Zvks
* Zvksc
* Zvksg
* Zvkt
This patch is based on the v20230620 version of the Vector Cryptography
specification. The specification is frozen and can be found here:
https://github.com/riscv/riscv-crypto/releases/tag/v20230620
Binutils support is merged as 9fdc1b157b6e72f7dd98851a240c5fdb386a558e.
All extensions come with (passing) tests for the feature test macros.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add support for zvbb,
zvbc, zvkg, zvkned, zvknha, zvknhb, zvksed, zvksh, zvkn,
zvknc, zvkng, zvks, zvksc, zvksg, zvkt and the implied subsets.
* config/riscv/arch-canonicalize: Add canonicalization info for
zvkn, zvknc, zvkng, zvks, zvksc, zvksg.
* config/riscv/riscv-opts.h (MASK_ZVBB): New macro.
(MASK_ZVBC): Likewise.
(TARGET_ZVBB): Likewise.
(TARGET_ZVBC): Likewise.
(MASK_ZVKG): Likewise.
(MASK_ZVKNED): Likewise.
(MASK_ZVKNHA): Likewise.
(MASK_ZVKNHB): Likewise.
(MASK_ZVKSED): Likewise.
(MASK_ZVKSH): Likewise.
(MASK_ZVKN): Likewise.
(MASK_ZVKNC): Likewise.
(MASK_ZVKNG): Likewise.
(MASK_ZVKS): Likewise.
(MASK_ZVKSC): Likewise.
(MASK_ZVKSG): Likewise.
(MASK_ZVKT): Likewise.
(TARGET_ZVKG): Likewise.
(TARGET_ZVKNED): Likewise.
(TARGET_ZVKNHA): Likewise.
(TARGET_ZVKNHB): Likewise.
(TARGET_ZVKSED): Likewise.
(TARGET_ZVKSH): Likewise.
(TARGET_ZVKN): Likewise.
(TARGET_ZVKNC): Likewise.
(TARGET_ZVKNG): Likewise.
(TARGET_ZVKS): Likewise.
(TARGET_ZVKSC): Likewise.
(TARGET_ZVKSG): Likewise.
(TARGET_ZVKT): Likewise.
* config/riscv/riscv.opt: Introduction of riscv_zv{b,k}_subext.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zvbb.c: New test.
* gcc.target/riscv/zvbc.c: New test.
* gcc.target/riscv/zvkg.c: New test.
* gcc.target/riscv/zvkn-1.c: New test.
* gcc.target/riscv/zvkn.c: New test.
* gcc.target/riscv/zvknc-1.c: New test.
* gcc.target/riscv/zvknc-2.c: New test.
* gcc.target/riscv/zvknc.c: New test.
* gcc.target/riscv/zvkned.c: New test.
* gcc.target/riscv/zvkng-1.c: New test.
* gcc.target/riscv/zvkng-2.c: New test.
* gcc.target/riscv/zvkng.c: New test.
* gcc.target/riscv/zvknha.c: New test.
* gcc.target/riscv/zvknhb.c: New test.
* gcc.target/riscv/zvks-1.c: New test.
* gcc.target/riscv/zvks.c: New test.
* gcc.target/riscv/zvksc-1.c: New test.
* gcc.target/riscv/zvksc-2.c: New test.
* gcc.target/riscv/zvksc.c: New test.
* gcc.target/riscv/zvksed.c: New test.
* gcc.target/riscv/zvksg-1.c: New test.
* gcc.target/riscv/zvksg-2.c: New test.
* gcc.target/riscv/zvksg.c: New test.
* gcc.target/riscv/zvksh.c: New test.
* gcc.target/riscv/zvkt.c: New test.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
The backtrace in the bug report suggest there is a running out of
stack during GC collection, because of a long chain of eh_landing_pad_d.
This might fix that by adding chain_next onto eh_landing_pad_d's GTY marker.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR middle-end/110510
* except.h (struct eh_landing_pad_d): Add chain_next GTY.
|
|
The addition of the multiply_defined suppress flag has been handled for some
considerable time now in the Darwin specs; remove it from the testsuite libs.
Avoid duplicates in the specs.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/darwin.h: Avoid duplicate multiply_defined specs on
earlier Darwin versions with shared libgcc.
libstdc++-v3/ChangeLog:
* testsuite/lib/libstdc++.exp: Remove additional flag handled
by Darwin specs.
gcc/testsuite/ChangeLog:
* lib/g++.exp: Remove additional flag handled by Darwin specs.
* lib/obj-c++.exp: Likewise.
|
|
Also change internal variable from int to bool.
gcc/ChangeLog:
* tree.h (tree_int_cst_equal): Change return type from int to bool.
(operand_equal_for_phi_arg_p): Ditto.
(tree_map_base_marked_p): Ditto.
* tree.cc (contains_placeholder_p): Update function body
for bool return type.
(type_cache_hasher::equal): Ditto.
(tree_map_base_hash): Change return type
from int to void and adjust function body accordingly.
(tree_int_cst_equal): Ditto.
(operand_equal_for_phi_arg_p): Ditto.
(get_narrower): Change "first" variable to bool.
(cl_option_hasher::equal): Update function body for bool return type.
* ggc.h (ggc_set_mark): Change return type from int to bool.
(ggc_marked_p): Ditto.
* ggc-page.cc (gt_ggc_mx): Change return type
from int to void and adjust function body accordingly.
(ggc_set_mark): Ditto.
|
|
Hi, Richard. I fix the order as you suggeted.
Before this patch, the order is {len,mask,bias}.
Now, after this patch, the order becomes {len,bias,mask}.
Since you said we should not need 'internal_fn_bias_index', the bias index should always be the len index + 1.
I notice LEN_STORE order is {len,vector,bias}, to make them consistent, I reorder into LEN_STORE {len,bias,vector}.
Just like MASK_STORE {mask,vector}.
Ok for trunk ?
gcc/ChangeLog:
* config/riscv/autovec.md: Change order of
LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
* config/riscv/riscv-v.cc (expand_load_store): Ditto.
* doc/md.texi: Ditto.
* gimple-fold.cc (gimple_fold_partial_load_store_mem_ref): Ditto.
* internal-fn.cc (len_maskload_direct): Ditto.
(len_maskstore_direct): Ditto.
(add_len_and_mask_args): New function.
(expand_partial_load_optab_fn): Change order of
LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
(expand_partial_store_optab_fn): Ditto.
(internal_fn_len_index): New function.
(internal_fn_mask_index): Change order of
LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
(internal_fn_stored_value_index): Ditto.
(internal_len_load_store_bias): Ditto.
* internal-fn.h (internal_fn_len_index): New function.
* tree-ssa-dse.cc (initialize_ao_ref_for_dse): Change order of
LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
|
|
The problem is that the predefined equality operator for unchecked union
types is implemented out of line by invoking a function that takes more
parameters than the two operands, which means that the renaming is not
seen as type conforming with this function and, therefore, is rejected.
The way out is to implement these additional parameters as "extra" formal
parameters, since this kind of parameters is not taken into account for
semantic checks. The change also factors out the duplicated generation
of actuals for these additional parameters into a single procedure.
gcc/ada/
* exp_ch3.ads (Build_Variant_Record_Equality): Add Spec_Id as second
parameter.
* exp_ch3.adb (Build_Variant_Record_Equality): For unchecked union
types, build the additional parameters as extra formal parameters.
(Expand_Freeze_Record_Type.Build_Variant_Record_Equality): Pass
Empty as Spec_Id in call to Build_Variant_Record_Equality.
* exp_ch4.ads (Expand_Unchecked_Union_Equality): New procedure.
* exp_ch4.adb (Expand_Composite_Equality): In the presence of a
function implementing composite equality, do not special case the
unchecked union types, and only convert the operands if the base
types are not the same like in Build_Equality_Call.
(Build_Equality_Call): Do not special case the unchecked union types
and relocate the operands only once.
(Expand_N_Op_Eq): Do not special case the unchecked union types.
(Expand_Unchecked_Union_Equality): New procedure implementing the
specific expansion of calls to the predefined equality function.
* exp_ch6.adb (Is_Unchecked_Union_Equality): New predicate.
(Expand_Call): Call Is_Unchecked_Union_Equality to determine whether
to call Expand_Unchecked_Union_Equality or Expand_Call_Helper.
* exp_ch8.adb (Build_Body_For_Renaming): Set Has_Delayed_Freeze flag
earlier on Id and pass Id in call to Build_Variant_Record_Equality.
|