aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-07-05x86: suppress avx512f-copysign.c testcase for 32-bitJan Beulich1-1/+1
The test installed by "x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F" won't succeed on 32-bit, for floating point operations being done there (by default) without using SIMD insns. gcc/testsuite/ * gcc.target/i386/avx512f-copysign.c: Suppress for 32-bit.
2023-07-05x86: yet more PR target/100711-like splittingJan Beulich2-2/+33
Following two-operand bitwise operations, add another splitter to also deal with not followed by broadcast all on its own, which can be expressed as simple embedded broadcast instead once a broadcast operand is actually permitted in the respective insn. While there also permit a broadcast operand in the corresponding expander. gcc/ PR target/100711 * config/i386/sse.md: New splitters to simplify not;vec_duplicate as a singular vpternlog. (one_cmpl<mode>2): Allow broadcast for operand 1. (<mask_codefor>one_cmpl<mode>2<mask_name>): Likewise. gcc/testsuite/ PR target/100711 * gcc.target/i386/pr100711-6.c: New test.
2023-07-05x86: further PR target/100711-like splittingJan Beulich3-0/+112
With respective two-operand bitwise operations now expressable by a single VPTERNLOG, add splitters to also deal with ior and xor counterparts of the original and-only case. Note that the splitters need to be separate, as the placement of "not" differs in the final insns (*iornot<mode>3, *xnor<mode>3) which are intended to pick up one half of the result. gcc/ PR target/100711 * config/i386/sse.md: New splitters to simplify not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}. gcc/testsuite/ PR target/100711 * gcc.target/i386/pr100711-4.c: New test. * gcc.target/i386/pr100711-5.c: New test.
2023-07-05x86: allow memory operand for AVX2 splitter for PR target/100711Jan Beulich1-1/+1
The intended broadcast (with AVX512) can very well be done right from memory. gcc/ PR target/100711 * config/i386/sse.md: Permit non-immediate operand 1 in AVX2 form of splitter for PR target/100711.
2023-07-05middle-end/110541 - VEC_PERM_EXPR documentation is offRichard Biener1-6/+11
The following adjusts the tree.def documentation about VEC_PERM_EXPR which wasn't adjusted when the restrictions of permutes with constant mask were relaxed. PR middle-end/110541 * tree.def (VEC_PERM_EXPR): Adjust documentation to reflect reality.
2023-07-05x86: use VPTERNLOG also for certain andnot formsJan Beulich4-11/+47
When it's the memory operand which is to be inverted, using VPANDN* requires a further load instruction. The same can be achieved by a single VPTERNLOG*. Add two new alternatives (for plain memory and embedded broadcast), adjusting the predicate for the first operand accordingly. Two pre-existing testcases actually end up being affected (improved) by the change, which is reflected in updated expectations there. gcc/ PR target/93768 * config/i386/sse.md (*andnot<mode>3): Add new alternatives for memory form operand 1. gcc/testsuite/ PR target/93768 * gcc.target/i386/avx512f-andn-di-zmm-2.c: New test. * gcc.target/i386/avx512f-andn-si-zmm-2.c: Adjust expecations towards generated code. * gcc.target/i386/pr100711-3.c: Adjust expectations for 32-bit code.
2023-07-05x86: use VPTERNLOG for further bitwise two-vector operationsJan Beulich6-4/+198
All combinations of and, ior, xor, and not involving two operands can be expressed that way in a single insn. gcc/ PR target/93768 * config/i386/i386.cc (ix86_rtx_costs): Further special-case bitwise vector operations. * config/i386/sse.md (*iornot<mode>3): New insn. (*xnor<mode>3): Likewise. (*<nlogic><mode>3): Likewise. (andor): New code iterator. (nlogic): New code attribute. (ternlog_nlogic): Likewise. gcc/testsuite/ PR target/93768 * gcc.target/i386/avx512-binop-not-1.h: New. * gcc.target/i386/avx512-binop-not-2.h: New. * gcc.target/i386/avx512f-orn-si-zmm-1.c: New test. * gcc.target/i386/avx512f-orn-si-zmm-2.c: New test.
2023-07-05Fix typo in vectorizer debug messageRichard Biener1-1/+1
* tree-vect-stmts.cc (vect_mark_relevant): Fix typo.
2023-07-05libstdc++: Disable std::forward_list tests for C++98 modeJonathan Wakely2-2/+2
These tests fail with -std=gnu++98/-D_GLIBCXX_DEBUG in the runtest flags. They should require the c++11 effective target. libstdc++-v3/ChangeLog: * testsuite/23_containers/forward_list/debug/iterator1_neg.cc: Skip as UNSUPPORTED for C++98 mode. * testsuite/23_containers/forward_list/debug/iterator3_neg.cc: Likewise.
2023-07-05libstdc++: Fix std::__uninitialized_default_n for constant evaluation [PR110542]Jonathan Wakely1-0/+6
libstdc++-v3/ChangeLog: PR libstdc++/110542 * include/bits/stl_uninitialized.h (__uninitialized_default_n): Do not use std::fill_n during constant evaluation.
2023-07-05libstdc++: Use RAII in std::vector::_M_default_appendJonathan Wakely1-32/+59
Similar to r14-2052-gdd2eb972a5b063, replace the try-block with RAII types for deallocating storage and destroying elements. libstdc++-v3/ChangeLog: * include/bits/vector.tcc (_M_default_append): Replace try-block with RAII types.
2023-07-05libstdc++: Add redundant 'typename' to std::projectedJonathan Wakely1-1/+1
This is needed by Clang 15. libstdc++-v3/ChangeLog: * include/bits/iterator_concepts.h (projected): Add typename.
2023-07-05RISC-V:Add float16 tuple type abiyulong9-17/+630
gcc/ChangeLog: * config/riscv/vector.md: Add float16 attr at sew、vlmul and ratio. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-10.c: Add float16 tuple type case. * gcc.target/riscv/rvv/base/abi-11.c: Ditto. * gcc.target/riscv/rvv/base/abi-12.c: Ditto. * gcc.target/riscv/rvv/base/abi-15.c: Ditto. * gcc.target/riscv/rvv/base/abi-8.c: Ditto. * gcc.target/riscv/rvv/base/abi-9.c: Ditto. * gcc.target/riscv/rvv/base/abi-17.c: New test. * gcc.target/riscv/rvv/base/abi-18.c: New test.
2023-07-05RISC-V:Add float16 tuple type supportyulong12-3/+366
This patch adds support for the float16 tuple type. gcc/ChangeLog: * config/riscv/genrvv-type-indexer.cc (valid_type): Enable FP16 tuple. * config/riscv/riscv-modes.def (RVV_TUPLE_MODES): New macro. (ADJUST_ALIGNMENT): Ditto. (RVV_TUPLE_PARTIAL_MODES): Ditto. (ADJUST_NUNITS): Ditto. * config/riscv/riscv-vector-builtins-types.def (vfloat16mf4x2_t): New types. (vfloat16mf4x3_t): Ditto. (vfloat16mf4x4_t): Ditto. (vfloat16mf4x5_t): Ditto. (vfloat16mf4x6_t): Ditto. (vfloat16mf4x7_t): Ditto. (vfloat16mf4x8_t): Ditto. (vfloat16mf2x2_t): Ditto. (vfloat16mf2x3_t): Ditto. (vfloat16mf2x4_t): Ditto. (vfloat16mf2x5_t): Ditto. (vfloat16mf2x6_t): Ditto. (vfloat16mf2x7_t): Ditto. (vfloat16mf2x8_t): Ditto. (vfloat16m1x2_t): Ditto. (vfloat16m1x3_t): Ditto. (vfloat16m1x4_t): Ditto. (vfloat16m1x5_t): Ditto. (vfloat16m1x6_t): Ditto. (vfloat16m1x7_t): Ditto. (vfloat16m1x8_t): Ditto. (vfloat16m2x2_t): Ditto. (vfloat16m2x3_t): Ditto. (vfloat16m2x4_t): Ditto. (vfloat16m4x2_t): Ditto. * config/riscv/riscv-vector-builtins.def (vfloat16mf4x2_t): New macro. (vfloat16mf4x3_t): Ditto. (vfloat16mf4x4_t): Ditto. (vfloat16mf4x5_t): Ditto. (vfloat16mf4x6_t): Ditto. (vfloat16mf4x7_t): Ditto. (vfloat16mf4x8_t): Ditto. (vfloat16mf2x2_t): Ditto. (vfloat16mf2x3_t): Ditto. (vfloat16mf2x4_t): Ditto. (vfloat16mf2x5_t): Ditto. (vfloat16mf2x6_t): Ditto. (vfloat16mf2x7_t): Ditto. (vfloat16mf2x8_t): Ditto. (vfloat16m1x2_t): Ditto. (vfloat16m1x3_t): Ditto. (vfloat16m1x4_t): Ditto. (vfloat16m1x5_t): Ditto. (vfloat16m1x6_t): Ditto. (vfloat16m1x7_t): Ditto. (vfloat16m1x8_t): Ditto. (vfloat16m2x2_t): Ditto. (vfloat16m2x3_t): Ditto. (vfloat16m2x4_t): Ditto. (vfloat16m4x2_t): Ditto. * config/riscv/riscv-vector-switch.def (TUPLE_ENTRY): New. * config/riscv/riscv.md: New. * config/riscv/vector-iterators.md: New. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/tuple-28.c: New test. * gcc.target/riscv/rvv/base/tuple-29.c: New test. * gcc.target/riscv/rvv/base/tuple-30.c: New test. * gcc.target/riscv/rvv/base/tuple-31.c: New test. * gcc.target/riscv/rvv/base/tuple-32.c: New test.
2023-07-05MIPS: Adjust mips16e2 related tests for ifcvt costing changesJie Mei2-2/+2
A mips16e2 related test fails after the ifcvt change. The mips16e2 addition also causes a test for unrelated module to fail. This patch adjusts branch costs when running the two affected tests. These tests should not require the -mbranch-cost option, and this issue needs to be addressed. gcc/testsuite/ChangeLog: * gcc.target/mips/mips16e2-cmov.c: Adjust branch cost to encourage if-conversion. * gcc.target/mips/movcc-3.c: Same as above.
2023-07-05Daily bump.GCC Administrator5-1/+239
2023-07-04PR 110487: `(a !=/== CST1 ? CST2 : CST3)` pattern for type safetyAndrew Pinski1-16/+8
The problem here is we might produce some values out of the type's min/max (and/or valid values, e.g. signed booleans). The fix is to use an integer type which has the same precision and signedness as the original type. Note two_value_replacement in phiopt had the same issue in previous versions; though I don't know if a problem will show up there. OK? Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: PR tree-optimization/110487 * match.pd (a !=/== CST1 ? CST2 : CST3): Always build a nonstandard integer and use that.
2023-07-04Fix PR 110487: invalid signed boolean valueAndrew Pinski1-2/+20
This fixes the first part of this bug where `a ? -1 : 0` would cause a value of 1 into the signed boolean value. It fixes the problem by casting to an integer type of the same size/signedness before doing the negative and then casting to the type of expression. OK? Bootstrapped and tested on x86_64. gcc/ChangeLog: * match.pd (a?-1:0): Cast type an integer type rather the type before the negative. (a?0:-1): Likewise.
2023-07-04xtensa: Use HARD_REG_SET instead of bare integerTakayuki 'January June' Suwa2-11/+11
gcc/ChangeLog: * config/xtensa/xtensa.cc (machine_function, xtensa_expand_prologue): Change to use HARD_REG_BIT and its macros. * config/xtensa/xtensa.md (peephole2: regmove elimination during DFmode input reload): Likewise.
2023-07-04tree-optimization/110491 - PHI-OPT and undefsRichard Biener2-0/+36
The following makes sure to not make conditional undefs in PHI arguments unconditional by folding cond ? arg1 : arg2. PR tree-optimization/110491 * tree-ssa-phiopt.cc (match_simplify_replacement): Check whether the PHI args are possibly undefined before folding the COND_EXPR. * gcc.dg/torture/pr110491.c: New testcase.
2023-07-04Streamer: Fix out of range memory access of machine modePan Li6-11/+25
We extend the machine mode from 8 to 16 bits already. But there still one placing missing from the streamer. It has one hard coded array for the machine code like size 256. In the lto pass, we memset the array by MAX_MACHINE_MODE count but the value of the MAX_MACHINE_MODE will grow as more and more modes are added. While the machine mode array in tree-streamer still leave 256 as is. Then, when the MAX_MACHINE_MODE is greater than 256, the memset of lto_output_init_mode_table will touch the memory out of range unexpected. This patch would like to take the MAX_MACHINE_MODE as the size of the array in streamer, to make sure there is no potential unexpected memory access in future. Meanwhile, this patch also adjust some place which has MAX_MACHINE_MODE <= 256 assumption. Care is taken that for offload compilation, we interpret the stream-in data in terms of the host 'MAX_MACHINE_MODE' ('file_data->mode_bits'), which very likely is different from the offload device 'MAX_MACHINE_MODE'. gcc/ * lto-streamer-in.cc (lto_input_mode_table): Stream in the mode bits for machine mode table. * lto-streamer-out.cc (lto_write_mode_table): Stream out the HOST machine mode bits. * lto-streamer.h (struct lto_file_decl_data): New fields mode_bits. * tree-streamer.cc (streamer_mode_table): Take MAX_MACHINE_MODE as the table size. * tree-streamer.h (streamer_mode_table): Ditto. (bp_pack_machine_mode): Take 1 << ceil_log2 (MAX_MACHINE_MODE) as the packing limit. (bp_unpack_machine_mode): Ditto with 'file_data->mode_bits'. gcc/lto/ * lto-common.cc (lto_file_finalize) [!ACCEL_COMPILER]: Initialize 'file_data->mode_bits'. Signed-off-by: Pan Li <pan2.li@intel.com> Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2023-07-04LTO: Capture 'lto_file_decl_data *file_data' in 'class lto_input_block'Thomas Schwinge12-21/+21
... instead of just 'unsigned char *mode_table'. Preparation for a forthcoming change, where we need to capture an additional 'file_data' item, so it seems easier to just capture that one proper. gcc/ * lto-streamer.h (class lto_input_block): Capture 'lto_file_decl_data *file_data' instead of just 'unsigned char *mode_table'. * ipa-devirt.cc (ipa_odr_read_section): Adjust. * ipa-fnsummary.cc (inline_read_section): Likewise. * ipa-icf.cc (sem_item_optimizer::read_section): Likewise. * ipa-modref.cc (read_section): Likewise. * ipa-prop.cc (ipa_prop_read_section, read_replacements_section): Likewise. * ipa-sra.cc (isra_read_summary_section): Likewise. * lto-cgraph.cc (input_cgraph_opt_section): Likewise. * lto-section-in.cc (lto_create_simple_input_block): Likewise. * lto-streamer-in.cc (lto_read_body_or_constructor) (lto_input_toplevel_asms): Likewise. * tree-streamer.h (bp_unpack_machine_mode): Likewise. gcc/lto/ * lto-common.cc (lto_read_decls): Adjust.
2023-07-04Use mark_ssa_maybe_undefs in PHI-OPTRichard Biener3-20/+6
The following removes gimple_uses_undefined_value_p and instead uses the conservative mark_ssa_maybe_undefs in PHI-OPT, the last user of the other API. * tree-ssa-phiopt.cc (pass_phiopt::execute): Mark SSA undefs. (empty_bb_or_one_feeding_into_p): Check for them. * tree-ssa.h (gimple_uses_undefined_value_p): Remove. * tree-ssa.cc (gimple_uses_undefined_value_p): Likewise.
2023-07-04Remove unnecessary check on scalar_niter == 0Richard Biener1-7/+0
The following removes an unnecessary check. * tree-vect-loop.cc (vect_analyze_loop_costing): Remove check guarding scalar_niter underflow.
2023-07-04tree-optimization/110376 - testcase for fixed bugRichard Biener1-0/+39
This is a new testcase for the fixed bug. PR tree-optimization/110376 * gcc.dg/torture/pr110376.c: New testcase.
2023-07-04PR tree-optimization/110531 - Vect: avoid using uninitialized variableHao Liu1-1/+1
slp_done_for_suggested_uf is used directly in vect_analyze_loop_2 without initialization, which is undefined behavior. Initialize it to false according to the discussion. gcc/ChangeLog: PR tree-optimization/110531 * tree-vect-loop.cc (vect_analyze_loop_1): initialize slp_done_for_suggested_uf to false.
2023-07-04tree-optimization/110228 - avoid undefs in ifcombine more thoroughlyRichard Biener3-2/+42
The following replaces the simplistic gimple_uses_undefined_value_p with the conservative mark_ssa_maybe_undefs approach as already used by LIM and IVOPTs. This is to avoid exposing an unconditional uninitialized read on a path from entry by if-combine. PR tree-optimization/110228 * tree-ssa-ifcombine.cc (pass_tree_ifcombine::execute): Mark SSA may-undefs. (bb_no_side_effects_p): Check stmt uses for undefs. * gcc.dg/torture/pr110228.c: New testcase. * gcc.dg/uninit-pr101912.c: Un-XFAIL.
2023-07-04tree-optimization/110436 - bogus live/relevant for unused patternRichard Biener2-0/+19
When we compute liveness and relevantness we have to make sure to handle live but not relevant stmts in a way we can later vectorize them. When the stmt uses only operands that do not need vectorization we can just leave such stmts in place - but not in the case they are recognized as patterns. Since we don't have a way to cancel pattern recognition we have to force mark such stmts as relevant. PR tree-optimization/110436 * tree-vect-stmts.cc (vect_mark_relevant): Expand dumping, force live but not relevant pattern stmts relevant. * gcc.dg/pr110436.c: New testcase.
2023-07-04x86: Enable ENQCMD and UINTR for march=sierraforest.Lili Cui2-4/+5
Enable ENQCMD and UINTR for march=sierraforest according to Intel ISE https://cdrdv2.intel.com/v1/dl/getContent/671368 gcc/ChangeLog * config/i386/i386.h: Add PTA_ENQCMD and PTA_UINTR to PTA_SIERRAFOREST. * doc/invoke.texi: Update new isa to march=sierraforest and grandridge.
2023-07-04ada: Do not unnecessarily use component-wise loop for slice assignmentEric Botcazou3-27/+31
This relaxes the condition under which Expand_Assign_Array leaves the assignment to or from an array slice untouched. The main prerequisite for the code generator is that everything be aligned on byte boundaries and Is_Possibly_Unaligned_Slice is too strong a predicate for this, so it is replaced by the combination of Possible_Bit_Aligned_Component and Is_Bit_Packed_Array, modulo a change to Possible_Bit_Aligned_Component to take into account the specific case of slices. gcc/ada/ * exp_ch5.adb (Expand_Assign_Array): Adjust comment above the calls to Possible_Bit_Aligned_Component on the LHS and RHS. Do not call Is_Possibly_Unaligned_Slice in the slice case. * exp_util.ads (Component_May_Be_Bit_Aligned): Add For_Slice boolean parameter. (Possible_Bit_Aligned_Component): Likewise. * exp_util.adb (Component_May_Be_Bit_Aligned): Do not return False for the slice of a small record or bit-packed array component. (Possible_Bit_Aligned_Component): Pass For_Slice in recursive calls, except in the slice case where True is passed, as well as in call to Component_May_Be_Bit_Aligned.
2023-07-04ada: Small adjustments to new procedure Expand_Unchecked_Union_EqualityEric Botcazou3-16/+14
The procedure is not stable under repeated invocation. Now it may be called twice on the same node, for example during the expansion of the renaming of the predefined equality operator after the unchecked union type is frozen. gcc/ada/ * exp_ch4.ads (Expand_Unchecked_Union_Equality): Only take a single parameter. * exp_ch4.adb (Expand_Unchecked_Union_Equality): Add guard against repeated invocation on the same node. * exp_ch6.adb (Expand_Call): Only pass a single actual parameter in the call to Expand_Unchecked_Union_Equality.
2023-07-04ada: Add No_Use_Of_Attribute & No_Use_Of_Pragma to gnat_rmViljar Indus3-377/+415
gcc/ada/ * doc/gnat_rm/standard_and_implementation_defined_restrictions.rst: add No_Use_Of_Attribute & No_Use_Of_Pragma restrictions. * gnat_rm.texi: Regenerate. * gnat_ugn.texi: Regenerate.
2023-07-04ada: Fix list of inherited subprograms in query for GNATproveYannick Moy2-0/+35
The query Inherited_Subprograms was returning a list containing some subprograms whose overridding was also in the list, when interfaces was present. This was an issue for GNATprove. Now propose a mode for this function to filter out overridden primitives. gcc/ada/ * sem_disp.adb (Inherited_Subprograms): Add parameter to filter out results. * sem_disp.ads: Likewise.
2023-07-04middle-end/110495 - avoid associating constants with (VL) vectorsRichard Biener4-15/+20
When trying to associate (v + INT_MAX) + INT_MAX we are using the TREE_OVERFLOW bit to check for correctness. That isn't working for VECTOR_CSTs and it can't in general when one considers VL vectors. It looks like it should work for COMPLEX_CSTs but I didn't try to single out _Complex int in this change. The following makes sure that for vectors we use the fallback of using unsigned arithmetic when associating the above to v + (INT_MAX + INT_MAX). PR middle-end/110495 * tree.h (TREE_OVERFLOW): Do not mention VECTOR_CSTs since we do not set TREE_OVERFLOW on those since the introduction of VL vectors. * match.pd (x +- CST +- CST): For VECTOR_CST do not look at TREE_OVERFLOW to determine validity of association. * gcc.dg/tree-ssa/addadd-2.c: Amend. * gcc.dg/tree-ssa/forwprop-27.c: Adjust.
2023-07-04tree-optimization/110310 - move vector epilogue disabling to analysis phaseRichard Biener4-114/+102
The following removes late deciding to elide vectorized epilogues to the analysis phase and also avoids altering the epilogues niter. The costing part from vect_determine_partial_vectors_and_peeling is moved to vect_analyze_loop_costing where we use the main loop analysis to constrain the epilogue scalar iterations. I have not tried to integrate this with vect_known_niters_smaller_than_vf. It seems the for_epilogue_p parameter in vect_determine_partial_vectors_and_peeling is largely useless and we could compute that in the function itself. PR tree-optimization/110310 * tree-vect-loop.cc (vect_determine_partial_vectors_and_peeling): Move costing part ... (vect_analyze_loop_costing): ... here. Integrate better estimate for epilogues from ... (vect_analyze_loop_2): Call vect_determine_partial_vectors_and_peeling with actual epilogue status. * tree-vect-loop-manip.cc (vect_do_peeling): ... here and avoid cancelling epilogue vectorization. (vect_update_epilogue_niters): Remove. No longer update epilogue LOOP_VINFO_NITERS. * gcc.target/i386/pr110310.c: New testcase. * gcc.dg/vect/slp-perm-12.c: Disable epilogue vectorization.
2023-07-04Revert "RISC-V: Fix one typo of FRM dynamic definition"Pan Li1-2/+2
This reverts commit 3d95a524d4746ceb3065f92f30a5679afb88d16a. gcc/ChangeLog: * config/riscv/vector.md: Revert changes.
2023-07-04Machine Description: Add LEN_MASK_{GATHER_LOAD, SCATTER_STORE} patternJu-Zhe Zhong4-17/+42
Hi, Richi and Richard. Base one the review comments from Richard: https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623405.html I change len_mask_gather_load/len_mask_scatter_store order into: {len,bias,mask} We adjust adding len and mask using using add_len_and_mask_args which is same as partial_load/parial_store. Now, the codes become more reasonable and easier maintain. This patch is adding LEN_MASK_{GATHER_LOAD,SCATTER_STORE} to allow targets handle flow control by mask and loop control by length on gather/scatter memory operations. Consider this following case: void f (uint8_t *restrict a, uint8_t *restrict b, int n, int base, int step, int *restrict cond) { for (int i = 0; i < n; ++i) { if (cond[i]) a[i * step + base] = b[i * step + base]; } } We hope RVV can vectorize such case into following IR: loop_len = SELECT_VL control_mask = comparison v = LEN_MASK_GATHER_LOAD (.., loop_len, bias, control_mask) LEN_SCATTER_STORE (... v, ..., loop_len, bias, control_mask) This patch doesn't apply such patterns into vectorizer, just add patterns and update the documents. Will send patch which apply such patterns into vectorizer soon after this patch is approved. Ok for trunk? gcc/ChangeLog: * doc/md.texi: Add len_mask_gather_load/len_mask_scatter_store. * internal-fn.cc (expand_scatter_store_optab_fn): Ditto. (expand_gather_load_optab_fn): Ditto. (internal_load_fn_p): Ditto. (internal_store_fn_p): Ditto. (internal_gather_scatter_fn_p): Ditto. (internal_fn_len_index): Ditto. (internal_fn_mask_index): Ditto. (internal_fn_stored_value_index): Ditto. * internal-fn.def (LEN_MASK_GATHER_LOAD): Ditto. (LEN_MASK_SCATTER_STORE): Ditto. * optabs.def (OPTAB_CD): Ditto.
2023-07-04RISC-V: Optimize local AVL propagationJuzhe-Zhong2-0/+43
I recently noticed that current VSETVL pass has a unnecessary restriction on local AVL propgation. Consider this following case: + insn 1: vsetvli a5,a3,e8,mf4,ta,mu + insn 2: vsetvli zero,a5,e32,m1,ta,ma + ... + vle32.v v1,0(a1) + vsetvli a2,zero,e32,m1,ta,ma + vadd.vv v1,v1,v1 + vsetvli zero,a5,e32,m1,ta,ma + vse32.v v1,0(a0) + ... + insn 3: sub a3,a3,a5 + ... We failed to elide insn 2 (vsetvl insn) since insn 3 is modifying "a3" AVL. Actually, we don't really care about insn 3 since we should only check and make sure there is no insn between insn 1 and insn 2 that modifies "a3" AVL. Then, we can propgate AVL "a3" from insn 1 to insn 2. Finally, insn 2 is eliminated. After this patch: + insn 1: vsetvli a5,a3,e8,mf4,ta,ma + ... + vle32.v v1,0(a1) + vsetvli a2,zero,e32,m1,ta,ma + vadd.vv v1,v1,v1 + vsetvli zero,a5,e32,m1,ta,ma + vse32.v v1,0(a0) + ... + insn 3: sub a3,a3,a5 + ... gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (vector_insn_info::parse_insn): Add early break. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/avl_prop-1.c: New test.
2023-07-04CRIS: Replace unspec CRIS_UNSPEC_SWAP_BITS with rtx bitreverseHans-Peter Nilsson1-7/+2
This is just expected to be a change in representation. No code is expected to change; no new tests are added. * config/cris/cris.md (CRIS_UNSPEC_SWAP_BITS): Remove. ("cris_swap_bits", "ctzsi2"): Use bitreverse instead.
2023-07-04dwarf2out.cc (mem_loc_descriptor): Handle BITREVERSEHans-Peter Nilsson1-0/+1
This seems to have just been overlooked when introducing BITREVERSE. Note that the function name mem_loc_descriptor is a misnomer; it'd better be called rtx_loc_descriptor or any_loc_descriptor, because "anything" RTX can end up here. To wit, when introducing new RTL that ends up as code or for other reasons appear in debug expressions, don't forget to update this function. This was observed by building libstdc+++ for cris-elf with a patch replacing the CRIS_UNSPEC_SWAP_BITS by bitreverse, as hitting the internal-error-generating default case. Looking at the BSWAP, POPCOUNT and ROTATE cases, BITREVERSE can probably be fully expressed as DWARF code if need be, but let's start with not throwing an internal error. gcc: * dwarf2out.cc (mem_loc_descriptor): Handle BITREVERSE.
2023-07-04Daily bump.GCC Administrator6-1/+547
2023-07-04libstdc++: Fix <iosfwd> synopsis testJonathan Wakely1-1/+1
The <syncstream> header is only supported for the cxx11 ABI. The declarations of basic_syncbuf, basic_osyncstream, syncbuf and osyncstream were already correctly guarded by a check for _GLIBCXX_USE_CXX11_ABI, but the wsyncbuf and wosyncstream declarations were not. libstdc++-v3/ChangeLog: * testsuite/27_io/headers/iosfwd/synopsis.cc: Make wsyncbuf and wosyncstream depend on _GLIBCXX_USE_CXX11_ABI.
2023-07-04libstdc++: Enable OpenMP 5.0 pragmas in PSTL headersJonathan Wakely1-2/+4
This reapplies r10-1314-g32bab8b6ad0a90 which was lost in the recent PSTL rebase from upstream. * include/pstl/pstl_config.h (_PSTL_PRAGMA_SIMD_SCAN, _PSTL_PRAGMA_SIMD_INCLUSIVE_SCAN, _PSTL_PRAGMA_SIMD_EXCLUSIVE_SCAN): Define to OpenMP 5.0 pragmas even for GCC 10.0+. (_PSTL_UDS_PRESENT): Define to 1 for GCC 10.0+.
2023-07-04libstdc++: Qualify calls to std::_Destroy and _Destroy_auxJonathan Wakely3-3/+14
These calls should be qualified to prevent ADL, which can cause errors for incomplete types that are associated classes. libstdc++-v3/ChangeLog: * include/bits/alloc_traits.h (_Destroy): Qualify call. * include/bits/stl_construct.h (_Destroy, _Destroy_n): Likewise. * testsuite/23_containers/vector/cons/destroy-adl.cc: New test.
2023-07-04RISC-V: Add support for vector crypto extensionsChristoph Müllner29-0/+779
This series adds basic support for the vector crypto extensions: * Zvbb * Zvbc * Zvkg * Zvkned * Zvkhn[a,b] * Zvksed * Zvksh * Zvkn * Zvknc * Zvkng * Zvks * Zvksc * Zvksg * Zvkt This patch is based on the v20230620 version of the Vector Cryptography specification. The specification is frozen and can be found here: https://github.com/riscv/riscv-crypto/releases/tag/v20230620 Binutils support is merged as 9fdc1b157b6e72f7dd98851a240c5fdb386a558e. All extensions come with (passing) tests for the feature test macros. gcc/ChangeLog: * common/config/riscv/riscv-common.cc: Add support for zvbb, zvbc, zvkg, zvkned, zvknha, zvknhb, zvksed, zvksh, zvkn, zvknc, zvkng, zvks, zvksc, zvksg, zvkt and the implied subsets. * config/riscv/arch-canonicalize: Add canonicalization info for zvkn, zvknc, zvkng, zvks, zvksc, zvksg. * config/riscv/riscv-opts.h (MASK_ZVBB): New macro. (MASK_ZVBC): Likewise. (TARGET_ZVBB): Likewise. (TARGET_ZVBC): Likewise. (MASK_ZVKG): Likewise. (MASK_ZVKNED): Likewise. (MASK_ZVKNHA): Likewise. (MASK_ZVKNHB): Likewise. (MASK_ZVKSED): Likewise. (MASK_ZVKSH): Likewise. (MASK_ZVKN): Likewise. (MASK_ZVKNC): Likewise. (MASK_ZVKNG): Likewise. (MASK_ZVKS): Likewise. (MASK_ZVKSC): Likewise. (MASK_ZVKSG): Likewise. (MASK_ZVKT): Likewise. (TARGET_ZVKG): Likewise. (TARGET_ZVKNED): Likewise. (TARGET_ZVKNHA): Likewise. (TARGET_ZVKNHB): Likewise. (TARGET_ZVKSED): Likewise. (TARGET_ZVKSH): Likewise. (TARGET_ZVKN): Likewise. (TARGET_ZVKNC): Likewise. (TARGET_ZVKNG): Likewise. (TARGET_ZVKS): Likewise. (TARGET_ZVKSC): Likewise. (TARGET_ZVKSG): Likewise. (TARGET_ZVKT): Likewise. * config/riscv/riscv.opt: Introduction of riscv_zv{b,k}_subext. gcc/testsuite/ChangeLog: * gcc.target/riscv/zvbb.c: New test. * gcc.target/riscv/zvbc.c: New test. * gcc.target/riscv/zvkg.c: New test. * gcc.target/riscv/zvkn-1.c: New test. * gcc.target/riscv/zvkn.c: New test. * gcc.target/riscv/zvknc-1.c: New test. * gcc.target/riscv/zvknc-2.c: New test. * gcc.target/riscv/zvknc.c: New test. * gcc.target/riscv/zvkned.c: New test. * gcc.target/riscv/zvkng-1.c: New test. * gcc.target/riscv/zvkng-2.c: New test. * gcc.target/riscv/zvkng.c: New test. * gcc.target/riscv/zvknha.c: New test. * gcc.target/riscv/zvknhb.c: New test. * gcc.target/riscv/zvks-1.c: New test. * gcc.target/riscv/zvks.c: New test. * gcc.target/riscv/zvksc-1.c: New test. * gcc.target/riscv/zvksc-2.c: New test. * gcc.target/riscv/zvksc.c: New test. * gcc.target/riscv/zvksed.c: New test. * gcc.target/riscv/zvksg-1.c: New test. * gcc.target/riscv/zvksg-2.c: New test. * gcc.target/riscv/zvksg.c: New test. * gcc.target/riscv/zvksh.c: New test. * gcc.target/riscv/zvkt.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2023-07-03Use chain_next on eh_landing_pad_d for GTY (PR middle-end/110510)Andrew Pinski1-1/+1
The backtrace in the bug report suggest there is a running out of stack during GC collection, because of a long chain of eh_landing_pad_d. This might fix that by adding chain_next onto eh_landing_pad_d's GTY marker. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR middle-end/110510 * except.h (struct eh_landing_pad_d): Add chain_next GTY.
2023-07-03testsuite, Darwin: Remove an unnecessary flags addition.Iain Sandoe4-14/+2
The addition of the multiply_defined suppress flag has been handled for some considerable time now in the Darwin specs; remove it from the testsuite libs. Avoid duplicates in the specs. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/ChangeLog: * config/darwin.h: Avoid duplicate multiply_defined specs on earlier Darwin versions with shared libgcc. libstdc++-v3/ChangeLog: * testsuite/lib/libstdc++.exp: Remove additional flag handled by Darwin specs. gcc/testsuite/ChangeLog: * lib/g++.exp: Remove additional flag handled by Darwin specs. * lib/obj-c++.exp: Likewise.
2023-07-03tree+ggc: Change return type of predicate functions from int to boolUros Bizjak4-72/+72
Also change internal variable from int to bool. gcc/ChangeLog: * tree.h (tree_int_cst_equal): Change return type from int to bool. (operand_equal_for_phi_arg_p): Ditto. (tree_map_base_marked_p): Ditto. * tree.cc (contains_placeholder_p): Update function body for bool return type. (type_cache_hasher::equal): Ditto. (tree_map_base_hash): Change return type from int to void and adjust function body accordingly. (tree_int_cst_equal): Ditto. (operand_equal_for_phi_arg_p): Ditto. (get_narrower): Change "first" variable to bool. (cl_option_hasher::equal): Update function body for bool return type. * ggc.h (ggc_set_mark): Change return type from int to bool. (ggc_marked_p): Ditto. * ggc-page.cc (gt_ggc_mx): Change return type from int to void and adjust function body accordingly. (ggc_set_mark): Ditto.
2023-07-03Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE argumentsJu-Zhe Zhong8-106/+107
Hi, Richard. I fix the order as you suggeted. Before this patch, the order is {len,mask,bias}. Now, after this patch, the order becomes {len,bias,mask}. Since you said we should not need 'internal_fn_bias_index', the bias index should always be the len index + 1. I notice LEN_STORE order is {len,vector,bias}, to make them consistent, I reorder into LEN_STORE {len,bias,vector}. Just like MASK_STORE {mask,vector}. Ok for trunk ? gcc/ChangeLog: * config/riscv/autovec.md: Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments. * config/riscv/riscv-v.cc (expand_load_store): Ditto. * doc/md.texi: Ditto. * gimple-fold.cc (gimple_fold_partial_load_store_mem_ref): Ditto. * internal-fn.cc (len_maskload_direct): Ditto. (len_maskstore_direct): Ditto. (add_len_and_mask_args): New function. (expand_partial_load_optab_fn): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments. (expand_partial_store_optab_fn): Ditto. (internal_fn_len_index): New function. (internal_fn_mask_index): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments. (internal_fn_stored_value_index): Ditto. (internal_len_load_store_bias): Ditto. * internal-fn.h (internal_fn_len_index): New function. * tree-ssa-dse.cc (initialize_ao_ref_for_dse): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments. * tree-vect-stmts.cc (vectorizable_store): Ditto. (vectorizable_load): Ditto.
2023-07-03ada: Fix renaming of predefined equality operator for unchecked union typesEric Botcazou6-427/+390
The problem is that the predefined equality operator for unchecked union types is implemented out of line by invoking a function that takes more parameters than the two operands, which means that the renaming is not seen as type conforming with this function and, therefore, is rejected. The way out is to implement these additional parameters as "extra" formal parameters, since this kind of parameters is not taken into account for semantic checks. The change also factors out the duplicated generation of actuals for these additional parameters into a single procedure. gcc/ada/ * exp_ch3.ads (Build_Variant_Record_Equality): Add Spec_Id as second parameter. * exp_ch3.adb (Build_Variant_Record_Equality): For unchecked union types, build the additional parameters as extra formal parameters. (Expand_Freeze_Record_Type.Build_Variant_Record_Equality): Pass Empty as Spec_Id in call to Build_Variant_Record_Equality. * exp_ch4.ads (Expand_Unchecked_Union_Equality): New procedure. * exp_ch4.adb (Expand_Composite_Equality): In the presence of a function implementing composite equality, do not special case the unchecked union types, and only convert the operands if the base types are not the same like in Build_Equality_Call. (Build_Equality_Call): Do not special case the unchecked union types and relocate the operands only once. (Expand_N_Op_Eq): Do not special case the unchecked union types. (Expand_Unchecked_Union_Equality): New procedure implementing the specific expansion of calls to the predefined equality function. * exp_ch6.adb (Is_Unchecked_Union_Equality): New predicate. (Expand_Call): Call Is_Unchecked_Union_Equality to determine whether to call Expand_Unchecked_Union_Equality or Expand_Call_Helper. * exp_ch8.adb (Build_Body_For_Renaming): Set Has_Delayed_Freeze flag earlier on Id and pass Id in call to Build_Variant_Record_Equality.