aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-04-26Convert compare_nonzero_chars to wide_ints.Aldy Hernandez1-4/+4
gcc/ChangeLog: * tree-ssa-strlen.cc (compare_nonzero_chars): Convert to wide_ints.
2023-04-26Remove some uses of deprecated irange API.Aldy Hernandez15-37/+39
gcc/ChangeLog: * builtins.cc (expand_builtin_strnlen): Rewrite deprecated irange API uses to new API. * gimple-predicate-analysis.cc (find_var_cmp_const): Same. * internal-fn.cc (get_min_precision): Same. * match.pd: Same. * tree-affine.cc (expr_to_aff_combination): Same. * tree-data-ref.cc (dr_step_indicator): Same. * tree-dfa.cc (get_ref_base_and_extent): Same. * tree-scalar-evolution.cc (iv_can_overflow_p): Same. * tree-ssa-phiopt.cc (two_value_replacement): Same. * tree-ssa-pre.cc (insert_into_preds_of_block): Same. * tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Same. * tree-ssa-strlen.cc (compare_nonzero_chars): Same. * tree-switch-conversion.cc (bit_test_cluster::emit): Same. * tree-vect-patterns.cc (vect_recog_divmod_pattern): Same. * tree.cc (get_range_pos_neg): Same.
2023-04-26Replace ad-hoc value_range dumpers with irange::dump.Aldy Hernandez3-42/+9
This causes a regression in gcc.c-torture/unsorted/dump-noaddr.c. The test is asserting that two dumps are identical, but they are not because irange dumps the type which varies between runs: < VR [irange] void (*<T3dc>) (int) [1, +INF] > VR [irange] void (*<T3da>) (int) [1, +INF] I have changed the pretty printer for irange types to pass TDF_NOUID, thus avoiding this problem. gcc/ChangeLog: * ipa-prop.cc (ipa_print_node_jump_functions_for_edge): Use vrange::dump instead of ad-hoc dumper. * tree-ssa-strlen.cc (dump_strlen_info): Same. * value-range-pretty-print.cc (visit): Pass TDF_NOUID to dump_generic_node.
2023-04-26Fix swapping of ranges.Aldy Hernandez2-51/+6
The legacy range code has logic to swap out of order endpoints in the irange constructor. The new irange code expects the caller to fix any inconsistencies, thus speeding up the common case. However, this means that when we remove legacy, any stragglers must be fixed. This patch fixes the 3 culprits found during the conversion. gcc/ChangeLog: * range-op.cc (operator_cast::op1_range): Use create_possibly_reversed_range. (operator_bitwise_and::simple_op1_range_solver): Same. * value-range.cc (swap_out_of_order_endpoints): Delete. (irange::set): Remove call to swap_out_of_order_endpoints.
2023-04-26Convert users of legacy API to get_legacy_range() function.Aldy Hernandez13-119/+199
This patch converts the users of the legacy API to a function called get_legacy_range() which will return the pieces of the soon to be removed API (min, max, and kind). This is a temporary measure while these users are converted. In upcoming patches I will convert most users, but most of the middle-end warning uses will remain. Naive attempts to remove them showed that a lot of these uses are quite dependant on the anti-range idiom, and converting them to the new API broke the tests, even when the conversion was conceptually correct. Perhaps someone who understands these passes could take a stab at it. In the meantime, the legacy uses can be trivially found by grepping for get_legacy_range. gcc/ChangeLog: * builtins.cc (determine_block_size): Convert use of legacy API to get_legacy_range. * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Same. (array_bounds_checker::check_array_ref): Same. * gimple-ssa-warn-restrict.cc (builtin_memref::extend_offset_range): Same. * ipa-cp.cc (ipcp_store_vr_results): Same. * ipa-fnsummary.cc (set_switch_stmt_execution_predicate): Same. * ipa-prop.cc (struct ipa_vr_ggc_hash_traits): Same. (ipa_write_jump_function): Same. * pointer-query.cc (get_size_range): Same. * tree-data-ref.cc (split_constant_offset): Same. * tree-ssa-strlen.cc (get_range): Same. (maybe_diag_stxncpy_trunc): Same. (strlen_pass::get_len_or_size): Same. (strlen_pass::count_nonzero_bytes_addr): Same. * tree-vect-patterns.cc (vect_get_range_info): Same. * value-range.cc (irange::maybe_anti_range): Remove. (get_legacy_range): New. (irange::copy_to_legacy): Use get_legacy_range. (ranges_from_anti_range): Same. * value-range.h (class irange): Remove maybe_anti_range. (get_legacy_range): New. * vr-values.cc (check_for_binary_op_overflow): Convert use of legacy API to get_legacy_range. (compare_ranges): Same. (compare_range_with_value): Same. (bounds_of_var_in_loop): Same. (find_case_label_ranges): Same. (simplify_using_ranges::simplify_switch_using_ranges): Same.
2023-04-26Remove irange::constant_p.Aldy Hernandez3-29/+6
gcc/ChangeLog: * value-range-pretty-print.cc (vrange_printer::visit): Remove constant_p use. * value-range.cc (irange::constant_p): Remove. (irange::get_nonzero_bits_from_range): Remove constant_p use. * value-range.h (class irange): Remove constant_p. (irange::num_pairs): Remove constant_p use.
2023-04-26Remove symbolics from irange.Aldy Hernandez3-139/+4
gcc/ChangeLog: * value-range.cc (irange::copy_legacy_to_multi_range): Remove symbolics support. (irange::set): Same. (irange::legacy_lower_bound): Same. (irange::legacy_upper_bound): Same. (irange::contains_p): Same. (range_tests_legacy): Same. (irange::normalize_addresses): Remove. (irange::normalize_symbolics): Remove. (irange::symbolic_p): Remove. * value-range.h (class irange): Remove symbolic_p, normalize_symbolics, and normalize_addresses. * vr-values.cc (simplify_using_ranges::two_valued_val_range_p): Remove symbolics support.
2023-04-26Remove irange::may_contain_p.Aldy Hernandez3-11/+4
The deprecated irange::may_contain_p method differed from contains_p in that it could handle symbolics, which no longer exist in VRP. gcc/ChangeLog: * value-range.cc (irange::may_contain_p): Remove. * value-range.h (range_includes_zero_p): Rewrite may_contain_p usage with contains_p. * vr-values.cc (compare_range_with_value): Same.
2023-04-26Remove range_fold_{unary,binary}_expr.Aldy Hernandez2-91/+0
gcc/ChangeLog: * tree-vrp.cc (supported_types_p): Remove. (defined_ranges_p): Remove. (range_fold_binary_expr): Remove. (range_fold_unary_expr): Remove. * tree-vrp.h (range_fold_unary_expr): Remove. (range_fold_binary_expr): Remove.
2023-04-26Remove deprecated range_fold_{unary,binary}_expr uses from ipa-*.Aldy Hernandez4-27/+57
gcc/ChangeLog: * ipa-cp.cc (ipa_vr_operation_and_type_effects): Convert to ranger API. (ipa_value_range_from_jfunc): Same. (propagate_vr_across_jump_function): Same. * ipa-fnsummary.cc (evaluate_conditions_for_known_args): Same. * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Same. * vr-values.cc (bounds_of_var_in_loop): Same.
2023-04-26Remove range_query::get_value_range.Aldy Hernandez5-96/+99
gcc/ChangeLog: * gimple-array-bounds.cc (array_bounds_checker::get_value_range): Add irange argument. (check_out_of_bounds_and_warn): Remove check for vr. (array_bounds_checker::check_array_ref): Remove pointer qualifier for vr and adjust accordingly. * gimple-array-bounds.h (get_value_range): Add irange argument. * value-query.cc (class equiv_allocator): Delete. (range_query::get_value_range): Delete. (range_query::range_query): Remove allocator access. (range_query::~range_query): Same. * value-query.h (get_value_range): Delete. * vr-values.cc (simplify_using_ranges::op_with_boolean_value_range_p): Remove call to get_value_range. (check_for_binary_op_overflow): Same. (simplify_using_ranges::legacy_fold_cond_overflow): Same. (simplify_using_ranges::simplify_abs_using_ranges): Same. (simplify_using_ranges::simplify_cond_using_ranges_1): Same. (simplify_using_ranges::simplify_casted_cond): Same. (simplify_using_ranges::simplify_switch_using_ranges): Same. (simplify_using_ranges::two_valued_val_range_p): Same.
2023-04-26Refactor vrp_evaluate_conditional* and rename it.Aldy Hernandez2-17/+13
gcc/ChangeLog: * vr-values.cc (simplify_using_ranges::vrp_evaluate_conditional_warnv_with_ops): Rename to... (simplify_using_ranges::legacy_fold_cond_overflow): ...this. (simplify_using_ranges::vrp_visit_cond_stmt): Rename to... (simplify_using_ranges::legacy_fold_cond): ...this. (simplify_using_ranges::fold_cond): Rename vrp_evaluate_conditional_warnv_with_ops to legacy_fold_cond_overflow. * vr-values.h (class vr_values): Replace vrp_visit_cond_stmt and vrp_evaluate_conditional_warnv_with_ops with legacy_fold_cond and legacy_fold_cond_overflow respectively.
2023-04-26Remove compare_names* from legacy cond folding.Aldy Hernandez2-59/+0
In a test run I have asserted that the legacy conditional folding only gets overflows, so this removal is safe. gcc/ChangeLog: * vr-values.cc (get_vr_for_comparison): Remove. (compare_name_with_value): Same. (vrp_evaluate_conditional_warnv_with_ops): Remove calls to compare_name_with_value. * vr-values.h: Remove compare_name_with_value. Remove get_vr_for_comparison.
2023-04-26[xstormy16] Add support for byte and word swapping instructions.Roger Sayle6-0/+85
This patch adds support for xstormy16's swpb (swap bytes) and swpw (swap words) instructions. The most obvious application of these to implement the __builtin_bswap16 and __builtin_bswap32 intrinsics. Currently, __builtin_bswap16 is implemented as: foo: mov r7,r2 shl r7,#8 shr r2,#8 or r2,r7 ret but with this patch becomes: foo: swpb r2 ret Likewise, __builtin_bswap32 now becomes: foo: swpb r2 | swpb r3 | swpw r2,r3 ret Finally, the swpw instruction on its own can be used to exchange two word mode registers without a temporary, so a new pattern and peephole2 have been added to catch this. As described in the PR rtl-optimization/106518, register allocation can (in theory) be more efficient on targets that provide a swap/exchange instruction. The slightly unusual swap<mode> naming matches that used in i386.md. 2024-04-26 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/stormy16/stormy16.md (bswaphi2): New define_insn. (bswapsi2): New define_insn. (swaphi): New define_insn to exchange two registers (swpw). (define_peephole2): Recognize exchange of registers as swaphi. gcc/testsuite/ChangeLog * gcc.target/xstormy16/bswap16.c: New test case. * gcc.target/xstormy16/bswap32.c: Likewise. * gcc.target/xstormy16/swpb.c: Likewise. * gcc.target/xstormy16/swpw-1.c: Likewise. * gcc.target/xstormy16/swpw-2.c: Likewise.
2023-04-26MAINTAINERS: fix alphabetic sortingMartin Liska1-1/+1
ChangeLog: * MAINTAINERS: fix sorting
2023-04-26Update gennews for GCC 13.Jakub Jelinek1-0/+1
2023-04-26 Jakub Jelinek <jakub@redhat.com> * gennews (files): Add files for GCC 13.
2023-04-26More last_stmt removalRichard Biener19-150/+100
This adjusts more users of last_stmt where it is clear that debug stmt skipping is unnecessary. In most cases this also allowed significant code simplification. gcc/c/ * gimple-parser.cc (c_parser_parse_gimple_body): Avoid last_stmt. gcc/ * gimple-range-path.cc (path_range_query::compute_outgoing_relations): Avoid last_stmt. * ipa-pure-const.cc (pass_nothrow::execute): Likewise. * predict.cc (apply_return_prediction): Likewise. * sese.cc (set_ifsese_condition): Likewise. Simplify. * tree-cfg.cc (assert_unreachable_fallthru_edge_p): Avoid last_stmt. (make_edges_bb): Likewise. (make_cond_expr_edges): Likewise. (end_recording_case_labels): Likewise. (make_gimple_asm_edges): Likewise. (cleanup_dead_labels): Likewise. (group_case_labels): Likewise. (gimple_can_merge_blocks_p): Likewise. (gimple_merge_blocks): Likewise. (find_taken_edge): Likewise. Also handle empty fallthru blocks. (gimple_duplicate_sese_tail): Avoid last_stmt. (find_loop_dist_alias): Likewise. (gimple_block_ends_with_condjump_p): Likewise. (gimple_purge_dead_eh_edges): Likewise. (gimple_purge_dead_abnormal_call_edges): Likewise. (pass_warn_function_return::execute): Likewise. (execute_fixup_cfg): Likewise. * tree-eh.cc (redirect_eh_edge_1): Likewise. (pass_lower_resx::execute): Likewise. (pass_lower_eh_dispatch::execute): Likewise. (cleanup_empty_eh): Likewise. * tree-if-conv.cc (if_convertible_bb_p): Likewise. (predicate_bbs): Likewise. (ifcvt_split_critical_edges): Likewise. * tree-loop-distribution.cc (create_edge_for_control_dependence): Likewise. (loop_distribution::transform_reduction_loop): Likewise. * tree-parloops.cc (transform_to_exit_first_loop_alt): Likewise. (try_transform_to_exit_first_loop_alt): Likewise. (transform_to_exit_first_loop): Likewise. (create_parallel_loop): Likewise. * tree-scalar-evolution.cc (get_loop_exit_condition): Likewise. * tree-ssa-dce.cc (mark_last_stmt_necessary): Likewise. (eliminate_unnecessary_stmts): Likewise. * tree-ssa-dom.cc (dom_opt_dom_walker::set_global_ranges_from_unreachable_edges): Likewise. * tree-ssa-ifcombine.cc (ifcombine_ifandif): Likewise. (pass_tree_ifcombine::execute): Likewise. * tree-ssa-loop-ch.cc (entry_loop_condition_is_static): Likewise. (should_duplicate_loop_header_p): Likewise. * tree-ssa-loop-ivcanon.cc (create_canonical_iv): Likewise. (tree_estimate_loop_size): Likewise. (try_unroll_loop_completely): Likewise. * tree-ssa-loop-ivopts.cc (tree_ssa_iv_optimize_loop): Likewise. * tree-ssa-loop-manip.cc (ip_normal_pos): Likewise. (canonicalize_loop_ivs): Likewise. * tree-ssa-loop-niter.cc (determine_value_range): Likewise. (bound_difference): Likewise. (number_of_iterations_popcount): Likewise. (number_of_iterations_cltz): Likewise. (number_of_iterations_cltz_complement): Likewise. (simplify_using_initial_conditions): Likewise. (number_of_iterations_exit_assumptions): Likewise. (loop_niter_by_eval): Likewise. (estimate_numbers_of_iterations): Likewise.
2023-04-26RISC-V: Fine tune vmadc/vmsbc RA constraintJu-Zhe Zhong5-88/+608
gcc/ChangeLog: * config/riscv/vector.md: Refine vmadc/vmsbc RA constraint. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/narrow_constraint-13.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-14.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-15.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-16.c: New test.
2023-04-26rs6000: Guard power9-vector for vsx_scalar_cmp_exp_qp_* [PR108758]Kewen Lin1-13/+13
__builtin_vsx_scalar_cmp_exp_qp_{eq,gt,lt,unordered} used to be guarded with condition TARGET_P9_VECTOR before new bif framework was introduced (r12-5752-gd08236359eb229), since r12-5752 they are placed under stanza ieee128-hw, that is to check condition TARGET_FLOAT128_HW, it caused test case float128-cmp2-runnable.c to fail at -m32 as the condition TARGET_FLOAT128_HW isn't satisified with -m32. By checking the commit history, I didn't see any notes on why this condition change on them was made, so this patch is to move these bifs from stanza ieee128-hw to stanza power9-vector as before. PR target/108758 gcc/ChangeLog: * config/rs6000/rs6000-builtins.def (__builtin_vsx_scalar_cmp_exp_qp_eq, __builtin_vsx_scalar_cmp_exp_qp_gt __builtin_vsx_scalar_cmp_exp_qp_lt, __builtin_vsx_scalar_cmp_exp_qp_unordered): Move from stanza ieee128-hw to power9-vector.
2023-04-26rs6000: Fix predicate for const vector in sldoi_to_mov [PR109069]Kewen Lin6-3/+218
As PR109069 shows, commit r12-6537-g080a06fcb076b3 which introduces define_insn_and_split sldoi_to_mov adopts easy_vector_constant for const vector of interest, but it's wrong since predicate easy_vector_constant doesn't guarantee each byte in the const vector is the same. One counter example is the const vector in pr109069-1.c. This patch is to introduce new predicate const_vector_each_byte_same to ensure all bytes in the given const vector are the same by considering both int and float, meanwhile for the constants which don't meet easy_vector_constant we need to gen a move instead of just a set, and uses VECTOR_MEM_ALTIVEC_OR_VSX_P rather than VECTOR_UNIT_ALTIVEC_OR_VSX_P for V2DImode support under VSX since vector long long type of vec_sld is guarded under stanza vsx. PR target/109069 gcc/ChangeLog: * config/rs6000/altivec.md (sldoi_to_mov<mode>): Replace predicate easy_vector_constant with const_vector_each_byte_same, add handlings in preparation for !easy_vector_constant, and update VECTOR_UNIT_ALTIVEC_OR_VSX_P with VECTOR_MEM_ALTIVEC_OR_VSX_P. * config/rs6000/predicates.md (const_vector_each_byte_same): New predicate. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr109069-1.c: New test. * gcc.target/powerpc/pr109069-2-run.c: New test. * gcc.target/powerpc/pr109069-2.c: New test. * gcc.target/powerpc/pr109069-2.h: New test.
2023-04-26RISC-V: Optimize comparison patterns for register allocationJuzhe-Zhong17-89/+3817
Current RA constraint for RVV comparison instructions totall does not allow registers between dest and source operand have any overlaps. For example: vmseq.vv vd, vs2, vs1 If LMUL = 8, vs2 = v8, vs1 = v16: In current GCC RA constraint, GCC does not allow vd to be any regno in v8 ~ v23. However, it is too conservative and not true according to RVV ISA. Since the dest EEW of comparison is always EEW = 1, so it always follows the overlap rules of Dest EEW < Source EEW. So in this case, we should allow GCC RA have the chance to allocate v8 or v16 for vd, so that we can have better vector registers usage in RA. gcc/ChangeLog: * config/riscv/vector.md (*pred_cmp<mode>_merge_tie_mask): New pattern. (*pred_ltge<mode>_merge_tie_mask): Ditto. (*pred_cmp<mode>_scalar_merge_tie_mask): Ditto. (*pred_eqne<mode>_scalar_merge_tie_mask): Ditto. (*pred_cmp<mode>_extended_scalar_merge_tie_mask): Ditto. (*pred_eqne<mode>_extended_scalar_merge_tie_mask): Ditto. (*pred_cmp<mode>_narrow_merge_tie_mask): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/binop_vv_constraint-4.c: Adapt testcase. * gcc.target/riscv/rvv/base/narrow_constraint-17.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-18.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-19.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-20.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-21.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-22.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-23.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-24.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-25.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-26.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-27.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-28.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-29.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-30.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-31.c: New test.
2023-04-26RISC-V: Fix redundant vmv1r.v instruction in vmsge.vx codegenJu-Zhe Zhong2-9/+8
Current expansion of vmsge will make RA produce redundant vmv1r.v. testcase: void f1 (void * in, void *out, int32_t x) { vbool32_t mask = *(vbool32_t*)in; asm volatile ("":::"memory"); vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, in, 4); vbool32_t m3 = __riscv_vmsge_vx_i32m1_b32 (v, x, 4); vbool32_t m4 = __riscv_vmsge_vx_i32m1_b32_mu (mask, m3, v, x, 4); m4 = __riscv_vmsge_vv_i32m1_b32_m (m4, v2, v2, 4); __riscv_vsm_v_b32 (out, m4, 4); } Before this patch: f1: vsetvli a5,zero,e8,mf4,ta,ma vlm.v v0,0(a0) vsetivli zero,4,e32,m1,ta,mu vle32.v v3,0(a0) vle32.v v2,0(a0),v0.t vmslt.vx v1,v3,a2 vmnot.m v1,v1 vmslt.vx v1,v3,a2,v0.t vmxor.mm v1,v1,v0 vmv1r.v v0,v1 vmsge.vv v2,v2,v2,v0.t vsm.v v2,0(a1) ret After this patch: f1: vsetvli a5,zero,e8,mf4,ta,ma vlm.v v0,0(a0) vsetivli zero,4,e32,m1,ta,mu vle32.v v3,0(a0) vle32.v v2,0(a0),v0.t vmslt.vx v1,v3,a2 vmnot.m v1,v1 vmslt.vx v1,v3,a2,v0.t vmxor.mm v0,v1,v0 vmsge.vv v2,v2,v2,v0.t vsm.v v2,0(a1) ret gcc/ChangeLog: * config/riscv/vector.md: Fix redundant vmv1r.v. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/binop_vx_constraint-150.c: Adapt assembly check.
2023-04-26RISC-V: Fine tune gather load RA constraintJu-Zhe Zhong2-27/+330
For DEST EEW < SOURCE EEW, we can partial overlap register according to RVV ISA. gcc/ChangeLog: * config/riscv/vector.md: Fix RA constraint. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/narrow_constraint-12.c: New test.
2023-04-26RISC-V: Bugfix for RVV vbool*_t vn_reference_equalPan Li4-3/+23
In most architecture the precision_size of vbool*_t types are caculated like as the multiple of the type size. For example: precision_size = type_size * 8 (aka, bit count per bytes). Unfortunately, some architecture like RISC-V will adjust the precision_size for the vbool*_t in order to align the ISA. For example as below. type_size = [1, 1, 1, 1, 2, 4, 8] precision_size = [1, 2, 4, 8, 16, 32, 64] Then the precision_size of RISC-V vbool*_t will not be the multiple of the type_size. This PATCH try to enrich this case when comparing the vn_reference. Given we have the below code: void test_vbool8_then_vbool16(int8_t * restrict in, int8_t * restrict out) { vbool8_t v1 = *(vbool8_t*)in; vbool16_t v2 = *(vbool16_t*)in; *(vbool8_t*)(out + 100) = v1; *(vbool16_t*)(out + 200) = v2; } Before this PATCH: csrr t0,vlenb slli t1,t0,1 csrr a3,vlenb sub sp,sp,t1 slli a4,a3,1 add a4,a4,sp addi a2,a1,100 vsetvli a5,zero,e8,m1,ta,ma sub a3,a4,a3 vlm.v v24,0(a0) vsm.v v24,0(a2) vsm.v v24,0(a3) addi a1,a1,200 csrr t0,vlenb vsetvli a4,zero,e8,mf2,ta,ma slli t1,t0,1 vlm.v v24,0(a3) vsm.v v24,0(a1) add sp,sp,t1 jr ra After this PATCH: addi a3,a1,100 vsetvli a4,zero,e8,m1,ta,ma addi a1,a1,200 vlm.v v24,0(a0) vsm.v v24,0(a3) vsetvli a5,zero,e8,mf2,ta,ma vlm.v v24,0(a0) vsm.v v24,0(a1) ret PR target/109272 gcc/ChangeLog: * tree-ssa-sccvn.cc (vn_reference_eq): add type vector subparts check for vn_reference equal. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr108185-4.c: Update test check condition. * gcc.target/riscv/rvv/base/pr108185-5.c: Likewise. * gcc.target/riscv/rvv/base/pr108185-6.c: Likewise. Signed-off-by: Pan Li <pan2.li@intel.com>
2023-04-25RISC-V: Add auto-vectorization compile option for RVVJu-Zhe Zhong2-0/+52
This patch is adding 2 compile option for RVV auto-vectorization. 1. -param=riscv-autovec-preference= This option is to specify the auto-vectorization approach for RVV. Currently, we only support scalable and fixed-vlmax. - scalable means VLA auto-vectorization. The vector-length to compiler is unknown and runtime invariant. Such approach can allow us compile the code run on any vector-length RVV CPU. - fixed-vlmax means the compile known the RVV CPU vector-length, compile option in fixed-length VLS auto-vectorization. Meaning if we specify vector-length=512. The execution file can only run on vector-length = 512 RVV CPU. - TODO: we may need to support min-length VLS auto-vectorization, means the execution file can run on larger length RVV CPU. 2. -param=riscv-autovec-lmul= Specify LMUL choosing for RVV auto-vectorization. gcc/ChangeLog: * config/riscv/riscv-opts.h (enum riscv_autovec_preference_enum): Add enum for auto-vectorization preference. (enum riscv_autovec_lmul_enum): Add enum for choosing LMUL of RVV auto-vectorization. * config/riscv/riscv.opt: Add compile option for RVV auto-vectorization.
2023-04-25avoid splitting small constants in bcrli_nottwobits patternsJivan Hakobyan3-3/+22
I have noticed that in the case when we try to clear two bits through a small constant, and ZBS is enabled then GCC split it into two "andi" instructions. For example for the following C code: int foo(int a) { return a & ~ 0x101; } GCC generates the following: foo: andi a0,a0,-2 andi a0,a0,-257 ret but should be this one: foo: andi a0,a0,-258 ret This patch solves the mentioned issue. gcc/ChangeLog * config/riscv/bitmanip.md: Updated predicates of bclri<mode>_nottwobits and bclridisi_nottwobits patterns. * config/riscv/predicates.md: (not_uimm_extra_bit_or_nottwobits): Adjust predicate to avoid splitting arith constants. (const_nottwobits_not_arith_operand): New predicate. gcc/testsuite * gcc.target/riscv/zbs-bclri-nottwobits.c: New test.
2023-04-26PR modula2/108121 Re-implement overflow detection for constant literalsGaius Mulley12-354/+188
This patch fixes the overflow detection for constant literals. The ZTYPE is changed to int128 (or int64) if int128 is unavailable and constant literals are built from widest_int. The widest_int is converted into the tree type and checked for overflow. m2expr_interpret_integer and append_m2_digit are removed. gcc/m2/ChangeLog: PR modula2/108121 * gm2-compiler/M2ALU.mod (Less): Reformatted. * gm2-compiler/SymbolTable.mod (DetermineSizeOfConstant): Remove from import. (ConstantStringExceedsZType): Import. (GetConstLitType): Re-implement using ConstantStringExceedsZType. * gm2-gcc/m2decl.cc (m2decl_DetermineSizeOfConstant): Remove. (m2decl_ConstantStringExceedsZType): New function. (m2decl_BuildConstLiteralNumber): Re-implement. * gm2-gcc/m2decl.def (DetermineSizeOfConstant): Remove. (ConstantStringExceedsZType): New function. * gm2-gcc/m2decl.h (m2decl_DetermineSizeOfConstant): Remove. (m2decl_ConstantStringExceedsZType): New function. * gm2-gcc/m2expr.cc (append_digit): Remove. (m2expr_interpret_integer): Remove. (append_m2_digit): Remove. (m2expr_StrToWideInt): New function. (m2expr_interpret_m2_integer): Remove. * gm2-gcc/m2expr.def (CheckConstStrZtypeRange): New function. * gm2-gcc/m2expr.h (m2expr_StrToWideInt): New function. * gm2-gcc/m2type.cc (build_m2_word64_type_node): New function. (build_m2_ztype_node): New function. (m2type_InitBaseTypes): Call build_m2_ztype_node. * gm2-lang.cc (gm2_type_for_size): Re-write using early returns. gcc/testsuite/ChangeLog: PR modula2/108121 * gm2/pim/fail/largeconst.mod: Increased constant value test to fail now that cc1gm2 uses widest_int to represent a ZTYPE. * gm2/pim/fail/largeconst2.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-04-26Daily bump.GCC Administrator10-1/+204
2023-04-26recog.cc: Correct comments referring to parameter match_lenHans-Peter Nilsson1-2/+2
* recog.cc (peep2_attempt, peep2_update_life): Correct head-comment description of parameter match_len.
2023-04-25Regenerate gcc.potJoseph Myers1-4089/+4190
* gcc.pot: Regenerate.
2023-04-25c++: value dependence of by-ref lambda capture [PR108975]Patrick Palka2-3/+25
We are still ICEing on the generic lambda version of the testcase from this PR, even after r13-6743-g6f90de97634d6f, due to the by-ref capture of the constant local variable 'dim' being considered value-dependent when regenerating the lambda (at which point processing_template_decl is set since the lambda is generic), which prevents us from constant folding its uses. Later during prune_lambda_captures we end up not thoroughly walking the body of the lambda and overlook the (non-folded) uses of 'dim' within the array bound and using-decls. We could fix this by making prune_lambda_captures walk the body of the lambda more thoroughly so that it finds these uses of 'dim', but ideally we should be able to constant fold all uses of 'dim' ahead of time and prune the implicit capture after all. To that end this patch makes value_dependent_expression_p return false for such by-ref captures of constant local variables, allowing their uses to get constant folded ahead of time. It seems we just need to disable the predicate's conservative early exit for reference variables (added by r5-5022-g51d72abe5ea04e) when DECL_HAS_VALUE_EXPR_P. This effectively makes us treat by-value and by-ref captures more consistently when it comes to value dependence. PR c++/108975 gcc/cp/ChangeLog: * pt.cc (value_dependent_expression_p) <case VAR_DECL>: Suppress conservative early exit for reference variables when DECL_HAS_VALUE_EXPR_P. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/lambda/lambda-const11a.C: New test.
2023-04-25riscv: relax splitter restrictions for creating pseudosVineet Gupta3-34/+24
[partial addressing of PR/109279] RISCV splitters have restrictions to not create pesudos due to a combine limitatation. And despite this being a split-during-combine limitation, all split passes take the hit due to way define*_split are used in gcc. With the original combine issue being fixed 61bee6aed2 ("combine: Don't record for UNDO_MODE pointers into regno_reg_rtx array [PR104985]") the RV splitters can now be relaxed. This improves the codegen in general. e.g. long long f(void) { return 0x0101010101010101ull; } Before li a0,0x01010000 addi a0,0x0101 slli a0,a0,16 addi a0,a0,0x0101 slli a0,a0,16 addi a0,a0,0x0101 ret With patch li a5,0x01010000 addi a5,a5,0x0101 mv a0,a5 slli a5,a5,32 add a0,a5,a0 ret This reduces the qemu icounts, even if slightly, across SPEC2017. 500.perlbench_r 0 1235310737733 1231742384460 0.29% 1 744489708820 743515759958 2 714072106766 712875768625 0.17% 502.gcc_r 0 197365353269 197178223030 1 235614445254 235465240341 2 226769189971 226604663947 3 188315686133 188123584015 4 289372107644 289187945424 503.bwaves_r 0 326291538768 326291539697 1 515809487294 515809488863 2 401647004144 401647005463 3 488750661035 488750662484 505.mcf_r 0 681926695281 681925418147 507.cactuBSSN_r 0 3832240965352 3832226068734 508.namd_r 0 1919838790866 1919832527292 510.parest_r 0 3515999635520 3515878553435 511.povray_r 0 3073889223775 3074758622749 519.lbm_r 0 1194077464296 1194077464041 520.omnetpp_r 0 1014144252460 1011530791131 0.26% 521.wrf_r 0 3966715533120 3966265425092 523.xalancbmk_r 0 1064914296949 1064506711802 525.x264_r 0 509290028335 509258131632 1 2001424246635 2001677767181 2 1914660798226 1914869407575 526.blender_r 0 1726083839515 1725974286174 527.cam4_r 0 2336526136415 2333656336419 531.deepsjeng_r 0 1689007489539 1686541299243 0.15% 538.imagick_r 0 3247960667520 3247942048723 541.leela_r 0 2072315300365 2070248271250 544.nab_r 0 1527909091282 1527906483039 548.exchange2_r 0 2086120304280 2086314757502 549.fotonik3d_r 0 2261694058444 2261670330720 554.roms_r 0 2640547903140 2640512733483 557.xz_r 0 388736881767 386880875636 0.48% 1 959356981818 959993132842 2 547643353034 546374038310 0.23% 997.specrand_fr 0 512881578 512599641 999.specrand_ir 0 512881578 512599641 This is testsuite clean, no regression w/ patch. ========= Summary of gcc testsuite ========= | # of unexpected case / # of unique unexpected case | gcc | g++ | gfortran | rv64imafdc/ lp64d/ medlow | 2 / 2 | 1 / 1 | 6 / 1 | rv64imac/ lp64/ medlow | 3 / 3 | 1 / 1 | 43 / 8 | rv32imafdc/ ilp32d/ medlow | 1 / 1 | 3 / 2 | 6 / 1 | rv32imac/ ilp32/ medlow | 1 / 1 | 3 / 2 | 43 / 8 | This came up as part of IRC chat on PR/109279 and was suggested by Andrew Pinski. gcc/ChangeLog: * config/riscv/riscv.md: riscv_move_integer() drop in_splitter arg. riscv_split_symbol() drop in_splitter arg. * config/riscv/riscv.cc: riscv_move_integer() drop in_splitter arg. riscv_split_symbol() drop in_splitter arg. riscv_force_temporary() drop in_splitter arg. * config/riscv/riscv-protos.h: riscv_move_integer() drop in_splitter arg. riscv_split_symbol() drop in_splitter arg. Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
2023-04-25Avoid creating useless debug temporariesEric Botcazou1-11/+3
insert_debug_temp_for_var_def has some strange code whereby it creates debug temporaries for SINGLE_RHS (RHS for gimple_assign_single_p) but not for other RHS in the same situation. gcc/ * tree-ssa.cc (insert_debug_temp_for_var_def): Do not create superfluous debug temporaries for single GIMPLE assignments.
2023-04-25tree-optimization/109609 - correctly interpret arg size in fnspecRichard Biener3-5/+43
By majority vote and a hint from the API name which is arg_max_access_size_given_by_arg_p this interprets a memory access size specified as given as other argument such as for strncpy in the testcase which has "1cO313" as specifying the _maximum_ size read/written rather than the exact size. There are two uses interpreting it that way already and one differing. The following adjusts the differing and clarifies the documentation. PR tree-optimization/109609 * attr-fnspec.h (arg_max_access_size_given_by_arg_p): Clarify semantics. * tree-ssa-alias.cc (check_fnspec): Correctly interpret the size given by arg_max_access_size_given_by_arg_p as maximum, not exact, size. * gcc.dg/torture/pr109609.c: New testcase.
2023-04-25'omp scan' struct block seq update for OpenMP 5.xTobias Burnus15-51/+545
While OpenMP 5.0 required a single structured block before and after the 'omp scan' directive, OpenMP 5.1 changed this to a 'structured block sequence, denoting 2 or more executable statements in OpenMP 5.1 (whoops!) and zero or more in OpenMP 5.2. This commit updates C/C++ to accept zero statements (but till requires the '{' ... '}' for the final-loop-body) and updates Fortran to accept zero or more than one statements. If there is no preceeding or succeeding executable statement, a warning is shown. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_scan_loop_body): Handle zero exec statements before/after 'omp scan'. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_scan_loop_body): Handle zero exec statements before/after 'omp scan'. gcc/fortran/ChangeLog: * openmp.cc (gfc_resolve_omp_do_blocks): Handle zero or more than one exec statements before/after 'omp scan'. * trans-openmp.cc (gfc_trans_omp_do): Likewise. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/scan-1.c: New test. * testsuite/libgomp.c/scan-23.c: New test. * testsuite/libgomp.fortran/scan-2.f90: New test. gcc/testsuite/ChangeLog: * g++.dg/gomp/attrs-7.C: Update dg-error/dg-warning. * gfortran.dg/gomp/loop-2.f90: Likewise. * gfortran.dg/gomp/reduction5.f90: Likewise. * gfortran.dg/gomp/reduction6.f90: Likewise. * gfortran.dg/gomp/scan-1.f90: Likewise. * gfortran.dg/gomp/taskloop-2.f90: Likewise. * c-c++-common/gomp/scan-6.c: New test. * gfortran.dg/gomp/scan-8.f90: New test.
2023-04-25testsuite: Fix up ext-floating2.C on powerpc64-linuxJakub Jelinek1-0/+4
Another testcase that is failing on powerpc64-linux. The test expects a diagnostics when float64 && float128 or in another spot when float32 && float128. Now, float128 effective target is satisfied on powerpc64-linux, despite __CPP_FLOAT128_T__ not being defined, because one needs to add some extra options for it. I think 32-bit arm has similar case for float16. 2023-04-25 Jakub Jelinek <jakub@redhat.com> * g++.dg/cpp23/ext-floating2.C: Add dg-add-options for float16, float32, float64 and float128.
2023-04-25aarch64: PR target/PR99195 Annotate more simple integer binary patterns with ↵Kyrylo Tkachov2-13/+12
vcz subst rules This patch adds more straightforward annotations to some more integer binary ops to eliminate redundant fmovs around 64-bit SIMD results. Bootstrapped and tested on aarch64-none-linux. gcc/ChangeLog: PR target/99195 * config/aarch64/aarch64-simd.md (orn<mode>3): Rename to... (orn<mode>3<vczle><vczbe>): ... This. (bic<mode>3): Rename to... (bic<mode>3<vczle><vczbe>): ... This. (<su><maxmin><mode>3): Rename to... (<su><maxmin><mode>3<vczle><vczbe>): ... This. gcc/testsuite/ChangeLog: PR target/99195 * gcc.target/aarch64/simd/pr99195_1.c: Add tests for orn, bic, max and min.
2023-04-25aarch64: Implement V2DI,V4SI division optabs for TARGET_SVEKyrylo Tkachov3-0/+87
Similar to the mulv2di case, we can use SVE instruction to implement the V4SI and V2DI optabs for signed and unsigned integer division. This allows us to generate much cleaner code for the testcase than the current: food: fmov x1, d1 fmov x0, d0 umov x2, v0.d[1] sdiv x0, x0, x1 umov x1, v1.d[1] sdiv x1, x2, x1 fmov d0, x0 ins v0.d[1], x1 ret which now becomes: food: ptrue p0.b, all sdiv z0.d, p0/m, z0.d, z1.d ret Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (<su_optab>div<mode>3): New define_expand. * config/aarch64/iterators.md (VQDIV): New mode iterator. (vnx2di): New mode attribute. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve-neon-modes_3.c: New test.
2023-04-25testsuite: Fix up ext-floating15.C tests on powerpc64-linux [PR109278]Jakub Jelinek1-0/+1
I've noticed this test FAILs on powerpc64-linux, with FAIL: g++.dg/cpp23/ext-floating15.C -std=gnu++98 (test for excess errors) Excess errors: /home/jakub/gcc/gcc/testsuite/g++.dg/cpp23/ext-floating15.C:8:5: error: '_Float128' is not supported on this target /home/jakub/gcc/gcc/testsuite/g++.dg/cpp23/ext-floating15.C:8:5: error: '_Float128' is not supported on this target /home/jakub/gcc/gcc/testsuite/g++.dg/cpp23/ext-floating15.C:8:1: error: variable or field 'bar' declared void /home/jakub/gcc/gcc/testsuite/g++.dg/cpp23/ext-floating15.C:8:5: error: '_Float128' is not supported on this target /home/jakub/gcc/gcc/testsuite/g++.dg/cpp23/ext-floating15.C:8:6: error: expected primary-expression before '_Float128' and similarly other std versions. powerpc64-linux is float128 target, but needs to add some options for it. Fixed by adding them. 2023-04-25 Jakub Jelinek <jakub@redhat.com> PR c++/109278 * g++.dg/cpp23/ext-floating15.C: Add dg-add-options float128.
2023-04-25rtl-optimization/109585 - alias analysis typoRichard Biener2-1/+34
When r10-514-gc6b84edb6110dd2b4fb improved access path analysis it introduced a typo that triggers when there's an access to a trailing array in the first access path leading to false disambiguation. PR rtl-optimization/109585 * tree-ssa-alias.cc (aliasing_component_refs_p): Fix typo. * gcc.dg/torture/pr109585.c: New testcase.
2023-04-25powerpc: Fix up *branch_anddi3_dot for -m32 -mpowerpc64 [PR109566]Jakub Jelinek2-1/+28
The following testcase reduced from newlib ICEs on powerpc-linux, with -O2 -m32 -mpowerpc64 since r12-6433 PR102239 optimization was added and on the original testcase since some ranger improvements in GCC 13 made it no longer latent on newlib. The problem is that the *branch_anddi3_dot define_insn_and_split relies on the *rotldi3_mask_dot define_insn_and_split being recognized during splitting. The rs6000_is_valid_rotate_dot_mask function checks whether the mask is a CONST_INT which is a valid mask, but *rotl<mode>3_mask_dot in addition to checking that it is a valid mask also has (<MODE>mode == Pmode || UINTVAL (operands[3]) <= 0x7fffffff) test in the condition. For TARGET_64BIT that doesn't add any further requirements, but for !TARGET_64BIT && TARGET_POWERPC64 if the AND second operand is larger than INT_MAX it will not be recognized. The rs6000_is_valid_rotate_dot_mask function is used solely in one spot, condition of *branch_anddi3_dot, so the following patch adjusts it to check for that as well. 2023-04-25 Jakub Jelinek <jakub@redhat.com> PR target/109566 * config/rs6000/rs6000.cc (rs6000_is_valid_rotate_dot_mask): For !TARGET_64BIT, don't return true if UINTVAL (mask) << (63 - nb) is larger than signed int maximum. * gcc.target/powerpc/pr109566.c: New test.
2023-04-25gcov: add info about "calls" to JSON output formatMartin Liska5-12/+95
gcc/ChangeLog: * doc/gcov.texi: Document the new "calls" field and document the API bump. Mention also "block_ids" for lines. * gcov.cc (output_intermediate_json_line): Output info about calls and extend branches as well. (generate_results): Bump version to 2. (output_line_details): Use block ID instead of a non-sensual index. gcc/testsuite/ChangeLog: * g++.dg/gcov/gcov-17.C: Add call to a noreturn function. * g++.dg/gcov/test-gcov-17.py: Cover new format. * lib/gcov.exp: Add options for gcov that emit the extra info.
2023-04-25[Committed] Correct zeroextendqihi2 insn length regression on xstormy16.Roger Sayle1-1/+1
My recent tweak to the zeroextendqihi2 pattern on xstormy16 incorrectly handled the case where the operand was a MEM. MEM operands use a longer encoding than REG operands, and the incorrect instruction length resulted in assembler errors (as reported by Jeff Law). This patch restores the original length resolving this regression. Sorry for the inconvenience. Committed as obvious, after testing that a cross-compiler to xstormy16-elf builds from x86_64-pc-linux-gnu, and that gcc.c-torture/execute/memset-2.c no longer causes "operand out of range" issues in gas. Committed as obvious. 2023-04-25 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/stormy16/stormy16.md (zero_extendqihi2): Restore/fix length attribute for the first (memory operand) alternative.
2023-04-25aarch64: Leveraging the use of STP instruction for vec_duplicateVictor Do Nascimento4-1/+71
The backend pattern for storing a pair of identical values in 32 and 64-bit modes with the machine instruction STP was missing, and multiple instructions were needed to reproduce this behavior as a result of failed RTL pattern match in combine pass. For the test case: typedef long long v2di __attribute__((vector_size (16))); typedef int v2si __attribute__((vector_size (8))); void foo (v2di *x, long long a) { v2di tmp = {a, a}; *x = tmp; } void foo2 (v2si *x, int a) { v2si tmp = {a, a}; *x = tmp; } at -O2 on aarch64 gives: foo: stp x1, x1, [x0] ret foo2: stp w1, w1, [x0] ret instead of: foo: dup v0.2d, x1 str q0, [x0] ret foo2: dup v0.2s, w1 str d0, [x0] ret Bootstrapped and regtested on aarch64-none-linux-gnu. gcc/ * config/aarch64/aarch64-simd.md(aarch64_simd_stp<mode>): New. * config/aarch64/constraints.md: Make "Umn" relaxed memory constraint. * config/aarch64/iterators.md(ldpstp_vel_sz): New. gcc/testsuite/ * gcc.target/aarch64/stp_vec_dup_32_64-1.c: New.
2023-04-25Remove default constructor to nan_state.Aldy Hernandez2-8/+8
I think it's best to specify the default behavior of nan_state, since it's not obvious that nan_state() defaults to TRUE. Also, this avoids the ugly nan_state(false, false) idiom. gcc/ChangeLog: * value-range.cc (frange::set): Adjust constructor. * value-range.h (nan_state::nan_state): Replace default constructor with one taking an argument.
2023-04-25MAINTAINERS: add myself to write after approvalVictor Do Nascimento1-0/+1
ChangeLog: * MAINTAINERS (Write After Approval): Add myself.
2023-04-25Remove obsolete configure code in gnattoolsEric Botcazou2-92/+20
It was recently pointed out that we generate symbolic links to ghost files when building the GNAT tools, as the mlib-tgt-specific-*.adb files are gone. gnattools/ * configure.ac (TOOLS_TARGET_PAIRS): Remove obsolete settings. (EXTRA_GNATTOOLS): Likewise. * configure: Regenerate.
2023-04-25Pass correct type to irange::contains_p() in ipa-cp.cc.Aldy Hernandez1-1/+19
There is a call to contains_p() in ipa-cp.cc which passes incompatible types. This currently works because deep in the call chain, the legacy code uses tree_int_cst_lt which performs the operation with widest_int. With the upcoming removal of legacy, contains_p() will be stricter. gcc/ChangeLog: * ipa-cp.cc (ipa_range_contains_p): New. (decide_whether_version_node): Use it.
2023-04-25[PATCH v2] testsuite: Add testcase for sparc ICE [PR105573]Sam James1-0/+15
r11-10018-g33914983cf3734c2f8079963ba49fcc117499ef3 fixed PR105312 and added a test case for target/arm but the duplicate PR105573 has a test case for target/sparc that was uncommitted until now. 2023-04-21 Sam James <sam@gentoo.org> PR tree-optimization/105312 PR target/105573 gcc/testsuite/ * gcc.target/sparc/pr105573.c: New test.
2023-04-24Add alternative testcase of phi-opt-25.c that tests phioptAndrew Pinski1-0/+89
Right now phi-opt-25.c has tests like `a ? func(a) : CST` but if we add the simplifications to match.pd, then phi-opt-25.c will no longer be testing phiopt to make sure these get optimized. So this adds an alternative version which is designed to test phiopt. Committed as obvious after testing the testcase to make sure it does not fail on x86_64-linux-gnu. Thanks, Andrew Pinski gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/phi-opt-25a.c: New test.