aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-04-27Normalize addresses in IPA before calling range_op_handler [PR109639]Aldy Hernandez6-7/+66
The old legacy code would allow building ranges containing symbolics, even though the entire ranger ecosystem does not handle them. These were normalized into non-zero ranges by helper functions in VRP (range_fold_*_expr) before calling the ranger. The only users of these functions should have been legacy VRP, which is no more. However, a handful of users crept into IPA, even though these functions shouldn't never been called outside of VRP or vr-values. The issue here is that IPA is building a range of [&foo, &foo] and expecting range_fold_binary to normalize it to non-zero. Fixed by adding a helper function before calling the range_op handler. I think these covers the problematic ranges. If not, I'll come up with something more generalized that does not involve polluting irange::set with the normalization code. After all, this only involves a handful of IPA places. I've also added an assert in irange::set() making it easier to detect any possible fallout without having to drill deep into the setter. gcc/ChangeLog: PR tree-optimization/109639 * ipa-cp.cc (ipa_value_range_from_jfunc): Normalize range. (propagate_vr_across_jump_function): Same. * ipa-fnsummary.cc (evaluate_conditions_for_known_args): Same. * ipa-prop.h (ipa_range_set_and_normalize): New. * value-range.cc (irange::set): Assert min and max are INTEGER_CST.
2023-04-27wrong GIMPLE from (bit_field_ref CTOR ..) simplificationRichard Biener1-2/+7
When we simplify a BIT_FIELD_REF of a CTOR like { _1, _2, _3, _4 } and attempt to produce (view converted) { _1, _2 } for a selected subset we fail to realize this cannot be done from match.pd since we have no way to write the resulting CTOR "operation" and the built CTOR { _1, _2 } isn't a GIMPLE value. This kind of simplifications have to be done in forwprop (or would need a match.pd syntax extension) where we can split out the CTOR to a separate stmt. The following disables this particular simplification when we are simplifying GIMPLE. With enhanced IL checking this otherwise causes ICEs in the testsuite from vectorized code. * match.pd (BIT_FIELD_REF CONSTRUCTOR@0 @1 @2): Do not create a CTOR operand in the result when simplifying GIMPLE.
2023-04-27Properly gimplify handled component chains on registersRichard Biener1-1/+15
When for example complex lowering wants to extract the imaginary part of a complex variable for lowering a complex move we can end up with it generating __imag <VIEW_CONVERT_EXPR <_22> > which is valid GENERIC. It then feeds that to the gimplifier via force_gimple_operand but that fails to split up this chain of handled components, generating invalid GIMPLE catched by verification when PR109644 is fixed. The following rectifies this by noting in gimplify_compound_lval when the base object which we gimplify first ends up being a register. * gimplify.cc (gimplify_compound_lval): When the base gimplified to a register make sure to split up chains of operations.
2023-04-27ipa/109607 - properly gimplify conversions introduced by IPA param manipulationRichard Biener3-3/+22
The following addresses IPA param manipulation (through IPA SRA) replacing BIT_FIELD_REF <*this_8(D), 8, 56> with BIT_FIELD_REF <VIEW_CONVERT_EXPR<const struct profile_count>(ISRA.814), 8, 56> which is supposed to be invalid GIMPLE (ISRA.814 is a register). There's currently insufficient checking in place to catch this in the IL verifier but I am working on that as part of fixing PR109594. The solution for the particular testcase I am running into this is to split the conversion to a separate stmt. Generally the modification phase is set up for this but the extra_stmts sequence isn't passed around everywhere. The following passes it to modify_expression from modify_assignment when rewriting the RHS. PR ipa/109607 * ipa-param-manipulation.h (ipa_param_body_adjustments::modify_expression): Add extra_stmts argument. * ipa-param-manipulation.cc (ipa_param_body_adjustments::modify_expression): Likewise. When we need a conversion and the replacement is a register split the conversion out. (ipa_param_body_adjustments::modify_assignment): Pass extra_stmts to RHS modify_expression. * g++.dg/torture/pr109607.C: New testcase.
2023-04-27c: Fix up error-recovery on non-empty VLA initializers [PR109409]Jakub Jelinek2-4/+17
On the following testcase we ICE, because after we emit the variable-sized object may not be initialized except with an empty initializer error we don't really reset the initializer to error_mark_node and then at -Wformat checking time we ICE on seeing STRING_CST initializer for a VLA. The following patch just arranges for error_mark_node to be returned after the error diagnostics. 2023-04-27 Jakub Jelinek <jakub@redhat.com> PR c/109409 * c-parser.cc (c_parser_initializer): Move diagnostics about initialization of variable sized object with non-empty initializer after c_parser_expr_no_commas call and ret.set_error (); after it. * gcc.dg/pr109409.c: New test.
2023-04-27c: Fix up error-recovery on functions initialized as variables [PR109412]Jakub Jelinek2-0/+25
The change to allow empty initializers in C broke error-recovery on the following testcase. We are emitting function %qD is initialized like a variable error early; if the initializer is non-empty, we just emit another error that the initializer is invalid. Previously if it was empty, we'd emit another error that scalar is being initialized by empty initializer (not really correct), but now we instead just try to build_zero_cst for the FUNCTION_TYPE and ICE on it. The following patch just emits the same diagnostics for the empty initializers as we emit for the non-empty ones. 2023-04-27 Jakub Jelinek <jakub@redhat.com> PR c/107682 PR c/109412 * c-typeck.cc (pop_init_level): If constructor_type is FUNCTION_TYPE, reject empty initializer as invalid. * gcc.dg/pr109412.c: New test.
2023-04-27doc: Add explanation of zero-length array exampleJonathan Wakely1-0/+3
gcc/ChangeLog: * doc/extend.texi (Zero Length): Describe example.
2023-04-27tree-optimization/109594 - wrong register promotionRichard Biener1-7/+28
We fail to verify the constraints under which we allow handled components to wrap registers. The gcc.dg/pr70022.c testcase shows that we happily end up with _2 = VIEW_CONVERT_EXPR<int[4]>(v_1(D)) as produced by SSA rewrite and update_address_taken. But the intent was that we wrap registers with at most a single level of handled components and specifically only allow __real, __imag, BIT_FIELD_REF and VIEW_CONVERT_EXPR on them, but not ARRAY_REF or COMPONENT_REF. Together with the improved gimple_load predicate taking advantage of the above and ASAN this eventually ICEd. The following fixes update_address_taken as to this constraint. PR tree-optimization/109594 * tree-ssa.cc (non_rewritable_mem_ref_base): Constrain what we rewrite to a register based on the above.
2023-04-27testsuite: adjust NOP expectations for RISC-VJan Beulich3-3/+6
RISC-V will emit ".option nopic" when -fno-pie is in effect, which matches the generic pattern. Just like done for Alpha, special-case RISC-V. gcc/testsuite/ * c-c++-common/patchable_function_entry-decl.c: Special-case RISC-V. * c-c++-common/patchable_function_entry-default.c: Likewise. * c-c++-common/patchable_function_entry-definition.c: Likewise.
2023-04-26c++: restore instantiate_decl assertJason Merrill1-0/+6
For PR61445 I removed this assert, but PR108242 demonstrated why it's still useful; to avoid regressing the former testcase I check pattern_defined in the assert. This reverts r212524. PR c++/61445 gcc/cp/ChangeLog: * pt.cc (instantiate_decl): Assert !defer_ok for local class members.
2023-04-27Daily bump.GCC Administrator6-1/+657
2023-04-26RISC-V: Fix sync.md and riscv.cc whitespace errorsPatrick O'Neill2-11/+11
This patch fixes whitespace errors introduced with https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616807.html 2023-04-26 Patrick O'Neill <patrick@rivosinc.com> gcc/ChangeLog: * config/riscv/riscv.cc: Fix whitespace. * config/riscv/sync.md: Fix whitespace. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-04-26c++: remove nsdmi_inst hashtableJason Merrill1-6/+4
It occurred to me that we have a perfectly good DECL_INITIAL field to put the instantiated DMI into, we don't need a separate hash table. gcc/cp/ChangeLog: * init.cc (nsdmi_inst): Remove. (maybe_instantiate_nsdmi_init): Use DECL_INITIAL instead.
2023-04-26c++: local class in nested generic lambda [PR109241]Jason Merrill1-5/+9
The earlier fix for PR109241 avoided the crash by handling a type with no TREE_BINFO. But we want to move toward doing the partial substitution of classes in generic lambdas, so let's take a step in that direction. PR c++/109241 gcc/cp/ChangeLog: * pt.cc (instantiate_class_template): Do partially instantiate. (tsubst_expr): Do call complete_type for partial instantiations.
2023-04-26c++: unique friend shenanigans [PR69836]Jason Merrill2-0/+42
Normally we re-instantiate a function declaration when we start to instantiate the body in case of multiple declarations. In this wacky testcase, this causes a problem because the type of the w_counter parameter depends on its declaration not being in scope yet, so the name lookup only finds the previous declaration. This isn't a problem for member functions, since they aren't subject to argument-dependent lookup. So let's just skip the regeneration for hidden friends. PR c++/69836 gcc/cp/ChangeLog: * pt.cc (regenerate_decl_from_template): Skip unique friends. gcc/testsuite/ChangeLog: * g++.dg/template/friend76.C: New test.
2023-04-26c++: micro-optimize most_specialized_partial_specPatrick Palka1-26/+19
This introduces an early exit test to most_specialized_partial_spec for templates which have no partial specializations, saving some unnecessary work during class template instantiation in the common case. In passing, modernize the code a bit. gcc/cp/ChangeLog: * pt.cc (most_specialized_partial_spec): Exit early when DECL_TEMPLATE_SPECIALIZATIONS is empty. Move local variable declarations closer to their first use. Remove redundant flag_concepts test. Remove redundant forward declaration.
2023-04-26Create a lazy ssa_cache.Andrew MacLeod5-62/+92
Sparsely used ssa caches can benefit from using a bitmap to determine if a name already has an entry. Utilize it in the path query and remove its private bitmap for tracking the same info. Also use it in the "assume" query class. PR tree-optimization/108697 * gimple-range-cache.cc (ssa_global_cache::clear_range): Do not clear the vector on an out of range query. (ssa_cache::dump): Use dump_range_query instead of get_range. (ssa_cache::dump_range_query): New. (ssa_lazy_cache::dump_range_query): New. (ssa_lazy_cache::set_range): New. * gimple-range-cache.h (ssa_cache::dump_range_query): New. (class ssa_lazy_cache): New. (ssa_lazy_cache::ssa_lazy_cache): New. (ssa_lazy_cache::~ssa_lazy_cache): New. (ssa_lazy_cache::get_range): New. (ssa_lazy_cache::clear_range): New. (ssa_lazy_cache::clear): New. (ssa_lazy_cache::dump): New. * gimple-range-path.cc (path_range_query::path_range_query): Do not allocate a ssa_cache object nor has_cache bitmap. (path_range_query::~path_range_query): Do not free objects. (path_range_query::clear_cache): Remove. (path_range_query::get_cache): Adjust. (path_range_query::set_cache): Remove. (path_range_query::dump): Don't call through a pointer. (path_range_query::internal_range_of_expr): Set cache directly. (path_range_query::reset_path): Clear cache directly. (path_range_query::ssa_range_in_phi): Fold with globals only. (path_range_query::compute_ranges_in_phis): Simply set range. (path_range_query::compute_ranges_in_block): Call cache directly. * gimple-range-path.h (class path_range_query): Replace bitmap and cache pointer with lazy cache object. * gimple-range.h (class assume_query): Use ssa_lazy_cache.
2023-04-26Rename ssa_global_cache to ssa_cache and add has_rangeAndrew MacLeod6-37/+49
This renames the ssa_global_cache to be ssa_cache. The original use was to function as a global cache, but its uses have expanded. Remove all mention of "global" from the class and methods. Also add a has_range method. * gimple-range-cache.cc (ssa_cache::ssa_cache): Rename. (ssa_cache::~ssa_cache): Rename. (ssa_cache::has_range): New. (ssa_cache::get_range): Rename. (ssa_cache::set_range): Rename. (ssa_cache::clear_range): Rename. (ssa_cache::clear): Rename. (ssa_cache::dump): Rename and use get_range. (ranger_cache::get_global_range): Use get_range and set_range. (ranger_cache::range_of_def): Use get_range. * gimple-range-cache.h (class ssa_cache): Rename class and methods. (class ranger_cache): Use ssa_cache. * gimple-range-path.cc (path_range_query::path_range_query): Use ssa_cache. (path_range_query::get_cache): Use get_range. (path_range_query::set_cache): Use set_range. * gimple-range-path.h (class path_range_query): Use ssa_cache. * gimple-range.cc (assume_query::assume_range_p): Use get_range. (assume_query::range_of_expr): Use get_range. (assume_query::assume_query): Use set_range. (assume_query::calculate_op): Use get_range and set_range. * gimple-range.h (class assume_query): Use ssa_cache.
2023-04-26Add sbr_lazy_vector and adjust (e)vrp sparse cacheAndrew MacLeod3-18/+78
Add a sparse vector class for cache and use if by default. Rename the evrp_* params to vrp_*, and add a param for small CFGS which use just the original basic vector. * gimple-range-cache.cc (sbr_vector::sbr_vector): Add parameter and local to optionally zero memory. (br_vector::grow): Only zero memory if flag is set. (class sbr_lazy_vector): New. (sbr_lazy_vector::sbr_lazy_vector): New. (sbr_lazy_vector::set_bb_range): New. (sbr_lazy_vector::get_bb_range): New. (sbr_lazy_vector::bb_range_p): New. (block_range_cache::set_bb_range): Check flags and Use sbr_lazy_vector. * gimple-range-gori.cc (gori_map::calculate_gori): Use param_vrp_switch_limit. (gori_compute::gori_compute): Use param_vrp_switch_limit. * params.opt (vrp_sparse_threshold): Rename from evrp_sparse_threshold. (vrp_switch_limit): Rename from evrp_switch_limit. (vrp_vector_threshold): New.
2023-04-26Quicker relation check.Andrew MacLeod2-0/+7
If either of the SSA names in a comparison do not have any equivalences or relations, we can short-circuit the check slightly. * value-relation.cc (dom_oracle::query_relation): Check early for lack of any relation. * value-relation.h (equiv_oracle::has_equiv_p): New.
2023-04-26Don't save ssa-name pointer in dependency cache.Andrew MacLeod3-8/+15
If the direct dependence fields point directly to an ssa-name, its possible that an optimization frees an ssa-name, and the value pointed to may now be in the free list. Simply maintain the ssa version number instead. PR tree-optimization/109417 * gimple-range-gori.cc (range_def_chain::register_dependency): Save the ssa version number, not the pointer. (gori_compute::may_recompute_p): No need to check if a dependency is in the free list. * gimple-range-gori.h (class range_def_chain): Change ssa1 and ssa2 fields to be unsigned int instead of trees. (ange_def_chain::depend1): Adjust. (ange_def_chain::depend2): Adjust. * gimple-range.h: Include "ssa.h" to inline ssa_name().
2023-04-26aix: Default AIX 7.2 to POWER7 server and AIX 7.3 to POWER8 server.David Edelsohn2-6/+6
AIX 7.2 minimum ISA is POWER7 and AIX 7.3 minimum ISA is POWER8. This patch changes the aix72.h configuration to POWER7 with VSX enabled by default (with the AIX VSX ABI limitations), matching LLVM on AIX, and changes the aix73.h configuration to POWER8. gcc/ChangeLog: * config/rs6000/aix72.h (TARGET_DEFAULT): Use ISA_2_6_MASKS_SERVER. * config/rs6000/aix73.h (TARGET_DEFAULT): Use ISA_2_7_MASKS_SERVER. (PROCESSOR_DEFAULT): Use PROCESSOR_POWER8. Signed-off-by: David Edelsohn <dje.gcc@gmail.com>
2023-04-26RISCV: Inline subword atomic opsPatrick O'Neill13-1/+1839
RISC-V has no support for subword atomic operations; code currently generates libatomic library calls. This patch changes the default behavior to inline subword atomic calls (using the same logic as the existing library call). Behavior can be specified using the -minline-atomics and -mno-inline-atomics command line flags. gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm. This will need to stay for backwards compatibility and the -mno-inline-atomics flag. 2023-04-18 Patrick O'Neill <patrick@rivosinc.com> gcc/ChangeLog: PR target/104338 * config/riscv/riscv-protos.h: Add helper function stubs. * config/riscv/riscv.cc: Add helper functions for subword masking. * config/riscv/riscv.opt: Add command-line flag. * config/riscv/sync.md: Add masking logic and inline asm for fetch_and_op, fetch_and_nand, CAS, and exchange ops. * doc/invoke.texi: Add blurb regarding command-line flag. libgcc/ChangeLog: PR target/104338 * config/riscv/atomic.c: Add reference to duplicate logic. gcc/testsuite/ChangeLog: PR target/104338 * gcc.target/riscv/inline-atomics-1.c: New test. * gcc.target/riscv/inline-atomics-2.c: New test. * gcc.target/riscv/inline-atomics-3.c: New test. * gcc.target/riscv/inline-atomics-4.c: New test. * gcc.target/riscv/inline-atomics-5.c: New test. * gcc.target/riscv/inline-atomics-6.c: New test. * gcc.target/riscv/inline-atomics-7.c: New test. * gcc.target/riscv/inline-atomics-8.c: New test. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-26aarch64: Reimplement RSHRN2 intrinsic patterns with standard RTL codesKyrylo Tkachov2-11/+25
Similar to the previous patch, we can reimplement the rshrn2 patterns using standard RTL codes for shift, truncate and plus with the appropriate constants. This allows us to get rid of UNSPEC_RSHRN entirely. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_rshrn2<mode>_insn_le): Reimplement using standard RTL codes instead of unspec. (aarch64_rshrn2<mode>_insn_be): Likewise. (aarch64_rshrn2<mode>): Adjust for the above. * config/aarch64/aarch64.md (UNSPEC_RSHRN): Delete.
2023-04-26aarch64: Reimplement RSHRN intrinsic patterns with standard RTL codesKyrylo Tkachov2-12/+30
This patch reimplements the backend patterns for the rshrn intrinsics using standard RTL codes rather than UNSPECS. We already represent shrn as truncate of a shift. rshrn can be represented as truncate (src + (1 << (shft - 1)) >> shft), similar to how LLVM treats it. I have a follow-up patch to do the same for the rshrn2 pattern, which will allow us to remove the UNSPEC_RSHRN entirely. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_rshrn<mode>_insn_le): Reimplement with standard RTL codes instead of an UNSPEC. (aarch64_rshrn<mode>_insn_be): Likewise. (aarch64_rshrn<mode>): Adjust for the above. * config/aarch64/predicates.md (aarch64_simd_rshrn_imm_vec): Define.
2023-04-26RISC-V: Legitimise the const0_rtx for RVV load/store addressPan Li2-2/+149
This patch try to legitimise the const0_rtx (aka zero register) as the base register for the RVV load/store instructions. For example: vint32m1_t test_vle32_v_i32m1_shortcut (size_t vl) { return __riscv_vle32_v_i32m1 ((int32_t *)0, vl); } Before this patch: li a5,0 vsetvli zero,a1,e32,m1,ta,ma vle32.v v24,0(a5) <- can propagate the const 0 to a5 here vs1r.v v24,0(a0) After this patch: vsetvli zero,a1,e32,m1,ta,ma vle32.v v24,0(zero) vs1r.v v24,0(a0) As above, this patch allow you to propagate the const 0 (aka zero register) to the base register of the RVV Unit-Stride load in the combine pass. This may benefit the underlying RVV auto-vectorization. However, the indexed load failed to perform the optimization and it will be take care of in another PATCH. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_classify_address): Allow const0_rtx for the RVV load/store. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/zero_base_load_store_optimization.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com> Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
2023-04-26Remove legacy range support.Aldy Hernandez5-1214/+47
This patch removes all the code paths guarded by legacy_mode_p(), thus allowing us to re-use the int_range<1> idiom for a range of one sub-range. This allows us to represent these simple ranges in a more efficient manner. gcc/ChangeLog: * range-op.cc (range_op_cast_tests): Remove legacy support. * value-range-storage.h (vrange_allocator::alloc_irange): Same. * value-range.cc (irange::operator=): Same. (get_legacy_range): Same. (irange::copy_legacy_to_multi_range): Delete. (irange::copy_to_legacy): Delete. (irange::irange_set_anti_range): Delete. (irange::set): Remove legacy support. (irange::verify_range): Same. (irange::legacy_lower_bound): Delete. (irange::legacy_upper_bound): Delete. (irange::legacy_equal_p): Delete. (irange::operator==): Remove legacy support. (irange::singleton_p): Same. (irange::value_inside_range): Same. (irange::contains_p): Same. (intersect_ranges): Delete. (irange::legacy_intersect): Delete. (union_ranges): Delete. (irange::legacy_union): Delete. (irange::legacy_verbose_union_): Delete. (irange::legacy_verbose_intersect): Delete. (irange::irange_union): Remove legacy support. (irange::irange_intersect): Same. (irange::intersect): Same. (irange::invert): Same. (ranges_from_anti_range): Delete. (gt_pch_nx): Adjust for legacy removal. (gt_ggc_mx): Same. (range_tests_legacy): Delete. (range_tests_misc): Adjust for legacy removal. (range_tests): Same. * value-range.h (class irange): Same. (irange::legacy_mode_p): Delete. (ranges_from_anti_range): Delete. (irange::nonzero_p): Adjust for legacy removal. (irange::lower_bound): Same. (irange::upper_bound): Same. (irange::union_): Same. (irange::intersect): Same. (irange::set_nonzero): Same. (irange::set_zero): Same. * vr-values.cc (simplify_using_ranges::legacy_fold_cond_overflow): Same.
2023-04-26Remove range_has_numeric_bounds_p.Aldy Hernandez2-10/+3
gcc/ChangeLog: * value-range.cc (irange::copy_legacy_to_multi_range): Rewrite use of range_has_numeric_bounds_p with irange API. (range_has_numeric_bounds_p): Delete. * value-range.h (range_has_numeric_bounds_p): Delete.
2023-04-26Remove range_int_cst_p.Aldy Hernandez5-53/+48
gcc/ChangeLog: * tree-data-ref.cc (compute_distributive_range): Replace uses of range_int_cst_p with irange API. * tree-ssa-strlen.cc (get_range_strlen_dynamic): Same. * tree-vrp.h (range_int_cst_p): Delete. * vr-values.cc (check_for_binary_op_overflow): Replace usees of range_int_cst_p with irange API. (vr_set_zero_nonzero_bits): Same. (range_fits_type_p): Same. (simplify_using_ranges::simplify_casted_cond): Same. * tree-vrp.cc (range_int_cst_p): Remove.
2023-04-26Convert compare_nonzero_chars to wide_ints.Aldy Hernandez1-4/+4
gcc/ChangeLog: * tree-ssa-strlen.cc (compare_nonzero_chars): Convert to wide_ints.
2023-04-26Remove some uses of deprecated irange API.Aldy Hernandez15-37/+39
gcc/ChangeLog: * builtins.cc (expand_builtin_strnlen): Rewrite deprecated irange API uses to new API. * gimple-predicate-analysis.cc (find_var_cmp_const): Same. * internal-fn.cc (get_min_precision): Same. * match.pd: Same. * tree-affine.cc (expr_to_aff_combination): Same. * tree-data-ref.cc (dr_step_indicator): Same. * tree-dfa.cc (get_ref_base_and_extent): Same. * tree-scalar-evolution.cc (iv_can_overflow_p): Same. * tree-ssa-phiopt.cc (two_value_replacement): Same. * tree-ssa-pre.cc (insert_into_preds_of_block): Same. * tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Same. * tree-ssa-strlen.cc (compare_nonzero_chars): Same. * tree-switch-conversion.cc (bit_test_cluster::emit): Same. * tree-vect-patterns.cc (vect_recog_divmod_pattern): Same. * tree.cc (get_range_pos_neg): Same.
2023-04-26Replace ad-hoc value_range dumpers with irange::dump.Aldy Hernandez3-42/+9
This causes a regression in gcc.c-torture/unsorted/dump-noaddr.c. The test is asserting that two dumps are identical, but they are not because irange dumps the type which varies between runs: < VR [irange] void (*<T3dc>) (int) [1, +INF] > VR [irange] void (*<T3da>) (int) [1, +INF] I have changed the pretty printer for irange types to pass TDF_NOUID, thus avoiding this problem. gcc/ChangeLog: * ipa-prop.cc (ipa_print_node_jump_functions_for_edge): Use vrange::dump instead of ad-hoc dumper. * tree-ssa-strlen.cc (dump_strlen_info): Same. * value-range-pretty-print.cc (visit): Pass TDF_NOUID to dump_generic_node.
2023-04-26Fix swapping of ranges.Aldy Hernandez2-51/+6
The legacy range code has logic to swap out of order endpoints in the irange constructor. The new irange code expects the caller to fix any inconsistencies, thus speeding up the common case. However, this means that when we remove legacy, any stragglers must be fixed. This patch fixes the 3 culprits found during the conversion. gcc/ChangeLog: * range-op.cc (operator_cast::op1_range): Use create_possibly_reversed_range. (operator_bitwise_and::simple_op1_range_solver): Same. * value-range.cc (swap_out_of_order_endpoints): Delete. (irange::set): Remove call to swap_out_of_order_endpoints.
2023-04-26Convert users of legacy API to get_legacy_range() function.Aldy Hernandez13-119/+199
This patch converts the users of the legacy API to a function called get_legacy_range() which will return the pieces of the soon to be removed API (min, max, and kind). This is a temporary measure while these users are converted. In upcoming patches I will convert most users, but most of the middle-end warning uses will remain. Naive attempts to remove them showed that a lot of these uses are quite dependant on the anti-range idiom, and converting them to the new API broke the tests, even when the conversion was conceptually correct. Perhaps someone who understands these passes could take a stab at it. In the meantime, the legacy uses can be trivially found by grepping for get_legacy_range. gcc/ChangeLog: * builtins.cc (determine_block_size): Convert use of legacy API to get_legacy_range. * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Same. (array_bounds_checker::check_array_ref): Same. * gimple-ssa-warn-restrict.cc (builtin_memref::extend_offset_range): Same. * ipa-cp.cc (ipcp_store_vr_results): Same. * ipa-fnsummary.cc (set_switch_stmt_execution_predicate): Same. * ipa-prop.cc (struct ipa_vr_ggc_hash_traits): Same. (ipa_write_jump_function): Same. * pointer-query.cc (get_size_range): Same. * tree-data-ref.cc (split_constant_offset): Same. * tree-ssa-strlen.cc (get_range): Same. (maybe_diag_stxncpy_trunc): Same. (strlen_pass::get_len_or_size): Same. (strlen_pass::count_nonzero_bytes_addr): Same. * tree-vect-patterns.cc (vect_get_range_info): Same. * value-range.cc (irange::maybe_anti_range): Remove. (get_legacy_range): New. (irange::copy_to_legacy): Use get_legacy_range. (ranges_from_anti_range): Same. * value-range.h (class irange): Remove maybe_anti_range. (get_legacy_range): New. * vr-values.cc (check_for_binary_op_overflow): Convert use of legacy API to get_legacy_range. (compare_ranges): Same. (compare_range_with_value): Same. (bounds_of_var_in_loop): Same. (find_case_label_ranges): Same. (simplify_using_ranges::simplify_switch_using_ranges): Same.
2023-04-26Remove irange::constant_p.Aldy Hernandez3-29/+6
gcc/ChangeLog: * value-range-pretty-print.cc (vrange_printer::visit): Remove constant_p use. * value-range.cc (irange::constant_p): Remove. (irange::get_nonzero_bits_from_range): Remove constant_p use. * value-range.h (class irange): Remove constant_p. (irange::num_pairs): Remove constant_p use.
2023-04-26Remove symbolics from irange.Aldy Hernandez3-139/+4
gcc/ChangeLog: * value-range.cc (irange::copy_legacy_to_multi_range): Remove symbolics support. (irange::set): Same. (irange::legacy_lower_bound): Same. (irange::legacy_upper_bound): Same. (irange::contains_p): Same. (range_tests_legacy): Same. (irange::normalize_addresses): Remove. (irange::normalize_symbolics): Remove. (irange::symbolic_p): Remove. * value-range.h (class irange): Remove symbolic_p, normalize_symbolics, and normalize_addresses. * vr-values.cc (simplify_using_ranges::two_valued_val_range_p): Remove symbolics support.
2023-04-26Remove irange::may_contain_p.Aldy Hernandez3-11/+4
The deprecated irange::may_contain_p method differed from contains_p in that it could handle symbolics, which no longer exist in VRP. gcc/ChangeLog: * value-range.cc (irange::may_contain_p): Remove. * value-range.h (range_includes_zero_p): Rewrite may_contain_p usage with contains_p. * vr-values.cc (compare_range_with_value): Same.
2023-04-26Remove range_fold_{unary,binary}_expr.Aldy Hernandez2-91/+0
gcc/ChangeLog: * tree-vrp.cc (supported_types_p): Remove. (defined_ranges_p): Remove. (range_fold_binary_expr): Remove. (range_fold_unary_expr): Remove. * tree-vrp.h (range_fold_unary_expr): Remove. (range_fold_binary_expr): Remove.
2023-04-26Remove deprecated range_fold_{unary,binary}_expr uses from ipa-*.Aldy Hernandez4-27/+57
gcc/ChangeLog: * ipa-cp.cc (ipa_vr_operation_and_type_effects): Convert to ranger API. (ipa_value_range_from_jfunc): Same. (propagate_vr_across_jump_function): Same. * ipa-fnsummary.cc (evaluate_conditions_for_known_args): Same. * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Same. * vr-values.cc (bounds_of_var_in_loop): Same.
2023-04-26Remove range_query::get_value_range.Aldy Hernandez5-96/+99
gcc/ChangeLog: * gimple-array-bounds.cc (array_bounds_checker::get_value_range): Add irange argument. (check_out_of_bounds_and_warn): Remove check for vr. (array_bounds_checker::check_array_ref): Remove pointer qualifier for vr and adjust accordingly. * gimple-array-bounds.h (get_value_range): Add irange argument. * value-query.cc (class equiv_allocator): Delete. (range_query::get_value_range): Delete. (range_query::range_query): Remove allocator access. (range_query::~range_query): Same. * value-query.h (get_value_range): Delete. * vr-values.cc (simplify_using_ranges::op_with_boolean_value_range_p): Remove call to get_value_range. (check_for_binary_op_overflow): Same. (simplify_using_ranges::legacy_fold_cond_overflow): Same. (simplify_using_ranges::simplify_abs_using_ranges): Same. (simplify_using_ranges::simplify_cond_using_ranges_1): Same. (simplify_using_ranges::simplify_casted_cond): Same. (simplify_using_ranges::simplify_switch_using_ranges): Same. (simplify_using_ranges::two_valued_val_range_p): Same.
2023-04-26Refactor vrp_evaluate_conditional* and rename it.Aldy Hernandez2-17/+13
gcc/ChangeLog: * vr-values.cc (simplify_using_ranges::vrp_evaluate_conditional_warnv_with_ops): Rename to... (simplify_using_ranges::legacy_fold_cond_overflow): ...this. (simplify_using_ranges::vrp_visit_cond_stmt): Rename to... (simplify_using_ranges::legacy_fold_cond): ...this. (simplify_using_ranges::fold_cond): Rename vrp_evaluate_conditional_warnv_with_ops to legacy_fold_cond_overflow. * vr-values.h (class vr_values): Replace vrp_visit_cond_stmt and vrp_evaluate_conditional_warnv_with_ops with legacy_fold_cond and legacy_fold_cond_overflow respectively.
2023-04-26Remove compare_names* from legacy cond folding.Aldy Hernandez2-59/+0
In a test run I have asserted that the legacy conditional folding only gets overflows, so this removal is safe. gcc/ChangeLog: * vr-values.cc (get_vr_for_comparison): Remove. (compare_name_with_value): Same. (vrp_evaluate_conditional_warnv_with_ops): Remove calls to compare_name_with_value. * vr-values.h: Remove compare_name_with_value. Remove get_vr_for_comparison.
2023-04-26[xstormy16] Add support for byte and word swapping instructions.Roger Sayle6-0/+85
This patch adds support for xstormy16's swpb (swap bytes) and swpw (swap words) instructions. The most obvious application of these to implement the __builtin_bswap16 and __builtin_bswap32 intrinsics. Currently, __builtin_bswap16 is implemented as: foo: mov r7,r2 shl r7,#8 shr r2,#8 or r2,r7 ret but with this patch becomes: foo: swpb r2 ret Likewise, __builtin_bswap32 now becomes: foo: swpb r2 | swpb r3 | swpw r2,r3 ret Finally, the swpw instruction on its own can be used to exchange two word mode registers without a temporary, so a new pattern and peephole2 have been added to catch this. As described in the PR rtl-optimization/106518, register allocation can (in theory) be more efficient on targets that provide a swap/exchange instruction. The slightly unusual swap<mode> naming matches that used in i386.md. 2024-04-26 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/stormy16/stormy16.md (bswaphi2): New define_insn. (bswapsi2): New define_insn. (swaphi): New define_insn to exchange two registers (swpw). (define_peephole2): Recognize exchange of registers as swaphi. gcc/testsuite/ChangeLog * gcc.target/xstormy16/bswap16.c: New test case. * gcc.target/xstormy16/bswap32.c: Likewise. * gcc.target/xstormy16/swpb.c: Likewise. * gcc.target/xstormy16/swpw-1.c: Likewise. * gcc.target/xstormy16/swpw-2.c: Likewise.
2023-04-26More last_stmt removalRichard Biener19-150/+100
This adjusts more users of last_stmt where it is clear that debug stmt skipping is unnecessary. In most cases this also allowed significant code simplification. gcc/c/ * gimple-parser.cc (c_parser_parse_gimple_body): Avoid last_stmt. gcc/ * gimple-range-path.cc (path_range_query::compute_outgoing_relations): Avoid last_stmt. * ipa-pure-const.cc (pass_nothrow::execute): Likewise. * predict.cc (apply_return_prediction): Likewise. * sese.cc (set_ifsese_condition): Likewise. Simplify. * tree-cfg.cc (assert_unreachable_fallthru_edge_p): Avoid last_stmt. (make_edges_bb): Likewise. (make_cond_expr_edges): Likewise. (end_recording_case_labels): Likewise. (make_gimple_asm_edges): Likewise. (cleanup_dead_labels): Likewise. (group_case_labels): Likewise. (gimple_can_merge_blocks_p): Likewise. (gimple_merge_blocks): Likewise. (find_taken_edge): Likewise. Also handle empty fallthru blocks. (gimple_duplicate_sese_tail): Avoid last_stmt. (find_loop_dist_alias): Likewise. (gimple_block_ends_with_condjump_p): Likewise. (gimple_purge_dead_eh_edges): Likewise. (gimple_purge_dead_abnormal_call_edges): Likewise. (pass_warn_function_return::execute): Likewise. (execute_fixup_cfg): Likewise. * tree-eh.cc (redirect_eh_edge_1): Likewise. (pass_lower_resx::execute): Likewise. (pass_lower_eh_dispatch::execute): Likewise. (cleanup_empty_eh): Likewise. * tree-if-conv.cc (if_convertible_bb_p): Likewise. (predicate_bbs): Likewise. (ifcvt_split_critical_edges): Likewise. * tree-loop-distribution.cc (create_edge_for_control_dependence): Likewise. (loop_distribution::transform_reduction_loop): Likewise. * tree-parloops.cc (transform_to_exit_first_loop_alt): Likewise. (try_transform_to_exit_first_loop_alt): Likewise. (transform_to_exit_first_loop): Likewise. (create_parallel_loop): Likewise. * tree-scalar-evolution.cc (get_loop_exit_condition): Likewise. * tree-ssa-dce.cc (mark_last_stmt_necessary): Likewise. (eliminate_unnecessary_stmts): Likewise. * tree-ssa-dom.cc (dom_opt_dom_walker::set_global_ranges_from_unreachable_edges): Likewise. * tree-ssa-ifcombine.cc (ifcombine_ifandif): Likewise. (pass_tree_ifcombine::execute): Likewise. * tree-ssa-loop-ch.cc (entry_loop_condition_is_static): Likewise. (should_duplicate_loop_header_p): Likewise. * tree-ssa-loop-ivcanon.cc (create_canonical_iv): Likewise. (tree_estimate_loop_size): Likewise. (try_unroll_loop_completely): Likewise. * tree-ssa-loop-ivopts.cc (tree_ssa_iv_optimize_loop): Likewise. * tree-ssa-loop-manip.cc (ip_normal_pos): Likewise. (canonicalize_loop_ivs): Likewise. * tree-ssa-loop-niter.cc (determine_value_range): Likewise. (bound_difference): Likewise. (number_of_iterations_popcount): Likewise. (number_of_iterations_cltz): Likewise. (number_of_iterations_cltz_complement): Likewise. (simplify_using_initial_conditions): Likewise. (number_of_iterations_exit_assumptions): Likewise. (loop_niter_by_eval): Likewise. (estimate_numbers_of_iterations): Likewise.
2023-04-26RISC-V: Fine tune vmadc/vmsbc RA constraintJu-Zhe Zhong5-88/+608
gcc/ChangeLog: * config/riscv/vector.md: Refine vmadc/vmsbc RA constraint. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/narrow_constraint-13.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-14.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-15.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-16.c: New test.
2023-04-26rs6000: Guard power9-vector for vsx_scalar_cmp_exp_qp_* [PR108758]Kewen Lin1-13/+13
__builtin_vsx_scalar_cmp_exp_qp_{eq,gt,lt,unordered} used to be guarded with condition TARGET_P9_VECTOR before new bif framework was introduced (r12-5752-gd08236359eb229), since r12-5752 they are placed under stanza ieee128-hw, that is to check condition TARGET_FLOAT128_HW, it caused test case float128-cmp2-runnable.c to fail at -m32 as the condition TARGET_FLOAT128_HW isn't satisified with -m32. By checking the commit history, I didn't see any notes on why this condition change on them was made, so this patch is to move these bifs from stanza ieee128-hw to stanza power9-vector as before. PR target/108758 gcc/ChangeLog: * config/rs6000/rs6000-builtins.def (__builtin_vsx_scalar_cmp_exp_qp_eq, __builtin_vsx_scalar_cmp_exp_qp_gt __builtin_vsx_scalar_cmp_exp_qp_lt, __builtin_vsx_scalar_cmp_exp_qp_unordered): Move from stanza ieee128-hw to power9-vector.
2023-04-26rs6000: Fix predicate for const vector in sldoi_to_mov [PR109069]Kewen Lin6-3/+218
As PR109069 shows, commit r12-6537-g080a06fcb076b3 which introduces define_insn_and_split sldoi_to_mov adopts easy_vector_constant for const vector of interest, but it's wrong since predicate easy_vector_constant doesn't guarantee each byte in the const vector is the same. One counter example is the const vector in pr109069-1.c. This patch is to introduce new predicate const_vector_each_byte_same to ensure all bytes in the given const vector are the same by considering both int and float, meanwhile for the constants which don't meet easy_vector_constant we need to gen a move instead of just a set, and uses VECTOR_MEM_ALTIVEC_OR_VSX_P rather than VECTOR_UNIT_ALTIVEC_OR_VSX_P for V2DImode support under VSX since vector long long type of vec_sld is guarded under stanza vsx. PR target/109069 gcc/ChangeLog: * config/rs6000/altivec.md (sldoi_to_mov<mode>): Replace predicate easy_vector_constant with const_vector_each_byte_same, add handlings in preparation for !easy_vector_constant, and update VECTOR_UNIT_ALTIVEC_OR_VSX_P with VECTOR_MEM_ALTIVEC_OR_VSX_P. * config/rs6000/predicates.md (const_vector_each_byte_same): New predicate. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr109069-1.c: New test. * gcc.target/powerpc/pr109069-2-run.c: New test. * gcc.target/powerpc/pr109069-2.c: New test. * gcc.target/powerpc/pr109069-2.h: New test.
2023-04-26RISC-V: Optimize comparison patterns for register allocationJuzhe-Zhong17-89/+3817
Current RA constraint for RVV comparison instructions totall does not allow registers between dest and source operand have any overlaps. For example: vmseq.vv vd, vs2, vs1 If LMUL = 8, vs2 = v8, vs1 = v16: In current GCC RA constraint, GCC does not allow vd to be any regno in v8 ~ v23. However, it is too conservative and not true according to RVV ISA. Since the dest EEW of comparison is always EEW = 1, so it always follows the overlap rules of Dest EEW < Source EEW. So in this case, we should allow GCC RA have the chance to allocate v8 or v16 for vd, so that we can have better vector registers usage in RA. gcc/ChangeLog: * config/riscv/vector.md (*pred_cmp<mode>_merge_tie_mask): New pattern. (*pred_ltge<mode>_merge_tie_mask): Ditto. (*pred_cmp<mode>_scalar_merge_tie_mask): Ditto. (*pred_eqne<mode>_scalar_merge_tie_mask): Ditto. (*pred_cmp<mode>_extended_scalar_merge_tie_mask): Ditto. (*pred_eqne<mode>_extended_scalar_merge_tie_mask): Ditto. (*pred_cmp<mode>_narrow_merge_tie_mask): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/binop_vv_constraint-4.c: Adapt testcase. * gcc.target/riscv/rvv/base/narrow_constraint-17.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-18.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-19.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-20.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-21.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-22.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-23.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-24.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-25.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-26.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-27.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-28.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-29.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-30.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-31.c: New test.
2023-04-26RISC-V: Fix redundant vmv1r.v instruction in vmsge.vx codegenJu-Zhe Zhong2-9/+8
Current expansion of vmsge will make RA produce redundant vmv1r.v. testcase: void f1 (void * in, void *out, int32_t x) { vbool32_t mask = *(vbool32_t*)in; asm volatile ("":::"memory"); vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, in, 4); vbool32_t m3 = __riscv_vmsge_vx_i32m1_b32 (v, x, 4); vbool32_t m4 = __riscv_vmsge_vx_i32m1_b32_mu (mask, m3, v, x, 4); m4 = __riscv_vmsge_vv_i32m1_b32_m (m4, v2, v2, 4); __riscv_vsm_v_b32 (out, m4, 4); } Before this patch: f1: vsetvli a5,zero,e8,mf4,ta,ma vlm.v v0,0(a0) vsetivli zero,4,e32,m1,ta,mu vle32.v v3,0(a0) vle32.v v2,0(a0),v0.t vmslt.vx v1,v3,a2 vmnot.m v1,v1 vmslt.vx v1,v3,a2,v0.t vmxor.mm v1,v1,v0 vmv1r.v v0,v1 vmsge.vv v2,v2,v2,v0.t vsm.v v2,0(a1) ret After this patch: f1: vsetvli a5,zero,e8,mf4,ta,ma vlm.v v0,0(a0) vsetivli zero,4,e32,m1,ta,mu vle32.v v3,0(a0) vle32.v v2,0(a0),v0.t vmslt.vx v1,v3,a2 vmnot.m v1,v1 vmslt.vx v1,v3,a2,v0.t vmxor.mm v0,v1,v0 vmsge.vv v2,v2,v2,v0.t vsm.v v2,0(a1) ret gcc/ChangeLog: * config/riscv/vector.md: Fix redundant vmv1r.v. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/binop_vx_constraint-150.c: Adapt assembly check.
2023-04-26RISC-V: Fine tune gather load RA constraintJu-Zhe Zhong2-27/+330
For DEST EEW < SOURCE EEW, we can partial overlap register according to RVV ISA. gcc/ChangeLog: * config/riscv/vector.md: Fix RA constraint. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/narrow_constraint-12.c: New test.