Age | Commit message (Collapse) | Author | Files | Lines |
|
The old legacy code would allow building ranges containing symbolics,
even though the entire ranger ecosystem does not handle them. These
were normalized into non-zero ranges by helper functions in VRP
(range_fold_*_expr) before calling the ranger. The only users of
these functions should have been legacy VRP, which is no more.
However, a handful of users crept into IPA, even though these
functions shouldn't never been called outside of VRP or vr-values.
The issue here is that IPA is building a range of [&foo, &foo] and
expecting range_fold_binary to normalize it to non-zero. Fixed by
adding a helper function before calling the range_op handler.
I think these covers the problematic ranges. If not, I'll come up
with something more generalized that does not involve polluting
irange::set with the normalization code. After all, this only
involves a handful of IPA places.
I've also added an assert in irange::set() making it easier to detect
any possible fallout without having to drill deep into the setter.
gcc/ChangeLog:
PR tree-optimization/109639
* ipa-cp.cc (ipa_value_range_from_jfunc): Normalize range.
(propagate_vr_across_jump_function): Same.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Same.
* ipa-prop.h (ipa_range_set_and_normalize): New.
* value-range.cc (irange::set): Assert min and max are INTEGER_CST.
|
|
When we simplify a BIT_FIELD_REF of a CTOR like { _1, _2, _3, _4 }
and attempt to produce (view converted) { _1, _2 } for a selected
subset we fail to realize this cannot be done from match.pd since
we have no way to write the resulting CTOR "operation" and the
built CTOR { _1, _2 } isn't a GIMPLE value.
This kind of simplifications have to be done in forwprop (or would
need a match.pd syntax extension) where we can split out the CTOR
to a separate stmt.
The following disables this particular simplification when we are
simplifying GIMPLE. With enhanced IL checking this otherwise
causes ICEs in the testsuite from vectorized code.
* match.pd (BIT_FIELD_REF CONSTRUCTOR@0 @1 @2): Do not
create a CTOR operand in the result when simplifying GIMPLE.
|
|
When for example complex lowering wants to extract the imaginary
part of a complex variable for lowering a complex move we can
end up with it generating __imag <VIEW_CONVERT_EXPR <_22> > which
is valid GENERIC. It then feeds that to the gimplifier via
force_gimple_operand but that fails to split up this chain
of handled components, generating invalid GIMPLE catched by
verification when PR109644 is fixed.
The following rectifies this by noting in gimplify_compound_lval
when the base object which we gimplify first ends up being a
register.
* gimplify.cc (gimplify_compound_lval): When the base
gimplified to a register make sure to split up chains
of operations.
|
|
The following addresses IPA param manipulation (through IPA SRA)
replacing
BIT_FIELD_REF <*this_8(D), 8, 56>
with
BIT_FIELD_REF <VIEW_CONVERT_EXPR<const struct profile_count>(ISRA.814), 8, 56>
which is supposed to be invalid GIMPLE (ISRA.814 is a register).
There's currently insufficient checking in place to catch this in the
IL verifier but I am working on that as part of fixing PR109594.
The solution for the particular testcase I am running into this is
to split the conversion to a separate stmt. Generally the modification
phase is set up for this but the extra_stmts sequence isn't passed
around everywhere. The following passes it to modify_expression
from modify_assignment when rewriting the RHS.
PR ipa/109607
* ipa-param-manipulation.h
(ipa_param_body_adjustments::modify_expression): Add extra_stmts
argument.
* ipa-param-manipulation.cc
(ipa_param_body_adjustments::modify_expression): Likewise.
When we need a conversion and the replacement is a register
split the conversion out.
(ipa_param_body_adjustments::modify_assignment): Pass
extra_stmts to RHS modify_expression.
* g++.dg/torture/pr109607.C: New testcase.
|
|
On the following testcase we ICE, because after we emit the
variable-sized object may not be initialized except with an empty initializer
error we don't really reset the initializer to error_mark_node and then at
-Wformat checking time we ICE on seeing STRING_CST initializer for a VLA.
The following patch just arranges for error_mark_node to be returned after
the error diagnostics.
2023-04-27 Jakub Jelinek <jakub@redhat.com>
PR c/109409
* c-parser.cc (c_parser_initializer): Move diagnostics about
initialization of variable sized object with non-empty initializer
after c_parser_expr_no_commas call and ret.set_error (); after it.
* gcc.dg/pr109409.c: New test.
|
|
The change to allow empty initializers in C broke error-recovery on the
following testcase. We are emitting function %qD is initialized like a
variable error early; if the initializer is non-empty, we just emit
another error that the initializer is invalid. Previously if it was empty,
we'd emit another error that scalar is being initialized by empty
initializer (not really correct), but now we instead just try to
build_zero_cst for the FUNCTION_TYPE and ICE on it.
The following patch just emits the same diagnostics for the empty
initializers as we emit for the non-empty ones.
2023-04-27 Jakub Jelinek <jakub@redhat.com>
PR c/107682
PR c/109412
* c-typeck.cc (pop_init_level): If constructor_type is FUNCTION_TYPE,
reject empty initializer as invalid.
* gcc.dg/pr109412.c: New test.
|
|
gcc/ChangeLog:
* doc/extend.texi (Zero Length): Describe example.
|
|
We fail to verify the constraints under which we allow handled
components to wrap registers. The gcc.dg/pr70022.c testcase shows
that we happily end up with
_2 = VIEW_CONVERT_EXPR<int[4]>(v_1(D))
as produced by SSA rewrite and update_address_taken. But the intent
was that we wrap registers with at most a single level of handled
components and specifically only allow __real, __imag, BIT_FIELD_REF
and VIEW_CONVERT_EXPR on them, but not ARRAY_REF or COMPONENT_REF.
Together with the improved gimple_load predicate taking advantage
of the above and ASAN this eventually ICEd.
The following fixes update_address_taken as to this constraint.
PR tree-optimization/109594
* tree-ssa.cc (non_rewritable_mem_ref_base): Constrain
what we rewrite to a register based on the above.
|
|
RISC-V will emit ".option nopic" when -fno-pie is in effect, which
matches the generic pattern. Just like done for Alpha, special-case
RISC-V.
gcc/testsuite/
* c-c++-common/patchable_function_entry-decl.c: Special-case
RISC-V.
* c-c++-common/patchable_function_entry-default.c: Likewise.
* c-c++-common/patchable_function_entry-definition.c: Likewise.
|
|
For PR61445 I removed this assert, but PR108242 demonstrated why it's still
useful; to avoid regressing the former testcase I check pattern_defined
in the assert.
This reverts r212524.
PR c++/61445
gcc/cp/ChangeLog:
* pt.cc (instantiate_decl): Assert !defer_ok for local
class members.
|
|
|
|
This patch fixes whitespace errors introduced with
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616807.html
2023-04-26 Patrick O'Neill <patrick@rivosinc.com>
gcc/ChangeLog:
* config/riscv/riscv.cc: Fix whitespace.
* config/riscv/sync.md: Fix whitespace.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
It occurred to me that we have a perfectly good DECL_INITIAL field to put
the instantiated DMI into, we don't need a separate hash table.
gcc/cp/ChangeLog:
* init.cc (nsdmi_inst): Remove.
(maybe_instantiate_nsdmi_init): Use DECL_INITIAL instead.
|
|
The earlier fix for PR109241 avoided the crash by handling a type with no
TREE_BINFO. But we want to move toward doing the partial substitution of
classes in generic lambdas, so let's take a step in that direction.
PR c++/109241
gcc/cp/ChangeLog:
* pt.cc (instantiate_class_template): Do partially instantiate.
(tsubst_expr): Do call complete_type for partial instantiations.
|
|
Normally we re-instantiate a function declaration when we start to
instantiate the body in case of multiple declarations. In this wacky
testcase, this causes a problem because the type of the w_counter parameter
depends on its declaration not being in scope yet, so the name lookup only
finds the previous declaration. This isn't a problem for member functions,
since they aren't subject to argument-dependent lookup. So let's just skip
the regeneration for hidden friends.
PR c++/69836
gcc/cp/ChangeLog:
* pt.cc (regenerate_decl_from_template): Skip unique friends.
gcc/testsuite/ChangeLog:
* g++.dg/template/friend76.C: New test.
|
|
This introduces an early exit test to most_specialized_partial_spec for
templates which have no partial specializations, saving some unnecessary
work during class template instantiation in the common case. In passing,
modernize the code a bit.
gcc/cp/ChangeLog:
* pt.cc (most_specialized_partial_spec): Exit early when
DECL_TEMPLATE_SPECIALIZATIONS is empty. Move local variable
declarations closer to their first use. Remove redundant
flag_concepts test. Remove redundant forward declaration.
|
|
Sparsely used ssa caches can benefit from using a bitmap to
determine if a name already has an entry. Utilize it in the path query
and remove its private bitmap for tracking the same info.
Also use it in the "assume" query class.
PR tree-optimization/108697
* gimple-range-cache.cc (ssa_global_cache::clear_range): Do
not clear the vector on an out of range query.
(ssa_cache::dump): Use dump_range_query instead of get_range.
(ssa_cache::dump_range_query): New.
(ssa_lazy_cache::dump_range_query): New.
(ssa_lazy_cache::set_range): New.
* gimple-range-cache.h (ssa_cache::dump_range_query): New.
(class ssa_lazy_cache): New.
(ssa_lazy_cache::ssa_lazy_cache): New.
(ssa_lazy_cache::~ssa_lazy_cache): New.
(ssa_lazy_cache::get_range): New.
(ssa_lazy_cache::clear_range): New.
(ssa_lazy_cache::clear): New.
(ssa_lazy_cache::dump): New.
* gimple-range-path.cc (path_range_query::path_range_query): Do
not allocate a ssa_cache object nor has_cache bitmap.
(path_range_query::~path_range_query): Do not free objects.
(path_range_query::clear_cache): Remove.
(path_range_query::get_cache): Adjust.
(path_range_query::set_cache): Remove.
(path_range_query::dump): Don't call through a pointer.
(path_range_query::internal_range_of_expr): Set cache directly.
(path_range_query::reset_path): Clear cache directly.
(path_range_query::ssa_range_in_phi): Fold with globals only.
(path_range_query::compute_ranges_in_phis): Simply set range.
(path_range_query::compute_ranges_in_block): Call cache directly.
* gimple-range-path.h (class path_range_query): Replace bitmap
and cache pointer with lazy cache object.
* gimple-range.h (class assume_query): Use ssa_lazy_cache.
|
|
This renames the ssa_global_cache to be ssa_cache. The original use was
to function as a global cache, but its uses have expanded. Remove all mention
of "global" from the class and methods. Also add a has_range method.
* gimple-range-cache.cc (ssa_cache::ssa_cache): Rename.
(ssa_cache::~ssa_cache): Rename.
(ssa_cache::has_range): New.
(ssa_cache::get_range): Rename.
(ssa_cache::set_range): Rename.
(ssa_cache::clear_range): Rename.
(ssa_cache::clear): Rename.
(ssa_cache::dump): Rename and use get_range.
(ranger_cache::get_global_range): Use get_range and set_range.
(ranger_cache::range_of_def): Use get_range.
* gimple-range-cache.h (class ssa_cache): Rename class and methods.
(class ranger_cache): Use ssa_cache.
* gimple-range-path.cc (path_range_query::path_range_query): Use
ssa_cache.
(path_range_query::get_cache): Use get_range.
(path_range_query::set_cache): Use set_range.
* gimple-range-path.h (class path_range_query): Use ssa_cache.
* gimple-range.cc (assume_query::assume_range_p): Use get_range.
(assume_query::range_of_expr): Use get_range.
(assume_query::assume_query): Use set_range.
(assume_query::calculate_op): Use get_range and set_range.
* gimple-range.h (class assume_query): Use ssa_cache.
|
|
Add a sparse vector class for cache and use if by default.
Rename the evrp_* params to vrp_*, and add a param for small CFGS which use
just the original basic vector.
* gimple-range-cache.cc (sbr_vector::sbr_vector): Add parameter
and local to optionally zero memory.
(br_vector::grow): Only zero memory if flag is set.
(class sbr_lazy_vector): New.
(sbr_lazy_vector::sbr_lazy_vector): New.
(sbr_lazy_vector::set_bb_range): New.
(sbr_lazy_vector::get_bb_range): New.
(sbr_lazy_vector::bb_range_p): New.
(block_range_cache::set_bb_range): Check flags and Use sbr_lazy_vector.
* gimple-range-gori.cc (gori_map::calculate_gori): Use
param_vrp_switch_limit.
(gori_compute::gori_compute): Use param_vrp_switch_limit.
* params.opt (vrp_sparse_threshold): Rename from evrp_sparse_threshold.
(vrp_switch_limit): Rename from evrp_switch_limit.
(vrp_vector_threshold): New.
|
|
If either of the SSA names in a comparison do not have any equivalences
or relations, we can short-circuit the check slightly.
* value-relation.cc (dom_oracle::query_relation): Check early for lack
of any relation.
* value-relation.h (equiv_oracle::has_equiv_p): New.
|
|
If the direct dependence fields point directly to an ssa-name,
its possible that an optimization frees an ssa-name, and the value
pointed to may now be in the free list. Simply maintain the ssa
version number instead.
PR tree-optimization/109417
* gimple-range-gori.cc (range_def_chain::register_dependency):
Save the ssa version number, not the pointer.
(gori_compute::may_recompute_p): No need to check if a dependency
is in the free list.
* gimple-range-gori.h (class range_def_chain): Change ssa1 and ssa2
fields to be unsigned int instead of trees.
(ange_def_chain::depend1): Adjust.
(ange_def_chain::depend2): Adjust.
* gimple-range.h: Include "ssa.h" to inline ssa_name().
|
|
AIX 7.2 minimum ISA is POWER7 and AIX 7.3 minimum ISA is POWER8.
This patch changes the aix72.h configuration to POWER7 with VSX enabled
by default (with the AIX VSX ABI limitations), matching LLVM on AIX,
and changes the aix73.h configuration to POWER8.
gcc/ChangeLog:
* config/rs6000/aix72.h (TARGET_DEFAULT): Use ISA_2_6_MASKS_SERVER.
* config/rs6000/aix73.h (TARGET_DEFAULT): Use ISA_2_7_MASKS_SERVER.
(PROCESSOR_DEFAULT): Use PROCESSOR_POWER8.
Signed-off-by: David Edelsohn <dje.gcc@gmail.com>
|
|
RISC-V has no support for subword atomic operations; code currently
generates libatomic library calls.
This patch changes the default behavior to inline subword atomic calls
(using the same logic as the existing library call).
Behavior can be specified using the -minline-atomics and
-mno-inline-atomics command line flags.
gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm.
This will need to stay for backwards compatibility and the
-mno-inline-atomics flag.
2023-04-18 Patrick O'Neill <patrick@rivosinc.com>
gcc/ChangeLog:
PR target/104338
* config/riscv/riscv-protos.h: Add helper function stubs.
* config/riscv/riscv.cc: Add helper functions for subword masking.
* config/riscv/riscv.opt: Add command-line flag.
* config/riscv/sync.md: Add masking logic and inline asm for fetch_and_op,
fetch_and_nand, CAS, and exchange ops.
* doc/invoke.texi: Add blurb regarding command-line flag.
libgcc/ChangeLog:
PR target/104338
* config/riscv/atomic.c: Add reference to duplicate logic.
gcc/testsuite/ChangeLog:
PR target/104338
* gcc.target/riscv/inline-atomics-1.c: New test.
* gcc.target/riscv/inline-atomics-2.c: New test.
* gcc.target/riscv/inline-atomics-3.c: New test.
* gcc.target/riscv/inline-atomics-4.c: New test.
* gcc.target/riscv/inline-atomics-5.c: New test.
* gcc.target/riscv/inline-atomics-6.c: New test.
* gcc.target/riscv/inline-atomics-7.c: New test.
* gcc.target/riscv/inline-atomics-8.c: New test.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Similar to the previous patch, we can reimplement the rshrn2 patterns using standard RTL codes
for shift, truncate and plus with the appropriate constants.
This allows us to get rid of UNSPEC_RSHRN entirely.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_rshrn2<mode>_insn_le):
Reimplement using standard RTL codes instead of unspec.
(aarch64_rshrn2<mode>_insn_be): Likewise.
(aarch64_rshrn2<mode>): Adjust for the above.
* config/aarch64/aarch64.md (UNSPEC_RSHRN): Delete.
|
|
This patch reimplements the backend patterns for the rshrn intrinsics using standard RTL codes rather than UNSPECS.
We already represent shrn as truncate of a shift. rshrn can be represented as truncate (src + (1 << (shft - 1)) >> shft),
similar to how LLVM treats it.
I have a follow-up patch to do the same for the rshrn2 pattern, which will allow us to remove the UNSPEC_RSHRN entirely.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_rshrn<mode>_insn_le): Reimplement
with standard RTL codes instead of an UNSPEC.
(aarch64_rshrn<mode>_insn_be): Likewise.
(aarch64_rshrn<mode>): Adjust for the above.
* config/aarch64/predicates.md (aarch64_simd_rshrn_imm_vec): Define.
|
|
This patch try to legitimise the const0_rtx (aka zero register)
as the base register for the RVV load/store instructions.
For example:
vint32m1_t test_vle32_v_i32m1_shortcut (size_t vl)
{
return __riscv_vle32_v_i32m1 ((int32_t *)0, vl);
}
Before this patch:
li a5,0
vsetvli zero,a1,e32,m1,ta,ma
vle32.v v24,0(a5) <- can propagate the const 0 to a5 here
vs1r.v v24,0(a0)
After this patch:
vsetvli zero,a1,e32,m1,ta,ma
vle32.v v24,0(zero)
vs1r.v v24,0(a0)
As above, this patch allow you to propagate the const 0 (aka zero
register) to the base register of the RVV Unit-Stride load in the
combine pass. This may benefit the underlying RVV auto-vectorization.
However, the indexed load failed to perform the optimization and it
will be take care of in another PATCH.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_classify_address): Allow
const0_rtx for the RVV load/store.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/zero_base_load_store_optimization.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
|
|
This patch removes all the code paths guarded by legacy_mode_p(), thus
allowing us to re-use the int_range<1> idiom for a range of one
sub-range. This allows us to represent these simple ranges in a more
efficient manner.
gcc/ChangeLog:
* range-op.cc (range_op_cast_tests): Remove legacy support.
* value-range-storage.h (vrange_allocator::alloc_irange): Same.
* value-range.cc (irange::operator=): Same.
(get_legacy_range): Same.
(irange::copy_legacy_to_multi_range): Delete.
(irange::copy_to_legacy): Delete.
(irange::irange_set_anti_range): Delete.
(irange::set): Remove legacy support.
(irange::verify_range): Same.
(irange::legacy_lower_bound): Delete.
(irange::legacy_upper_bound): Delete.
(irange::legacy_equal_p): Delete.
(irange::operator==): Remove legacy support.
(irange::singleton_p): Same.
(irange::value_inside_range): Same.
(irange::contains_p): Same.
(intersect_ranges): Delete.
(irange::legacy_intersect): Delete.
(union_ranges): Delete.
(irange::legacy_union): Delete.
(irange::legacy_verbose_union_): Delete.
(irange::legacy_verbose_intersect): Delete.
(irange::irange_union): Remove legacy support.
(irange::irange_intersect): Same.
(irange::intersect): Same.
(irange::invert): Same.
(ranges_from_anti_range): Delete.
(gt_pch_nx): Adjust for legacy removal.
(gt_ggc_mx): Same.
(range_tests_legacy): Delete.
(range_tests_misc): Adjust for legacy removal.
(range_tests): Same.
* value-range.h (class irange): Same.
(irange::legacy_mode_p): Delete.
(ranges_from_anti_range): Delete.
(irange::nonzero_p): Adjust for legacy removal.
(irange::lower_bound): Same.
(irange::upper_bound): Same.
(irange::union_): Same.
(irange::intersect): Same.
(irange::set_nonzero): Same.
(irange::set_zero): Same.
* vr-values.cc (simplify_using_ranges::legacy_fold_cond_overflow): Same.
|
|
gcc/ChangeLog:
* value-range.cc (irange::copy_legacy_to_multi_range): Rewrite use
of range_has_numeric_bounds_p with irange API.
(range_has_numeric_bounds_p): Delete.
* value-range.h (range_has_numeric_bounds_p): Delete.
|
|
gcc/ChangeLog:
* tree-data-ref.cc (compute_distributive_range): Replace uses of
range_int_cst_p with irange API.
* tree-ssa-strlen.cc (get_range_strlen_dynamic): Same.
* tree-vrp.h (range_int_cst_p): Delete.
* vr-values.cc (check_for_binary_op_overflow): Replace usees of
range_int_cst_p with irange API.
(vr_set_zero_nonzero_bits): Same.
(range_fits_type_p): Same.
(simplify_using_ranges::simplify_casted_cond): Same.
* tree-vrp.cc (range_int_cst_p): Remove.
|
|
gcc/ChangeLog:
* tree-ssa-strlen.cc (compare_nonzero_chars): Convert to wide_ints.
|
|
gcc/ChangeLog:
* builtins.cc (expand_builtin_strnlen): Rewrite deprecated irange
API uses to new API.
* gimple-predicate-analysis.cc (find_var_cmp_const): Same.
* internal-fn.cc (get_min_precision): Same.
* match.pd: Same.
* tree-affine.cc (expr_to_aff_combination): Same.
* tree-data-ref.cc (dr_step_indicator): Same.
* tree-dfa.cc (get_ref_base_and_extent): Same.
* tree-scalar-evolution.cc (iv_can_overflow_p): Same.
* tree-ssa-phiopt.cc (two_value_replacement): Same.
* tree-ssa-pre.cc (insert_into_preds_of_block): Same.
* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Same.
* tree-ssa-strlen.cc (compare_nonzero_chars): Same.
* tree-switch-conversion.cc (bit_test_cluster::emit): Same.
* tree-vect-patterns.cc (vect_recog_divmod_pattern): Same.
* tree.cc (get_range_pos_neg): Same.
|
|
This causes a regression in gcc.c-torture/unsorted/dump-noaddr.c.
The test is asserting that two dumps are identical, but they are not
because irange dumps the type which varies between runs:
< VR [irange] void (*<T3dc>) (int) [1, +INF]
> VR [irange] void (*<T3da>) (int) [1, +INF]
I have changed the pretty printer for irange types to pass TDF_NOUID,
thus avoiding this problem.
gcc/ChangeLog:
* ipa-prop.cc (ipa_print_node_jump_functions_for_edge): Use
vrange::dump instead of ad-hoc dumper.
* tree-ssa-strlen.cc (dump_strlen_info): Same.
* value-range-pretty-print.cc (visit): Pass TDF_NOUID to
dump_generic_node.
|
|
The legacy range code has logic to swap out of order endpoints in the
irange constructor. The new irange code expects the caller to fix any
inconsistencies, thus speeding up the common case. However, this means
that when we remove legacy, any stragglers must be fixed. This patch
fixes the 3 culprits found during the conversion.
gcc/ChangeLog:
* range-op.cc (operator_cast::op1_range): Use
create_possibly_reversed_range.
(operator_bitwise_and::simple_op1_range_solver): Same.
* value-range.cc (swap_out_of_order_endpoints): Delete.
(irange::set): Remove call to swap_out_of_order_endpoints.
|
|
This patch converts the users of the legacy API to a function called
get_legacy_range() which will return the pieces of the soon to be
removed API (min, max, and kind). This is a temporary measure while
these users are converted.
In upcoming patches I will convert most users, but most of the
middle-end warning uses will remain. Naive attempts to remove them
showed that a lot of these uses are quite dependant on the anti-range
idiom, and converting them to the new API broke the tests, even when
the conversion was conceptually correct. Perhaps someone who
understands these passes could take a stab at it. In the meantime,
the legacy uses can be trivially found by grepping for
get_legacy_range.
gcc/ChangeLog:
* builtins.cc (determine_block_size): Convert use of legacy API to
get_legacy_range.
* gimple-array-bounds.cc (check_out_of_bounds_and_warn): Same.
(array_bounds_checker::check_array_ref): Same.
* gimple-ssa-warn-restrict.cc
(builtin_memref::extend_offset_range): Same.
* ipa-cp.cc (ipcp_store_vr_results): Same.
* ipa-fnsummary.cc (set_switch_stmt_execution_predicate): Same.
* ipa-prop.cc (struct ipa_vr_ggc_hash_traits): Same.
(ipa_write_jump_function): Same.
* pointer-query.cc (get_size_range): Same.
* tree-data-ref.cc (split_constant_offset): Same.
* tree-ssa-strlen.cc (get_range): Same.
(maybe_diag_stxncpy_trunc): Same.
(strlen_pass::get_len_or_size): Same.
(strlen_pass::count_nonzero_bytes_addr): Same.
* tree-vect-patterns.cc (vect_get_range_info): Same.
* value-range.cc (irange::maybe_anti_range): Remove.
(get_legacy_range): New.
(irange::copy_to_legacy): Use get_legacy_range.
(ranges_from_anti_range): Same.
* value-range.h (class irange): Remove maybe_anti_range.
(get_legacy_range): New.
* vr-values.cc (check_for_binary_op_overflow): Convert use of
legacy API to get_legacy_range.
(compare_ranges): Same.
(compare_range_with_value): Same.
(bounds_of_var_in_loop): Same.
(find_case_label_ranges): Same.
(simplify_using_ranges::simplify_switch_using_ranges): Same.
|
|
gcc/ChangeLog:
* value-range-pretty-print.cc (vrange_printer::visit): Remove
constant_p use.
* value-range.cc (irange::constant_p): Remove.
(irange::get_nonzero_bits_from_range): Remove constant_p use.
* value-range.h (class irange): Remove constant_p.
(irange::num_pairs): Remove constant_p use.
|
|
gcc/ChangeLog:
* value-range.cc (irange::copy_legacy_to_multi_range): Remove
symbolics support.
(irange::set): Same.
(irange::legacy_lower_bound): Same.
(irange::legacy_upper_bound): Same.
(irange::contains_p): Same.
(range_tests_legacy): Same.
(irange::normalize_addresses): Remove.
(irange::normalize_symbolics): Remove.
(irange::symbolic_p): Remove.
* value-range.h (class irange): Remove symbolic_p,
normalize_symbolics, and normalize_addresses.
* vr-values.cc (simplify_using_ranges::two_valued_val_range_p):
Remove symbolics support.
|
|
The deprecated irange::may_contain_p method differed from contains_p
in that it could handle symbolics, which no longer exist in VRP.
gcc/ChangeLog:
* value-range.cc (irange::may_contain_p): Remove.
* value-range.h (range_includes_zero_p): Rewrite may_contain_p
usage with contains_p.
* vr-values.cc (compare_range_with_value): Same.
|
|
gcc/ChangeLog:
* tree-vrp.cc (supported_types_p): Remove.
(defined_ranges_p): Remove.
(range_fold_binary_expr): Remove.
(range_fold_unary_expr): Remove.
* tree-vrp.h (range_fold_unary_expr): Remove.
(range_fold_binary_expr): Remove.
|
|
gcc/ChangeLog:
* ipa-cp.cc (ipa_vr_operation_and_type_effects): Convert to ranger API.
(ipa_value_range_from_jfunc): Same.
(propagate_vr_across_jump_function): Same.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Same.
* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Same.
* vr-values.cc (bounds_of_var_in_loop): Same.
|
|
gcc/ChangeLog:
* gimple-array-bounds.cc (array_bounds_checker::get_value_range):
Add irange argument.
(check_out_of_bounds_and_warn): Remove check for vr.
(array_bounds_checker::check_array_ref): Remove pointer qualifier
for vr and adjust accordingly.
* gimple-array-bounds.h (get_value_range): Add irange argument.
* value-query.cc (class equiv_allocator): Delete.
(range_query::get_value_range): Delete.
(range_query::range_query): Remove allocator access.
(range_query::~range_query): Same.
* value-query.h (get_value_range): Delete.
* vr-values.cc
(simplify_using_ranges::op_with_boolean_value_range_p): Remove
call to get_value_range.
(check_for_binary_op_overflow): Same.
(simplify_using_ranges::legacy_fold_cond_overflow): Same.
(simplify_using_ranges::simplify_abs_using_ranges): Same.
(simplify_using_ranges::simplify_cond_using_ranges_1): Same.
(simplify_using_ranges::simplify_casted_cond): Same.
(simplify_using_ranges::simplify_switch_using_ranges): Same.
(simplify_using_ranges::two_valued_val_range_p): Same.
|
|
gcc/ChangeLog:
* vr-values.cc
(simplify_using_ranges::vrp_evaluate_conditional_warnv_with_ops):
Rename to...
(simplify_using_ranges::legacy_fold_cond_overflow): ...this.
(simplify_using_ranges::vrp_visit_cond_stmt): Rename to...
(simplify_using_ranges::legacy_fold_cond): ...this.
(simplify_using_ranges::fold_cond): Rename
vrp_evaluate_conditional_warnv_with_ops to
legacy_fold_cond_overflow.
* vr-values.h (class vr_values): Replace vrp_visit_cond_stmt and
vrp_evaluate_conditional_warnv_with_ops with legacy_fold_cond and
legacy_fold_cond_overflow respectively.
|
|
In a test run I have asserted that the legacy conditional folding only
gets overflows, so this removal is safe.
gcc/ChangeLog:
* vr-values.cc (get_vr_for_comparison): Remove.
(compare_name_with_value): Same.
(vrp_evaluate_conditional_warnv_with_ops): Remove calls to
compare_name_with_value.
* vr-values.h: Remove compare_name_with_value.
Remove get_vr_for_comparison.
|
|
This patch adds support for xstormy16's swpb (swap bytes) and swpw (swap
words) instructions. The most obvious application of these to implement
the __builtin_bswap16 and __builtin_bswap32 intrinsics.
Currently, __builtin_bswap16 is implemented as:
foo: mov r7,r2
shl r7,#8
shr r2,#8
or r2,r7
ret
but with this patch becomes:
foo: swpb r2
ret
Likewise, __builtin_bswap32 now becomes:
foo: swpb r2 | swpb r3 | swpw r2,r3
ret
Finally, the swpw instruction on its own can be used to exchange
two word mode registers without a temporary, so a new pattern and
peephole2 have been added to catch this. As described in the
PR rtl-optimization/106518, register allocation can (in theory)
be more efficient on targets that provide a swap/exchange instruction.
The slightly unusual swap<mode> naming matches that used in i386.md.
2024-04-26 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/stormy16/stormy16.md (bswaphi2): New define_insn.
(bswapsi2): New define_insn.
(swaphi): New define_insn to exchange two registers (swpw).
(define_peephole2): Recognize exchange of registers as swaphi.
gcc/testsuite/ChangeLog
* gcc.target/xstormy16/bswap16.c: New test case.
* gcc.target/xstormy16/bswap32.c: Likewise.
* gcc.target/xstormy16/swpb.c: Likewise.
* gcc.target/xstormy16/swpw-1.c: Likewise.
* gcc.target/xstormy16/swpw-2.c: Likewise.
|
|
This adjusts more users of last_stmt where it is clear that debug
stmt skipping is unnecessary. In most cases this also allowed
significant code simplification.
gcc/c/
* gimple-parser.cc (c_parser_parse_gimple_body): Avoid
last_stmt.
gcc/
* gimple-range-path.cc (path_range_query::compute_outgoing_relations):
Avoid last_stmt.
* ipa-pure-const.cc (pass_nothrow::execute): Likewise.
* predict.cc (apply_return_prediction): Likewise.
* sese.cc (set_ifsese_condition): Likewise. Simplify.
* tree-cfg.cc (assert_unreachable_fallthru_edge_p): Avoid last_stmt.
(make_edges_bb): Likewise.
(make_cond_expr_edges): Likewise.
(end_recording_case_labels): Likewise.
(make_gimple_asm_edges): Likewise.
(cleanup_dead_labels): Likewise.
(group_case_labels): Likewise.
(gimple_can_merge_blocks_p): Likewise.
(gimple_merge_blocks): Likewise.
(find_taken_edge): Likewise. Also handle empty fallthru blocks.
(gimple_duplicate_sese_tail): Avoid last_stmt.
(find_loop_dist_alias): Likewise.
(gimple_block_ends_with_condjump_p): Likewise.
(gimple_purge_dead_eh_edges): Likewise.
(gimple_purge_dead_abnormal_call_edges): Likewise.
(pass_warn_function_return::execute): Likewise.
(execute_fixup_cfg): Likewise.
* tree-eh.cc (redirect_eh_edge_1): Likewise.
(pass_lower_resx::execute): Likewise.
(pass_lower_eh_dispatch::execute): Likewise.
(cleanup_empty_eh): Likewise.
* tree-if-conv.cc (if_convertible_bb_p): Likewise.
(predicate_bbs): Likewise.
(ifcvt_split_critical_edges): Likewise.
* tree-loop-distribution.cc (create_edge_for_control_dependence):
Likewise.
(loop_distribution::transform_reduction_loop): Likewise.
* tree-parloops.cc (transform_to_exit_first_loop_alt): Likewise.
(try_transform_to_exit_first_loop_alt): Likewise.
(transform_to_exit_first_loop): Likewise.
(create_parallel_loop): Likewise.
* tree-scalar-evolution.cc (get_loop_exit_condition): Likewise.
* tree-ssa-dce.cc (mark_last_stmt_necessary): Likewise.
(eliminate_unnecessary_stmts): Likewise.
* tree-ssa-dom.cc
(dom_opt_dom_walker::set_global_ranges_from_unreachable_edges):
Likewise.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Likewise.
(pass_tree_ifcombine::execute): Likewise.
* tree-ssa-loop-ch.cc (entry_loop_condition_is_static): Likewise.
(should_duplicate_loop_header_p): Likewise.
* tree-ssa-loop-ivcanon.cc (create_canonical_iv): Likewise.
(tree_estimate_loop_size): Likewise.
(try_unroll_loop_completely): Likewise.
* tree-ssa-loop-ivopts.cc (tree_ssa_iv_optimize_loop): Likewise.
* tree-ssa-loop-manip.cc (ip_normal_pos): Likewise.
(canonicalize_loop_ivs): Likewise.
* tree-ssa-loop-niter.cc (determine_value_range): Likewise.
(bound_difference): Likewise.
(number_of_iterations_popcount): Likewise.
(number_of_iterations_cltz): Likewise.
(number_of_iterations_cltz_complement): Likewise.
(simplify_using_initial_conditions): Likewise.
(number_of_iterations_exit_assumptions): Likewise.
(loop_niter_by_eval): Likewise.
(estimate_numbers_of_iterations): Likewise.
|
|
gcc/ChangeLog:
* config/riscv/vector.md: Refine vmadc/vmsbc RA constraint.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/narrow_constraint-13.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-14.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-15.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-16.c: New test.
|
|
__builtin_vsx_scalar_cmp_exp_qp_{eq,gt,lt,unordered} used
to be guarded with condition TARGET_P9_VECTOR before new
bif framework was introduced (r12-5752-gd08236359eb229),
since r12-5752 they are placed under stanza ieee128-hw,
that is to check condition TARGET_FLOAT128_HW, it caused
test case float128-cmp2-runnable.c to fail at -m32 as the
condition TARGET_FLOAT128_HW isn't satisified with -m32.
By checking the commit history, I didn't see any notes on
why this condition change on them was made, so this patch
is to move these bifs from stanza ieee128-hw to stanza
power9-vector as before.
PR target/108758
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def
(__builtin_vsx_scalar_cmp_exp_qp_eq, __builtin_vsx_scalar_cmp_exp_qp_gt
__builtin_vsx_scalar_cmp_exp_qp_lt,
__builtin_vsx_scalar_cmp_exp_qp_unordered): Move from stanza ieee128-hw
to power9-vector.
|
|
As PR109069 shows, commit r12-6537-g080a06fcb076b3 which
introduces define_insn_and_split sldoi_to_mov adopts
easy_vector_constant for const vector of interest, but it's
wrong since predicate easy_vector_constant doesn't guarantee
each byte in the const vector is the same. One counter
example is the const vector in pr109069-1.c. This patch is
to introduce new predicate const_vector_each_byte_same to
ensure all bytes in the given const vector are the same by
considering both int and float, meanwhile for the constants
which don't meet easy_vector_constant we need to gen a move
instead of just a set, and uses VECTOR_MEM_ALTIVEC_OR_VSX_P
rather than VECTOR_UNIT_ALTIVEC_OR_VSX_P for V2DImode support
under VSX since vector long long type of vec_sld is guarded
under stanza vsx.
PR target/109069
gcc/ChangeLog:
* config/rs6000/altivec.md (sldoi_to_mov<mode>): Replace predicate
easy_vector_constant with const_vector_each_byte_same, add
handlings in preparation for !easy_vector_constant, and update
VECTOR_UNIT_ALTIVEC_OR_VSX_P with VECTOR_MEM_ALTIVEC_OR_VSX_P.
* config/rs6000/predicates.md (const_vector_each_byte_same): New
predicate.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr109069-1.c: New test.
* gcc.target/powerpc/pr109069-2-run.c: New test.
* gcc.target/powerpc/pr109069-2.c: New test.
* gcc.target/powerpc/pr109069-2.h: New test.
|
|
Current RA constraint for RVV comparison instructions totall does not allow
registers between dest and source operand have any overlaps.
For example:
vmseq.vv vd, vs2, vs1
If LMUL = 8, vs2 = v8, vs1 = v16:
In current GCC RA constraint, GCC does not allow vd to be any regno in v8 ~ v23.
However, it is too conservative and not true according to RVV ISA.
Since the dest EEW of comparison is always EEW = 1, so it always follows the overlap
rules of Dest EEW < Source EEW. So in this case, we should allow GCC RA have the chance
to allocate v8 or v16 for vd, so that we can have better vector registers usage in RA.
gcc/ChangeLog:
* config/riscv/vector.md (*pred_cmp<mode>_merge_tie_mask): New pattern.
(*pred_ltge<mode>_merge_tie_mask): Ditto.
(*pred_cmp<mode>_scalar_merge_tie_mask): Ditto.
(*pred_eqne<mode>_scalar_merge_tie_mask): Ditto.
(*pred_cmp<mode>_extended_scalar_merge_tie_mask): Ditto.
(*pred_eqne<mode>_extended_scalar_merge_tie_mask): Ditto.
(*pred_cmp<mode>_narrow_merge_tie_mask): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/binop_vv_constraint-4.c: Adapt testcase.
* gcc.target/riscv/rvv/base/narrow_constraint-17.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-18.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-19.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-20.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-21.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-22.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-23.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-24.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-25.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-26.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-27.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-28.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-29.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-30.c: New test.
* gcc.target/riscv/rvv/base/narrow_constraint-31.c: New test.
|
|
Current expansion of vmsge will make RA produce redundant vmv1r.v.
testcase:
void f1 (void * in, void *out, int32_t x)
{
vbool32_t mask = *(vbool32_t*)in;
asm volatile ("":::"memory");
vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4);
vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, in, 4);
vbool32_t m3 = __riscv_vmsge_vx_i32m1_b32 (v, x, 4);
vbool32_t m4 = __riscv_vmsge_vx_i32m1_b32_mu (mask, m3, v, x, 4);
m4 = __riscv_vmsge_vv_i32m1_b32_m (m4, v2, v2, 4);
__riscv_vsm_v_b32 (out, m4, 4);
}
Before this patch:
f1:
vsetvli a5,zero,e8,mf4,ta,ma
vlm.v v0,0(a0)
vsetivli zero,4,e32,m1,ta,mu
vle32.v v3,0(a0)
vle32.v v2,0(a0),v0.t
vmslt.vx v1,v3,a2
vmnot.m v1,v1
vmslt.vx v1,v3,a2,v0.t
vmxor.mm v1,v1,v0
vmv1r.v v0,v1
vmsge.vv v2,v2,v2,v0.t
vsm.v v2,0(a1)
ret
After this patch:
f1:
vsetvli a5,zero,e8,mf4,ta,ma
vlm.v v0,0(a0)
vsetivli zero,4,e32,m1,ta,mu
vle32.v v3,0(a0)
vle32.v v2,0(a0),v0.t
vmslt.vx v1,v3,a2
vmnot.m v1,v1
vmslt.vx v1,v3,a2,v0.t
vmxor.mm v0,v1,v0
vmsge.vv v2,v2,v2,v0.t
vsm.v v2,0(a1)
ret
gcc/ChangeLog:
* config/riscv/vector.md: Fix redundant vmv1r.v.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/binop_vx_constraint-150.c: Adapt assembly
check.
|
|
For DEST EEW < SOURCE EEW, we can partial overlap register
according to RVV ISA.
gcc/ChangeLog:
* config/riscv/vector.md: Fix RA constraint.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/narrow_constraint-12.c: New test.
|