Age | Commit message (Collapse) | Author | Files | Lines |
|
The former patch adds optab for builtin isnormal. Thus builtin isnormal
might not be folded at front end. So the range op for isnormal is needed
for value range analysis. This patch adds range op for builtin isnormal.
gcc/
* gimple-range-op.cc (class cfn_isfinite): New.
(op_cfn_finite): New variables.
(gimple_range_op_handler::maybe_builtin_call): Handle
CFN_BUILT_IN_ISFINITE.
* value-range.h (class frange): Declear known_isnormal and
known_isdenormal_or_zero.
(frange::known_isnormal): Define.
(frange::known_isdenormal_or_zero): Define.
gcc/testsuite/
* gcc.dg/tree-ssa/range-isnormal.c: New test.
|
|
This reverts commit r14-6478-gfda8e2f8292a90 "range:
Workaround different type precision between _Float128 and
long double [PR112788]" as the fixes for PR112993 make
all 128 bits scalar floating point have the same 128 bit
precision, this workaround isn't needed any more.
PR target/112993
gcc/ChangeLog:
* value-range.h (range_compatible_p): Remove the workaround on
different type precision between _Float128 and long double.
|
|
I noticed there was a warning from clang about int_range's
dtor being marked as final saying the class cannot be inherited from.
So let's mark the few ranger classes as final for those which we know
will be final.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* value-range.h (class int_range): Mark as final.
(class prange): Likewise.
(class frange): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
Now that all remaining users of value_range have been renamed to
int_range<>, we can reclaim value_range as a temporary, thus removing
the annoying CamelCase.
gcc/ChangeLog:
* data-streamer-in.cc (streamer_read_value_range): Rename
Value_Range to value_range.
* data-streamer.h (streamer_read_value_range): Same.
* gimple-pretty-print.cc (dump_ssaname_info): Same.
* gimple-range-cache.cc (ssa_block_ranges::dump): Same.
(ssa_lazy_cache::merge): Same.
(block_range_cache::dump): Same.
(ssa_cache::merge_range): Same.
(ssa_cache::dump): Same.
(ranger_cache::edge_range): Same.
(ranger_cache::propagate_cache): Same.
(ranger_cache::fill_block_cache): Same.
(ranger_cache::resolve_dom): Same.
(ranger_cache::range_from_dom): Same.
(ranger_cache::register_inferred_value): Same.
* gimple-range-fold.cc (op1_range): Same.
(op2_range): Same.
(fold_relations): Same.
(fold_using_range::range_of_range_op): Same.
(fold_using_range::range_of_phi): Same.
(fold_using_range::range_of_call): Same.
(fold_using_range::condexpr_adjust): Same.
(fold_using_range::range_of_cond_expr): Same.
(fur_source::register_outgoing_edges): Same.
* gimple-range-fold.h (gimple_range_type): Same.
(gimple_range_ssa_p): Same.
* gimple-range-gori.cc (gori_compute::compute_operand_range): Same.
(gori_compute::logical_combine): Same.
(gori_compute::refine_using_relation): Same.
(gori_compute::compute_operand1_range): Same.
(gori_compute::compute_operand2_range): Same.
(gori_compute::compute_operand1_and_operand2_range): Same.
(gori_calc_operands): Same.
(gori_name_helper): Same.
* gimple-range-infer.cc (gimple_infer_range::check_assume_func): Same.
(gimple_infer_range::gimple_infer_range): Same.
(infer_range_manager::maybe_adjust_range): Same.
(infer_range_manager::add_range): Same.
* gimple-range-infer.h: Same.
* gimple-range-op.cc
(gimple_range_op_handler::gimple_range_op_handler): Same.
(gimple_range_op_handler::calc_op1): Same.
(gimple_range_op_handler::calc_op2): Same.
(gimple_range_op_handler::maybe_builtin_call): Same.
* gimple-range-path.cc (path_range_query::internal_range_of_expr): Same.
(path_range_query::ssa_range_in_phi): Same.
(path_range_query::compute_ranges_in_phis): Same.
(path_range_query::compute_ranges_in_block): Same.
(path_range_query::add_to_exit_dependencies): Same.
* gimple-range-trace.cc (debug_seed_ranger): Same.
* gimple-range.cc (gimple_ranger::range_of_expr): Same.
(gimple_ranger::range_on_entry): Same.
(gimple_ranger::range_on_edge): Same.
(gimple_ranger::range_of_stmt): Same.
(gimple_ranger::prefill_stmt_dependencies): Same.
(gimple_ranger::register_inferred_ranges): Same.
(gimple_ranger::register_transitive_inferred_ranges): Same.
(gimple_ranger::export_global_ranges): Same.
(gimple_ranger::dump_bb): Same.
(assume_query::calculate_op): Same.
(assume_query::calculate_phi): Same.
(assume_query::dump): Same.
(dom_ranger::range_of_stmt): Same.
* ipa-cp.cc (ipcp_vr_lattice::meet_with_1): Same.
(ipa_vr_operation_and_type_effects): Same.
(ipa_value_range_from_jfunc): Same.
(propagate_bits_across_jump_function): Same.
(propagate_vr_across_jump_function): Same.
(ipcp_store_vr_results): Same.
* ipa-cp.h: Same.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Same.
(evaluate_properties_for_edge): Same.
* ipa-prop.cc (struct ipa_vr_ggc_hash_traits): Same.
(ipa_vr::get_vrange): Same.
(ipa_vr::streamer_read): Same.
(ipa_vr::streamer_write): Same.
(ipa_vr::dump): Same.
(ipa_set_jfunc_vr): Same.
(ipa_compute_jump_functions_for_edge): Same.
(ipcp_get_parm_bits): Same.
(ipcp_update_vr): Same.
(ipa_record_return_value_range): Same.
(ipa_return_value_range): Same.
* ipa-prop.h (ipa_return_value_range): Same.
(ipa_record_return_value_range): Same.
* range-op.h (range_cast): Same.
* tree-ssa-dom.cc
(dom_opt_dom_walker::set_global_ranges_from_unreachable_edges): Same.
(cprop_operand): Same.
* tree-ssa-loop-ch.cc (loop_static_stmt_p): Same.
* tree-ssa-loop-niter.cc (record_nonwrapping_iv): Same.
* tree-ssa-loop-split.cc (split_at_bb_p): Same.
* tree-ssa-phiopt.cc (value_replacement): Same.
* tree-ssa-strlen.cc (get_range): Same.
* tree-ssa-threadedge.cc (hybrid_jt_simplifier::simplify): Same.
(hybrid_jt_simplifier::compute_exit_dependencies): Same.
* tree-ssanames.cc (set_range_info): Same.
(duplicate_ssa_name_range_info): Same.
* tree-vrp.cc (remove_unreachable::handle_early): Same.
(remove_unreachable::remove_and_update_globals): Same.
(execute_ranger_vrp): Same.
* value-query.cc (range_query::value_of_expr): Same.
(range_query::value_on_edge): Same.
(range_query::value_of_stmt): Same.
(range_query::value_on_entry): Same.
(range_query::value_on_exit): Same.
(range_query::get_tree_range): Same.
* value-range-storage.cc (vrange_storage::set_vrange): Same.
* value-range.cc (Value_Range::dump): Same.
(value_range::dump): Same.
(debug): Same.
* value-range.h (enum value_range_discriminator): Same.
(class vrange): Same.
(class Value_Range): Same.
(class value_range): Same.
(Value_Range::Value_Range): Same.
(value_range::value_range): Same.
(Value_Range::~Value_Range): Same.
(value_range::~value_range): Same.
(Value_Range::set_type): Same.
(value_range::set_type): Same.
(Value_Range::init): Same.
(value_range::init): Same.
(Value_Range::operator=): Same.
(value_range::operator=): Same.
(Value_Range::operator==): Same.
(value_range::operator==): Same.
(Value_Range::operator!=): Same.
(value_range::operator!=): Same.
(Value_Range::supports_type_p): Same.
(value_range::supports_type_p): Same.
* vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Same.
(simplify_using_ranges::legacy_fold_cond): Same.
|
|
Now that pointers and integers have been disambiguated from irange,
and all the pointer range temporaries use prange, we can reclaim
value_range as a general purpose range container.
This patch removes the typedef, in favor of int_range_max, thus
providing slightly better ranges in places. I have also used
int_range<1> or <2> when it's known ahead of time how big the range will
be, thus saving a few words.
In a follow-up patch I will rename the Value_Range temporary to
value_range.
No change in performance.
gcc/ChangeLog:
* builtins.cc (expand_builtin_strnlen): Replace value_range use
with int_range_max or irange when appropriate.
(determine_block_size): Same.
* fold-const.cc (minmax_from_comparison): Same.
* gimple-array-bounds.cc (check_out_of_bounds_and_warn): Same.
(array_bounds_checker::check_array_ref): Same.
* gimple-fold.cc (size_must_be_zero_p): Same.
* gimple-predicate-analysis.cc (find_var_cmp_const): Same.
* gimple-ssa-sprintf.cc (get_int_range): Same.
(format_integer): Same.
(try_substitute_return_value): Same.
(handle_printf_call): Same.
* gimple-ssa-warn-restrict.cc
(builtin_memref::extend_offset_range): Same.
* graphite-sese-to-poly.cc (add_param_constraints): Same.
* internal-fn.cc (get_min_precision): Same.
* match.pd: Same.
* pointer-query.cc (get_size_range): Same.
* range-op.cc (get_shift_range): Same.
(operator_trunc_mod::op1_range): Same.
(operator_trunc_mod::op2_range): Same.
* range.cc (range_negatives): Same.
* range.h (range_positives): Same.
(range_negatives): Same.
* tree-affine.cc (expr_to_aff_combination): Same.
* tree-data-ref.cc (compute_distributive_range): Same.
(nop_conversion_for_offset_p): Same.
(split_constant_offset): Same.
(split_constant_offset_1): Same.
(dr_step_indicator): Same.
* tree-dfa.cc (get_ref_base_and_extent): Same.
* tree-scalar-evolution.cc (iv_can_overflow_p): Same.
* tree-ssa-math-opts.cc (optimize_spaceship): Same.
* tree-ssa-pre.cc (insert_into_preds_of_block): Same.
* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Same.
* tree-ssa-strlen.cc (compare_nonzero_chars): Same.
(dump_strlen_info): Same.
(get_range_strlen_dynamic): Same.
(set_strlen_range): Same.
(maybe_diag_stxncpy_trunc): Same.
(strlen_pass::get_len_or_size): Same.
(strlen_pass::handle_builtin_string_cmp): Same.
(strlen_pass::count_nonzero_bytes_addr): Same.
(strlen_pass::handle_integral_assign): Same.
* tree-switch-conversion.cc (bit_test_cluster::emit): Same.
* tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Same.
(vect_do_peeling): Same.
* tree-vect-patterns.cc (vect_get_range_info): Same.
(vect_recog_divmod_pattern): Same.
* tree.cc (get_range_pos_neg): Same.
* value-range.cc (debug): Remove value_range variants.
* value-range.h (value_range): Remove typedef.
* vr-values.cc
(simplify_using_ranges::op_with_boolean_value_range_p): Replace
value_range use with int_range_max or irange when appropriate.
(check_for_binary_op_overflow): Same.
(simplify_using_ranges::legacy_fold_cond_overflow): Same.
(find_case_label_ranges): Same.
(simplify_using_ranges::simplify_abs_using_ranges): Same.
(test_for_singularity): Same.
(simplify_using_ranges::simplify_compare_using_ranges_1): Same.
(simplify_using_ranges::simplify_casted_compare): Same.
(simplify_using_ranges::simplify_switch_using_ranges): Same.
(simplify_conversion_using_ranges): Same.
(simplify_using_ranges::two_valued_val_range_p): Same.
|
|
This reverts commit d7bb8eaade3cd3aa70715c8567b4d7b08098e699 and enables prange
support again.
|
|
This reverts commit 36e877996936abd8bd08f8b1d983c8d1023a5842 until the IPA
pass is fixed with regards to POINTER = POINTER <RELOP> POINTER.
|
|
gcc/ChangeLog:
PR tree-optimization/114912
* value-range.h (class Value_Range): Use a union.
|
|
This throws the switch on prange. After this patch, it is no longer
valid to store a pointer in an irange (or vice versa). Instead, they
must go in prange, which is faster and more memory efficient.
I will push this now, so I have time to do any follow-up bugfixing
before going on paternity leave.
There are various cleanups we plan on doing after this patch (faster
intersect/union, remove range-op-mixed.h, remove value_range in favor
of int_range_max, reclaim the name for the Value_Range temporary,
clean up range-ops, etc etc). But we will hold off on those for now
to make it easier to revert this patch, if for some reason we need to
do so while I'm away.
Tested on x86-64 Linux.
gcc/ChangeLog:
* gimple-range-cache.cc (sbr_sparse_bitmap::sbr_sparse_bitmap):
Change irange to prange.
* gimple-range-fold.cc (fold_using_range::fold_stmt): Same.
(fold_using_range::range_of_address): Same.
* gimple-range-fold.h (range_of_address): Same.
* gimple-range-infer.cc (gimple_infer_range::add_nonzero): Same.
* gimple-range-op.cc (class cfn_strlen): Same.
* gimple-range-path.cc
(path_range_query::adjust_for_non_null_uses): Same.
* gimple-ssa-warn-access.cc (pass_waccess::check_pointer_uses): Same.
* tree-ssa-structalias.cc (find_what_p_points_to): Same.
* range-op-ptr.cc (range_op_table::initialize_pointer_ops): Remove
hybrid entries in table.
* range-op.cc (range_op_table::range_op_table): Add pointer
entries for bitwise and/or and min/max.
* value-range.cc (irange::verify_range): Add assert.
* value-range.h (irange::varying_compatible_p): Remove check for
error_mark_node.
(irange::supports_p): Remove pointer support.
* ipa-cp.h (ipa_supports_p): Add prange support.
|
|
This provides a bare prange class with bounds and bitmasks. It will
be a drop-in replacement for pointer ranges, so we can pull their
support from irange. The range-op code will be contributed as a
follow-up.
The code is disabled by default, as irange::supports_p still accepts
pointers:
inline bool
irange::supports_p (const_tree type)
{
return INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type);
}
Once the prange operators are implemented in range-ops, pointer
support will be removed from irange to activate pranges.
gcc/ChangeLog:
* value-range-pretty-print.cc (vrange_printer::visit): New.
* value-range-pretty-print.h: Declare prange visit() method.
* value-range.cc (vrange::operator=): Add prange support.
(vrange::operator==): Same.
(prange::accept): New.
(prange::set_nonnegative): New.
(prange::set): New.
(prange::contains_p): New.
(prange::singleton_p): New.
(prange::lbound): New.
(prange::ubound): New.
(prange::union_): New.
(prange::intersect): New.
(prange::operator=): New.
(prange::operator==): New.
(prange::invert): New.
(prange::verify_range): New.
(prange::update_bitmask): New.
(range_tests_misc): Use prange.
* value-range.h (enum value_range_discriminator): Add VR_PRANGE.
(class prange): New.
(Value_Range::init): Add prange support.
(Value_Range::operator=): Same.
(Value_Range::supports_type_p): Same.
(prange::prange): New.
(prange::supports_p): New.
(prange::supports_type_p): New.
(prange::set_undefined): New.
(prange::set_varying): New.
(prange::set_nonzero): New.
(prange::set_zero): New.
(prange::contains_p): New.
(prange::zero_p): New.
(prange::nonzero_p): New.
(prange::type): New.
(prange::lower_bound): New.
(prange::upper_bound): New.
(prange::varying_compatible_p): New.
(prange::get_bitmask): New.
(prange::fits_p): New.
|
|
There is a 2% slowdown to VRP unrelated to the work at hand. This patch
is a skeleton implementation of prange that exhibits this degradation. It
is meant as a place in the commit history we can return to in order to revisit
the issue.
The relevant discussion is here:
https://gcc.gnu.org/pipermail/gcc/2024-May/243898.html
gcc/ChangeLog:
* value-range.h (class prange): New.
|
|
Single argument static_assert is C++17 only.
gcc/ChangeLog:
* value-range.h: fix static_assert to use 2 arguments.
|
|
Value_Range is our polymorphic temporary that can hold any range. It
is used for type agnostic code where it isn't known ahead of time,
what the type of the range will be (irange, france, etc). Currently,
there is a temporary of each type in the object, which means we need
to construct each range for every temporary. This isn't scaling
well now that prange is about to add yet another range type.
This patch removes each range, opting to use in-place new for a byte
buffer sufficiently large to hold ranges of any type. It reduces the
memory footprint by 14% for every Value_Range temporary (from 792 to
680 bytes), and we are guaranteed it will never again grow as we add
more range types (strings, complex numbers, etc).
Surprisingly, it improves VRP performance by 6.61% and overall
compilation by 0.44%, which is a lot more than we bargained for
when we started working on prange performance.
There is a slight change in semantics for Value_Range. The default
constructor does not initialize the object at all. It must be
manually initialized with either Value_Range::set_type(), or by
assigning a range to it. This means that IPA's m_known_value_ranges
must be initialized at allocation, instead of depending on the empty
constructor to initialize it to VR_UNDEFINED for unsupported_range.
I have taken the time to properly document both the class, and each
method. If anything isn't clear, please let me know so I can adjust it
accordingly.
gcc/ChangeLog:
* ipa-fnsummary.cc (evaluate_properties_for_edge): Initialize Value_Range's.
* value-range.h (class Value_Range): Add a buffer and remove
m_irange and m_frange.
(Value_Range::Value_Range): Call init.
(Value_Range::set_type): Same.
(Value_Range::init): Use in place new to initialize buffer.
(Value_Range::operator=): Tidy.
|
|
Here are some cleanups to unsupported_range so the assignment operator
takes an unsupported_range and behaves like the other ranges. This
makes subsequent cleanups easier.
gcc/ChangeLog:
* value-range.cc (unsupported_range::union_): Cast vrange to
unsupported_range.
(unsupported_range::intersect): Same.
(unsupported_range::operator=): Make argument an unsupported_range.
* value-range.h: New constructor.
|
|
As per the documentation, irange_bitmask must have the unknown bits in
the mask set to 0 in the value field. Even though we say we must have
normalized value/mask bits, we don't enforce it, opting to normalize
on the fly in union and intersect. Avoiding this lazy enforcing as
well as the extra saving/restoring involved in returning the changed
status, gives us a performance increase of 1.25% for VRP and 1.51% for
ipa-CP.
gcc/ChangeLog:
* tree-ssa-ccp.cc (ccp_finalize): Normalize before calling
set_bitmask.
* value-range.cc (irange::intersect_bitmask): Calculate changed
irange_bitmask bits on our own.
(irange::union_bitmask): Same.
(irange_bitmask::verify_mask): Verify that bits are normalized.
* value-range.h (irange_bitmask::union_): Do not normalize.
Remove return value.
(irange_bitmask::intersect): Same.
|
|
Accept a vrange, as this will be used for either integers or pointers.
gcc/ChangeLog:
* value-range.h (range_includes_zero_p): Accept vrange.
|
|
prange will also have bitmasks, so it will need to use get_bitmask_from_range.
gcc/ChangeLog:
* value-range.cc (get_bitmask_from_range): Move out of irange class.
(irange::get_bitmask): Call function instead of internal method.
* value-range.h (class irange): Remove get_bitmask_from_range.
|
|
In preparation for prange, make get_legacy_range take a generic
vrange, not just an irange.
gcc/ChangeLog:
* value-range.cc (get_legacy_range): Make static and add another
version of get_legacy_range that takes a vrange.
* value-range.h (class irange): Remove unnecessary friendship with
get_legacy_range.
(get_legacy_range): Accept a vrange.
|
|
Make range_includes_zero_p take an argument instead of a pointer for
consistency in the range-op code.
gcc/ChangeLog:
* gimple-range-op.cc (cfn_clz::fold_range): Change
range_includes_zero_p argument to a reference.
(cfn_ctz::fold_range): Same.
* range-op.cc (operator_plus::lhs_op1_relation): Same.
* value-range.h (range_includes_zero_p): Same.
|
|
Now that we have a vrange storage class to save ranges in long-term
memory, there is no need for GTY markers for any of the vrange
classes, since they should never live in GC.
gcc/ChangeLog:
* value-range-storage.h: Remove friends.
* value-range.cc (gt_ggc_mx): Remove.
(gt_pch_nx): Remove.
* value-range.h (class vrange): Remove GTY markers.
(class irange): Same.
(class int_range): Same.
(class frange): Same.
(gt_ggc_mx): Remove.
(gt_pch_nx): Remove.
|
|
Any range can theoretically have a bitmask of set bits. This patch
moves the bitmask accessors to the base class. This cleans up some
users in IPA*, and will provide a cleaner interface when prange is in
place.
gcc/ChangeLog:
* ipa-cp.cc (propagate_bits_across_jump_function): Access bitmask
through base class.
(ipcp_store_vr_results): Same.
* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Same.
(ipcp_get_parm_bits): Same.
(ipcp_update_vr): Same.
* range-op-mixed.h (update_known_bitmask): Change argument to vrange.
* range-op.cc (update_known_bitmask): Same.
* value-range.cc (vrange::update_bitmask): New.
(irange::set_nonzero_bits): Move to vrange class.
(irange::get_nonzero_bits): Same.
* value-range.h (class vrange): Add update_bitmask, get_bitmask,
get_nonzero_bits, and set_nonzero_bits.
(class irange): Make bitmask methods virtual overrides.
(class Value_Range): Add get_bitmask and update_bitmask.
|
|
This patch adds vrange::lbound() and vrange::ubound() that return
trees. These can be used in generic code that is type agnostic, and
avoids special casing for pointers and integers in places where we
handle both. It also cleans up a wart in the Value_Range class.
gcc/ChangeLog:
* tree-ssa-loop-niter.cc (refine_value_range_using_guard): Convert
bound to wide_int.
* value-range.cc (Value_Range::lower_bound): Remove.
(Value_Range::upper_bound): Remove.
(unsupported_range::lbound): New.
(unsupported_range::ubound): New.
(frange::lbound): New.
(frange::ubound): New.
(irange::lbound): New.
(irange::ubound): New.
* value-range.h (class vrange): Add lbound() and ubound().
(class irange): Same.
(class frange): Same.
(class unsupported_range): Same.
(class Value_Range): Rename lower_bound and upper_bound to lbound
and ubound respectively.
|
|
Richi mentioned in PR113476 that it would be cleaner to move the
destructor from int_range to the base class. Although this isn't
strictly necessary, as there are no users, it is good to future proof
things, and the overall impact is miniscule.
gcc/ChangeLog:
* value-range.h (vrange::~vrange): New.
(int_range::~int_range): Make final override.
|
|
Explicitly make vrange an abstract class. This involves fleshing out
the unsupported_range overrides which we were inheriting by default
from vrange.
gcc/ChangeLog:
* value-range.cc (unsupported_range::accept): Move down.
(vrange::contains_p): Rename to...
(unsupported_range::contains_p): ...this.
(vrange::singleton_p): Rename to...
(unsupported_range::singleton_p): ...this.
(vrange::set): Rename to...
(unsupported_range::set): ...this.
(vrange::type): Rename to...
(unsupported_range::type): ...this.
(vrange::supports_type_p): Rename to...
(unsupported_range::supports_type_p): ...this.
(vrange::set_undefined): Rename to...
(unsupported_range::set_undefined): ...this.
(vrange::set_varying): Rename to...
(unsupported_range::set_varying): ...this.
(vrange::union_): Rename to...
(unsupported_range::union_): ...this.
(vrange::intersect): Rename to...
(unsupported_range::intersect): ...this.
(vrange::zero_p): Rename to...
(unsupported_range::zero_p): ...this.
(vrange::nonzero_p): Rename to...
(unsupported_range::nonzero_p): ...this.
(vrange::set_nonzero): Rename to...
(unsupported_range::set_nonzero): ...this.
(vrange::set_zero): Rename to...
(unsupported_range::set_zero): ...this.
(vrange::set_nonnegative): Rename to...
(unsupported_range::set_nonnegative): ...this.
(vrange::fits_p): Rename to...
(unsupported_range::fits_p): ...this.
(unsupported_range::operator=): New.
(frange::fits_p): New.
* value-range.h (class vrange): Make an abstract class.
(class unsupported_range): Declare override methods.
|
|
|
|
[PR112788]
As PR112788 shows, on rs6000 with -mabi=ieeelongdouble type _Float128
has the different type precision (128) from that (127) of type long
double, but actually they has the same underlying mode, so they have
the same precision as the mode indicates the same real type format
ieee_quad_format.
It's not sensible to have such two types which have the same mode but
different type precisions, some fix attempt was posted at [1].
As the discussion there, there are some historical reasons and
practical issues. Considering we passed stage 1 and it also affected
the build as reported, this patch is trying to temporarily workaround
it. I thought to introduce a hookpod but that seems a bit overkill,
assuming scalar float type with the same mode should have the same
precision looks sensible.
[1] https://inbox.sourceware.org/gcc-patches/718677e7-614d-7977-312d-05a75e1fd5b4@linux.ibm.com/
PR tree-optimization/112788
gcc/ChangeLog:
* value-range.h (range_compatible_p): Workaround same type mode but
different type precision issue for rs6000 scalar float types
_Float128 and long double.
|
|
Instead of directly checking type precision, check_operands_p should
invoke range_compatible_p to keep the range checking centralized.
* gimple-range-fold.h (range_compatible_p): Relocate.
* value-range.h (range_compatible_p): Here.
* range-op-mixed.h (operand_equal::operand_check_p): Call
range_compatible_p rather than comparing precision.
(operand_not_equal::operand_check_p): Ditto.
(operand_not_lt::operand_check_p): Ditto.
(operand_not_le::operand_check_p): Ditto.
(operand_not_gt::operand_check_p): Ditto.
(operand_not_ge::operand_check_p): Ditto.
(operand_plus::operand_check_p): Ditto.
(operand_abs::operand_check_p): Ditto.
(operand_minus::operand_check_p): Ditto.
(operand_negate::operand_check_p): Ditto.
(operand_mult::operand_check_p): Ditto.
(operand_bitwise_not::operand_check_p): Ditto.
(operand_bitwise_xor::operand_check_p): Ditto.
(operand_bitwise_and::operand_check_p): Ditto.
(operand_bitwise_or::operand_check_p): Ditto.
(operand_min::operand_check_p): Ditto.
(operand_max::operand_check_p): Ditto.
* range-op.cc (operand_lshift::operand_check_p): Ditto.
(operand_rshift::operand_check_p): Ditto.
(operand_logical_and::operand_check_p): Ditto.
(operand_logical_or::operand_check_p): Ditto.
(operand_logical_not::operand_check_p): Ditto.
|
|
Check to see if a comparison to a constant can be determined to always
be not-equal based on the bitmask.
PR tree-optimization/111766
gcc/
* range-op.cc (operator_equal::fold_range): Check constants
against the bitmask.
(operator_not_equal::fold_range): Ditto.
* value-range.h (irange_bitmask::member_p): New.
gcc/testsuite/
* gcc.dg/pr111766.c: New.
|
|
During the intersection operation, it can be helpful to remove any
low-end ranges when the bitmask has trailing zeros. This prevents
obviously incorrect ranges from appearing without requiring a bitmask
check.
* value-range.cc (irange_bitmask::adjust_range): New.
(irange::intersect_bitmask): Call adjust_range.
* value-range.h (irange_bitmask::adjust_range): New prototype.
|
|
A common pattern to to append a range to an existing range via union.
This optimizes that process.
* value-range.cc (irange::union_append): New.
(irange::union_): Call union_append when appropriate.
* value-range.h (irange::union_append): New prototype.
|
|
32640 bits [PR102989]
As mentioned in the _BitInt support thread, _BitInt(N) is currently limited
by the wide_int/widest_int maximum precision limitation, which is depending
on target 191, 319, 575 or 703 bits (one less than WIDE_INT_MAX_PRECISION).
That is fairly low limit for _BitInt, especially on the targets with the 191
bit limitation.
The following patch bumps that limit to 16319 bits on all arches (which support
_BitInt at all), which is the limit imposed by INTEGER_CST representation
(unsigned char members holding number of HOST_WIDE_INT limbs).
In order to achieve that, wide_int is changed from a trivially copyable type
which contained just an inline array of WIDE_INT_MAX_ELTS (3, 5, 9 or
11 limbs depending on target) limbs into a non-trivially copy constructible,
copy assignable and destructible type which for the usual small cases (up
to WIDE_INT_MAX_INL_ELTS which is the former WIDE_INT_MAX_ELTS) still uses
an inline array of limbs, but for larger precisions uses heap allocated
limb array. This makes wide_int unusable in GC structures, so for dwarf2out
which was the only place which needed it there is a new rwide_int type
(restricted wide_int) which supports only up to RWIDE_INT_MAX_ELTS limbs
inline and is trivially copyable (dwarf2out should never deal with large
_BitInt constants, those should have been lowered earlier).
Similarly, widest_int has been changed from a trivially copyable type which
contained also an inline array of WIDE_INT_MAX_ELTS limbs (but unlike
wide_int didn't contain precision and assumed that to be
WIDE_INT_MAX_PRECISION) into a non-trivially copy constructible, copy
assignable and destructible type which has always WIDEST_INT_MAX_PRECISION
precision (32640 bits currently, twice as much as INTEGER_CST limitation
allows) and unlike wide_int decides depending on get_len () value whether
it uses an inline array (again, up to WIDE_INT_MAX_INL_ELTS) or heap
allocated one. In wide-int.h this means we need to estimate an upper
bound on how many limbs will wide-int.cc (usually, sometimes wide-int.h)
need to write, heap allocate if needed based on that estimation and upon
set_len which is done at the end if we guessed over WIDE_INT_MAX_INL_ELTS
and allocated dynamically, while we actually need less than that
copy/deallocate. The unexact guesses are needed because the exact
computation of the length in wide-int.cc is sometimes quite complex and
especially canonicalize at the end can decrease it. widest_int is again
because of this not usable in GC structures, so cfgloop.h has been changed
to use fixed_wide_int_storage <WIDE_INT_MAX_INL_PRECISION> and punt if
we'd have larger _BitInt based iterators, programs having more than 128-bit
iterators will be hopefully rare and I think it is fine to treat loops with
more than 2^127 iterations as effectively possibly infinite, omp-general.cc
is changed to use fixed_wide_int_storage <1024>, as it better should support
scores with the same precision on all arches.
Code which used WIDE_INT_PRINT_BUFFER_SIZE sized buffers for printing
wide_int/widest_int into buffer had to be changed to use XALLOCAVEC for
larger lengths.
On x86_64, the patch in --enable-checking=yes,rtl,extra configured
bootstrapped cc1plus enlarges the .text section by 1.01% - from
0x25725a5 to 0x25e5555 and similarly at least when compiling insn-recog.cc
with the usual bootstrap option slows compilation down by 1.01%,
user 4m22.046s and 4m22.384s on vanilla trunk vs.
4m25.947s and 4m25.581s on patched trunk. I'm afraid some code size growth
and compile time slowdown is unavoidable in this case, we use wide_int and
widest_int everywhere, and while the rare cases are marked with UNLIKELY
macros, it still means extra checks for it.
The patch also regresses
+FAIL: gm2/pim/fail/largeconst.mod, -O
+FAIL: gm2/pim/fail/largeconst.mod, -O -g
+FAIL: gm2/pim/fail/largeconst.mod, -O3 -fomit-frame-pointer
+FAIL: gm2/pim/fail/largeconst.mod, -O3 -fomit-frame-pointer -finline-functions
+FAIL: gm2/pim/fail/largeconst.mod, -Os
+FAIL: gm2/pim/fail/largeconst.mod, -g
+FAIL: gm2/pim/fail/largeconst2.mod, -O
+FAIL: gm2/pim/fail/largeconst2.mod, -O -g
+FAIL: gm2/pim/fail/largeconst2.mod, -O3 -fomit-frame-pointer
+FAIL: gm2/pim/fail/largeconst2.mod, -O3 -fomit-frame-pointer -finline-functions
+FAIL: gm2/pim/fail/largeconst2.mod, -Os
+FAIL: gm2/pim/fail/largeconst2.mod, -g
tests, which previously were rejected with
error: constant literal ‘12345678912345678912345679123456789123456789123456789123456789123456791234567891234567891234567891234567891234567912345678912345678912345678912345678912345679123456789123456789’ exceeds internal ZTYPE range
kind of errors, but now are accepted. Seems the FE tries to parse constants
into widest_int in that case and only diagnoses if widest_int overflows,
that seems wrong, it should at least punt if stuff doesn't fit into
WIDE_INT_MAX_PRECISION, but perhaps far less than that, if it wants support
for middle-end for precisions above 128-bit, it better should be using
BITINT_TYPE. Will file a PR and defer to Modula2 maintainer.
2023-10-12 Jakub Jelinek <jakub@redhat.com>
PR c/102989
* wide-int.h: Adjust file comment.
(WIDE_INT_MAX_INL_ELTS): Define to former value of WIDE_INT_MAX_ELTS.
(WIDE_INT_MAX_INL_PRECISION): Define.
(WIDE_INT_MAX_ELTS): Change to 255. Assert that WIDE_INT_MAX_INL_ELTS
is smaller than WIDE_INT_MAX_ELTS.
(RWIDE_INT_MAX_ELTS, RWIDE_INT_MAX_PRECISION, WIDEST_INT_MAX_ELTS,
WIDEST_INT_MAX_PRECISION): Define.
(WI_BINARY_RESULT_VAR, WI_UNARY_RESULT_VAR): Change write_val callers
to pass 0 as a new argument.
(class widest_int_storage): Likewise.
(widest_int, widest2_int): Change typedefs to use widest_int_storage
rather than fixed_wide_int_storage.
(enum wi::precision_type): Add INL_CONST_PRECISION enumerator.
(struct binary_traits): Add partial specializations for
INL_CONST_PRECISION.
(generic_wide_int): Add needs_write_val_arg static data member.
(int_traits): Likewise.
(wide_int_storage): Replace val non-static data member with a union
u of it and HOST_WIDE_INT *valp. Declare copy constructor, copy
assignment operator and destructor. Add unsigned int argument to
write_val.
(wide_int_storage::wide_int_storage): Initialize precision to 0
in the default ctor. Remove unnecessary {}s around STATIC_ASSERTs.
Assert in non-default ctor T's precision_type is not
INL_CONST_PRECISION and allocate u.valp for large precision. Add
copy constructor.
(wide_int_storage::~wide_int_storage): New.
(wide_int_storage::operator=): Add copy assignment operator. In
assignment operator remove unnecessary {}s around STATIC_ASSERTs,
assert ctor T's precision_type is not INL_CONST_PRECISION and
if precision changes, deallocate and/or allocate u.valp.
(wide_int_storage::get_val): Return u.valp rather than u.val for
large precision.
(wide_int_storage::write_val): Likewise. Add an unused unsigned int
argument.
(wide_int_storage::set_len): Use write_val instead of writing val
directly.
(wide_int_storage::from, wide_int_storage::from_array): Adjust
write_val callers.
(wide_int_storage::create): Allocate u.valp for large precisions.
(wi::int_traits <wide_int_storage>::get_binary_precision): New.
(fixed_wide_int_storage::fixed_wide_int_storage): Make default
ctor defaulted.
(fixed_wide_int_storage::write_val): Add unused unsigned int argument.
(fixed_wide_int_storage::from, fixed_wide_int_storage::from_array):
Adjust write_val callers.
(wi::int_traits <fixed_wide_int_storage>::get_binary_precision): New.
(WIDEST_INT): Define.
(widest_int_storage): New template class.
(wi::int_traits <widest_int_storage>): New.
(trailing_wide_int_storage::write_val): Add unused unsigned int
argument.
(wi::get_binary_precision): Use
wi::int_traits <WI_BINARY_RESULT (T1, T2)>::get_binary_precision
rather than get_precision on get_binary_result.
(wi::copy): Adjust write_val callers. Don't call set_len if
needs_write_val_arg.
(wi::bit_not): If result.needs_write_val_arg, call write_val
again with upper bound estimate of len.
(wi::sext, wi::zext, wi::set_bit): Likewise.
(wi::bit_and, wi::bit_and_not, wi::bit_or, wi::bit_or_not,
wi::bit_xor, wi::add, wi::sub, wi::mul, wi::mul_high, wi::div_trunc,
wi::div_floor, wi::div_ceil, wi::div_round, wi::divmod_trunc,
wi::mod_trunc, wi::mod_floor, wi::mod_ceil, wi::mod_round,
wi::lshift, wi::lrshift, wi::arshift): Likewise.
(wi::bswap, wi::bitreverse): Assert result.needs_write_val_arg
is false.
(gt_ggc_mx, gt_pch_nx): Remove generic template for all
generic_wide_int, instead add functions and templates for each
storage of generic_wide_int. Make functions for
generic_wide_int <wide_int_storage> and templates for
generic_wide_int <widest_int_storage <N>> deleted.
(wi::mask, wi::shifted_mask): Adjust write_val calls.
* wide-int.cc (zeros): Decrease array size to 1.
(BLOCKS_NEEDED): Use CEIL.
(canonize): Use HOST_WIDE_INT_M1.
(wi::from_buffer): Pass 0 to write_val.
(wi::to_mpz): Use CEIL.
(wi::from_mpz): Likewise. Pass 0 to write_val. Use
WIDE_INT_MAX_INL_ELTS instead of WIDE_INT_MAX_ELTS.
(wi::mul_internal): Use WIDE_INT_MAX_INL_PRECISION instead of
MAX_BITSIZE_MODE_ANY_INT in automatic array sizes, for prec
above WIDE_INT_MAX_INL_PRECISION estimate precision from
lengths of operands. Use XALLOCAVEC allocated buffers for
prec above WIDE_INT_MAX_INL_PRECISION.
(wi::divmod_internal): Likewise.
(wi::lshift_large): For len > WIDE_INT_MAX_INL_ELTS estimate
it from xlen and skip.
(rshift_large_common): Remove xprecision argument, add len
argument with len computed in caller. Don't return anything.
(wi::lrshift_large, wi::arshift_large): Compute len here
and pass it to rshift_large_common, for lengths above
WIDE_INT_MAX_INL_ELTS using estimations from xlen if possible.
(assert_deceq, assert_hexeq): For lengths above
WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer.
(test_printing): Use WIDE_INT_MAX_INL_PRECISION instead of
WIDE_INT_MAX_PRECISION.
* wide-int-print.h (WIDE_INT_PRINT_BUFFER_SIZE): Use
WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION.
* wide-int-print.cc (print_decs, print_decu, print_hex): For
lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer.
* tree.h (wi::int_traits<extended_tree <N>>): Change precision_type
to INL_CONST_PRECISION for N == ADDR_MAX_PRECISION.
(widest_extended_tree): Use WIDEST_INT_MAX_PRECISION instead of
WIDE_INT_MAX_PRECISION.
(wi::ints_for): Use int_traits <extended_tree <N> >::precision_type
instead of hard coded CONST_PRECISION.
(widest2_int_cst): Use WIDEST_INT_MAX_PRECISION instead of
WIDE_INT_MAX_PRECISION.
(wi::extended_tree <N>::get_len): Use WIDEST_INT_MAX_PRECISION rather
than WIDE_INT_MAX_PRECISION.
(wi::ints_for::zero): Use
wi::int_traits <wi::extended_tree <N> >::precision_type instead of
wi::CONST_PRECISION.
* tree.cc (build_replicated_int_cst): Formatting fix. Use
WIDE_INT_MAX_INL_ELTS rather than WIDE_INT_MAX_ELTS.
* print-tree.cc (print_node): Don't print TREE_UNAVAILABLE on
INTEGER_CSTs, TREE_VECs or SSA_NAMEs.
* double-int.h (wi::int_traits <double_int>::precision_type): Change
to INL_CONST_PRECISION from CONST_PRECISION.
* poly-int.h (struct poly_coeff_traits): Add partial specialization
for wi::INL_CONST_PRECISION.
* cfgloop.h (bound_wide_int): New typedef.
(struct nb_iter_bound): Change bound type from widest_int to
bound_wide_int.
(struct loop): Change nb_iterations_upper_bound,
nb_iterations_likely_upper_bound and nb_iterations_estimate type from
widest_int to bound_wide_int.
* cfgloop.cc (record_niter_bound): Return early if wi::min_precision
of i_bound is too large for bound_wide_int. Adjustments for the
widest_int to bound_wide_int type change in non-static data members.
(get_estimated_loop_iterations, get_max_loop_iterations,
get_likely_max_loop_iterations): Adjustments for the widest_int to
bound_wide_int type change in non-static data members.
* tree-vect-loop.cc (vect_transform_loop): Likewise.
* tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations): Use
XALLOCAVEC allocated buffer for i_bound len above
WIDE_INT_MAX_INL_ELTS.
(record_estimate): Return early if wi::min_precision of i_bound is too
large for bound_wide_int. Adjustments for the widest_int to
bound_wide_int type change in non-static data members.
(wide_int_cmp): Use bound_wide_int instead of widest_int.
(bound_index): Use bound_wide_int instead of widest_int.
(discover_iteration_bound_by_body_walk): Likewise. Use
widest_int::from to convert it to widest_int when passed to
record_niter_bound.
(maybe_lower_iteration_bound): Use widest_int::from to convert it to
widest_int when passed to record_niter_bound.
(estimate_numbers_of_iteration): Don't record upper bound if
loop->nb_iterations has too large precision for bound_wide_int.
(n_of_executions_at_most): Use widest_int::from.
* tree-ssa-loop-ivcanon.cc (remove_redundant_iv_tests): Adjust for
the widest_int to bound_wide_int changes.
* match.pd (fold_sign_changed_comparison simplification): Use
wide_int::from on wi::to_wide instead of wi::to_widest.
* value-range.h (irange::maybe_resize): Avoid using memcpy on
non-trivially copyable elements.
* value-range.cc (irange_bitmask::dump): Use XALLOCAVEC allocated
buffer for mask or value len above WIDE_INT_PRINT_BUFFER_SIZE.
* fold-const.cc (fold_convert_const_int_from_int, fold_unary_loc):
Use wide_int::from on wi::to_wide instead of wi::to_widest.
* tree-ssa-ccp.cc (bit_value_binop): Zero extend r1max from width
before calling wi::udiv_trunc.
* lto-streamer-out.cc (output_cfg): Adjustments for the widest_int to
bound_wide_int type change in non-static data members.
* lto-streamer-in.cc (input_cfg): Likewise.
(lto_input_tree_1): Use WIDE_INT_MAX_INL_ELTS rather than
WIDE_INT_MAX_ELTS. For length above WIDE_INT_MAX_INL_ELTS use
XALLOCAVEC allocated buffer. Formatting fix.
* data-streamer-in.cc (streamer_read_wide_int,
streamer_read_widest_int): Likewise.
* tree-affine.cc (aff_combination_expand): Use placement new to
construct name_expansion.
(free_name_expansion): Destruct name_expansion.
* gimple-ssa-strength-reduction.cc (struct slsr_cand_d): Change
index type from widest_int to offset_int.
(class incr_info_d): Change incr type from widest_int to offset_int.
(alloc_cand_and_find_basis, backtrace_base_for_ref,
restructure_reference, slsr_process_ref, create_mul_ssa_cand,
create_mul_imm_cand, create_add_ssa_cand, create_add_imm_cand,
slsr_process_add, cand_abs_increment, replace_mult_candidate,
replace_unconditional_candidate, incr_vec_index,
create_add_on_incoming_edge, create_phi_basis_1,
replace_conditional_candidate, record_increment,
record_phi_increments_1, phi_incr_cost_1, phi_incr_cost,
lowest_cost_path, total_savings, ncd_with_phi, ncd_of_cand_and_phis,
nearest_common_dominator_for_cands, insert_initializers,
all_phi_incrs_profitable_1, replace_one_candidate,
replace_profitable_candidates): Use offset_int rather than widest_int
and wi::to_offset rather than wi::to_widest.
* real.cc (real_to_integer): Use WIDE_INT_MAX_INL_ELTS rather than
2 * WIDE_INT_MAX_ELTS and for words above that use XALLOCAVEC
allocated buffer.
* tree-ssa-loop-ivopts.cc (niter_for_exit): Use placement new
to construct tree_niter_desc and destruct it on failure.
(free_tree_niter_desc): Destruct tree_niter_desc if value is non-NULL.
* gengtype.cc (main): Remove widest_int handling.
* graphite-isl-ast-to-gimple.cc (widest_int_from_isl_expr_int): Use
WIDEST_INT_MAX_ELTS instead of WIDE_INT_MAX_ELTS.
* gimple-ssa-warn-alloca.cc (pass_walloca::execute): Use
WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION and
assert get_len () fits into it.
* value-range-pretty-print.cc (vrange_printer::print_irange_bitmasks):
For mask or value lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC
allocated buffer.
* gimple-ssa-sprintf.cc (adjust_range_for_overflow): Use
wide_int::from on wi::to_wide instead of wi::to_widest.
* omp-general.cc (score_wide_int): New typedef.
(omp_context_compute_score): Use score_wide_int instead of widest_int
and adjust for those changes.
(struct omp_declare_variant_entry): Change score and
score_in_declare_simd_clone non-static data member type from widest_int
to score_wide_int.
(omp_resolve_late_declare_variant, omp_resolve_declare_variant): Use
score_wide_int instead of widest_int and adjust for those changes.
(omp_lto_output_declare_variant_alt): Likewise.
(omp_lto_input_declare_variant_alt): Likewise.
* godump.cc (go_output_typedef): Assert get_len () is smaller than
WIDE_INT_MAX_INL_ELTS.
gcc/c-family/
* c-warn.cc (match_case_to_enum_1): Use wi::to_wide just once instead
of 3 times, assert get_len () is smaller than WIDE_INT_MAX_INL_ELTS.
gcc/testsuite/
* gcc.dg/bitint-38.c: New test.
|
|
We can set_nan() with a nan_state so it's good form to have the
analogous form for update_nan().
gcc/ChangeLog:
* value-range.h (frange::update_nan): New.
|
|
In the conversion of iranges to wide_int (commit cb779afeff204f), I
mistakenly made contains_zero_p() return TRUE for undefined ranges.
This means the rest of the patch was adjusted for this stupidity.
For example, we ended up doing the following, to make up for the fact
that contains_zero_p was broken:
- if (!lhs.contains_p (build_zero_cst (lhs.type ())))
+ if (lhs.undefined_p () || !contains_zero_p (lhs))
This patch fixes the thinko and adjusts all callers.
In places where a caller is not checking undefined_p(), it is because
either the caller has already handled undefined ranges in the
preceeding code, or the check is superfluous.
gcc/ChangeLog:
* value-range.h (contains_zero_p): Return false for undefined ranges.
* range-op-float.cc (operator_gt::op1_op2_relation): Adjust for
contains_zero_p change above.
(operator_ge::op1_op2_relation): Same.
(operator_equal::op1_op2_relation): Same.
(operator_not_equal::op1_op2_relation): Same.
(operator_lt::op1_op2_relation): Same.
(operator_le::op1_op2_relation): Same.
(operator_ge::op1_op2_relation): Same.
* range-op.cc (operator_equal::op1_op2_relation): Same.
(operator_not_equal::op1_op2_relation): Same.
(operator_lt::op1_op2_relation): Same.
(operator_le::op1_op2_relation): Same.
(operator_cast::op1_range): Same.
(set_nonzero_range_from_mask): Same.
(operator_bitwise_xor::op1_range): Same.
(operator_addr_expr::fold_range): Same.
(operator_addr_expr::op1_range): Same.
|
|
In previous reviews, adding overflow APIs to range-op would be useful.
Those APIs could help to check if overflow happens when operating
between two 'range's, like: plus, minus, and mult.
Previous discussions are here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624067.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624701.html
gcc/ChangeLog:
* range-op-mixed.h (operator_plus::overflow_free_p): New declare.
(operator_minus::overflow_free_p): New declare.
(operator_mult::overflow_free_p): New declare.
* range-op.cc (range_op_handler::overflow_free_p): New function.
(range_operator::overflow_free_p): New default function.
(operator_plus::overflow_free_p): New function.
(operator_minus::overflow_free_p): New function.
(operator_mult::overflow_free_p): New function.
* range-op.h (range_op_handler::overflow_free_p): New declare.
(range_operator::overflow_free_p): New declare.
* value-range.cc (irange::nonnegative_p): New function.
(irange::nonpositive_p): New function.
* value-range.h (irange::nonnegative_p): New declare.
(irange::nonpositive_p): New declare.
|
|
Set routines which take a type shouldn't have to pre-set the type of the
underlying range as it is specified as a parameter already.
* value-range.h (Value_Range::set_varying): Set the type.
(Value_Range::set_zero): Ditto.
(Value_Range::set_nonzero): Ditto.
|
|
The bit twiddling in union/intersect for the value/mask pair must be
normalized to have the unknown bits with a value of 0 in order to make
the math simpler. Normalizing at construction slowed VRP by 1.5% so I
opted to normalize before updating the bitmask in range-ops, since it
was the only user. However, with upcoming changes there will be
multiple setters of the mask (IPA and CCP), so we need something more
general.
I played with various alternatives, and settled on normalizing before
union/intersect which were the ones needing the bits cleared. With
this patch, there's no noticeable difference in performance either in
VRP or in overall compilation.
gcc/ChangeLog:
* value-range.cc (irange_bitmask::verify_mask): Mask need not be
normalized.
* value-range.h (irange_bitmask::union_): Normalize beforehand.
(irange_bitmask::intersect): Same.
|
|
Integer ranges (irange) currently track known 0 bits. We've wanted to
track known 1 bits for some time, and instead of tracking known 0 and
known 1's separately, it has been suggested we track a value/mask pair
similarly to what we do for CCP and RTL. This patch implements such a
thing.
With this we now track a VALUE integer which are the known values, and
a MASK which tells us which bits contain meaningful information. This
allows us to fix a handful of enhancement requests, such as PR107043
and PR107053.
There is a 4.48% performance penalty for VRP and 0.42% in overall
compilation for this entire patchset. It is expected and in line
with the loss incurred when we started tracking known 0 bits.
This patch just provides the value/mask tracking support. All the
nonzero users (range-op, IPA, CCP, etc), are still using the nonzero
nomenclature. For that matter, this patch reimplements the nonzero
accessors with the value/mask functionality. In follow-up patches I
will enhance these passes to use the value/mask information, and
fix the aforementioned PRs.
gcc/ChangeLog:
* data-streamer-in.cc (streamer_read_value_range): Adjust for
value/mask.
* data-streamer-out.cc (streamer_write_vrange): Same.
* range-op.cc (operator_cast::fold_range): Same.
* value-range-pretty-print.cc
(vrange_printer::print_irange_bitmasks): Same.
* value-range-storage.cc (irange_storage::write_lengths_address):
Same.
(irange_storage::set_irange): Same.
(irange_storage::get_irange): Same.
(irange_storage::size): Same.
(irange_storage::dump): Same.
* value-range-storage.h: Same.
* value-range.cc (debug): New.
(irange_bitmask::dump): New.
(add_vrange): Adjust for value/mask.
(irange::operator=): Same.
(irange::set): Same.
(irange::verify_range): Same.
(irange::operator==): Same.
(irange::contains_p): Same.
(irange::irange_single_pair_union): Same.
(irange::union_): Same.
(irange::intersect): Same.
(irange::invert): Same.
(irange::get_nonzero_bits_from_range): Rename to...
(irange::get_bitmask_from_range): ...this.
(irange::set_range_from_nonzero_bits): Rename to...
(irange::set_range_from_bitmask): ...this.
(irange::set_nonzero_bits): Rename to...
(irange::update_bitmask): ...this.
(irange::get_nonzero_bits): Rename to...
(irange::get_bitmask): ...this.
(irange::intersect_nonzero_bits): Rename to...
(irange::intersect_bitmask): ...this.
(irange::union_nonzero_bits): Rename to...
(irange::union_bitmask): ...this.
(irange_bitmask::verify_mask): New.
* value-range.h (class irange_bitmask): New.
(irange_bitmask::set_unknown): New.
(irange_bitmask::unknown_p): New.
(irange_bitmask::irange_bitmask): New.
(irange_bitmask::get_precision): New.
(irange_bitmask::get_nonzero_bits): New.
(irange_bitmask::set_nonzero_bits): New.
(irange_bitmask::operator==): New.
(irange_bitmask::union_): New.
(irange_bitmask::intersect): New.
(class irange): Friend vrange_printer.
(irange::varying_compatible_p): Adjust for bitmask.
(irange::set_varying): Same.
(irange::set_nonzero): Same.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr107009.c: Adjust irange dumping for
value/mask changes.
* gcc.dg/tree-ssa/vrp-unreachable.c: Same.
* gcc.dg/tree-ssa/vrp122.c: Same.
|
|
There's a few spots where a range is being altered in-place, but we
fail to call normalize the range. This patch makes sure we always
call normalize_kind(), and that normalize_kind in turn calls
verify_range to make sure verything is canonical.
gcc/ChangeLog:
* value-range.cc (frange::set): Do not call verify_range.
(frange::normalize_kind): Verify range.
(frange::union_nans): Do not call verify_range.
(frange::union_): Same.
(frange::intersect): Same.
(irange::irange_single_pair_union): Call normalize_kind if
necessary.
(irange::union_): Same.
(irange::intersect): Same.
(irange::set_range_from_nonzero_bits): Verify range.
(irange::set_nonzero_bits): Call normalize_kind if necessary.
(irange::get_nonzero_bits): Tweak comment.
(irange::intersect_nonzero_bits): Call normalize_kind if
necessary.
(irange::union_nonzero_bits): Same.
* value-range.h (irange::normalize_kind): Verify range.
|
|
Simplify range_op_handler to have a single range_operator pointer and
provide a more flexible dispatch mechanism for calls via generic vrange
classes. This is more extensible for adding new classes of range support.
Any unsupported dispatch patterns will simply return FALSE now rather
than generating compile time exceptions, aleviating the need to
constantly check for supoprted types.
* gimple-range-op.cc
(gimple_range_op_handler::gimple_range_op_handler): Adjust.
(gimple_range_op_handler::maybe_builtin_call): Adjust.
* gimple-range-op.h (operand1, operand2): Use m_operator.
* range-op.cc (integral_table, pointer_table): Relocate.
(get_op_handler): Rename from get_handler and handle all types.
(range_op_handler::range_op_handler): Relocate.
(range_op_handler::set_op_handler): Relocate and adjust.
(range_op_handler::range_op_handler): Relocate.
(dispatch_trio): New.
(RO_III, RO_IFI, RO_IFF, RO_FFF, RO_FIF, RO_FII): New consts.
(range_op_handler::dispatch_kind): New.
(range_op_handler::fold_range): Relocate and Use new dispatch value.
(range_op_handler::op1_range): Ditto.
(range_op_handler::op2_range): Ditto.
(range_op_handler::lhs_op1_relation): Ditto.
(range_op_handler::lhs_op2_relation): Ditto.
(range_op_handler::op1_op2_relation): Ditto.
(range_op_handler::set_op_handler): Use m_operator member.
* range-op.h (range_op_handler::operator bool): Use m_operator.
(range_op_handler::dispatch_kind): New.
(range_op_handler::m_valid): Delete.
(range_op_handler::m_int): Delete
(range_op_handler::m_float): Delete
(range_op_handler::m_operator): New.
(range_op_table::operator[]): Relocate from .cc file.
(range_op_table::set): Ditto.
* value-range.h (class vrange): Make range_op_handler a friend.
|
|
NANs don't have bounds, so there's no need to stream them out.
gcc/ChangeLog:
* data-streamer-in.cc (streamer_read_value_range): Handle NANs.
* data-streamer-out.cc (streamer_write_vrange): Same.
* value-range.h (class vrange): Make streamer_write_vrange a friend.
|
|
Generalize frange::set_nan() to take a nan_state and make current
set_nan() methods syntactic sugar.
This is in preparation for better streaming of NANs for LTO/IPA.
gcc/ChangeLog:
* value-range.h (frange::set_nan): New.
|
|
gcc/ChangeLog:
* value-range.h (vrange::kind): Remove.
|
|
gcc/ChangeLog:
PR tree-optimization/109920
* value-range.h (RESIZABLE>::~int_range): Use delete[].
|
|
This adds some missing accessors to the type agnostic Value_Range
class. They'll be used in the upcoming IPA work.
gcc/ChangeLog:
* value-range.h (class Value_Range): Implement set_zero,
set_nonzero, and nonzero_p.
|
|
gcc/ChangeLog:
* value-range.h (Value_Range::operator=): New.
|
|
The unsupported_range class is provided for completness sake. It is a
way to set VARYING/UNDEFINED ranges for unsupported ranges (currently
anything not float, integer, or pointer). You can't do anything with
them, except set_varying, and set_undefined. We will trap on any
other operation.
This patch provides a way to copy them, just in case they creep in.
This could happen in IPA under certain circumstances.
gcc/ChangeLog:
* value-range.cc (vrange::operator=): Add a stub to copy
unsupported ranges.
* value-range.h (is_a <unsupported_range>): New.
(Value_Range::operator=): Support copying unsupported ranges.
|
|
<tldr>
We can now have int_range<N, RESIZABLE=false> for automatically
resizable ranges. int_range_max is now int_range<3, true>
for a 69X reduction in size from current trunk, and 6.9X reduction from
GCC12. This incurs a 5% performance penalty for VRP that is more than
covered by our > 13% improvements recently.
</tldr>
int_range_max is the temporary range object we use in the ranger for
integers. With the conversion to wide_int, this structure bloated up
significantly because wide_ints are huge (80 bytes a piece) and are
about 10 times as big as a plain tree. Since the temporary object
requires 255 sub-ranges, that's 255 * 80 * 2, plus the control word.
This means the structure grew from 4112 bytes to 40912 bytes.
This patch adds the ability to resize ranges as needed, defaulting to
no resizing, while int_range_max now defaults to 3 sub-ranges (instead
of 255) and grows to 255 when the range being calculated does not fit.
For example:
int_range<1> foo; // 1 sub-range with no resizing.
int_range<5> foo; // 5 sub-ranges with no resizing.
int_range<5, true> foo; // 5 sub-ranges with resizing.
I ran some tests and found that 3 sub-ranges cover 99% of cases, so
I've set the int_range_max default to that:
typedef int_range<3, /*RESIZABLE=*/true> int_range_max;
We don't bother growing incrementally, since the default covers most
cases and we have a 255 hard-limit. This hard limit could be reduced
to 128, since my tests never saw a range needing more than 124, but we
could do that as a follow-up if needed.
With 3-subranges, int_range_max is now 592 bytes versus 40912 for
trunk, and versus 4112 bytes for GCC12! The penalty is 5.04% for VRP
and 3.02% for threading, with no noticeable change in overall
compilation (0.27%). This is more than covered by our 13.26%
improvements for the legacy removal + wide_int conversion.
I think this approach is a good alternative, while providing us with
flexibility going forward. For example, we could try defaulting to a
8 sub-ranges for a noticeable improvement in VRP. We could also use
large sub-ranges for switch analysis to avoid resizing.
Another approach I tried was always resizing. With this, we could
drop the whole int_range<N> nonsense, and have irange just hold a
resizable range. This simplified things, but incurred a 7% penalty on
ipa_cp. This was hard to pinpoint, and I'm not entirely convinced
this wasn't some artifact of valgrind. However, until we're sure,
let's avoid massive changes, especially since IPA changes are coming
up.
For the curious, a particular hot spot for IPA in this area was:
ipcp_vr_lattice::meet_with_1 (const value_range *other_vr)
{
...
...
value_range save (m_vr);
m_vr.union_ (*other_vr);
return m_vr != save;
}
The problem isn't the resizing (since we do that at most once) but the
fact that for some functions with lots of callers we end up a huge
range that gets copied and compared for every meet operation. Maybe
the IPA algorithm could be adjusted somehow??.
Anywhooo... for now there is nothing to worry about, since value_range
still has 2 subranges and is not resizable. But we should probably
think what if anything we want to do here, as I envision IPA using
infinite ranges here (well, int_range_max) and handling frange's, etc.
gcc/ChangeLog:
PR tree-optimization/109695
* value-range.cc (irange::operator=): Resize range.
(irange::union_): Same.
(irange::intersect): Same.
(irange::invert): Same.
(int_range_max): Default to 3 sub-ranges and resize as needed.
* value-range.h (irange::maybe_resize): New.
(~int_range): New.
(int_range::int_range): Adjust for resizing.
(int_range::operator=): Same.
|
|
The previous patch just added basic intrinsic ranges for sqrt
([-0.0, +Inf] +-NAN being the general result range of the function
and [-0.0, +Inf] the general operand range if result isn't NAN etc.),
the following patch intersects those ranges with particular range
computed from argument or result's exact range with the expected
error in ulps taken into account and adds a function (frange_arithmetic
variant) which can be used by other functions as well as helper.
2023-05-06 Jakub Jelinek <jakub@redhat.com>
* value-range.h (frange_arithmetic): Declare.
* range-op-float.cc (frange_arithmetic): No longer static.
* gimple-range-op.cc (frange_mpfr_arg1): New function.
(cfn_sqrt::fold_range): Intersect the generic boundaries range
with range computed from sqrt of the particular bounds.
(cfn_sqrt::op1_range): Intersect the generic boundaries range
with range computed from squared particular bounds.
* gcc.dg/tree-ssa/range-sqrt-2.c: New test.
|
|
gcc/ChangeLog:
* value-range.h (class int_range): Remove gt_ggc_mx and gt_pch_nx
friends.
|
|
irange::set_nonzero is used everywhere and benefits immensely from
inlining.
gcc/ChangeLog:
* value-range.h (irange::set_nonzero): Inline.
|