wide-int: Allow up to 16320 bits wide_int and change widest_int precision to 32640 bits [PR102989]

As mentioned in the _BitInt support thread, _BitInt(N) is currently limited by the wide_int/widest_int maximum precision limitation, which is depending on target 191, 319, 575 or 703 bits (one less than WIDE_INT_MAX_PRECISION). That is fairly low limit for _BitInt, especially on the targets with the 191 bit limitation. The following patch bumps that limit to 16319 bits on all arches (which support _BitInt at all), which is the limit imposed by INTEGER_CST representation (unsigned char members holding number of HOST_WIDE_INT limbs). In order to achieve that, wide_int is changed from a trivially copyable type which contained just an inline array of WIDE_INT_MAX_ELTS (3, 5, 9 or 11 limbs depending on target) limbs into a non-trivially copy constructible, copy assignable and destructible type which for the usual small cases (up to WIDE_INT_MAX_INL_ELTS which is the former WIDE_INT_MAX_ELTS) still uses an inline array of limbs, but for larger precisions uses heap allocated limb array. This makes wide_int unusable in GC structures, so for dwarf2out which was the only place which needed it there is a new rwide_int type (restricted wide_int) which supports only up to RWIDE_INT_MAX_ELTS limbs inline and is trivially copyable (dwarf2out should never deal with large _BitInt constants, those should have been lowered earlier). Similarly, widest_int has been changed from a trivially copyable type which contained also an inline array of WIDE_INT_MAX_ELTS limbs (but unlike wide_int didn't contain precision and assumed that to be WIDE_INT_MAX_PRECISION) into a non-trivially copy constructible, copy assignable and destructible type which has always WIDEST_INT_MAX_PRECISION precision (32640 bits currently, twice as much as INTEGER_CST limitation allows) and unlike wide_int decides depending on get_len () value whether it uses an inline array (again, up to WIDE_INT_MAX_INL_ELTS) or heap allocated one. In wide-int.h this means we need to estimate an upper bound on how many limbs will wide-int.cc (usually, sometimes wide-int.h) need to write, heap allocate if needed based on that estimation and upon set_len which is done at the end if we guessed over WIDE_INT_MAX_INL_ELTS and allocated dynamically, while we actually need less than that copy/deallocate. The unexact guesses are needed because the exact computation of the length in wide-int.cc is sometimes quite complex and especially canonicalize at the end can decrease it. widest_int is again because of this not usable in GC structures, so cfgloop.h has been changed to use fixed_wide_int_storage <WIDE_INT_MAX_INL_PRECISION> and punt if we'd have larger _BitInt based iterators, programs having more than 128-bit iterators will be hopefully rare and I think it is fine to treat loops with more than 2^127 iterations as effectively possibly infinite, omp-general.cc is changed to use fixed_wide_int_storage <1024>, as it better should support scores with the same precision on all arches. Code which used WIDE_INT_PRINT_BUFFER_SIZE sized buffers for printing wide_int/widest_int into buffer had to be changed to use XALLOCAVEC for larger lengths. On x86_64, the patch in --enable-checking=yes,rtl,extra configured bootstrapped cc1plus enlarges the .text section by 1.01% - from 0x25725a5 to 0x25e5555 and similarly at least when compiling insn-recog.cc with the usual bootstrap option slows compilation down by 1.01%, user 4m22.046s and 4m22.384s on vanilla trunk vs. 4m25.947s and 4m25.581s on patched trunk. I'm afraid some code size growth and compile time slowdown is unavoidable in this case, we use wide_int and widest_int everywhere, and while the rare cases are marked with UNLIKELY macros, it still means extra checks for it. The patch also regresses +FAIL: gm2/pim/fail/largeconst.mod, -O +FAIL: gm2/pim/fail/largeconst.mod, -O -g +FAIL: gm2/pim/fail/largeconst.mod, -O3 -fomit-frame-pointer +FAIL: gm2/pim/fail/largeconst.mod, -O3 -fomit-frame-pointer -finline-functions +FAIL: gm2/pim/fail/largeconst.mod, -Os +FAIL: gm2/pim/fail/largeconst.mod, -g +FAIL: gm2/pim/fail/largeconst2.mod, -O +FAIL: gm2/pim/fail/largeconst2.mod, -O -g +FAIL: gm2/pim/fail/largeconst2.mod, -O3 -fomit-frame-pointer +FAIL: gm2/pim/fail/largeconst2.mod, -O3 -fomit-frame-pointer -finline-functions +FAIL: gm2/pim/fail/largeconst2.mod, -Os +FAIL: gm2/pim/fail/largeconst2.mod, -g tests, which previously were rejected with error: constant literal ‘12345678912345678912345679123456789123456789123456789123456789123456791234567891234567891234567891234567891234567912345678912345678912345678912345678912345679123456789123456789’ exceeds internal ZTYPE range kind of errors, but now are accepted. Seems the FE tries to parse constants into widest_int in that case and only diagnoses if widest_int overflows, that seems wrong, it should at least punt if stuff doesn't fit into WIDE_INT_MAX_PRECISION, but perhaps far less than that, if it wants support for middle-end for precisions above 128-bit, it better should be using BITINT_TYPE. Will file a PR and defer to Modula2 maintainer. 2023-10-12 Jakub Jelinek <jakub@redhat.com> PR c/102989 * wide-int.h: Adjust file comment. (WIDE_INT_MAX_INL_ELTS): Define to former value of WIDE_INT_MAX_ELTS. (WIDE_INT_MAX_INL_PRECISION): Define. (WIDE_INT_MAX_ELTS): Change to 255. Assert that WIDE_INT_MAX_INL_ELTS is smaller than WIDE_INT_MAX_ELTS. (RWIDE_INT_MAX_ELTS, RWIDE_INT_MAX_PRECISION, WIDEST_INT_MAX_ELTS, WIDEST_INT_MAX_PRECISION): Define. (WI_BINARY_RESULT_VAR, WI_UNARY_RESULT_VAR): Change write_val callers to pass 0 as a new argument. (class widest_int_storage): Likewise. (widest_int, widest2_int): Change typedefs to use widest_int_storage rather than fixed_wide_int_storage. (enum wi::precision_type): Add INL_CONST_PRECISION enumerator. (struct binary_traits): Add partial specializations for INL_CONST_PRECISION. (generic_wide_int): Add needs_write_val_arg static data member. (int_traits): Likewise. (wide_int_storage): Replace val non-static data member with a union u of it and HOST_WIDE_INT *valp. Declare copy constructor, copy assignment operator and destructor. Add unsigned int argument to write_val. (wide_int_storage::wide_int_storage): Initialize precision to 0 in the default ctor. Remove unnecessary {}s around STATIC_ASSERTs. Assert in non-default ctor T's precision_type is not INL_CONST_PRECISION and allocate u.valp for large precision. Add copy constructor. (wide_int_storage::~wide_int_storage): New. (wide_int_storage::operator=): Add copy assignment operator. In assignment operator remove unnecessary {}s around STATIC_ASSERTs, assert ctor T's precision_type is not INL_CONST_PRECISION and if precision changes, deallocate and/or allocate u.valp. (wide_int_storage::get_val): Return u.valp rather than u.val for large precision. (wide_int_storage::write_val): Likewise. Add an unused unsigned int argument. (wide_int_storage::set_len): Use write_val instead of writing val directly. (wide_int_storage::from, wide_int_storage::from_array): Adjust write_val callers. (wide_int_storage::create): Allocate u.valp for large precisions. (wi::int_traits <wide_int_storage>::get_binary_precision): New. (fixed_wide_int_storage::fixed_wide_int_storage): Make default ctor defaulted. (fixed_wide_int_storage::write_val): Add unused unsigned int argument. (fixed_wide_int_storage::from, fixed_wide_int_storage::from_array): Adjust write_val callers. (wi::int_traits <fixed_wide_int_storage>::get_binary_precision): New. (WIDEST_INT): Define. (widest_int_storage): New template class. (wi::int_traits <widest_int_storage>): New. (trailing_wide_int_storage::write_val): Add unused unsigned int argument. (wi::get_binary_precision): Use wi::int_traits <WI_BINARY_RESULT (T1, T2)>::get_binary_precision rather than get_precision on get_binary_result. (wi::copy): Adjust write_val callers. Don't call set_len if needs_write_val_arg. (wi::bit_not): If result.needs_write_val_arg, call write_val again with upper bound estimate of len. (wi::sext, wi::zext, wi::set_bit): Likewise. (wi::bit_and, wi::bit_and_not, wi::bit_or, wi::bit_or_not, wi::bit_xor, wi::add, wi::sub, wi::mul, wi::mul_high, wi::div_trunc, wi::div_floor, wi::div_ceil, wi::div_round, wi::divmod_trunc, wi::mod_trunc, wi::mod_floor, wi::mod_ceil, wi::mod_round, wi::lshift, wi::lrshift, wi::arshift): Likewise. (wi::bswap, wi::bitreverse): Assert result.needs_write_val_arg is false. (gt_ggc_mx, gt_pch_nx): Remove generic template for all generic_wide_int, instead add functions and templates for each storage of generic_wide_int. Make functions for generic_wide_int <wide_int_storage> and templates for generic_wide_int <widest_int_storage <N>> deleted. (wi::mask, wi::shifted_mask): Adjust write_val calls. * wide-int.cc (zeros): Decrease array size to 1. (BLOCKS_NEEDED): Use CEIL. (canonize): Use HOST_WIDE_INT_M1. (wi::from_buffer): Pass 0 to write_val. (wi::to_mpz): Use CEIL. (wi::from_mpz): Likewise. Pass 0 to write_val. Use WIDE_INT_MAX_INL_ELTS instead of WIDE_INT_MAX_ELTS. (wi::mul_internal): Use WIDE_INT_MAX_INL_PRECISION instead of MAX_BITSIZE_MODE_ANY_INT in automatic array sizes, for prec above WIDE_INT_MAX_INL_PRECISION estimate precision from lengths of operands. Use XALLOCAVEC allocated buffers for prec above WIDE_INT_MAX_INL_PRECISION. (wi::divmod_internal): Likewise. (wi::lshift_large): For len > WIDE_INT_MAX_INL_ELTS estimate it from xlen and skip. (rshift_large_common): Remove xprecision argument, add len argument with len computed in caller. Don't return anything. (wi::lrshift_large, wi::arshift_large): Compute len here and pass it to rshift_large_common, for lengths above WIDE_INT_MAX_INL_ELTS using estimations from xlen if possible. (assert_deceq, assert_hexeq): For lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer. (test_printing): Use WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION. * wide-int-print.h (WIDE_INT_PRINT_BUFFER_SIZE): Use WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION. * wide-int-print.cc (print_decs, print_decu, print_hex): For lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer. * tree.h (wi::int_traits<extended_tree <N>>): Change precision_type to INL_CONST_PRECISION for N == ADDR_MAX_PRECISION. (widest_extended_tree): Use WIDEST_INT_MAX_PRECISION instead of WIDE_INT_MAX_PRECISION. (wi::ints_for): Use int_traits <extended_tree <N> >::precision_type instead of hard coded CONST_PRECISION. (widest2_int_cst): Use WIDEST_INT_MAX_PRECISION instead of WIDE_INT_MAX_PRECISION. (wi::extended_tree <N>::get_len): Use WIDEST_INT_MAX_PRECISION rather than WIDE_INT_MAX_PRECISION. (wi::ints_for::zero): Use wi::int_traits <wi::extended_tree <N> >::precision_type instead of wi::CONST_PRECISION. * tree.cc (build_replicated_int_cst): Formatting fix. Use WIDE_INT_MAX_INL_ELTS rather than WIDE_INT_MAX_ELTS. * print-tree.cc (print_node): Don't print TREE_UNAVAILABLE on INTEGER_CSTs, TREE_VECs or SSA_NAMEs. * double-int.h (wi::int_traits <double_int>::precision_type): Change to INL_CONST_PRECISION from CONST_PRECISION. * poly-int.h (struct poly_coeff_traits): Add partial specialization for wi::INL_CONST_PRECISION. * cfgloop.h (bound_wide_int): New typedef. (struct nb_iter_bound): Change bound type from widest_int to bound_wide_int. (struct loop): Change nb_iterations_upper_bound, nb_iterations_likely_upper_bound and nb_iterations_estimate type from widest_int to bound_wide_int. * cfgloop.cc (record_niter_bound): Return early if wi::min_precision of i_bound is too large for bound_wide_int. Adjustments for the widest_int to bound_wide_int type change in non-static data members. (get_estimated_loop_iterations, get_max_loop_iterations, get_likely_max_loop_iterations): Adjustments for the widest_int to bound_wide_int type change in non-static data members. * tree-vect-loop.cc (vect_transform_loop): Likewise. * tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations): Use XALLOCAVEC allocated buffer for i_bound len above WIDE_INT_MAX_INL_ELTS. (record_estimate): Return early if wi::min_precision of i_bound is too large for bound_wide_int. Adjustments for the widest_int to bound_wide_int type change in non-static data members. (wide_int_cmp): Use bound_wide_int instead of widest_int. (bound_index): Use bound_wide_int instead of widest_int. (discover_iteration_bound_by_body_walk): Likewise. Use widest_int::from to convert it to widest_int when passed to record_niter_bound. (maybe_lower_iteration_bound): Use widest_int::from to convert it to widest_int when passed to record_niter_bound. (estimate_numbers_of_iteration): Don't record upper bound if loop->nb_iterations has too large precision for bound_wide_int. (n_of_executions_at_most): Use widest_int::from. * tree-ssa-loop-ivcanon.cc (remove_redundant_iv_tests): Adjust for the widest_int to bound_wide_int changes. * match.pd (fold_sign_changed_comparison simplification): Use wide_int::from on wi::to_wide instead of wi::to_widest. * value-range.h (irange::maybe_resize): Avoid using memcpy on non-trivially copyable elements. * value-range.cc (irange_bitmask::dump): Use XALLOCAVEC allocated buffer for mask or value len above WIDE_INT_PRINT_BUFFER_SIZE. * fold-const.cc (fold_convert_const_int_from_int, fold_unary_loc): Use wide_int::from on wi::to_wide instead of wi::to_widest. * tree-ssa-ccp.cc (bit_value_binop): Zero extend r1max from width before calling wi::udiv_trunc. * lto-streamer-out.cc (output_cfg): Adjustments for the widest_int to bound_wide_int type change in non-static data members. * lto-streamer-in.cc (input_cfg): Likewise. (lto_input_tree_1): Use WIDE_INT_MAX_INL_ELTS rather than WIDE_INT_MAX_ELTS. For length above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer. Formatting fix. * data-streamer-in.cc (streamer_read_wide_int, streamer_read_widest_int): Likewise. * tree-affine.cc (aff_combination_expand): Use placement new to construct name_expansion. (free_name_expansion): Destruct name_expansion. * gimple-ssa-strength-reduction.cc (struct slsr_cand_d): Change index type from widest_int to offset_int. (class incr_info_d): Change incr type from widest_int to offset_int. (alloc_cand_and_find_basis, backtrace_base_for_ref, restructure_reference, slsr_process_ref, create_mul_ssa_cand, create_mul_imm_cand, create_add_ssa_cand, create_add_imm_cand, slsr_process_add, cand_abs_increment, replace_mult_candidate, replace_unconditional_candidate, incr_vec_index, create_add_on_incoming_edge, create_phi_basis_1, replace_conditional_candidate, record_increment, record_phi_increments_1, phi_incr_cost_1, phi_incr_cost, lowest_cost_path, total_savings, ncd_with_phi, ncd_of_cand_and_phis, nearest_common_dominator_for_cands, insert_initializers, all_phi_incrs_profitable_1, replace_one_candidate, replace_profitable_candidates): Use offset_int rather than widest_int and wi::to_offset rather than wi::to_widest. * real.cc (real_to_integer): Use WIDE_INT_MAX_INL_ELTS rather than 2 * WIDE_INT_MAX_ELTS and for words above that use XALLOCAVEC allocated buffer. * tree-ssa-loop-ivopts.cc (niter_for_exit): Use placement new to construct tree_niter_desc and destruct it on failure. (free_tree_niter_desc): Destruct tree_niter_desc if value is non-NULL. * gengtype.cc (main): Remove widest_int handling. * graphite-isl-ast-to-gimple.cc (widest_int_from_isl_expr_int): Use WIDEST_INT_MAX_ELTS instead of WIDE_INT_MAX_ELTS. * gimple-ssa-warn-alloca.cc (pass_walloca::execute): Use WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION and assert get_len () fits into it. * value-range-pretty-print.cc (vrange_printer::print_irange_bitmasks): For mask or value lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer. * gimple-ssa-sprintf.cc (adjust_range_for_overflow): Use wide_int::from on wi::to_wide instead of wi::to_widest. * omp-general.cc (score_wide_int): New typedef. (omp_context_compute_score): Use score_wide_int instead of widest_int and adjust for those changes. (struct omp_declare_variant_entry): Change score and score_in_declare_simd_clone non-static data member type from widest_int to score_wide_int. (omp_resolve_late_declare_variant, omp_resolve_declare_variant): Use score_wide_int instead of widest_int and adjust for those changes. (omp_lto_output_declare_variant_alt): Likewise. (omp_lto_input_declare_variant_alt): Likewise. * godump.cc (go_output_typedef): Assert get_len () is smaller than WIDE_INT_MAX_INL_ELTS. gcc/c-family/ * c-warn.cc (match_case_to_enum_1): Use wi::to_wide just once instead of 3 times, assert get_len () is smaller than WIDE_INT_MAX_INL_ELTS. gcc/testsuite/ * gcc.dg/bitint-38.c: New test.
author: Jakub Jelinek <jakub@redhat.com> 2023-10-12 16:01:12 +0200
committer: Jakub Jelinek <jakub@redhat.com> 2023-10-12 16:01:12 +0200
commit: 0d00385eaf72ccacff17935b0d214a26773e095f (patch)
tree: 8dd6bddfc6220b8e8140085bb0d8b805b4ca1207 /gcc/tree.cc
parent: cd0185b8d155887a7d3a78d0c186289140b44e47 (diff)
download: gcc-0d00385eaf72ccacff17935b0d214a26773e095f.zip
gcc-0d00385eaf72ccacff17935b0d214a26773e095f.tar.gz
gcc-0d00385eaf72ccacff17935b0d214a26773e095f.tar.bz2
1 files changed, 6 insertions, 6 deletions
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 850a4de..6fa85cc 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -2676,13 +2676,13 @@ build_zero_cst (tree type)
 tree
 build_replicated_int_cst (tree type, unsigned int width, HOST_WIDE_INT value)
 {
-  int n = (TYPE_PRECISION (type) + HOST_BITS_PER_WIDE_INT - 1)
-    / HOST_BITS_PER_WIDE_INT;
+  int n = ((TYPE_PRECISION (type) + HOST_BITS_PER_WIDE_INT - 1)
+	   / HOST_BITS_PER_WIDE_INT);
   unsigned HOST_WIDE_INT low, mask;
-  HOST_WIDE_INT a[WIDE_INT_MAX_ELTS];
+  HOST_WIDE_INT a[WIDE_INT_MAX_INL_ELTS];
   int i;
 
-  gcc_assert (n && n <= WIDE_INT_MAX_ELTS);
+  gcc_assert (n && n <= WIDE_INT_MAX_INL_ELTS);
 
   if (width == HOST_BITS_PER_WIDE_INT)
     low = value;
@@ -2696,8 +2696,8 @@ build_replicated_int_cst (tree type, unsigned int width, HOST_WIDE_INT value)
     a[i] = low;
 
   gcc_assert (TYPE_PRECISION (type) <= MAX_BITSIZE_MODE_ANY_INT);
-  return wide_int_to_tree
-    (type, wide_int::from_array (a, n, TYPE_PRECISION (type)));
+  return wide_int_to_tree (type, wide_int::from_array (a, n,
+						       TYPE_PRECISION (type)));
 }
 
 /* If floating-point type TYPE has an IEEE-style sign bit, return an
author	Jakub Jelinek <jakub@redhat.com>	2023-10-12 16:01:12 +0200
committer	Jakub Jelinek <jakub@redhat.com>	2023-10-12 16:01:12 +0200
commit	0d00385eaf72ccacff17935b0d214a26773e095f (patch)
tree	8dd6bddfc6220b8e8140085bb0d8b805b4ca1207 /gcc/tree.cc
parent	cd0185b8d155887a7d3a78d0c186289140b44e47 (diff)
download	gcc-0d00385eaf72ccacff17935b0d214a26773e095f.zip gcc-0d00385eaf72ccacff17935b0d214a26773e095f.tar.gz gcc-0d00385eaf72ccacff17935b0d214a26773e095f.tar.bz2