aboutsummaryrefslogtreecommitdiff
path: root/gcc/fold-const.cc
AgeCommit message (Collapse)AuthorFilesLines
2024-02-26fold-const: Avoid infinite recursion in +-*&|^minmax reassociation [PR114084]Jakub Jelinek1-10/+41
In the following testcase we infinitely recurse during BIT_IOR_EXPR reassociation. One operand is (unsigned _BitInt(31)) a << 4 and another operand 2147483647 >> 1 | 80 where both the right shift and the | 80 trees have TREE_CONSTANT set, but weren't folded because of delayed folding, where some foldings are apparently done even in that case unfortunately. Now, the fold_binary_loc reassocation code splits both operands into variable part, minus variable part, constant part, minus constant part, literal part and minus literal parts, to prevent infinite recursion punts if there are just 2 parts altogether from the 2 operands and then goes on with reassociation, merges first the corresponding parts from both operands and then some further merges. The problem with the above expressions is that we get 3 different objects, var0 (the left shift), con1 (the right shift) and lit1 (80), so the infinite recursion prevention doesn't trigger, and we eventually merge con1 with lit1, which effectively reconstructs the original op1 and then associate that with var0 which is original op0, and associate_trees for that case calls fold_binary. There are some casts involved there too (the T typedef type and the underlying _BitInt type which are stripped with STRIP_NOPS). The following patch attempts to prevent this infinite recursion by tracking the origin (if certain var comes from nothing - 0, op0 - 1, op1 - 2 or both - 3) and propagates it through all the associate_tree calls which merge the vars. If near the end we'd try to merge what comes solely from op0 with what comes solely from op1 (or vice versa), the patch punts, because then it isn't any kind of reassociation between the two operands, if anything it should be handled when folding the suboperands. 2024-02-26 Jakub Jelinek <jakub@redhat.com> PR middle-end/114084 * fold-const.cc (fold_binary_loc): Avoid the final associate_trees if all subtrees of var0 come from one of the op0 or op1 operands and all subtrees of con0 come from the other one. Don't clear variables which are never used afterwards. * gcc.dg/bitint-94.c: New test.
2024-01-25fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].Robin Dapp1-0/+31
Found in PR112971 this patch adds folding support for bitwise operations of const duplicate zero/one vectors with stepped vectors. On riscv we have the situation that a folding would perpetually continue without simplifying because e.g. {0, 0, 0, ...} & {7, 6, 5, ...} would not be folded to {0, 0, 0, ...}. gcc/ChangeLog: PR middle-end/112971 * fold-const.cc (simplify_const_binop): New function for binop simplification of two constant vectors when element-wise handling is not necessary. (const_binop): Call new function. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr112971.c: New test.
2024-01-23fold-const: Fold larger VIEW_CONVERT_EXPRs [PR113462]Jakub Jelinek1-5/+17
On Mon, Jan 22, 2024 at 11:27:52AM +0100, Richard Biener wrote: > We run into > > static tree > native_interpret_int (tree type, const unsigned char *ptr, int len) > { > ... > if (total_bytes > len > || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT) > return NULL_TREE; > > OTOH using a V_C_E to "truncate" a _BitInt looks wrong? OTOH the > check doesn't really handle native_encode_expr using the "proper" > wide_int encoding however that's exactly handled. So it might be > a pre-existing issue that's only uncovered by large _BitInts > (__int128 might show similar issues?) I guess the || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT conditions make no sense, all we care is whether it fits in the buffer or not. But then there is fold_view_convert_expr (and other spots) which use /* We support up to 1024-bit values (for GCN/RISC-V V128QImode). */ unsigned char buffer[128]; or something similar. This patch fixes even that by using a XALLOCAVEC allocated buffer if the type size is 129 .. 8192 bytes. 2024-01-22 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/113462 * fold-const.cc (native_interpret_int): Don't punt if total_bytes is larger than HOST_BITS_PER_DOUBLE_INT / BITS_PER_UNIT. (fold_view_convert_expr): Use XALLOCAVEC buffers for types with sizes between 129 and 8192 bytes.
2024-01-12middle-end/113344 - is_truth_type_for vs GENERIC tcc_comparisonRichard Biener1-1/+1
On GENERIC tcc_comparison can have int type so restrict the PR113126 fix to vector types. PR middle-end/113344 * match.pd ((double)float CMP (double)float -> float CMP float): Perform result type check only for vectors. * fold-const.cc (fold_binary_loc): Likewise.
2024-01-11tree-optimization/113126 - vector extension compare optimizationRichard Biener1-1/+2
The following makes sure the resulting boolean type is the same when eliding a float extension. PR tree-optimization/113126 * match.pd ((double)float CMP (double)float -> float CMP float): Make sure the boolean type is the same. * fold-const.cc (fold_binary_loc): Likewise. * gcc.dg/torture/pr113126.c: New testcase.
2024-01-03Update copyright years.Jakub Jelinek1-1/+1
2023-12-11MATCH: (convert)(zero_one !=/== 0/1) for outer type and zero_one type are ↵Andrew Pinski1-27/+0
the same When I moved two_value to match.pd, I removed the check for the {0,+-1} as I had placed it after the {0,+-1} case for cond in match.pd. In the case of {0,+-1} and non boolean, before we would optmize those case to just `(convert)a` but after we would get `(convert)(a != 0)` which was not handled anyways to just `(convert)a`. So this adds a pattern to match `(convert)(zeroone != 0)` and simplify to `(convert)zeroone`. Also this optimizes (convert)(zeroone == 0) into (zeroone^1) if the type match. Removing the opposite transformation from fold. The opposite transformation was added with https://gcc.gnu.org/pipermail/gcc-patches/2006-February/190514.html It is no longer considered the canonicalization either, even VRP will transform it back into `(~a) & 1` so removing it is a good idea. Note the testcase pr69270.c needed a slight update due to not matching exactly a scan pattern, this update makes it more robust and will match before and afterwards and if there are other changes in this area too. Note the testcase gcc.target/i386/pr110790-2.c needs a slight update for better code generation in LP64 bit mode. Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/111972 PR tree-optimization/110637 * match.pd (`(convert)(zeroone !=/== CST)`): Match and simplify to ((convert)zeroone){,^1}. * fold-const.cc (fold_binary_loc): Remove transformation of `(~a) & 1` and `(a ^ 1) & 1` into `(convert)(a == 0)`. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr110637-1.c: New test. * gcc.dg/tree-ssa/pr110637-2.c: New test. * gcc.dg/tree-ssa/pr110637-3.c: New test. * gcc.dg/tree-ssa/pr111972-1.c: New test. * gcc.dg/tree-ssa/pr69270.c: Update testcase. * gcc.target/i386/pr110790-2.c: Update testcase. * gcc.dg/fold-even-1.c: Removed. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2023-11-29fold-const: Fix up multiple_of_p [PR112733]Jakub Jelinek1-1/+1
We ICE on the following testcase when wi::multiple_of_p is called on widest_int 1 and -128 with UNSIGNED. I still need to work on the actual wide-int.cc issue, the latest patch attached to the PR regressed bitint-{38,39}.c, so will need to debug that, but there is a clear bug on the fold-const.cc side as well - widest_int is a signed representation by definition, using UNSIGNED with it certainly doesn't match what was intended, because -128 as the second operand effectively means unsigned 131072 bit 0xfffff............ffff80 integer, not the signed char -128 that appeared in the source. In the INTEGER_CST case a few lines above this we already use case INTEGER_CST: if (TREE_CODE (bottom) != INTEGER_CST || integer_zerop (bottom)) return false; return wi::multiple_of_p (wi::to_widest (top), wi::to_widest (bottom), SIGNED); so I think using SIGNED with widest_int is best there (compared to the other choices in the PR). 2023-11-29 Jakub Jelinek <jakub@redhat.com> PR middle-end/112733 * fold-const.cc (multiple_of_p): Pass SIGNED rather than UNSIGNED for wi::multiple_of_p on widest_int arguments. * gcc.dg/pr112733.c: New test.
2023-11-27PR111754: Rework encoding of result for VEC_PERM_EXPR with constant input ↵Prathamesh Kulkarni1-19/+91
vectors. gcc/ChangeLog: PR middle-end/111754 * fold-const.cc (fold_vec_perm_cst): Set result's encoding to sel's encoding, and set res_nelts_per_pattern to 2 if sel contains stepped sequence but input vectors do not. (test_nunits_min_2): New test Case 8. (test_nunits_min_4): New tests Case 8 and Case 9. gcc/testsuite/ChangeLog: PR middle-end/111754 * gcc.target/aarch64/sve/slp_3.c: Adjust code-gen. * gcc.target/aarch64/sve/slp_4.c: Likewise. * gcc.dg/vect/pr111754.c: New test. Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>
2023-11-10Handle constant CONSTRUCTORs in operand_compareEric Botcazou1-7/+67
This teaches operand_compare to compare constant CONSTRUCTORs, which is quite helpful for so-called fat pointers in Ada, i.e. objects that are semantically pointers but are represented by structures made up of two pointers. This is modeled on the implementation present in the ICF pass. gcc/ * fold-const.cc (operand_compare::operand_equal_p) <CONSTRUCTOR>: Deal with nonempty constant CONSTRUCTORs. (operand_compare::hash_operand) <CONSTRUCTOR>: Hash DECL_FIELD_OFFSET and DECL_FIELD_BIT_OFFSET for FIELD_DECLs. gcc/testsuite/ * gnat.dg/opt103.ads, gnat.dg/opt103.adb: New test.
2023-10-19PR111648: Fix wrong code-gen due to incorrect VEC_PERM_EXPR folding.Prathamesh Kulkarni1-11/+103
gcc/ChangeLog: PR tree-optimization/111648 * fold-const.cc (valid_mask_for_fold_vec_perm_cst_p): If a1 chooses base element from arg, ensure that it's a natural stepped sequence. (build_vec_cst_rand): New param natural_stepped and use it to construct a naturally stepped sequence. (test_nunits_min_2): Add new unit tests Case 6 and Case 7.
2023-10-16use more get_range_queryJiufu Guo1-5/+1
For "get_global_range_query" SSA_NAME_RANGE_INFO can be queried. For "get_range_query", it could get more context-aware range info. And look at the implementation of "get_range_query", it returns global range if no local fun info. So, if not quering for SSA_NAME and not chaning the IL, it would be ok to use get_range_query to replace get_global_range_query. gcc/ChangeLog: * fold-const.cc (expr_not_equal_to): Replace get_global_range_query by get_range_query. * gimple-fold.cc (size_must_be_zero_p): Likewise. * gimple-range-fold.cc (fur_source::fur_source): Likewise. * gimple-ssa-warn-access.cc (check_nul_terminated_array): Likewise. * tree-dfa.cc (get_ref_base_and_extent): Likewise.
2023-10-12wide-int: Allow up to 16320 bits wide_int and change widest_int precision to ↵Jakub Jelinek1-3/+11
32640 bits [PR102989] As mentioned in the _BitInt support thread, _BitInt(N) is currently limited by the wide_int/widest_int maximum precision limitation, which is depending on target 191, 319, 575 or 703 bits (one less than WIDE_INT_MAX_PRECISION). That is fairly low limit for _BitInt, especially on the targets with the 191 bit limitation. The following patch bumps that limit to 16319 bits on all arches (which support _BitInt at all), which is the limit imposed by INTEGER_CST representation (unsigned char members holding number of HOST_WIDE_INT limbs). In order to achieve that, wide_int is changed from a trivially copyable type which contained just an inline array of WIDE_INT_MAX_ELTS (3, 5, 9 or 11 limbs depending on target) limbs into a non-trivially copy constructible, copy assignable and destructible type which for the usual small cases (up to WIDE_INT_MAX_INL_ELTS which is the former WIDE_INT_MAX_ELTS) still uses an inline array of limbs, but for larger precisions uses heap allocated limb array. This makes wide_int unusable in GC structures, so for dwarf2out which was the only place which needed it there is a new rwide_int type (restricted wide_int) which supports only up to RWIDE_INT_MAX_ELTS limbs inline and is trivially copyable (dwarf2out should never deal with large _BitInt constants, those should have been lowered earlier). Similarly, widest_int has been changed from a trivially copyable type which contained also an inline array of WIDE_INT_MAX_ELTS limbs (but unlike wide_int didn't contain precision and assumed that to be WIDE_INT_MAX_PRECISION) into a non-trivially copy constructible, copy assignable and destructible type which has always WIDEST_INT_MAX_PRECISION precision (32640 bits currently, twice as much as INTEGER_CST limitation allows) and unlike wide_int decides depending on get_len () value whether it uses an inline array (again, up to WIDE_INT_MAX_INL_ELTS) or heap allocated one. In wide-int.h this means we need to estimate an upper bound on how many limbs will wide-int.cc (usually, sometimes wide-int.h) need to write, heap allocate if needed based on that estimation and upon set_len which is done at the end if we guessed over WIDE_INT_MAX_INL_ELTS and allocated dynamically, while we actually need less than that copy/deallocate. The unexact guesses are needed because the exact computation of the length in wide-int.cc is sometimes quite complex and especially canonicalize at the end can decrease it. widest_int is again because of this not usable in GC structures, so cfgloop.h has been changed to use fixed_wide_int_storage <WIDE_INT_MAX_INL_PRECISION> and punt if we'd have larger _BitInt based iterators, programs having more than 128-bit iterators will be hopefully rare and I think it is fine to treat loops with more than 2^127 iterations as effectively possibly infinite, omp-general.cc is changed to use fixed_wide_int_storage <1024>, as it better should support scores with the same precision on all arches. Code which used WIDE_INT_PRINT_BUFFER_SIZE sized buffers for printing wide_int/widest_int into buffer had to be changed to use XALLOCAVEC for larger lengths. On x86_64, the patch in --enable-checking=yes,rtl,extra configured bootstrapped cc1plus enlarges the .text section by 1.01% - from 0x25725a5 to 0x25e5555 and similarly at least when compiling insn-recog.cc with the usual bootstrap option slows compilation down by 1.01%, user 4m22.046s and 4m22.384s on vanilla trunk vs. 4m25.947s and 4m25.581s on patched trunk. I'm afraid some code size growth and compile time slowdown is unavoidable in this case, we use wide_int and widest_int everywhere, and while the rare cases are marked with UNLIKELY macros, it still means extra checks for it. The patch also regresses +FAIL: gm2/pim/fail/largeconst.mod, -O +FAIL: gm2/pim/fail/largeconst.mod, -O -g +FAIL: gm2/pim/fail/largeconst.mod, -O3 -fomit-frame-pointer +FAIL: gm2/pim/fail/largeconst.mod, -O3 -fomit-frame-pointer -finline-functions +FAIL: gm2/pim/fail/largeconst.mod, -Os +FAIL: gm2/pim/fail/largeconst.mod, -g +FAIL: gm2/pim/fail/largeconst2.mod, -O +FAIL: gm2/pim/fail/largeconst2.mod, -O -g +FAIL: gm2/pim/fail/largeconst2.mod, -O3 -fomit-frame-pointer +FAIL: gm2/pim/fail/largeconst2.mod, -O3 -fomit-frame-pointer -finline-functions +FAIL: gm2/pim/fail/largeconst2.mod, -Os +FAIL: gm2/pim/fail/largeconst2.mod, -g tests, which previously were rejected with error: constant literal ‘12345678912345678912345679123456789123456789123456789123456789123456791234567891234567891234567891234567891234567912345678912345678912345678912345678912345679123456789123456789’ exceeds internal ZTYPE range kind of errors, but now are accepted. Seems the FE tries to parse constants into widest_int in that case and only diagnoses if widest_int overflows, that seems wrong, it should at least punt if stuff doesn't fit into WIDE_INT_MAX_PRECISION, but perhaps far less than that, if it wants support for middle-end for precisions above 128-bit, it better should be using BITINT_TYPE. Will file a PR and defer to Modula2 maintainer. 2023-10-12 Jakub Jelinek <jakub@redhat.com> PR c/102989 * wide-int.h: Adjust file comment. (WIDE_INT_MAX_INL_ELTS): Define to former value of WIDE_INT_MAX_ELTS. (WIDE_INT_MAX_INL_PRECISION): Define. (WIDE_INT_MAX_ELTS): Change to 255. Assert that WIDE_INT_MAX_INL_ELTS is smaller than WIDE_INT_MAX_ELTS. (RWIDE_INT_MAX_ELTS, RWIDE_INT_MAX_PRECISION, WIDEST_INT_MAX_ELTS, WIDEST_INT_MAX_PRECISION): Define. (WI_BINARY_RESULT_VAR, WI_UNARY_RESULT_VAR): Change write_val callers to pass 0 as a new argument. (class widest_int_storage): Likewise. (widest_int, widest2_int): Change typedefs to use widest_int_storage rather than fixed_wide_int_storage. (enum wi::precision_type): Add INL_CONST_PRECISION enumerator. (struct binary_traits): Add partial specializations for INL_CONST_PRECISION. (generic_wide_int): Add needs_write_val_arg static data member. (int_traits): Likewise. (wide_int_storage): Replace val non-static data member with a union u of it and HOST_WIDE_INT *valp. Declare copy constructor, copy assignment operator and destructor. Add unsigned int argument to write_val. (wide_int_storage::wide_int_storage): Initialize precision to 0 in the default ctor. Remove unnecessary {}s around STATIC_ASSERTs. Assert in non-default ctor T's precision_type is not INL_CONST_PRECISION and allocate u.valp for large precision. Add copy constructor. (wide_int_storage::~wide_int_storage): New. (wide_int_storage::operator=): Add copy assignment operator. In assignment operator remove unnecessary {}s around STATIC_ASSERTs, assert ctor T's precision_type is not INL_CONST_PRECISION and if precision changes, deallocate and/or allocate u.valp. (wide_int_storage::get_val): Return u.valp rather than u.val for large precision. (wide_int_storage::write_val): Likewise. Add an unused unsigned int argument. (wide_int_storage::set_len): Use write_val instead of writing val directly. (wide_int_storage::from, wide_int_storage::from_array): Adjust write_val callers. (wide_int_storage::create): Allocate u.valp for large precisions. (wi::int_traits <wide_int_storage>::get_binary_precision): New. (fixed_wide_int_storage::fixed_wide_int_storage): Make default ctor defaulted. (fixed_wide_int_storage::write_val): Add unused unsigned int argument. (fixed_wide_int_storage::from, fixed_wide_int_storage::from_array): Adjust write_val callers. (wi::int_traits <fixed_wide_int_storage>::get_binary_precision): New. (WIDEST_INT): Define. (widest_int_storage): New template class. (wi::int_traits <widest_int_storage>): New. (trailing_wide_int_storage::write_val): Add unused unsigned int argument. (wi::get_binary_precision): Use wi::int_traits <WI_BINARY_RESULT (T1, T2)>::get_binary_precision rather than get_precision on get_binary_result. (wi::copy): Adjust write_val callers. Don't call set_len if needs_write_val_arg. (wi::bit_not): If result.needs_write_val_arg, call write_val again with upper bound estimate of len. (wi::sext, wi::zext, wi::set_bit): Likewise. (wi::bit_and, wi::bit_and_not, wi::bit_or, wi::bit_or_not, wi::bit_xor, wi::add, wi::sub, wi::mul, wi::mul_high, wi::div_trunc, wi::div_floor, wi::div_ceil, wi::div_round, wi::divmod_trunc, wi::mod_trunc, wi::mod_floor, wi::mod_ceil, wi::mod_round, wi::lshift, wi::lrshift, wi::arshift): Likewise. (wi::bswap, wi::bitreverse): Assert result.needs_write_val_arg is false. (gt_ggc_mx, gt_pch_nx): Remove generic template for all generic_wide_int, instead add functions and templates for each storage of generic_wide_int. Make functions for generic_wide_int <wide_int_storage> and templates for generic_wide_int <widest_int_storage <N>> deleted. (wi::mask, wi::shifted_mask): Adjust write_val calls. * wide-int.cc (zeros): Decrease array size to 1. (BLOCKS_NEEDED): Use CEIL. (canonize): Use HOST_WIDE_INT_M1. (wi::from_buffer): Pass 0 to write_val. (wi::to_mpz): Use CEIL. (wi::from_mpz): Likewise. Pass 0 to write_val. Use WIDE_INT_MAX_INL_ELTS instead of WIDE_INT_MAX_ELTS. (wi::mul_internal): Use WIDE_INT_MAX_INL_PRECISION instead of MAX_BITSIZE_MODE_ANY_INT in automatic array sizes, for prec above WIDE_INT_MAX_INL_PRECISION estimate precision from lengths of operands. Use XALLOCAVEC allocated buffers for prec above WIDE_INT_MAX_INL_PRECISION. (wi::divmod_internal): Likewise. (wi::lshift_large): For len > WIDE_INT_MAX_INL_ELTS estimate it from xlen and skip. (rshift_large_common): Remove xprecision argument, add len argument with len computed in caller. Don't return anything. (wi::lrshift_large, wi::arshift_large): Compute len here and pass it to rshift_large_common, for lengths above WIDE_INT_MAX_INL_ELTS using estimations from xlen if possible. (assert_deceq, assert_hexeq): For lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer. (test_printing): Use WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION. * wide-int-print.h (WIDE_INT_PRINT_BUFFER_SIZE): Use WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION. * wide-int-print.cc (print_decs, print_decu, print_hex): For lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer. * tree.h (wi::int_traits<extended_tree <N>>): Change precision_type to INL_CONST_PRECISION for N == ADDR_MAX_PRECISION. (widest_extended_tree): Use WIDEST_INT_MAX_PRECISION instead of WIDE_INT_MAX_PRECISION. (wi::ints_for): Use int_traits <extended_tree <N> >::precision_type instead of hard coded CONST_PRECISION. (widest2_int_cst): Use WIDEST_INT_MAX_PRECISION instead of WIDE_INT_MAX_PRECISION. (wi::extended_tree <N>::get_len): Use WIDEST_INT_MAX_PRECISION rather than WIDE_INT_MAX_PRECISION. (wi::ints_for::zero): Use wi::int_traits <wi::extended_tree <N> >::precision_type instead of wi::CONST_PRECISION. * tree.cc (build_replicated_int_cst): Formatting fix. Use WIDE_INT_MAX_INL_ELTS rather than WIDE_INT_MAX_ELTS. * print-tree.cc (print_node): Don't print TREE_UNAVAILABLE on INTEGER_CSTs, TREE_VECs or SSA_NAMEs. * double-int.h (wi::int_traits <double_int>::precision_type): Change to INL_CONST_PRECISION from CONST_PRECISION. * poly-int.h (struct poly_coeff_traits): Add partial specialization for wi::INL_CONST_PRECISION. * cfgloop.h (bound_wide_int): New typedef. (struct nb_iter_bound): Change bound type from widest_int to bound_wide_int. (struct loop): Change nb_iterations_upper_bound, nb_iterations_likely_upper_bound and nb_iterations_estimate type from widest_int to bound_wide_int. * cfgloop.cc (record_niter_bound): Return early if wi::min_precision of i_bound is too large for bound_wide_int. Adjustments for the widest_int to bound_wide_int type change in non-static data members. (get_estimated_loop_iterations, get_max_loop_iterations, get_likely_max_loop_iterations): Adjustments for the widest_int to bound_wide_int type change in non-static data members. * tree-vect-loop.cc (vect_transform_loop): Likewise. * tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations): Use XALLOCAVEC allocated buffer for i_bound len above WIDE_INT_MAX_INL_ELTS. (record_estimate): Return early if wi::min_precision of i_bound is too large for bound_wide_int. Adjustments for the widest_int to bound_wide_int type change in non-static data members. (wide_int_cmp): Use bound_wide_int instead of widest_int. (bound_index): Use bound_wide_int instead of widest_int. (discover_iteration_bound_by_body_walk): Likewise. Use widest_int::from to convert it to widest_int when passed to record_niter_bound. (maybe_lower_iteration_bound): Use widest_int::from to convert it to widest_int when passed to record_niter_bound. (estimate_numbers_of_iteration): Don't record upper bound if loop->nb_iterations has too large precision for bound_wide_int. (n_of_executions_at_most): Use widest_int::from. * tree-ssa-loop-ivcanon.cc (remove_redundant_iv_tests): Adjust for the widest_int to bound_wide_int changes. * match.pd (fold_sign_changed_comparison simplification): Use wide_int::from on wi::to_wide instead of wi::to_widest. * value-range.h (irange::maybe_resize): Avoid using memcpy on non-trivially copyable elements. * value-range.cc (irange_bitmask::dump): Use XALLOCAVEC allocated buffer for mask or value len above WIDE_INT_PRINT_BUFFER_SIZE. * fold-const.cc (fold_convert_const_int_from_int, fold_unary_loc): Use wide_int::from on wi::to_wide instead of wi::to_widest. * tree-ssa-ccp.cc (bit_value_binop): Zero extend r1max from width before calling wi::udiv_trunc. * lto-streamer-out.cc (output_cfg): Adjustments for the widest_int to bound_wide_int type change in non-static data members. * lto-streamer-in.cc (input_cfg): Likewise. (lto_input_tree_1): Use WIDE_INT_MAX_INL_ELTS rather than WIDE_INT_MAX_ELTS. For length above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer. Formatting fix. * data-streamer-in.cc (streamer_read_wide_int, streamer_read_widest_int): Likewise. * tree-affine.cc (aff_combination_expand): Use placement new to construct name_expansion. (free_name_expansion): Destruct name_expansion. * gimple-ssa-strength-reduction.cc (struct slsr_cand_d): Change index type from widest_int to offset_int. (class incr_info_d): Change incr type from widest_int to offset_int. (alloc_cand_and_find_basis, backtrace_base_for_ref, restructure_reference, slsr_process_ref, create_mul_ssa_cand, create_mul_imm_cand, create_add_ssa_cand, create_add_imm_cand, slsr_process_add, cand_abs_increment, replace_mult_candidate, replace_unconditional_candidate, incr_vec_index, create_add_on_incoming_edge, create_phi_basis_1, replace_conditional_candidate, record_increment, record_phi_increments_1, phi_incr_cost_1, phi_incr_cost, lowest_cost_path, total_savings, ncd_with_phi, ncd_of_cand_and_phis, nearest_common_dominator_for_cands, insert_initializers, all_phi_incrs_profitable_1, replace_one_candidate, replace_profitable_candidates): Use offset_int rather than widest_int and wi::to_offset rather than wi::to_widest. * real.cc (real_to_integer): Use WIDE_INT_MAX_INL_ELTS rather than 2 * WIDE_INT_MAX_ELTS and for words above that use XALLOCAVEC allocated buffer. * tree-ssa-loop-ivopts.cc (niter_for_exit): Use placement new to construct tree_niter_desc and destruct it on failure. (free_tree_niter_desc): Destruct tree_niter_desc if value is non-NULL. * gengtype.cc (main): Remove widest_int handling. * graphite-isl-ast-to-gimple.cc (widest_int_from_isl_expr_int): Use WIDEST_INT_MAX_ELTS instead of WIDE_INT_MAX_ELTS. * gimple-ssa-warn-alloca.cc (pass_walloca::execute): Use WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION and assert get_len () fits into it. * value-range-pretty-print.cc (vrange_printer::print_irange_bitmasks): For mask or value lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer. * gimple-ssa-sprintf.cc (adjust_range_for_overflow): Use wide_int::from on wi::to_wide instead of wi::to_widest. * omp-general.cc (score_wide_int): New typedef. (omp_context_compute_score): Use score_wide_int instead of widest_int and adjust for those changes. (struct omp_declare_variant_entry): Change score and score_in_declare_simd_clone non-static data member type from widest_int to score_wide_int. (omp_resolve_late_declare_variant, omp_resolve_declare_variant): Use score_wide_int instead of widest_int and adjust for those changes. (omp_lto_output_declare_variant_alt): Likewise. (omp_lto_input_declare_variant_alt): Likewise. * godump.cc (go_output_typedef): Assert get_len () is smaller than WIDE_INT_MAX_INL_ELTS. gcc/c-family/ * c-warn.cc (match_case_to_enum_1): Use wi::to_wide just once instead of 3 times, assert get_len () is smaller than WIDE_INT_MAX_INL_ELTS. gcc/testsuite/ * gcc.dg/bitint-38.c: New test.
2023-10-10tree-optimization/111751 - support 1024 bit vector constant reinterpretationRichard Biener1-2/+2
The following ups the limit in fold_view_convert_expr to handle 1024bit vectors as used by GCN and RVV. It also robustifies the handling in visit_reference_op_load to properly give up when constants cannot be re-interpreted. PR tree-optimization/111751 * fold-const.cc (fold_view_convert_expr): Up the buffer size to 128 bytes. * tree-ssa-sccvn.cc (visit_reference_op_load): Special case constants, giving up when re-interpretation to the target type fails.
2023-09-29Remove poly_int_podRichard Sandiford1-2/+2
poly_int was written before the switch to C++11 and so couldn't use explicit default constructors. This led to an awkward split between poly_int_pod and poly_int. poly_int simply inherited from poly_int_pod and added constructors, with the argumentless constructor having an empty body. But inheritance meant that poly_int had to repeat the assignment operators from poly_int_pod (again, no C++11, so no "using" to inherit base-class implementations). All that goes away if we switch to using default constructors. The main complication is ensuring that braced initialisation still gives a constexpr, so that static variables can be initialised without runtime code. The two problems here are: (1) When initialising a poly_int<N, wide_int> with fewer than N coefficients, the other coefficients need to be a zero of the same precision as the explicit coefficients. This was previously done in a for loop using wi::ints_for<...>::zero, but C++11 constexpr constructors can't have function bodies. The patch instead uses a series of delegated initialisers to fill in the implicit coefficients. (2) The initialisation in: void f(int x) { unsigned int foo {x}; } produces the warning: warning: narrowing conversion of 'x' from 'int' to 'unsigned int' [-Wnarrowing] whereas: void f(int x) { unsigned int foo = x; } does not. So switching to direct initialisation of the coeffs array would mean that: poly_uin64_t x = 0; would trigger a warning for using 0 rather than 0u. That seemed overly pedantic, so the patch adds explicit casts to the constructor. The complication is to do that without adding extra code to wide-int versions. The patch uses a new init_cast type for that. gcc/ * poly-int.h (poly_int_pod): Delete. (poly_coeff_traits::init_cast): New type. (poly_int_full, poly_int_hungry, poly_int_fullness): New structures. (poly_int): Replace constructors that take 1 and 2 coefficients with a general one that takes an arbitrary number of coefficients. Delegate initialization to two new private constructors, one of which uses the coefficients as-is and one of which adds an extra zero of the appropriate type (and precision, where applicable). (gt_ggc_mx, gt_pch_nx): Operate on poly_ints rather than poly_int_pods. * poly-int-types.h (poly_uint16_pod, poly_int64_pod, poly_uint64_pod) (poly_offset_int_pod, poly_wide_int_pod, poly_widest_int_pod): Delete. * gengtype.cc (main): Don't register poly_int64_pod. * calls.cc (initialize_argument_information): Use poly_int rather than poly_int_pod. (combine_pending_stack_adjustment_and_call): Likewise. * config/aarch64/aarch64.cc (pure_scalable_type_info): Likewise. * data-streamer.h (bp_unpack_poly_value): Likewise. * dwarf2cfi.cc (struct dw_trace_info): Likewise. (struct queued_reg_save): Likewise. * dwarf2out.h (struct dw_cfa_location): Likewise. * emit-rtl.h (struct incoming_args): Likewise. (struct rtl_data): Likewise. * expr.cc (get_bit_range): Likewise. (get_inner_reference): Likewise. * expr.h (get_bit_range): Likewise. * fold-const.cc (split_address_to_core_and_offset): Likewise. (ptr_difference_const): Likewise. * fold-const.h (ptr_difference_const): Likewise. * function.cc (try_fit_stack_local): Likewise. (instantiate_new_reg): Likewise. * function.h (struct expr_status): Likewise. (struct args_size): Likewise. * genmodes.cc (ZERO_COEFFS): Likewise. (mode_size_inline): Likewise. (mode_nunits_inline): Likewise. (emit_mode_precision): Likewise. (emit_mode_size): Likewise. (emit_mode_nunits): Likewise. * gimple-fold.cc (get_base_constructor): Likewise. * gimple-ssa-store-merging.cc (struct symbolic_number): Likewise. * inchash.h (class hash): Likewise. * ipa-modref-tree.cc (modref_access_node::dump): Likewise. * ipa-modref.cc (modref_access_analysis::merge_call_side_effects): Likewise. * ira-int.h (ira_spilled_reg_stack_slot): Likewise. * lra-eliminations.cc (self_elim_offsets): Likewise. * machmode.h (mode_size, mode_precision, mode_nunits): Likewise. * omp-low.cc (omplow_simd_context): Likewise. * pretty-print.cc (pp_wide_integer): Likewise. * pretty-print.h (pp_wide_integer): Likewise. * reload.cc (struct decomposition): Likewise. * reload.h (struct reload): Likewise. * reload1.cc (spill_stack_slot_width): Likewise. (struct elim_table): Likewise. (offsets_at): Likewise. (init_eliminable_invariants): Likewise. * rtl.h (union rtunion): Likewise. (poly_int_rtx_p): Likewise. (strip_offset): Likewise. (strip_offset_and_add): Likewise. * rtlanal.cc (strip_offset): Likewise. * tree-dfa.cc (get_ref_base_and_extent): Likewise. (get_addr_base_and_unit_offset_1): Likewise. (get_addr_base_and_unit_offset): Likewise. * tree-dfa.h (get_ref_base_and_extent): Likewise. (get_addr_base_and_unit_offset_1): Likewise. (get_addr_base_and_unit_offset): Likewise. * tree-ssa-loop-ivopts.cc (struct iv_use): Likewise. (strip_offset): Likewise. * tree-ssa-sccvn.h (struct vn_reference_op_struct): Likewise. * tree.cc (ptrdiff_tree_p): Likewise. * tree.h (poly_int_tree_p): Likewise. (ptrdiff_tree_p): Likewise. (get_inner_reference): Likewise. gcc/testsuite/ * gcc.dg/plugin/poly-int-tests.h (test_num_coeffs_extra): Use poly_int rather than poly_int_pod.
2023-09-12fold-const: Handle BITINT_TYPE in range_check_typeJakub Jelinek1-1/+6
When discussing PR111369 with Andrew Pinski, I've realized that I haven't added BITINT_TYPE handling to range_check_type. Right now (unsigned) max + 1 == (unsigned) min for signed _BitInt,l so I think we don't need to do the extra hops for BITINT_TYPE (though possibly we don't need them for INTEGER_TYPE either in the two's complement word and we don't support anything else, though I really don't know if Ada or some other FEs don't create weird INTEGER_TYPEs). 2023-09-12 Jakub Jelinek <jakub@redhat.com> * fold-const.cc (range_check_type): Handle BITINT_TYPE like OFFSET_TYPE.
2023-09-09Support folding min(poly,poly) to constLehua Ding1-0/+24
This patch adds support that tries to fold `MIN (poly, poly)` to a constant. Consider the following C Code: ``` void foo2 (int* restrict a, int* restrict b, int n) { for (int i = 0; i < 3; i += 1) a[i] += b[i]; } ``` Before this patch: ``` void foo2 (int * restrict a, int * restrict b, int n) { vector([4,4]) int vect__7.27; vector([4,4]) int vect__6.26; vector([4,4]) int vect__4.23; unsigned long _32; <bb 2> [local count: 268435456]: _32 = MIN_EXPR <3, POLY_INT_CST [4, 4]>; vect__4.23_20 = .MASK_LEN_LOAD (a_11(D), 32B, { -1, ... }, _32, 0); vect__6.26_15 = .MASK_LEN_LOAD (b_12(D), 32B, { -1, ... }, _32, 0); vect__7.27_9 = vect__6.26_15 + vect__4.23_20; .MASK_LEN_STORE (a_11(D), 32B, { -1, ... }, _32, 0, vect__7.27_9); [tail call] return; } ``` After this patch: ``` void foo2 (int * restrict a, int * restrict b, int n) { vector([4,4]) int vect__7.27; vector([4,4]) int vect__6.26; vector([4,4]) int vect__4.23; <bb 2> [local count: 268435456]: vect__4.23_20 = .MASK_LEN_LOAD (a_11(D), 32B, { -1, ... }, 3, 0); vect__6.26_15 = .MASK_LEN_LOAD (b_12(D), 32B, { -1, ... }, 3, 0); vect__7.27_9 = vect__6.26_15 + vect__4.23_20; .MASK_LEN_STORE (a_11(D), 32B, { -1, ... }, 3, 0, vect__7.27_9); [tail call] return; } ``` For RISC-V RVV, csrr and branch instructions can be reduced: Before this patch: ``` foo2: csrr a4,vlenb srli a4,a4,2 li a5,3 bleu a5,a4,.L5 mv a5,a4 .L5: vsetvli zero,a5,e32,m1,ta,ma ... ``` After this patch. ``` foo2: vsetivli zero,3,e32,m1,ta,ma ... ``` gcc/ChangeLog: * fold-const.cc (can_min_p): New function. (poly_int_binop): Try fold MIN_EXPR. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/div-1.c: Adjust. * gcc.target/riscv/rvv/autovec/vls/shift-3.c: Adjust. * gcc.target/riscv/rvv/autovec/fold-min-poly.c: New test.
2023-09-07middle-end: Avoid calling targetm.c.bitint_type_info inside of gcc_assert ↵Jakub Jelinek1-4/+4
[PR102989] On Thu, Sep 07, 2023 at 10:36:02AM +0200, Thomas Schwinge wrote: > Minor comment/question: are we doing away with the property that > 'assert'-like "calls" must not have side effects? Per 'gcc/system.h', > this is "OK" for 'gcc_assert' for '#if ENABLE_ASSERT_CHECKING' or > '#elif (GCC_VERSION >= 4005)' -- that is, GCC 4.5, which is always-true, > thus the "offending" '#else' is never active. However, it's different > for standard 'assert' and 'gcc_checking_assert', so I'm not sure if > that's a good property for 'gcc_assert' only? For example, see also > <https://gcc.gnu.org/PR6906> "warn about asserts with side effects", or > recent <https://gcc.gnu.org/PR111144> > "RFE: could -fanalyzer warn about assertions that have side effects?". You're right, the #define gcc_assert(EXPR) ((void)(0 && (EXPR))) fallback definition is incompatible with the way I've used it, so for --disable-checking built by non-GCC it would not work properly. 2023-09-07 Jakub Jelinek <jakub@redhat.com> PR c/102989 * expr.cc (expand_expr_real_1): Don't call targetm.c.bitint_type_info inside gcc_assert, as later code relies on it filling info variable. * gimple-fold.cc (clear_padding_bitint_needs_padding_p, clear_padding_type): Likewise. * varasm.cc (output_constant): Likewise. * fold-const.cc (native_encode_int, native_interpret_int): Likewise. * stor-layout.cc (finish_bitfield_representative, layout_type): Likewise. * gimple-lower-bitint.cc (bitint_precision_kind): Likewise.
2023-09-06Middle-end _BitInt support [PR102989]Jakub Jelinek1-9/+66
The following patch introduces the middle-end part of the _BitInt support, a new BITINT_TYPE, handling it where needed, except the lowering pass and sanitizer support. 2023-09-06 Jakub Jelinek <jakub@redhat.com> PR c/102989 * tree.def (BITINT_TYPE): New type. * tree.h (TREE_CHECK6, TREE_NOT_CHECK6): Define. (NUMERICAL_TYPE_CHECK, INTEGRAL_TYPE_P): Include BITINT_TYPE. (BITINT_TYPE_P): Define. (CONSTRUCTOR_BITFIELD_P): Return true even for BLKmode bit-fields if they have BITINT_TYPE type. (tree_check6, tree_not_check6): New inline functions. (any_integral_type_check): Include BITINT_TYPE. (build_bitint_type): Declare. * tree.cc (tree_code_size, wide_int_to_tree_1, cache_integer_cst, build_zero_cst, type_hash_canon_hash, type_cache_hasher::equal, type_hash_canon): Handle BITINT_TYPE. (bitint_type_cache): New variable. (build_bitint_type): New function. (signed_or_unsigned_type_for, verify_type_variant, verify_type): Handle BITINT_TYPE. (tree_cc_finalize): Free bitint_type_cache. * builtins.cc (type_to_class): Handle BITINT_TYPE. (fold_builtin_unordered_cmp): Handle BITINT_TYPE like INTEGER_TYPE. * cfgexpand.cc (expand_debug_expr): Punt on BLKmode BITINT_TYPE INTEGER_CSTs. * convert.cc (convert_to_pointer_1, convert_to_real_1, convert_to_complex_1): Handle BITINT_TYPE like INTEGER_TYPE. (convert_to_integer_1): Likewise. For BITINT_TYPE don't check GET_MODE_PRECISION (TYPE_MODE (type)). * doc/generic.texi (BITINT_TYPE): Document. * doc/tm.texi.in (TARGET_C_BITINT_TYPE_INFO): New. * doc/tm.texi: Regenerated. * dwarf2out.cc (base_type_die, is_base_type, modified_type_die, gen_type_die_with_usage): Handle BITINT_TYPE. (rtl_for_decl_init): Punt on BLKmode BITINT_TYPE INTEGER_CSTs or handle those which fit into shwi. * expr.cc (expand_expr_real_1): Define EXTEND_BITINT macro, reduce to bitfield precision reads from BITINT_TYPE vars, parameters or memory locations. Expand large/huge BITINT_TYPE INTEGER_CSTs into memory. * fold-const.cc (fold_convert_loc, make_range_step): Handle BITINT_TYPE. (extract_muldiv_1): For BITINT_TYPE use TYPE_PRECISION rather than GET_MODE_SIZE (SCALAR_INT_TYPE_MODE). (native_encode_int, native_interpret_int, native_interpret_expr): Handle BITINT_TYPE. * gimple-expr.cc (useless_type_conversion_p): Make BITINT_TYPE to some other integral type or vice versa conversions non-useless. * gimple-fold.cc (gimple_fold_builtin_memset): Punt for BITINT_TYPE. (clear_padding_unit): Mention in comment that _BitInt types don't need to fit either. (clear_padding_bitint_needs_padding_p): New function. (clear_padding_type_may_have_padding_p): Handle BITINT_TYPE. (clear_padding_type): Likewise. * internal-fn.cc (expand_mul_overflow): For unsigned non-mode precision operands force pos_neg? to 1. (expand_MULBITINT, expand_DIVMODBITINT, expand_FLOATTOBITINT, expand_BITINTTOFLOAT): New functions. * internal-fn.def (MULBITINT, DIVMODBITINT, FLOATTOBITINT, BITINTTOFLOAT): New internal functions. * internal-fn.h (expand_MULBITINT, expand_DIVMODBITINT, expand_FLOATTOBITINT, expand_BITINTTOFLOAT): Declare. * match.pd (non-equality compare simplifications from fold_binary): Punt if TYPE_MODE (arg1_type) is BLKmode. * pretty-print.h (pp_wide_int): Handle printing of large precision wide_ints which would buffer overflow digit_buffer. * stor-layout.cc (finish_bitfield_representative): For bit-fields with BITINT_TYPE, prefer representatives with precisions in multiple of limb precision. (layout_type): Handle BITINT_TYPE. Handle COMPLEX_TYPE with BLKmode element type and assert it is BITINT_TYPE. * target.def (bitint_type_info): New C target hook. * target.h (struct bitint_info): New type. * targhooks.cc (default_bitint_type_info): New function. * targhooks.h (default_bitint_type_info): Declare. * tree-pretty-print.cc (dump_generic_node): Handle BITINT_TYPE. Handle printing large wide_ints which would buffer overflow digit_buffer. * tree-ssa-sccvn.cc: Include target.h. (eliminate_dom_walker::eliminate_stmt): Punt for large/huge BITINT_TYPE. * tree-switch-conversion.cc (jump_table_cluster::emit): For more than 64-bit BITINT_TYPE subtract low bound from expression and cast to 64-bit integer type both the controlling expression and case labels. * typeclass.h (enum type_class): Add bitint_type_class enumerator. * varasm.cc (output_constant): Handle BITINT_TYPE INTEGER_CSTs. * vr-values.cc (check_for_binary_op_overflow): Use widest2_int rather than widest_int. (simplify_using_ranges::simplify_internal_call_using_ranges): Use unsigned_type_for rather than build_nonstandard_integer_type.
2023-08-21PR111048: Set arg_npatterns correctly.Prathamesh Kulkarni1-7/+37
In valid_mask_for_fold_vec_perm_cst we set arg_npatterns always to VECTOR_CST_NPATTERNS (arg0) because of (q1 & 0) == 0: /* Ensure that the stepped sequence always selects from the same input pattern. */ unsigned arg_npatterns = ((q1 & 0) == 0) ? VECTOR_CST_NPATTERNS (arg0) : VECTOR_CST_NPATTERNS (arg1); resulting in wrong code-gen issues. The patch fixes this by changing the condition to (q1 & 1) == 0. gcc/ChangeLog: PR tree-optimization/111048 * fold-const.cc (valid_mask_for_fold_vec_perm_cst_p): Set arg_npatterns correctly. (fold_vec_perm_cst): Remove workaround and again call valid_mask_fold_vec_perm_cst_p for both VLS and VLA vectors. (test_fold_vec_perm_cst::test_nunits_min_4): Add test-case.
2023-08-18tree-optimization/111048 - avoid flawed logic in fold_vec_permRichard Biener1-6/+6
The following avoids running into somehow flawed logic in fold_vec_perm for non-VLA vectors. PR tree-optimization/111048 * fold-const.cc (fold_vec_perm_cst): Check for non-VLA vectors first. * gcc.dg/torture/pr111048.c: New testcase.
2023-08-16Extend fold_vec_perm to handle VLA vector_cst.Prathamesh Kulkarni1-21/+778
The patch extends fold_vec_perm to fold VLA vector_csts. For eg: arg0 = {...}, npatterns = 1, nelts_per_pattern = 3, len = 4 + 4x arg1 = {...}, npatterns = 1, nelts_per_pattern = 3, len = 4 + 4x sel = { 0, len, ...} npatterns = 2, nelts_per_pattern = 1, len = 4 + 4x res = VEC_PERM_EXPR<arg0, arg1, sel> --> { arg0[0], arg1[0], ... }, npatterns = 2, nelts_per_pattern = 1 Eg 2: arg0 = {...}, npatterns = 1, nelts_per_pattern = 3, len = 2 + 2x arg1 = {...}, npatterns = 1, nelts_per_pattern = 3, len = 2 + 2x sel = {0, 1, 2, ...}, npatterns = 1, nelts_per_pattern = 3, len = 2 + 2x For this case the index 2 in sel is ambiguous for len 2 + 2x: if x = 0, runtime vector length = 2 and sel[i] will choose arg1[0] if x > 0, runtime vector length > 2 and sel[i] choose arg0[2]. So we return NULL_TREE for this case. This leads us to defining a constraint that a stepped sequence in sel, should only select a particular pattern from a particular input vector. Eg 3: arg0 = {...} npatterns = 1, nelts_per_pattern = 3, len = 4 + 4x arg1 = {...} npatterns = 1, nelts_per_pattern = 3, len = 4 + 4x sel = { len, 0, 2, ... } npatterns = 1, nelts_per_pattern = 3, len = 4 + 4x sel contains a single pattern with stepped sequence: {0, 2, ...}. Let, a1 = the first element of stepped part of sequence, which is 0. Let esel = number of total elements in stepped sequence. Thus, esel = len / sel_npatterns = (4 + 4x) / 1 = 4 + 4x Let S = step of the sequence, which is 2 in this case. Let ae = last element of the stepped sequence. Thus, ae = a1 + (esel - 2) * S = 0 + (4 + 4x - 2) * 2 = 4 + 8x To ensure that we select elements from the same input vector, a1 /trunc len = ae /trunc len. Let, q1 = a1 /trunc len = 0 / (4 + 4x) = 0 Let, qe = ae /trunc len = (4 + 8x) / (4 + 4x) = 1 Since q1 != qe, we cross input vectors, and return NULL_TREE for this case. However, if sel was: sel = {len, 0, 1, ...} The only change in this case is S = 1. So, ae = a1 + (esel - 2) * S = 0 + (4 + 4x - 2) * 1 = 2 + 4x In this case, a1/len == ae/len == 0, and the stepped sequence chooses all elements from arg0. Thus, res = {arg1[0], arg0[0], arg0[1], ...} For VLA folding, sel has to conform to constraints imposed in valid_mask_for_fold_vec_perm_cst_p. test_fold_vec_perm_cst defines several unit-tests for VLA folding. gcc/ChangeLog: * fold-const.cc (INCLUDE_ALGORITHM): Add Include. (valid_mask_for_fold_vec_perm_cst_p): New function. (fold_vec_perm_cst): Likewise. (fold_vec_perm): Adjust assert and call fold_vec_perm_cst. (test_fold_vec_perm_cst): New namespace. (test_fold_vec_perm_cst::build_vec_cst_rand): New function. (test_fold_vec_perm_cst::validate_res): Likewise. (test_fold_vec_perm_cst::validate_res_vls): Likewise. (test_fold_vec_perm_cst::builder_push_elems): Likewise. (test_fold_vec_perm_cst::test_vnx4si_v4si): Likewise. (test_fold_vec_perm_cst::test_v4si_vnx4si): Likewise. (test_fold_vec_perm_cst::test_all_nunits): Likewise. (test_fold_vec_perm_cst::test_nunits_min_2): Likewise. (test_fold_vec_perm_cst::test_nunits_min_4): Likewise. (test_fold_vec_perm_cst::test_nunits_min_8): Likewise. (test_fold_vec_perm_cst::test_nunits_max_4): Likewise. (test_fold_vec_perm_cst::is_simple_vla_size): Likewise. (test_fold_vec_perm_cst::test): Likewise. (fold_const_cc_tests): Call test_fold_vec_perm_cst::test. Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>
2023-07-18c++: constexpr bit_cast with empty fieldJason Merrill1-1/+2
The change to only cache constexpr calls that are reduced_constant_expression_p tripped on bit-cast3.C, which failed that predicate due to the presence of an empty field in the result of native_interpret_aggregate, which reduced_constant_expression_p rejects to avoid confusing output_constructor. This patch proposes to skip such fields in native_interpret_aggregate, since they aren't actually involved in the value representation. gcc/ChangeLog: * fold-const.cc (native_interpret_aggregate): Skip empty fields. gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_bit_cast): Check that the result of native_interpret_aggregate doesn't need more evaluation.
2023-06-30fold-const+optabs: Change return type of predicate functions from int to boolUros Bizjak1-34/+34
Also change some internal variables and function argument from int to bool. gcc/ChangeLog: * fold-const.h (multiple_of_p): Change return type from int to bool. * fold-const.cc (split_tree): Change negl_p, neg_litp_p, neg_conp_p and neg_var_p variables to bool. (const_binop): Change sat_p variable to bool. (merge_ranges): Change no_overlap variable to bool. (extract_muldiv_1): Change same_p variable to bool. (tree_swap_operands_p): Update function body for bool return type. (fold_truth_andor): Change commutative variable to bool. (multiple_of_p): Change return type from int to void and adjust function body accordingly. * optabs.h (expand_twoval_unop): Change return type from int to bool. (expand_twoval_binop): Ditto. (can_compare_p): Ditto. (have_add2_insn): Ditto. (have_addptr3_insn): Ditto. (have_sub2_insn): Ditto. (have_insn_for): Ditto. * optabs.cc (add_equal_note): Ditto. (widen_operand): Change no_extend argument from int to bool. (expand_binop): Ditto. (expand_twoval_unop): Change return type from int to void and adjust function body accordingly. (expand_twoval_binop): Ditto. (can_compare_p): Ditto. (have_add2_insn): Ditto. (have_addptr3_insn): Ditto. (have_sub2_insn): Ditto. (have_insn_for): Ditto.
2023-06-23Fix tree_simple_nonnegative_warnv_p for VECTOR_TYPEsRichard Biener1-1/+2
tree_simple_nonnegative_warnv_p ends up being called on VECTOR_TYPEs which I think even gets the wrong answer here for tcc_comparison since vector bools are signed. The following properly guards that with !VECTOR_TYPE_P. * fold-const.cc (tree_simple_nonnegative_warnv_p): Guard the truth_value_p case with !VECTOR_TYPE_P.
2023-06-23Bogus and missed folding on vector comparesRichard Biener1-2/+2
fold_binary tries to transform (double)float1 CMP (double)float2 into float1 CMP float2 but ends up using TYPE_PRECISION on the argument types. For vector types that compares the number of lanes which should be always equal (so it's harmless as to not generating wrong code). The following instead properly uses element_precision. The same happens in the corresponding match.pd pattern. * fold-const.cc (fold_binary_loc): Use element_precision when trying (double)float1 CMP (double)float2 to float1 CMP float2 simplification. * match.pd: Likewise.
2023-06-16tree-optimization/110269 - restore missed condition foldingRichard Biener1-7/+0
The following makes sure we optimize x != 0 using range info via tree_expr_nonzero_p via match.pd. PR tree-optimization/110269 * fold-const.cc (fold_binary_loc): Merge x != 0 folding with tree_expr_nonzero_p ... * match.pd (cmp (convert? addr@0) integer_zerop): With this pattern. * gcc.dg/tree-ssa/pr110269.c: New testcase.
2023-06-13middle-end/110232 - fix native interpret of vector <signed-boolean:1>Richard Biener1-7/+4
The following fixes native interpretation of a buffer as boolean vector with bit-precision elements such as AVX512 vectors. The check whether the buffer covers the whole vector was broken for bit-precision elements and the following instead implements it based on the vector type size. PR middle-end/110232 * fold-const.cc (native_interpret_vector): Use TYPE_SIZE_UNIT to check whether the buffer covers the whole vector. * gcc.target/i386/pr110232.c: New testcase.
2023-05-30Add a != MIN/MAX_VALUE_CST ? CST-+1 : a to minmax_from_comparisonAndrew Pinski1-0/+26
This patch adds the support for match that was implemented for PR 87913 in phiopt. It implements it by adding support to minmax_from_comparison for the check. It uses the range information if available which allows to produce MIN/MAX expression when comparing against the lower/upper bound of the range instead of lower/upper of the type. minmax-20.c is the new testcase which tests the ranges part. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * fold-const.cc (minmax_from_comparison): Add support for NE_EXPR. * match.pd ((cond (cmp (convert1? x) c1) (convert2? x) c2) pattern): Add ne as a possible cmp. ((a CMP b) ? minmax<a, c> : minmax<b, c> pattern): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/minmax-22.c: New test.
2023-05-23Fix handling of non-integral bit-fields in native_encode_initializerEric Botcazou1-10/+17
The encoder for CONSTRUCTORs assumes that all bit-fields (DECL_BIT_FIELD) have integral types, but that's not the case in Ada where they may have pretty much any type, resulting in a wrong encoding for them gcc/ * fold-const.cc (native_encode_initializer) <CONSTRUCTOR>: Apply the specific treatment for bit-fields only if they have an integral type and filter out non-integral bit-fields that do not start and end on a byte boundary. gcc/testsuite/ * gnat.dg/opt101.adb: New test. * gnat.dg/opt101_pkg.ads: New helper.
2023-05-20Move fold_single_bit_test to expr.cc from fold-const.ccAndrew Pinski1-112/+0
This is part 1 of N patch set that will change the expansion of `(A & C) != 0` from using trees to directly expanding so later on we can do some cost analysis. Since the only user of fold_single_bit_test is now expand, move it to there. gcc/ChangeLog: * fold-const.cc (fold_single_bit_test_into_sign_test): Move to expr.cc. (fold_single_bit_test): Likewise. * expr.cc (fold_single_bit_test_into_sign_test): Move from fold-const.cc (fold_single_bit_test): Likewise and make static. * fold-const.h (fold_single_bit_test): Remove declaration.
2023-05-18gcc: use _P() defines from tree.hBernhard Reutner-Fischer1-24/+22
gcc/ChangeLog: * alias.cc (ref_all_alias_ptr_type_p): Use _P() defines from tree.h. * attribs.cc (diag_attr_exclusions): Ditto. (decl_attributes): Ditto. (build_type_attribute_qual_variant): Ditto. * builtins.cc (fold_builtin_carg): Ditto. (fold_builtin_next_arg): Ditto. (do_mpc_arg2): Ditto. * cfgexpand.cc (expand_return): Ditto. * cgraph.h (decl_in_symtab_p): Ditto. (symtab_node::get_create): Ditto. * dwarf2out.cc (base_type_die): Ditto. (implicit_ptr_descriptor): Ditto. (gen_array_type_die): Ditto. (gen_type_die_with_usage): Ditto. (optimize_location_into_implicit_ptr): Ditto. * expr.cc (do_store_flag): Ditto. * fold-const.cc (negate_expr_p): Ditto. (fold_negate_expr_1): Ditto. (fold_convert_const): Ditto. (fold_convert_loc): Ditto. (constant_boolean_node): Ditto. (fold_binary_op_with_conditional_arg): Ditto. (build_fold_addr_expr_with_type_loc): Ditto. (fold_comparison): Ditto. (fold_checksum_tree): Ditto. (tree_unary_nonnegative_warnv_p): Ditto. (integer_valued_real_unary_p): Ditto. (fold_read_from_constant_string): Ditto. * gcc-rich-location.cc (maybe_range_label_for_tree_type_mismatch::get_text): Ditto. * gimple-expr.cc (useless_type_conversion_p): Ditto. (is_gimple_reg): Ditto. (is_gimple_asm_val): Ditto. (mark_addressable): Ditto. * gimple-expr.h (is_gimple_variable): Ditto. (virtual_operand_p): Ditto. * gimple-ssa-warn-access.cc (pass_waccess::check_dangling_stores): Ditto. * gimplify.cc (gimplify_bind_expr): Ditto. (gimplify_return_expr): Ditto. (gimple_add_padding_init_for_auto_var): Ditto. (gimplify_addr_expr): Ditto. (omp_add_variable): Ditto. (omp_notice_variable): Ditto. (omp_get_base_pointer): Ditto. (omp_strip_components_and_deref): Ditto. (omp_strip_indirections): Ditto. (omp_accumulate_sibling_list): Ditto. (omp_build_struct_sibling_lists): Ditto. (gimplify_adjust_omp_clauses_1): Ditto. (gimplify_adjust_omp_clauses): Ditto. (gimplify_omp_for): Ditto. (goa_lhs_expr_p): Ditto. (gimplify_one_sizepos): Ditto. * graphite-scop-detection.cc (scop_detection::graphite_can_represent_scev): Ditto. * ipa-devirt.cc (odr_types_equivalent_p): Ditto. * ipa-prop.cc (ipa_set_jf_constant): Ditto. (propagate_controlled_uses): Ditto. * ipa-sra.cc (type_prevails_p): Ditto. (scan_expr_access): Ditto. * optabs-tree.cc (optab_for_tree_code): Ditto. * toplev.cc (wrapup_global_declaration_1): Ditto. * trans-mem.cc (transaction_invariant_address_p): Ditto. * tree-cfg.cc (verify_types_in_gimple_reference): Ditto. (verify_gimple_comparison): Ditto. (verify_gimple_assign_binary): Ditto. (verify_gimple_assign_single): Ditto. * tree-complex.cc (get_component_ssa_name): Ditto. * tree-emutls.cc (lower_emutls_2): Ditto. * tree-inline.cc (copy_tree_body_r): Ditto. (estimate_move_cost): Ditto. (copy_decl_for_dup_finish): Ditto. * tree-nested.cc (convert_nonlocal_omp_clauses): Ditto. (note_nonlocal_vla_type): Ditto. (convert_local_omp_clauses): Ditto. (remap_vla_decls): Ditto. (fixup_vla_decls): Ditto. * tree-parloops.cc (loop_has_vector_phi_nodes): Ditto. * tree-pretty-print.cc (print_declaration): Ditto. (print_call_name): Ditto. * tree-sra.cc (compare_access_positions): Ditto. * tree-ssa-alias.cc (compare_type_sizes): Ditto. * tree-ssa-ccp.cc (get_default_value): Ditto. * tree-ssa-coalesce.cc (populate_coalesce_list_for_outofssa): Ditto. * tree-ssa-dom.cc (reduce_vector_comparison_to_scalar_comparison): Ditto. * tree-ssa-forwprop.cc (can_propagate_from): Ditto. * tree-ssa-propagate.cc (may_propagate_copy): Ditto. * tree-ssa-sccvn.cc (fully_constant_vn_reference_p): Ditto. * tree-ssa-sink.cc (statement_sink_location): Ditto. * tree-ssa-structalias.cc (type_must_have_pointers): Ditto. * tree-ssa-ter.cc (find_replaceable_in_bb): Ditto. * tree-ssa-uninit.cc (warn_uninit): Ditto. * tree-ssa.cc (maybe_rewrite_mem_ref_base): Ditto. (non_rewritable_mem_ref_base): Ditto. * tree-streamer-in.cc (lto_input_ts_type_non_common_tree_pointers): Ditto. * tree-streamer-out.cc (write_ts_type_non_common_tree_pointers): Ditto. * tree-vect-generic.cc (do_binop): Ditto. (do_cond): Ditto. * tree-vect-stmts.cc (vect_init_vector): Ditto. * tree-vector-builder.h (tree_vector_builder::note_representative): Ditto. * tree.cc (sign_mask_for): Ditto. (verify_type_variant): Ditto. (gimple_canonical_types_compatible_p): Ditto. (verify_type): Ditto. * ubsan.cc (get_ubsan_type_info_for_type): Ditto. * var-tracking.cc (prepare_call_arguments): Ditto. (vt_add_function_parameters): Ditto. * varasm.cc (decode_addr_const): Ditto.
2023-05-01Conversion to irange wide_int API.Aldy Hernandez1-2/+1
This converts the irange API to use wide_ints exclusively, along with its users. This patch will slow down VRP, as there will be more useless wide_int to tree conversions. However, this slowdown is only temporary, as a follow-up patch will convert the internal representation of iranges to wide_ints for a net overall gain in performance. gcc/ChangeLog: * fold-const.cc (expr_not_equal_to): Convert to irange wide_int API. * gimple-fold.cc (size_must_be_zero_p): Same. * gimple-loop-versioning.cc (loop_versioning::prune_loop_conditions): Same. * gimple-range-edge.cc (gcond_edge_range): Same. (gimple_outgoing_range::calc_switch_ranges): Same. * gimple-range-fold.cc (adjust_imagpart_expr): Same. (adjust_realpart_expr): Same. (fold_using_range::range_of_address): Same. (fold_using_range::relation_fold_and_or): Same. * gimple-range-gori.cc (gori_compute::gori_compute): Same. (range_is_either_true_or_false): Same. * gimple-range-op.cc (cfn_toupper_tolower::get_letter_range): Same. (cfn_clz::fold_range): Same. (cfn_ctz::fold_range): Same. * gimple-range-tests.cc (class test_expr_eval): Same. * gimple-ssa-warn-alloca.cc (alloca_call_type): Same. * ipa-cp.cc (ipa_value_range_from_jfunc): Same. (propagate_vr_across_jump_function): Same. (decide_whether_version_node): Same. * ipa-prop.cc (ipa_get_value_range): Same. * ipa-prop.h (ipa_range_set_and_normalize): Same. * range-op.cc (get_shift_range): Same. (value_range_from_overflowed_bounds): Same. (value_range_with_overflow): Same. (create_possibly_reversed_range): Same. (equal_op1_op2_relation): Same. (not_equal_op1_op2_relation): Same. (lt_op1_op2_relation): Same. (le_op1_op2_relation): Same. (gt_op1_op2_relation): Same. (ge_op1_op2_relation): Same. (operator_mult::op1_range): Same. (operator_exact_divide::op1_range): Same. (operator_lshift::op1_range): Same. (operator_rshift::op1_range): Same. (operator_cast::op1_range): Same. (operator_logical_and::fold_range): Same. (set_nonzero_range_from_mask): Same. (operator_bitwise_or::op1_range): Same. (operator_bitwise_xor::op1_range): Same. (operator_addr_expr::fold_range): Same. (pointer_plus_operator::wi_fold): Same. (pointer_or_operator::op1_range): Same. (INT): Same. (UINT): Same. (INT16): Same. (UINT16): Same. (SCHAR): Same. (UCHAR): Same. (range_op_cast_tests): Same. (range_op_lshift_tests): Same. (range_op_rshift_tests): Same. (range_op_bitwise_and_tests): Same. (range_relational_tests): Same. * range.cc (range_zero): Same. (range_nonzero): Same. * range.h (range_true): Same. (range_false): Same. (range_true_and_false): Same. * tree-data-ref.cc (split_constant_offset_1): Same. * tree-ssa-loop-ch.cc (entry_loop_condition_is_static): Same. * tree-ssa-loop-unswitch.cc (struct unswitch_predicate): Same. (find_unswitching_predicates_for_bb): Same. * tree-ssa-phiopt.cc (value_replacement): Same. * tree-ssa-threadbackward.cc (back_threader::find_taken_edge_cond): Same. * tree-ssanames.cc (ssa_name_has_boolean_range): Same. * tree-vrp.cc (find_case_label_range): Same. * value-query.cc (range_query::get_tree_range): Same. * value-range.cc (irange::set_nonnegative): Same. (frange::contains_p): Same. (frange::singleton_p): Same. (frange::internal_singleton_p): Same. (irange::irange_set): Same. (irange::irange_set_1bit_anti_range): Same. (irange::irange_set_anti_range): Same. (irange::set): Same. (irange::operator==): Same. (irange::singleton_p): Same. (irange::contains_p): Same. (irange::set_range_from_nonzero_bits): Same. (DEFINE_INT_RANGE_INSTANCE): Same. (INT): Same. (UINT): Same. (SCHAR): Same. (UINT128): Same. (UCHAR): Same. (range): New. (tree_range): New. (range_int): New. (range_uint): New. (range_uint128): New. (range_uchar): New. (range_char): New. (build_range3): Convert to irange wide_int API. (range_tests_irange3): Same. (range_tests_int_range_max): Same. (range_tests_strict_enum): Same. (range_tests_misc): Same. (range_tests_nonzero_bits): Same. (range_tests_nan): Same. (range_tests_signed_zeros): Same. * value-range.h (Value_Range::Value_Range): Same. (irange::set): Same. (irange::nonzero_p): Same. (irange::contains_p): Same. (range_includes_zero_p): Same. (irange::set_nonzero): Same. (irange::set_zero): Same. (contains_zero_p): Same. (frange::contains_p): Same. * vr-values.cc (simplify_using_ranges::op_with_boolean_value_range_p): Same. (bounds_of_var_in_loop): Same. (simplify_using_ranges::legacy_fold_cond_overflow): Same.
2023-04-28MATCH: Factor out code that for min max detection with constantsAndrew Pinski1-0/+44
This factors out some of the code from the min/max detection from match.pd into a function so it can be reused in other places. This is mainly used to detect the conversions of >= to > which causes the integer values to be changed by one. Changes since v1: * factor out the checks for INTEGER_CSTs so it is more obvious. OK? Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: * match.pd: Factor out the deciding the min/max from the "(cond (cmp (convert1? x) c1) (convert2? x) c2)" pattern to ... * fold-const.cc (minmax_from_comparison): this new function. * fold-const.h (minmax_from_comparison): New prototype.
2023-04-04sanitizer: missing signed integer overflow errors [PR109107]Marek Polacek1-1/+2
Here we're failing to detect a signed overflow with -O because match.pd, since r8-1516, transforms c = (a + 1) - (int) (short int) b; into c = (int) ((unsigned int) a + 4294946117); wrongly eliding the overflow. This kind of problems is usually avoided by using TYPE_OVERFLOW_SANITIZED in the appropriate place. The first match.pd hunk in the patch fixes it. I've constructed a testcase for each of the surrounding cases as well. Then I noticed that fold_binary_loc/associate has the same problem, so I've added a TYPE_OVERFLOW_SANITIZED there as well (it may be too coarse, sorry). Then I found yet another problem, but instead of fixing it now I've opened 109134. I could probably go on and find a dozen more. PR sanitizer/109107 gcc/ChangeLog: * fold-const.cc (fold_binary_loc): Use TYPE_OVERFLOW_SANITIZED when associating. * match.pd: Use TYPE_OVERFLOW_SANITIZED. gcc/testsuite/ChangeLog: * c-c++-common/ubsan/pr109107-1.c: New test. * c-c++-common/ubsan/pr109107-2.c: New test. * c-c++-common/ubsan/pr109107-3.c: New test. * c-c++-common/ubsan/pr109107-4.c: New test.
2023-03-23c: [PR84900] cast of compound literal does not cause the code to become a ↵Andrew Pinski1-0/+1
non-lvalue The problem here is after r0-92187-g2ec5deb5c3146c, maybe_lvalue_p would return false for compound literals which causes non_lvalue_loc not to wrap the expression with a NON_LVALUE_EXPR unlike before when it return true as it returns true for all language specific tree codes. This fixes that oversight and fixes the testcase to have the cast as a non-lvalue. Committed to the trunk as obvious after a bootstrap/test on x86_64-linux-gnu. PR c/84900 gcc/ChangeLog: * fold-const.cc (maybe_lvalue_p): Treat COMPOUND_LITERAL_EXPR as a lvalue. gcc/testsuite/ChangeLog: * gcc.dg/compound-literal-cast-lvalue-1.c: New test.
2023-03-09middle-end/108995 - avoid folding when sanitizing overflowRichard Biener1-4/+3
The following plugs one place in extract_muldiv where it should avoid folding when sanitizing overflow. PR middle-end/108995 * fold-const.cc (extract_muldiv_1): Avoid folding (CST * b) / CST2 when sanitizing overflow and we rely on overflow being undefined. * gcc.dg/ubsan/pr108995.c: New testcase.
2023-03-02fold-const: Ignore padding bits in native_interpret_expr REAL_CST reverse ↵Jakub Jelinek1-2/+4
verification [PR108934] In the following testcase we try to std::bit_cast a (pair of) integral value(s) which has some non-zero bits in the place of x86 long double (for 64-bit 16 byte type with 10 bytes actually loaded/stored by hw, for 32-bit 12 byte) and starting with my PR104522 change we reject that as native_interpret_expr fails on it. The PR104522 change extends what has been done before for MODE_COMPOSITE_P (but those don't have any padding bits) to all floating point types, because e.g. the exact x86 long double has various bit combinations we don't support, like pseudo-(denormals,infinities,NaNs) or unnormals. The HW handles some of those as exceptional cases and others similarly to the non-pseudo ones. But for the padding bits it actually doesn't load/store those bits at all, it loads/stores 10 bytes. So, I think we should exempt the padding bits from the reverse comparison (the native_encode_expr bits for the padding will be all zeros), which the following patch does. For bit_cast it is similar to e.g. ignoring padding bits if the destination is a structure which has padding bits in there. The change changed auto-init-4.c to how it has been behaving before the PR105259 change, where some more VCEs can be now done. 2023-03-02 Jakub Jelinek <jakub@redhat.com> PR c++/108934 * fold-const.cc (native_interpret_expr) <case REAL_CST>: Before memcmp comparison copy the bytes from ptr to a temporary buffer and clearing padding bits in there. * gcc.target/i386/auto-init-4.c: Revert PR105259 change. * g++.target/i386/pr108934.C: New test.
2023-01-04ubsan: Avoid narrowing of multiply for -fsanitize=signed-integer-overflow ↵Jakub Jelinek1-1/+3
[PR108256] We shouldn't narrow multiplications originally done in signed types, because the original multiplication might overflow but the narrowed one will be done in unsigned arithmetics and will never overflow. 2023-01-04 Jakub Jelinek <jakub@redhat.com> PR sanitizer/108256 * convert.cc (do_narrow): Punt for MULT_EXPR if original type doesn't wrap around and -fsanitize=signed-integer-overflow is on. * fold-const.cc (fold_unary_loc) <CASE_CONVERT>: Likewise. * c-c++-common/ubsan/pr108256.c: New test.
2023-01-02Update copyright years.Jakub Jelinek1-1/+1
2022-12-20fold-const: Treat fp conversion to a type with same mode as copyKewen Lin1-0/+9
In function fold_convert_const_real_from_real, when the modes of two types involved in fp conversion are the same, we can simply take it as copy, rebuild with the exactly same TREE_REAL_CST and the target type. It is more efficient and helps to avoid possible unexpected signalling bit clearing in [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608533.html gcc/ChangeLog: * fold-const.cc (fold_convert_const_real_from_real): Treat floating point conversion to a type with same mode as copy instead of normal convertFormat.
2022-12-20fold: fix use of protected_set_expr_location_unshareJason Merrill1-1/+1
Unlike protected_set_expr_location, this variant can return a different tree. gcc/ChangeLog: * fold-const.cc (fold_convert_loc): Check return value of protected_set_expr_location_unshare.
2022-12-20c++: source position of lambda captures [PR84471]Jason Merrill1-1/+1
If the DECL_VALUE_EXPR of a VAR_DECL has EXPR_LOCATION set, then any use of that variable looks like it has that location, which leads to the debugger jumping back and forth for both lambdas and structured bindings. Rather than fix all the uses, it seems simplest to remove any EXPR_LOCATION when setting DECL_VALUE_EXPR. So the cp/ hunks aren't necessary, but they avoid the need to unshare to remove the location. PR c++/84471 PR c++/107504 gcc/cp/ChangeLog: * coroutines.cc (transform_local_var_uses): Don't specify a location for DECL_VALUE_EXPR. * decl.cc (cp_finish_decomp): Likewise. gcc/ChangeLog: * fold-const.cc (protected_set_expr_location_unshare): Not static. * tree.h: Declare it. * tree.cc (decl_value_expr_insert): Use it. include/ChangeLog: * ansidecl.h (ATTRIBUTE_WARN_UNUSED_RESULT): Add __. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/value-expr1.C: New test. * g++.dg/tree-ssa/value-expr2.C: New test. * g++.dg/analyzer/pr93212.C: Move warning.
2022-12-12Revert parts of ADDR_EXPR/CONSTRUCTOR treatment change in match.pdRichard Biener1-0/+9
This reverts the part that substitutes from the definition of an SSA name to the capture, thus ADDR_EXPR@0 eventually yielding &y_1->a[i_2] instead of _3. That's because I didn't think of how to deal with substituting @0 in the result pattern. So the following re-instantiates the SSA def CONSTRUCTOR handling and in the ADDR_EXPR helpers used by match.pd handles SSA names defined to ADDR_EXPRs transparently. * genmatch.cc (dt_simplify::gen): Revert last change. * match.pd: Revert simplification of CONSTUCTOR leaf handling. (&x cmp SSA_NAME): Handle ADDR_EXPR in SSA defs. * fold-const.cc (split_address_to_core_and_offset): Handle ADDR_EXPRs in SSA defs. (address_compare): Likewise.
2022-12-02Fix a few incorrect accesses.Andrew MacLeod1-3/+3
This consists of 3 changes which stronger type checking has indicated are incorrect. gcc/ * fold-const.cc (fold_unary_loc): Check TREE_TYPE of node. (tree_invalid_nonnegative_warnv_p): Likewise. gcc/c-family/ * c-attribs.cc (handle_deprecated_attribute): Use type when using TYPE_NAME.
2022-11-24Remove ASSERT_EXPR.Aldy Hernandez1-6/+0
This removes all uses of ASSERT_EXPR except the internal one in ipa-*. gcc/ChangeLog: * doc/gimple.texi: Remove ASSERT_EXPR references. * fold-const.cc (tree_expr_nonzero_warnv_p): Same. (fold_binary_loc): Same. (tree_expr_nonnegative_warnv_p): Same. * gimple-array-bounds.cc (get_base_decl): Same. * gimple-pretty-print.cc (dump_unary_rhs): Same. * gimple.cc (get_gimple_rhs_num_ops): Same. * pointer-query.cc (handle_ssa_name): Same. * tree-cfg.cc (verify_gimple_assign_single): Same. * tree-pretty-print.cc (dump_generic_node): Same. * tree-scalar-evolution.cc (scev_dfs::follow_ssa_edge_expr):Same. (interpret_rhs_expr): Same. * tree-ssa-operands.cc (operands_scanner::get_expr_operands): Same. * tree-ssa-propagate.cc (substitute_and_fold_dom_walker::before_dom_children): Same. * tree-ssa-threadedge.cc: Same. * tree-vrp.cc (overflow_comparison_p): Same. * tree.def (ASSERT_EXPR): Add note. * tree.h (ASSERT_EXPR_VAR): Remove. (ASSERT_EXPR_COND): Remove. * vr-values.cc (simplify_using_ranges::vrp_visit_cond_stmt): Remove comment.
2022-11-04Fix recent thinko in operand_equal_pEric Botcazou1-14/+4
There is a thinko in a recent improvement made to operand_equal_p where the code just looks at operand 2 of COMPONENT_REF, if it is present, to compare addresses. That's wrong because operand 2 contains the number of DECL_OFFSET_ALIGN-bit-sized words so, when DECL_OFFSET_ALIGN > 8, not all the bytes are included and some of them are in DECL_FIELD_BIT_OFFSET, see get_inner_reference for the model computation. In other words, you would need to compare operand 2 and DECL_OFFSET_ALIGN and DECL_FIELD_BIT_OFFSET in this situation, but I'm not sure this is worth the hassle in practice so the fix just removes this alternate handling. gcc/ * fold-const.cc (operand_compare::operand_equal_p) <COMPONENT_REF>: Do not take into account operand 2. (operand_compare::hash_operand) <COMPONENT_REF>: Likewise. gcc/testsuite/ * gnat.dg/opt99.adb: New test. * gnat.dg/opt99_pkg1.ads, gnat.dg/opt99_pkg1.adb: New helper. * gnat.dg/opt99_pkg2.ads: Likewise.
2022-10-31builtins: Add various complex builtins for _Float{16,32,64,128,32x,64x,128x}Jakub Jelinek1-0/+37
The following patch adds some complex builtins which have libm implementation in glibc 2.26 and later on various arches. It is needed for libstdc++ _Float128 support when long double is not IEEE quad. 2022-10-31 Jakub Jelinek <jakub@redhat.com> * builtin-types.def (BT_COMPLEX_FLOAT16, BT_COMPLEX_FLOAT32, BT_COMPLEX_FLOAT64, BT_COMPLEX_FLOAT128, BT_COMPLEX_FLOAT32X, BT_COMPLEX_FLOAT64X, BT_COMPLEX_FLOAT128X, BT_FN_COMPLEX_FLOAT16_COMPLEX_FLOAT16, BT_FN_COMPLEX_FLOAT32_COMPLEX_FLOAT32, BT_FN_COMPLEX_FLOAT64_COMPLEX_FLOAT64, BT_FN_COMPLEX_FLOAT128_COMPLEX_FLOAT128, BT_FN_COMPLEX_FLOAT32X_COMPLEX_FLOAT32X, BT_FN_COMPLEX_FLOAT64X_COMPLEX_FLOAT64X, BT_FN_COMPLEX_FLOAT128X_COMPLEX_FLOAT128X, BT_FN_FLOAT16_COMPLEX_FLOAT16, BT_FN_FLOAT32_COMPLEX_FLOAT32, BT_FN_FLOAT64_COMPLEX_FLOAT64, BT_FN_FLOAT128_COMPLEX_FLOAT128, BT_FN_FLOAT32X_COMPLEX_FLOAT32X, BT_FN_FLOAT64X_COMPLEX_FLOAT64X, BT_FN_FLOAT128X_COMPLEX_FLOAT128X, BT_FN_COMPLEX_FLOAT16_COMPLEX_FLOAT16_COMPLEX_FLOAT16, BT_FN_COMPLEX_FLOAT32_COMPLEX_FLOAT32_COMPLEX_FLOAT32, BT_FN_COMPLEX_FLOAT64_COMPLEX_FLOAT64_COMPLEX_FLOAT64, BT_FN_COMPLEX_FLOAT128_COMPLEX_FLOAT128_COMPLEX_FLOAT128, BT_FN_COMPLEX_FLOAT32X_COMPLEX_FLOAT32X_COMPLEX_FLOAT32X, BT_FN_COMPLEX_FLOAT64X_COMPLEX_FLOAT64X_COMPLEX_FLOAT64X, BT_FN_COMPLEX_FLOAT128X_COMPLEX_FLOAT128X_COMPLEX_FLOAT128X): New. * builtins.def (CABS_TYPE, CACOSH_TYPE, CARG_TYPE, CASINH_TYPE, CPOW_TYPE, CPROJ_TYPE): Define and undefine later. (BUILT_IN_CABS, BUILT_IN_CACOSH, BUILT_IN_CACOS, BUILT_IN_CARG, BUILT_IN_CASINH, BUILT_IN_CASIN, BUILT_IN_CATANH, BUILT_IN_CATAN, BUILT_IN_CCOSH, BUILT_IN_CCOS, BUILT_IN_CEXP, BUILT_IN_CLOG, BUILT_IN_CPOW, BUILT_IN_CPROJ, BUILT_IN_CSINH, BUILT_IN_CSIN, BUILT_IN_CSQRT, BUILT_IN_CTANH, BUILT_IN_CTAN): Add DEF_EXT_LIB_FLOATN_NX_BUILTINS. * fold-const-call.cc (fold_const_call_sc, fold_const_call_cc, fold_const_call_ccc): Add various CASE_CFN_*_FN: cases when CASE_CFN_* is present. * gimple-ssa-backprop.cc (backprop::process_builtin_call_use): Likewise. * builtins.cc (expand_builtin, fold_builtin_1): Likewise. * fold-const.cc (negate_mathfn_p, tree_expr_finite_p, tree_expr_maybe_signaling_nan_p, tree_expr_maybe_nan_p, tree_expr_maybe_real_minus_zero_p, tree_call_nonnegative_warnv_p): Likewise.
2022-10-31builtins: Add various __builtin_*f{16,32,64,128,32x,64x,128x} builtinsJakub Jelinek1-0/+27
When working on libstdc++ extended float support in <cmath>, I found that we need various builtins for the _Float{16,32,64,128,32x,64x,128x} types. Glibc 2.26 and later provides the underlying libm routines (except for _Float16 and _Float128x for the time being) and in libstdc++ I think we need at least the _Float128 builtins on x86_64, i?86, powerpc64le and ia64 (when long double is IEEE quad, we can handle it by using __builtin_*l instead), because without the builtins the overloads couldn't be constexpr (say when it would declare the *f128 extern "C" routines itself and call them). The testcase covers just types of those builtins and their constant folding, so doesn't need actual libm support. 2022-10-31 Jakub Jelinek <jakub@redhat.com> * builtin-types.def (BT_FLOAT16_PTR, BT_FLOAT32_PTR, BT_FLOAT64_PTR, BT_FLOAT128_PTR, BT_FLOAT32X_PTR, BT_FLOAT64X_PTR, BT_FLOAT128X_PTR): New DEF_PRIMITIVE_TYPE. (BT_FN_INT_FLOAT16, BT_FN_INT_FLOAT32, BT_FN_INT_FLOAT64, BT_FN_INT_FLOAT128, BT_FN_INT_FLOAT32X, BT_FN_INT_FLOAT64X, BT_FN_INT_FLOAT128X, BT_FN_LONG_FLOAT16, BT_FN_LONG_FLOAT32, BT_FN_LONG_FLOAT64, BT_FN_LONG_FLOAT128, BT_FN_LONG_FLOAT32X, BT_FN_LONG_FLOAT64X, BT_FN_LONG_FLOAT128X, BT_FN_LONGLONG_FLOAT16, BT_FN_LONGLONG_FLOAT32, BT_FN_LONGLONG_FLOAT64, BT_FN_LONGLONG_FLOAT128, BT_FN_LONGLONG_FLOAT32X, BT_FN_LONGLONG_FLOAT64X, BT_FN_LONGLONG_FLOAT128X): New DEF_FUNCTION_TYPE_1. (BT_FN_FLOAT16_FLOAT16_FLOAT16PTR, BT_FN_FLOAT32_FLOAT32_FLOAT32PTR, BT_FN_FLOAT64_FLOAT64_FLOAT64PTR, BT_FN_FLOAT128_FLOAT128_FLOAT128PTR, BT_FN_FLOAT32X_FLOAT32X_FLOAT32XPTR, BT_FN_FLOAT64X_FLOAT64X_FLOAT64XPTR, BT_FN_FLOAT128X_FLOAT128X_FLOAT128XPTR, BT_FN_FLOAT16_FLOAT16_INT, BT_FN_FLOAT32_FLOAT32_INT, BT_FN_FLOAT64_FLOAT64_INT, BT_FN_FLOAT128_FLOAT128_INT, BT_FN_FLOAT32X_FLOAT32X_INT, BT_FN_FLOAT64X_FLOAT64X_INT, BT_FN_FLOAT128X_FLOAT128X_INT, BT_FN_FLOAT16_FLOAT16_INTPTR, BT_FN_FLOAT32_FLOAT32_INTPTR, BT_FN_FLOAT64_FLOAT64_INTPTR, BT_FN_FLOAT128_FLOAT128_INTPTR, BT_FN_FLOAT32X_FLOAT32X_INTPTR, BT_FN_FLOAT64X_FLOAT64X_INTPTR, BT_FN_FLOAT128X_FLOAT128X_INTPTR, BT_FN_FLOAT16_FLOAT16_LONG, BT_FN_FLOAT32_FLOAT32_LONG, BT_FN_FLOAT64_FLOAT64_LONG, BT_FN_FLOAT128_FLOAT128_LONG, BT_FN_FLOAT32X_FLOAT32X_LONG, BT_FN_FLOAT64X_FLOAT64X_LONG, BT_FN_FLOAT128X_FLOAT128X_LONG): New DEF_FUNCTION_TYPE_2. (BT_FN_FLOAT16_FLOAT16_FLOAT16_INTPTR, BT_FN_FLOAT32_FLOAT32_FLOAT32_INTPTR, BT_FN_FLOAT64_FLOAT64_FLOAT64_INTPTR, BT_FN_FLOAT128_FLOAT128_FLOAT128_INTPTR, BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_INTPTR, BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_INTPTR, BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_INTPTR): New DEF_FUNCTION_TYPE_3. * builtins.def (ACOSH_TYPE, ATAN2_TYPE, ATANH_TYPE, COSH_TYPE, FDIM_TYPE, HUGE_VAL_TYPE, HYPOT_TYPE, ILOGB_TYPE, LDEXP_TYPE, LGAMMA_TYPE, LLRINT_TYPE, LOG10_TYPE, LRINT_TYPE, MODF_TYPE, NEXTAFTER_TYPE, REMQUO_TYPE, SCALBLN_TYPE, SCALBN_TYPE, SINH_TYPE): Define and undefine later. (FMIN_TYPE, SQRT_TYPE): Undefine at a later line. (INF_TYPE): Define at a later line. (BUILT_IN_ACOSH, BUILT_IN_ACOS, BUILT_IN_ASINH, BUILT_IN_ASIN, BUILT_IN_ATAN2, BUILT_IN_ATANH, BUILT_IN_ATAN, BUILT_IN_CBRT, BUILT_IN_COSH, BUILT_IN_COS, BUILT_IN_ERFC, BUILT_IN_ERF, BUILT_IN_EXP2, BUILT_IN_EXP, BUILT_IN_EXPM1, BUILT_IN_FDIM, BUILT_IN_FMOD, BUILT_IN_FREXP, BUILT_IN_HYPOT, BUILT_IN_ILOGB, BUILT_IN_LDEXP, BUILT_IN_LGAMMA, BUILT_IN_LLRINT, BUILT_IN_LLROUND, BUILT_IN_LOG10, BUILT_IN_LOG1P, BUILT_IN_LOG2, BUILT_IN_LOGB, BUILT_IN_LOG, BUILT_IN_LRINT, BUILT_IN_LROUND, BUILT_IN_MODF, BUILT_IN_NEXTAFTER, BUILT_IN_POW, BUILT_IN_REMAINDER, BUILT_IN_REMQUO, BUILT_IN_SCALBLN, BUILT_IN_SCALBN, BUILT_IN_SINH, BUILT_IN_SIN, BUILT_IN_TANH, BUILT_IN_TAN, BUILT_IN_TGAMMA): Add DEF_EXT_LIB_FLOATN_NX_BUILTINS. (BUILT_IN_HUGE_VAL): Use HUGE_VAL_TYPE instead of INF_TYPE in DEF_GCC_FLOATN_NX_BUILTINS. * fold-const-call.cc (fold_const_call_ss): Add various CASE_CFN_*_FN: cases when CASE_CFN_* is present. (fold_const_call_sss): Likewise. * builtins.cc (mathfn_built_in_2): Use CASE_MATHFN_FLOATN instead of CASE_MATHFN for various builtins in SEQ_OF_CASE_MATHFN macro. (builtin_with_linkage_p): Add CASE_FLT_FN_FLOATN_NX for various builtins next to CASE_FLT_FN. * fold-const.cc (tree_call_nonnegative_warnv_p): Add CASE_CFN_*_FN: next to CASE_CFN_*: for various builtins. * tree-call-cdce.cc (can_test_argument_range): Add CASE_FLT_FN_FLOATN_NX next to CASE_FLT_FN for various builtins. (edom_only_function): Likewise. * gcc.dg/torture/floatn-builtin.h: Add tests for newly added builtins.
2022-10-06c++, c: Implement C++23 P1774R8 - Portable assumptions [PR106654]Jakub Jelinek1-15/+13
The following patch implements C++23 P1774R8 - Portable assumptions paper, by introducing support for [[assume (cond)]]; attribute for C++. In addition to that the patch adds [[gnu::assume (cond)]]; and __attribute__((assume (cond))); support to both C and C++. As described in C++23, the attribute argument is conditional-expression rather than the usual assignment-expression for attribute arguments, the condition is contextually converted to bool (for C truthvalue conversion is done on it) and is never evaluated at runtime. For C++ constant expression evaluation, I only check the simplest conditions for undefined behavior, because otherwise I'd need to undo changes to *ctx->global which happened during the evaluation (but I believe the spec allows that and we can further improve later). The patch uses a new internal function, .ASSUME, to hold the condition in the FEs. At gimplification time, if the condition is simple/without side-effects, it is gimplified as if (cond) ; else __builtin_unreachable (); and otherwise for now dropped on the floor. The intent is to incrementally outline the conditions into separate artificial functions and use .ASSUME further to tell the ranger and perhaps other optimization passes about the assumptions, as detailed in the PR. When implementing it, I found that assume entry hasn't been added to https://eel.is/c++draft/cpp.cond#6 Jonathan said he'll file a NB comment about it, this patch assumes it has been added into the table as 202207L when the paper has been voted in. With the attributes for both C/C++, I'd say we don't need to add __builtin_assume with similar purpose, especially when __builtin_assume in LLVM is just weird. It is strange for side-effects in function call's argument not to be evaluated, and LLVM in that case (annoyingly) warns and ignores the side-effects (but doesn't do then anything with it), if there are no side-effects, it will work like our if (!cond) __builtin_unreachable (); 2022-10-06 Jakub Jelinek <jakub@redhat.com> PR c++/106654 gcc/ * internal-fn.def (ASSUME): New internal function. * internal-fn.h (expand_ASSUME): Declare. * internal-fn.cc (expand_ASSUME): Define. * gimplify.cc (gimplify_call_expr): Gimplify IFN_ASSUME. * fold-const.h (simple_condition_p): Declare. * fold-const.cc (simple_operand_p_2): Rename to ... (simple_condition_p): ... this. Remove forward declaration. No longer static. Adjust function comment and fix a typo in it. Adjust recursive call. (simple_operand_p): Adjust function comment. (fold_truth_andor): Adjust simple_operand_p_2 callers to call simple_condition_p. * doc/extend.texi: Document assume attribute. Move fallthrough attribute example to its section. gcc/c-family/ * c-attribs.cc (handle_assume_attribute): New function. (c_common_attribute_table): Add entry for assume attribute. * c-lex.cc (c_common_has_attribute): Handle __have_cpp_attribute (assume). gcc/c/ * c-parser.cc (handle_assume_attribute): New function. (c_parser_declaration_or_fndef): Handle assume attribute. (c_parser_attribute_arguments): Add assume_attr argument, if true, parse first argument as conditional expression. (c_parser_gnu_attribute, c_parser_std_attribute): Adjust c_parser_attribute_arguments callers. (c_parser_statement_after_labels) <case RID_ATTRIBUTE>: Handle assume attribute. gcc/cp/ * cp-tree.h (process_stmt_assume_attribute): Implement C++23 P1774R8 - Portable assumptions. Declare. (diagnose_failing_condition): Declare. (find_failing_clause): Likewise. * parser.cc (assume_attr): New enumerator. (cp_parser_parenthesized_expression_list): Handle assume_attr. Remove identifier variable, for id_attr push the identifier into expression_list right away instead of inserting it before all the others at the end. (cp_parser_conditional_expression): New function. (cp_parser_constant_expression): Use it. (cp_parser_statement): Handle assume attribute. (cp_parser_expression_statement): Likewise. (cp_parser_gnu_attribute_list): Use assume_attr for assume attribute. (cp_parser_std_attribute): Likewise. Handle standard assume attribute like gnu::assume. * cp-gimplify.cc (process_stmt_assume_attribute): New function. * constexpr.cc: Include fold-const.h. (find_failing_clause_r, find_failing_clause): New functions, moved from semantics.cc with ctx argument added and if non-NULL, call cxx_eval_constant_expression rather than fold_non_dependent_expr. (cxx_eval_internal_function): Handle IFN_ASSUME. (potential_constant_expression_1): Likewise. * pt.cc (tsubst_copy_and_build): Likewise. * semantics.cc (diagnose_failing_condition): New function. (find_failing_clause_r, find_failing_clause): Moved to constexpr.cc. (finish_static_assert): Use it. Add auto_diagnostic_group. gcc/testsuite/ * gcc.dg/attr-assume-1.c: New test. * gcc.dg/attr-assume-2.c: New test. * gcc.dg/attr-assume-3.c: New test. * g++.dg/cpp2a/feat-cxx2a.C: Add colon to C++20 features comment, add C++20 attributes comment and move C++20 new features after the attributes before them. * g++.dg/cpp23/feat-cxx2b.C: Likewise. Test __has_cpp_attribute(assume). * g++.dg/cpp23/attr-assume1.C: New test. * g++.dg/cpp23/attr-assume2.C: New test. * g++.dg/cpp23/attr-assume3.C: New test. * g++.dg/cpp23/attr-assume4.C: New test.