aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-05-03c++: Avoid incorrect shortening of divisions [PR108365]Jakub Jelinek3-2/+28
The following testcase is miscompiled, because we shorten the division in a case where it should not be shortened. Divisions (and modulos) can be shortened if it is unsigned division/modulo, or if it is signed division/modulo where we can prove the dividend will not be the minimum signed value or divisor will not be -1, because e.g. on sizeof(long long)==sizeof(int)*2 && __INT_MAX__ == 0x7fffffff targets (-2147483647 - 1) / -1 is UB but (int) (-2147483648LL / -1LL) is not, it is -2147483648. The primary aim of both the C and C++ FE division/modulo shortening I assume was for the implicit integral promotions of {,signed,unsigned} {char,short} and because at this point we have no VRP information etc., the shortening is done if the integral promotion is from unsigned type for the divisor or if the dividend is an integer constant other than -1. This works fine for char/short -> int promotions when char/short have smaller precision than int - unsigned char -> int or unsigned short -> int will always be a positive int, so never the most negative. Now, the C FE checks whether orig_op0 is TYPE_UNSIGNED where op0 is either the same as orig_op0 or that promoted to int, I think that works fine, if it isn't promoted, either the division/modulo common type will have the same precision as op0 but then the division/modulo is unsigned and so without UB, or it will be done in wider precision (e.g. because op1 has wider precision), but then op0 can't be minimum signed value. Or it has been promoted to int, but in that case it was again from narrower type and so never minimum signed int. But the C++ FE was checking if op0 is a NOP_EXPR from TYPE_UNSIGNED. First of all, not sure if the operand of NOP_EXPR couldn't be non-integral type where TYPE_UNSIGNED wouldn't be meaningful, but more importantly, even if it is a cast from unsigned integral type, we only know it can't be minimum signed value if it is a widening cast, if it is same precision or narrowing cast, we know nothing. So, the following patch for the NOP_EXPR cases checks just in case that it is from integral type and more importantly checks it is a widening conversion. 2023-01-14 Jakub Jelinek <jakub@redhat.com> PR c++/108365 * typeck.c (cp_build_binary_op): For integral division or modulo, shorten if type0 is unsigned, or op0 is cast from narrower unsigned integral type or stripped_op1 is INTEGER_CST other than -1. * g++.dg/opt/pr108365.C: New test. * g++.dg/warn/pr108365.C: New test. (cherry picked from commit 5b3a88640f962d4ffca31ae651bed2d8672f1a8c)
2023-05-03match.pd: When simplifying BFR of an insert, require a mode precision ↵Andrew Pinski2-1/+18
integral type [PR108688] The same problem as PR 88739 has crept in but this time in match.pd when simplifying bit_field_ref of an bit_insert. That is we are generating a BIT_FIELD_REF of a non-mode-precision integral type. PR tree-optimization/108688 * match.pd (bit_field_ref [bit_insert]): Avoid generating BIT_FIELD_REFs of non-mode-precision integral operands. * gcc.c-torture/compile/pr108688-1.c: New test. (cherry picked from commit 44f308e59bfa0f93ae05b17e257d8563c12399fd)
2023-05-03fortran: Fix up hash table usage in gfc_trans_use_stmts [PR108451]Jakub Jelinek1-1/+5
The first testcase in the PR (which I haven't included in the patch because it is unclear to me if it is supposed to be valid or not) ICEs since extra hash table checking has been added recently. The problem is that gfc_trans_use_stmts does tree *slot = entry->decls->find_slot_with_hash (rent->use_name, hash, INSERT); if (*slot == NULL) and later on doesn't store anything into *slot and continues. Another spot a few lines later correctly clears the slot if it decides not to use the slot, so the following patch does the same. 2023-02-03 Jakub Jelinek <jakub@redhat.com> PR fortran/108451 * trans-decl.c (gfc_trans_use_stmts): Call clear_slot before doing continue. (cherry picked from commit 76f7f0eddcb7c418d1ec3dea3e2341ca99097301)
2023-05-03nested, openmp: Wrap OMP_CLAUSE_*_GIMPLE_SEQ into GIMPLE_BIND for ↵Jakub Jelinek2-16/+34
declare_vars [PR108435] When gimplifying OMP_CLAUSE_{LASTPRIVATE,LINEAR}_STMT, we wrap it always into a GIMPLE_BIND, but when putting statements directly into OMP_CLAUSE_{LASTPRIVATE,LINEAR}_GIMPLE_SEQ, we do it only if needed (there are any temporaries that need to be declared in the sequence). convert_nonlocal_omp_clauses was relying on the GIMPLE_BIND to be there always because it called declare_vars on it. The following patch wraps it into GIMPLE_BIND in tree-nested if we need to declare_vars on it on demand. 2023-02-02 Jakub Jelinek <jakub@redhat.com> PR middle-end/108435 * tree-nested.c (convert_nonlocal_omp_clauses) <case OMP_CLAUSE_LASTPRIVATE>: If info->new_local_var_chain and *seq is not a GIMPLE_BIND, wrap the sequence into a new GIMPLE_BIND before calling declare_vars. (convert_nonlocal_omp_clauses) <case OMP_CLAUSE_LINEAR>: Merge with the OMP_CLAUSE_LASTPRIVATE handling except for whether seq is initialized to &OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (clause) or &OMP_CLAUSE_LINEAR_GIMPLE_SEQ (clause). * gcc.dg/gomp/pr108435.c: New test. (cherry picked from commit 0f349928e16fdc7dba52561e8d40347909f9f0ff)
2023-05-03ree: Fix -fcompare-debug issues in combine_reaching_defs [PR108573]Jakub Jelinek2-2/+22
The PR78437 r7-4871 changes made combine_reaching_defs punt on WORD_REGISTER_OPERATIONS targets if a setter of smaller than word register has wider uses. This unfortunately breaks -fcompare-debug, because if such a use appears only in DEBUG_INSN(s), while all other uses aren't wider than the setter, we can REE optimize it without -g and not with -g. Such decisions shouldn't be based on debug instructions. We could try to reset them or adjust in some other way after we decide to perform the change, but at least on the testcase which used to fail on riscv64-linux the (debug_insn 8 7 9 2 (var_location:HI s (minus:HI (subreg:HI (and:DI (reg:DI 10 a0 [160]) (const_int 1 [0x1])) 0) (subreg:HI (ashiftrt:DI (reg/v:DI 9 s1 [orig:151 l ] [151]) (debug_expr:SI D#1)) 0))) "pr108573.c":12:5 -1 (nil)) clearly doesn't care about the upper bits and I have hard time imaging how could one end up with DEBUG_INSN which actually cares about those upper bits. So, the following patch just ignores uses on DEBUG_INSNs in this case, if we run into something where we'd need to do something further later on, let's deal with it when we have a testcase for it. 2023-02-01 Jakub Jelinek <jakub@redhat.com> PR debug/108573 * ree.c (combine_reaching_defs): Don't return false for paradoxical subregs in DEBUG_INSNs. * gcc.dg/pr108573.c: New test. (cherry picked from commit e4473d7cf871c8ddf8f22d105c5af6375ebe37bf)
2023-05-03c++, openmp: Handle some OMP_*/OACC_* constructs during constant expression ↵Jakub Jelinek2-0/+77
evaluation [PR108607] While potential_constant_expression_1 handled most of OMP_* codes (by saying that they aren't potential constant expressions), OMP_SCOPE was missing in that list. I've also added OMP_SCAN, though that is less important (similarly to OMP_SECTION it ought to appear solely inside of OMP_{FOR,SIMD} resp. OMP_SECTIONS). As the testcase shows, it isn't enough, potential_constant_expression_1 can catch only some cases, as soon as one uses switch or ifs where at least one of the possible paths could be constant expression, we can run into the same codes during cxx_eval_constant_expression, so this patch handles those there as well. 2023-02-01 Jakub Jelinek <jakub@redhat.com> PR c++/108607 * constexpr.c (cxx_eval_constant_expression): Handle OMP_* and OACC_* constructs as non-constant. (potential_constant_expression_1): Handle OMP_SCAN. * g++.dg/gomp/pr108607.C: New test. (cherry picked from commit bfc070595bfb00abef88a002eee5d9117f5b86a7)
2023-05-03bbpart: Fix up ICE on asm goto [PR108596]Jakub Jelinek2-1/+46
On the following testcase we have asm goto in hot block with 2 successors, one cold to which it both falls through and has one of the label pointing to it and another hot successor with another label. Now, during bbpart we want to ensure that no blocks from one partition fall through into a block in a different partition. fix_up_fall_thru_edges does that by temporarily clearing the EDGE_CROSSING on the fallthrough edge, calling force_nonfallthru and then depending on whether it created a new bb either set EDGE_CROSSING on the single successor edge from the new bb (the new bb is kept in the same partition as the predecessor block), or if no new bb has been created setting EDGE_CROSSING back on the fallthru edge which has been forced non-EDGE_FALLTHRU. For asm goto this doesn't always work, force_nonfallthru can create a new bb and change the fallthrough edge to point to that, but if the original fallthru destination block has its label referenced among the asm goto labels, it will create a new non-fallthru edge for the label(s). But because we've temporarily cheated and cleared EDGE_CROSSING on the edge, it is cleared on the new edge as well, then the caller sees we've created a new bb and just sets EDGE_CROSSING on the single fallthru edge from the new bb. But the direct edge from cur_bb to fallthru edge's destination isn't handled and fails afterwards consistency checks, because it crosses partitions. The following patch notes the case and sets EDGE_CROSSING on that edge too. 2023-01-31 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/108596 * bb-reorder.c (fix_up_fall_thru_edges): Handle the case where cur_bb ends with asm goto and has a crossing fallthrough edge to the same bb that contains at least one of its labels by restoring EDGE_CROSSING flag even on possible edge from cur_bb to new_bb successor. * gcc.c-torture/compile/pr108596.c: New test. (cherry picked from commit 603a6fbcaac1e80aa90d1d26318c881a53473066)
2023-05-03doc: Fix up return type of __builtin_va_arg_pack_len [PR108560]Jakub Jelinek1-1/+1
__builtin_va_arg_pack_len as implemented returned int since its introduction in 2007. The initial documentation didn't mention any return type, which changed in 2010 in r0-103077-gab940b73bfabe2cec4 during some documentation formatting cleanups https://gcc.gnu.org/legacy-ml/gcc-patches/2010-09/msg01632.html I can understand that for formatting some type was needed there but what exactly hasn't been really discussed. So, I think we should change documentation to match the implementation, rather than change implementation to match the documentation. Most people don't use more than 2147483647 arguments to inline functions, and on poor targets with 16-bit ints I bet even having more than 65535 arguments to inline functions would be highly unexpected. 2023-01-27 Jakub Jelinek <jakub@redhat.com> PR other/108560 * doc/extend.texi: Fix up return type of __builtin_va_arg_pack_len from size_t to int. (cherry picked from commit 16f30680f403891556da2ad6329fcef9dc9b47db)
2023-05-03options: fix cl_target_option_print_diff() with stringsEric Biggers1-1/+1
Fix an obvious copy-and-paste error where ptr1 was used instead of ptr2. This bug caused the dump file produced by -fdump-ipa-inline-details to not correctly show the difference in target options when a function could not be inlined due to a target option mismatch. gcc/ChangeLog: PR bootstrap/90543 * optc-save-gen.awk: Fix copy-and-paste error. Signed-off-by: Eric Biggers <ebiggers@google.com> (cherry picked from commit 9f0cb3368af735e95776769c4f28fa9cbb60eaf8)
2023-05-03c++: Fix up handling of references to anon union members in initializers ↵Jakub Jelinek2-10/+60
[PR53932] For anonymous union members we create artificial VAR_DECLs which have DECL_VALUE_EXPR for the actual COMPONENT_REF. That works just fine inside of functions (including global dynamic constructors), because during gimplification such VAR_DECLs are gimplified as their DECL_VALUE_EXPR. This is also done during regimplification. But references to these artificial vars in DECL_INITIAL expressions aren't ever replaced by the DECL_VALUE_EXPRs, so we end up either with link failures like on the testcase below, or worse ICEs with LTO. The following patch fixes those during cp_fully_fold_init where we already walk all the trees (!data->genericize means that function rather than cp_fold_function). 2023-01-19 Jakub Jelinek <jakub@redhat.com> PR c++/53932 * cp-gimplify.c (cp_fold_r): During cp_fully_fold_init replace DECL_ANON_UNION_VAR_P VAR_DECLs with their corresponding DECL_VALUE_EXPR. * g++.dg/init/pr53932.C: New test. (cherry picked from commit 9b9a989adc042b304572fd6d4ade46b47be6ccb8)
2023-05-03fortran: Fix up function types for realloc and sincos{,f,l} builtins [PR108349]Jakub Jelinek1-18/+20
As reported in the PR, the FUNCTION_TYPE for __builtin_realloc in the Fortran FE is wrong since r0-100026-gb64fca63690ad which changed -  tmp = tree_cons (NULL_TREE, pvoid_type_node, void_list_node); -  tmp = tree_cons (NULL_TREE, size_type_node, tmp); -  ftype = build_function_type (pvoid_type_node, tmp); +  ftype = build_function_type_list (pvoid_type_node, +                                    size_type_node, pvoid_type_node, +                                    NULL_TREE);    gfc_define_builtin ("__builtin_realloc", ftype, BUILT_IN_REALLOC,                       "realloc", false); The return type is correct, void *, but the first argument should be void * too and only second one size_t, while the above change changed realloc to be void *__builtin_realloc (size_t, void *); I went through all other changes from that commit and found that __builtin_sincos{,f,l} got broken as well, instead of the former void __builtin_sincos{,f,l} (ftype, ftype *, ftype *); where ftype is {double,float,long double} it is now incorrectly void __builtin_sincos{,f,l} (ftype *, ftype *); The following patch fixes that, plus some formatting issues around the spots I've changed. 2023-01-11 Jakub Jelinek <jakub@redhat.com> PR fortran/108349 * f95-lang.c (gfc_init_builtin_function): Fix up function types for BUILT_IN_REALLOC and BUILT_IN_SINCOS{F,,L}. Formatting fixes. (cherry picked from commit 0986c351aa8a9f08b3cb614baec13564dd62c114)
2023-05-03generic-match-head: Don't assume GENERIC folding is done only early [PR108237]Jakub Jelinek2-1/+17
We ICE on the following testcase, because a valid V2DImode != comparison is folded into an unsupported V2DImode > comparison. The match.pd pattern which does this looks like: /* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z where ~Y + 1 == pow2 and Z = ~Y. */ (for cst (VECTOR_CST INTEGER_CST) (for cmp (eq ne) icmp (le gt) (simplify (cmp (bit_and:c@2 @0 cst@1) integer_zerop) (with { tree csts = bitmask_inv_cst_vector_p (@1); } (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) (with { auto optab = VECTOR_TYPE_P (TREE_TYPE (@1)) ? optab_vector : optab_default; tree utype = unsigned_type_for (TREE_TYPE (@1)); } (if (target_supports_op_p (utype, icmp, optab) || (optimize_vectors_before_lowering_p () && (!target_supports_op_p (type, cmp, optab) || !target_supports_op_p (type, BIT_AND_EXPR, optab)))) (if (TYPE_UNSIGNED (TREE_TYPE (@1))) (icmp @0 { csts; }) (icmp (view_convert:utype @0) { csts; }))))))))) and that optimize_vectors_before_lowering_p () guarded stuff there already deals with this problem, not trying to fold a supported comparison into a non-supported one. The reason it doesn't work in this case is that it isn't GIMPLE folding which does this, but GENERIC folding done during forwprop4 - forward_propagate_into_comparison -> forward_propagate_into_comparison_1 -> combine_cond_expr_cond -> fold_binary_loc -> generic_simplify and we simply assumed that GENERIC folding happens only before gimplification. The following patch fixes that by checking cfun properties instead of always returning true in those cases. 2023-01-04 Jakub Jelinek <jakub@redhat.com> PR middle-end/108237 * generic-match-head.c: Include tree-pass.h. (canonicalize_math_p): Define to false if cfun and cfun->curr_properties has PROP_gimple_opt_math resp. PROP_gimple_lvec property set. * gcc.c-torture/compile/pr108237.c: New test. (cherry picked from commit 345dffd0d4ebff7e705dfff1a8a72017a167120a)
2023-05-03tree-ssa-dom: can_infer_simple_equiv fixes [PR108068]Jakub Jelinek4-6/+50
As reported in the PR, tree-ssa-dom.cc uses real_zerop call to find if a floating point constant is zero and it shouldn't try to infer equivalences from comparison against it if signed zeros are honored. This doesn't work at all for decimal types, because real_zerop always returns false for them (one can have different representations of decimal zero beyond -0/+0), and it doesn't work for vector compares either, as real_zerop checks if all elements are zero, while we need to avoid infering equivalences from comparison against vector constants which have at least one zero element in it (if signed zeros are honored). Furthermore, as mentioned by Joseph, for decimal types many other values aren't singleton. So, this patch stops infering anything if element mode is decimal, and otherwise uses instead of real_zerop a new function, real_maybe_zerop, which will work even for decimal types and for complex or vector will return true if any element is or might be zero (so it returns true for anything but constants for now). 2022-12-23 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/108068 * tree.h (real_maybe_zerop): Declare. * tree.c (real_maybe_zerop): Define. * tree-ssa-dom.c (record_edge_info): Use it instead of real_zerop or TREE_CODE (op1) == SSA_NAME || real_zerop. Always set can_infer_simple_equiv to false for decimal floating point types. * gcc.dg/dfp/pr108068.c: New test. (cherry picked from commit fd1b0aefda5b65f3f841ca6e61ccea6a72daa060)
2023-05-03cse: Fix up CSE const_anchor handling [PR108193]Jakub Jelinek2-5/+29
The following testcase ICEs on aarch64, because insert_const_anchor inserts invalid CONST_INT into the CSE tables - 0x80000000 for SImode. The second hunk of the patch fixes that, the first one is to avoid triggering undefined behavior at compile time during compute_const_anchors computations - performing those additions and subtractions in HOST_WIDE_INT means it can overflow for certain constants. 2022-12-22 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/108193 * cse.c (compute_const_anchors): Change n type to unsigned HOST_WIDE_INT, adjust comparison against it to avoid warnings. Formatting fix. (insert_const_anchor): Use gen_int_mode instead of GEN_INT. * gfortran.dg/pr108193.f90: New test. (cherry picked from commit 0cb5d7cdbab8e5f8359764ef5f62d93c2bc88552)
2023-05-03openmp: Don't try to destruct DECL_OMP_PRIVATIZED_MEMBER vars [PR108180]Jakub Jelinek1-0/+5
DECL_OMP_PRIVATIZED_MEMBER vars are artificial vars with DECL_VALUE_EXPR of this->field used just during gimplification and omp lowering/expansion to privatize individual fields in methods when needed. As the following testcase shows, when not in templates, they were handled right, but in templates we actually called cp_finish_decl on them and that can result in their destruction, which is obviously undesirable, we should only destruct the privatized copies of them created in omp lowering. Fixed thusly. 2022-12-21 Jakub Jelinek <jakub@redhat.com> PR c++/108180 * pt.c (tsubst_expr): Don't call cp_finish_decl on DECL_OMP_PRIVATIZED_MEMBER vars. * testsuite/libgomp.c++/pr108180.C: New test. (cherry picked from commit 1119902b6c7c1c50123ed85ec1def8be4772d68c)
2023-05-03testsuite: Fix up pr64536.c for LLP64 targets [PR108151]Jakub Jelinek1-2/+2
Apparently llp64 had 2 further warnings, fixed thusly. 2022-12-19 Jakub Jelinek <jakub@redhat.com> PR testsuite/108151 * gcc.dg/pr64536.c (bar): Cast long to __INTPTR_TYPE__ before casting to long *. (cherry picked from commit 6e85f89a7d59a99a3395b6e153b99262a58b2f6c)
2023-05-03testsuite: Fix up pr64536.c for LLP64 targets [PR108151]Jakub Jelinek1-2/+2
The test casts a pointer to long, which is ok for ilp32 and lp64 targets but not for llp64 targets. Nothing reads the values later, it is a link test, so all we care about is that it is the same cast on s390x-linux where it used to fail before the PR64536 fix, and that we don't warn about it. 2022-12-19 Jakub Jelinek <jakub@redhat.com> PR testsuite/108151 * gcc.dg/pr64536.c (bar): Use casts to __INTPTR_TYPE__ rather than long when casting pointer to integral type. (cherry picked from commit ea37e96a37b50dad17b91d46edc518bbb9132d8e)
2023-05-03loop-invariant: Split preheader edge if the preheader bb ends with jump ↵Jakub Jelinek2-0/+19
[PR106751] The RTL loop passes only request simple preheaders, but don't require fallthru preheaders, while move_invariant_reg apparently assumes the latter, that it can just append instruction(s) to the end of the preheader basic block. The following patch fixes that by splitting the preheader edge if the preheader bb ends with a JUMP_INSN (asm goto in this case). Without that we get control flow in the middle of a bb. 2022-12-16 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/106751 * loop-invariant.c (move_invariant_reg): If preheader bb ends with a JUMP_INSN, split the preheader edge and emit invariants into the new preheader basic block. * gcc.c-torture/compile/pr106751.c: New test. (cherry picked from commit ddcaa60983b50378bde1b7e327086fe0ce101795)
2023-05-03c++: Ensure !!var is not an lvalue [PR107065]Jakub Jelinek3-3/+24
The TRUTH_NOT_EXPR case in cp_build_unary_op is one of the spots where we somewhat fold immediately using invert_truthvalue_loc. I've tried using return build1_loc (location, TRUTH_NOT_EXPR, boolean_type_node, arg); in there instead, but unfortunately that regressed Wlogical-not-parentheses-*.c pr49706.c pr62199.c pr65120.c sequence-pt-1.C tests, so at least for backporting that doesn't seem to be a way to go. So, this patch instead wraps it into NON_LVALUE_EXPR if needed (which also need a tweak for some tests in the pr47906.c test, but nothing major), with the intent to make it backportable, and later I'll try to do further steps to avoid folding here prematurely. Most of the problems with build1 TRUTH_NOT_EXPR are that it doesn't even invert comparisons as most common case and lots of warning code isn't able to deal with ! around comparisons; so perhaps one way to do this would be fold by hand only invertable comparisons and for the rest create TRUTH_NOT_EXPR. 2022-12-15 Jakub Jelinek <jakub@redhat.com> PR c++/107065 gcc/cp/ * typeck.c (cp_build_unary_op) <case TRUTH_NOT_EXPR>: If invert_truthvalue_loc returns obvalue_p, wrap it into NON_LVALUE_EXPR. * parser.c (cp_parser_binary_expression): Don't call warn_logical_not_parentheses if current.lhs is a NON_LVALUE_EXPR of a decl with boolean type. gcc/testsuite/ * g++.dg/cpp0x/pr107065.C: New test. (cherry picked from commit 8b775b4c48a3cc4ef5c50e56144aea02da2e9cc6)
2023-05-03ivopts: Fix IP_END handling for asm goto [PR107997]Jakub Jelinek2-0/+30
The following testcase ICEs, because the latch bb ends with asm goto which has both fallthrough to the header and one or more labels in the header too. In that case there is just a single edge out of the latch block, but still the asm goto is stmt_ends_bb_p statement, yet ivopts decides to emit an IV bump at the IP_END position and inserts it into the same bb as the asm goto after it, which then fails verification (control flow in the middle of bb). The following patch fixes it by splitting the latch -> header edge in that case and inserting into the newly created bb, where split_edge -> redirect_edge_and_branch is able to deal with this case correctly. 2022-12-10 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/107997 * tree-ssa-loop-ivopts.c: Include cfganal.h. (create_new_iv) <case IP_END>: If ip_end_pos bb is non-empty and ends with a stmt which ends bb, instead of adding iv update after it split the latch edge and insert iterator into the new latch bb. * gcc.c-torture/compile/pr107997.c: New test. (cherry picked from commit 7676235f690e624b7ed41a22b22ce8ccfac1492f)
2023-05-03cfgbuild: Fix DEBUG_INSN handling in find_bb_boundaries [PR106719]Jakub Jelinek2-2/+60
The following testcase FAILs on aarch64-linux. We have some atomic instruction followed by 2 DEBUG_INSNs (if -g only of course) followed by NOTE_INSN_EPILOGUE_BEG followed by some USE insn. Now, split3 pass replaces the atomic instruction with a code sequence which ends with a conditional jump and the split3 pass calls find_many_sub_basic_blocks. For -g0, find_bb_boundaries sees the flow_transfer_insn (the new conditional jump), then NOTE_INSN_EPILOGUE_BEG which can live in between basic blocks and then the USE insn, so splits block after the NOTE_INSN_EPILOGUE_BEG and puts the NOTE in between the blocks. For -g, if sees a DEBUG_INSN after the flow_transfer_insn, so sets debug_insn to it, then walks over another DEBUG_INSN, NOTE_INSN_EPILOGUE_BEG until it finally sees the USE insn, and triggers the: rtx_insn *prev = PREV_INSN (insn); /* If the first non-debug inside_basic_block_p insn after a control flow transfer is not a label, split the block before the debug insn instead of before the non-debug insn, so that the debug insns are not lost. */ if (debug_insn && code != CODE_LABEL && code != BARRIER) prev = PREV_INSN (debug_insn); code I've added for PR81325. If there are only DEBUG_INSNs, that is the right thing to do, but if in between debug_insn and insn there are notes which can stay in between basic blocks or simnilarly JUMP_TABLE_DATA or their associated CODE_LABELs, it causes -fcompare-debug differences. The following patch fixes it by clearing debug_insn if JUMP_TABLE_DATA or associated CODE_LABEL is seen (I'm afraid there is no good answer what to do with DEBUG_INSNs before those; the code then removes them: /* Clean up the bb field for the insns between the blocks. */ for (x = NEXT_INSN (flow_transfer_insn); x != BB_HEAD (fallthru->dest); x = next) { next = NEXT_INSN (x); /* Debug insns should not be in between basic blocks, drop them on the floor. */ if (DEBUG_INSN_P (x)) delete_insn (x); else if (!BARRIER_P (x)) set_block_for_insn (x, NULL); } but if there are NOTEs, the patch just reorders the NOTEs and DEBUG_INSNs, such that the NOTEs come first (so that they stay in between basic blocks like with -g0) and DEBUG_INSNs after those (so that bb is split before them, so they will be in the basic block after NOTE_INSN_BASIC_BLOCK). 2022-12-08 Jakub Jelinek <jakub@redhat.com> PR debug/106719 * cfgbuild.c (find_bb_boundaries): If there are NOTEs in between debug_insn (seen after flow_transfer_insn) and insn, move NOTEs before all the DEBUG_INSNs and split after NOTEs. If there are other insns like jump table data, clear debug_insn. * gcc.dg/pr106719.c: New test. (cherry picked from commit d9f9d5d30feb33c359955d7030cc6be50ef6dc0a)
2023-05-03asan: Fix up error recovery for too large frames [PR107317]Jakub Jelinek2-0/+19
asan_emit_stack_protection and functions it calls have various asserts that verify sanity of the stack protection instrumentation. But, that verification can easily fail if we've diagnosed a frame offset overflow. asan_emit_stack_protection just emits some extra code in the prologue, if we've reported errors, we aren't producing assembly, so it doesn't really matter if we don't include the protection code, compilation is going to fail anyway. 2022-11-24 Jakub Jelinek <jakub@redhat.com> PR middle-end/107317 * asan.c: Include diagnostic-core.h. (asan_emit_stack_protection): Return NULL early if seen_error (). * gcc.dg/asan/pr107317.c: New test. (cherry picked from commit b6330a7685476fc30b8ae9bbf3fca1a9b0d4be95)
2023-05-03i386: Uglify some local identifiers in *intrin.h [PR107748]Jakub Jelinek1-6/+7
While reporting PR107748 (where is a problem with non-uglified names, but I've left it out because it needs fixing anyway), I've noticed various spots where identifiers in *intrin.h headers weren't uglified. The following patch fixed those that are related to unions (I've grepped for [a-zA-Z]\.[a-zA-Z] spots). The reason we need those to be uglified is the same as why the arguments of the inlines are __ prefixed and most of automatic vars in the inlines - say a, v or u aren't part of implementation namespace and so users could #define u whatever->something #include <x86intrin.h> and it should still work, as long as u is not e.g. one of the names of the functions/macros the header provides (_mm* etc.). 2022-11-21 Jakub Jelinek <jakub@redhat.com> PR target/107748 * config/i386/smmintrin.h (_mm_extract_ps): Uglify names of local variables and union members. (cherry picked from commit ec8ec09f9414be871e322fecf4ebf53e3687bd22)
2023-05-03reg-stack: Fix a -fcompare-debug bug in reg-stack [PR107183]Jakub Jelinek2-21/+77
As the following testcase shows, the swap_rtx_condition function in reg-stack can result in different code generation between -g and -g0. The function is doing the changes as it goes, so does analysis and changes together, which makes it harder to deal with DEBUG_INSNs, where normally analysis phase ignores them and the later phase doesn't. swap_rtx_condition walks instructions two different ways, one is using next_flags_user function which stops on non-call instructions that mention the flags register, and the other is a loop on fnstsw where it stops on instructions mentioning it and tries to find sahf instruction that uses it (in both cases calls stop it and so does end of basic block). Now both of these currently stop on DEBUG_INSNs that mention the flags register resp. the fnstsw result register. On success the function recurses on next flags user instruction if still live and if the recursion failed, reverts the changes it did too and fails. If it were just for the next_flags_user case, the fix could be just not doing INSN_CODE (insn) = -1; if (recog_memoized (insn) == -1) fail = 1; on DEBUG_INSNs (assuming all changes to those are fine), swap_rtx_condition_1 just changes one comparison to a different one. But due to the possibility of fnstsw result being used in theory before sahf in some DEBUG_INSNs, this patch takes a different approach. swap_rtx_condition has now a new argument and two modes. The first mode is when debug_seen is >= 0, in this case both next_flags_user and the loop for fnstsw -> sahf will ignore but note DEBUG_INSNs (that mention flags register or fnstsw result). If no such DEBUG_INSN is found during the whole call including recursive invocations (so e.g. for -g0 but probably most often for -g as well), it behaves as before, if it returns true all the changes are done and nothing further needs to be done later. If any DEBUG_INSNs are seen along the way, even when returning success all the changes are reverted, so it just reports that the function would be successful if DEBUG_INSNs were ignored. In this case, compare_for_stack_reg needs to call it again in debug_seen = -1 mode, which tells the function to update everything including DEBUG_INSNs. For the fnstsw -> sahf case which I hope will be very rare I just reset the DEBUG_INSNs, I don't really know how to express it easily otherwise. For the rest swap_rtx_condition_1 is done even on the DEBUG_INSNs. 2022-11-20 Jakub Jelinek <jakub@redhat.com> PR target/107183 * reg-stack.c (next_flags_user): Add DEBUG_SEEN argument. If >= 0 and a DEBUG_INSN would be otherwise returned, set DEBUG_SEEN to 1 and ignore it. (swap_rtx_condition): Add DEBUG_SEEN argument. In >= 0 mode only set DEBUG_SEEN to 1 if problematic DEBUG_ISNSs were seen and revert all changes on success in that case. Don't try to recog_memoized DEBUG_INSNs. (compare_for_stack_reg): Adjust swap_rtx_condition caller. If it returns true and debug_seen is 1, call swap_rtx_condition again with debug_seen -1. * gcc.dg/ubsan/pr107183.c: New test. (cherry picked from commit 6b5c98c1c0003bd470a4428bede6c862637a94b8)
2023-05-03c, c++: Fix up excess precision handling of scalar_to_vector conversion ↵Jakub Jelinek2-2/+32
[PR107358] As mentioned earlier in the C++ excess precision support mail, the following testcase is broken with excess precision both in C and C++ (though just in C++ it was triggered in real-world code). scalar_to_vector is called in both FEs after the excess precision promotions (or stripping of EXCESS_PRECISION_EXPR), so we can then get invalid diagnostics that say float vector + float involves truncation (on ia32 from long double to float). The following patch fixes that by calling scalar_to_vector on the operands before the excess precision promotions, let scalar_to_vector just do the diagnostics (it does e.g. fold_for_warn so it will fold EXCESS_PRECISION_EXPR around REAL_CST to constants etc.) but will then do the actual conversions using the excess precision promoted operands (so say if we have vector double + (float + float) we don't actually do vector double + (float) ((long double) float + (long double) float) but vector double + (double) ((long double) float + (long double) float) 2022-10-24 Jakub Jelinek <jakub@redhat.com> PR c++/107358 gcc/c/ * c-typeck.c (build_binary_op): Pass operands before excess precision promotions to scalar_to_vector call. gcc/testsuite/ * c-c++-common/pr107358.c: New test. (cherry picked from commit 65e3274e363cb2c6bfe6b5e648916eb7696f7e2f)
2023-05-03c++: Fix up constexpr handling of char/signed char/short pre/post ↵Jakub Jelinek2-0/+27
inc/decrement [PR105774] signed char, char or short int pre/post inc/decrement are represented by normal {PRE,POST}_{INC,DEC}REMENT_EXPRs in the FE and only gimplification ensures that the {PLUS,MINUS}_EXPR is done in unsigned version of those types: case PREINCREMENT_EXPR: case PREDECREMENT_EXPR: case POSTINCREMENT_EXPR: case POSTDECREMENT_EXPR: { tree type = TREE_TYPE (TREE_OPERAND (*expr_p, 0)); if (INTEGRAL_TYPE_P (type) && c_promoting_integer_type_p (type)) { if (!TYPE_OVERFLOW_WRAPS (type)) type = unsigned_type_for (type); return gimplify_self_mod_expr (expr_p, pre_p, post_p, 1, type); } break; } This means during constant evaluation we need to do it similarly (either using unsigned_type_for or using widening to integer_type_node). The following patch does the latter. 2022-10-24 Jakub Jelinek <jakub@redhat.com> PR c++/105774 * constexpr.c (cxx_eval_increment_expression): For signed types that promote to int, evaluate PLUS_EXPR or MINUS_EXPR in int type. * g++.dg/cpp1y/constexpr-105774.C: New test. (cherry picked from commit da8c362c4c18cff2f2dfd5c4706bdda7576899a4)
2023-05-03openmp: Fix ICE with taskgroup at -O0 -fexceptions [PR107001]Jakub Jelinek3-3/+29
The following testcase ICEs because with -O0 -fexceptions GOMP_taskgroup_end call isn't directly followed by GOMP_RETURN statement, but there are some conditionals to handle exceptions and we fail to find the correct GOMP_RETURN. The fix is to treat taskgroup similarly to target data, both of these constructs emit a try { body } finally { end_call } around the construct's body during gimplification and we need to see proper construct nesting during gimplification and omp lowering (including nesting of regions checks), but during omp expansion we don't really need their nesting anymore, all we need is emit something at the start of the region and the end of the region is the end API call we've already emitted during gimplification. For target data, we weren't adding GOMP_RETURN statement during omp lowering, so after that pass it is treated merely like stand-alone omp directives. This patch does the same for taskgroup too. 2022-09-24 Jakub Jelinek <jakub@redhat.com> PR c/107001 * omp-low.c (lower_omp_taskgroup): Don't add GOMP_RETURN statement at the end. * omp-expand.c (build_omp_regions_1): Clarify GF_OMP_TARGET_KIND_DATA is not stand-alone directive. For GIMPLE_OMP_TASKGROUP, also don't update parent. (omp_make_gimple_edges) <case GIMPLE_OMP_TASKGROUP>: Reset cur_region back after new_omp_region. * c-c++-common/gomp/pr107001.c: New test. (cherry picked from commit ad2aab5c816a6fd56b46210c0a4a4c6243da1de9)
2023-05-03openmp, c: Tighten up c_tree_equal [PR106981]Jakub Jelinek2-6/+19
This patch changes c_tree_equal to work more like cp_tree_equal, be more strict in what it accepts. The ICE on the first testcase was due to INTEGER_CST wi::wide (t1) == wi::wide (t2) comparison which ICEs if the two constants have different precision, but as the second testcase shows, being too lenient in it can also lead to miscompilation of valid OpenMP programs where we think certain expression is the same even when it isn't and can be guaranteed at runtime to represent different memory location. So, the patch looks through only NON_LVALUE_EXPRs and for constants as well as casts requires that the types match before actually comparing the constant values or recursing on the cast operands. 2022-09-24 Jakub Jelinek <jakub@redhat.com> PR c/106981 gcc/c/ * c-typeck.c (c_tree_equal): Only strip NON_LVALUE_EXPRs at the start. For CONSTANT_CLASS_P or CASE_CONVERT: return false if t1 and t2 have different types. gcc/testsuite/ * c-c++-common/gomp/pr106981.c: New test. libgomp/ * testsuite/libgomp.c-c++-common/pr106981.c: New test. (cherry picked from commit 3c5bccb608c665ac3f62adb1817c42c845812428)
2023-05-03c++: Implement P2327R1 - De-deprecating volatile compound operationsJakub Jelinek5-14/+28
From what I can see, this has been voted in as a DR and as it means we warn less often than before in -std={gnu,c}++2{0,3} modes or with -Wvolatile, I wonder if it shouldn't be backported to affected release branches as well. 2022-08-16 Jakub Jelinek <jakub@redhat.com> * typeck.c (cp_build_modify_expr): Implement P2327R1 - De-deprecating volatile compound operations. Don't warn for |=, &= or ^= with volatile lhs. * expr.c (mark_use) <case MODIFY_EXPR>: Adjust warning wording, leave out simple. * g++.dg/cpp2a/volatile1.C: Adjust for de-deprecation of volatile compound |=, &= and ^= operations. * g++.dg/cpp2a/volatile3.C: Likewise. * g++.dg/cpp2a/volatile5.C: Likewise. (cherry picked from commit 6e790ca4615443fa395ac5cdba1ab6c87810985c)
2023-05-03cgraphunit: Don't emit asm thunks for -dx [PR106261]Jakub Jelinek2-1/+37
When -dx option is used (didn't know we have it and no idea what is it useful for), we just expand functions to RTL and then omit all further RTL passes, so the normal functions aren't actually emitted into assembly, just variables. The following testcase ICEs, because we don't emit the methods, but do emit thunks pointing to that and those thunks have unwind info and rely on at least some real functions to be emitted (which is normally the case, thunks are only emitted for locally defined functions) because otherwise there are no CIEs, only FDEs and dwarf2out is upset about it. The following patch fixes that by not emitting assembly thunks for -dx either. 2022-07-27 Jakub Jelinek <jakub@redhat.com> PR debug/106261 * cgraphunit.c (cgraph_node::assemble_thunks_and_aliases): Don't output asm thunks for -dx. * g++.dg/debug/pr106261.C: New test. (cherry picked from commit f9671b60f9395cb1dca128b92f5dd215f5aeaae1)
2023-05-03wide-int: Fix up wi::shifted_mask [PR106144]Jakub Jelinek1-1/+12
As the following self-test testcase shows, wi::shifted_mask sometimes doesn't create canonicalized wide_ints, which then fail to compare equal to canonicalized wide_ints with the same value. In particular, wi::mask (128, false, 128) gives { -1 } with len 1 and prec 128, while wi::shifted_mask (0, 128, false, 128) gives { -1, -1 } with len 2 and prec 128. The problem is that the code is written with the assumption that there are 3 bit blocks (or 2 if start is 0), but doesn't consider the possibility where there are 2 bit blocks (or 1 if start is 0) where the highest block isn't present. In that case, there is the optional block of negate ? 0 : -1 elts, followed by just one elt (either one from the if (shift) or just negate ? -1 : 0) and the rest is implicit sign-extension. Only if end < prec there is 1 or more bits above it that have different bit value and so we need to emit all the elts till end and then one more elt. if (end == prec) would work too, because we have: if (width > prec - start) width = prec - start; unsigned int end = start + width; so end is guaranteed to be end <= prec, dunno what is preferred. 2022-07-01 Jakub Jelinek <jakub@redhat.com> PR middle-end/106144 * wide-int.cc (wi::shifted_mask): If end >= prec, return right after emitting element for shift or if shift is 0 first element after start. (wide_int_cc_tests): Add tests for equivalency of wi::mask and wi::shifted_mask with 0 start. (cherry picked from commit e52592073f6df3d7a3acd9f0436dcc32a8b7493d)
2023-05-03ifcvt: Don't introduce trapping or faulting reads in noce_try_sign_mask ↵Jakub Jelinek2-7/+29
[PR106032] noce_try_sign_mask as documented will optimize if (c < 0) x = t; else x = 0; into x = (c >> bitsm1) & t; The optimization is done if either t is unconditional (e.g. for x = t; if (c >= 0) x = 0; ) or if it is cheap. We already check that t doesn't have side-effects, but if t is conditional, we need to punt also if it may trap or fault, as we make it unconditional. I've briefly skimmed other noce_try* optimizations and didn't find one that would suffer from the same problem. 2022-06-21 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/106032 * ifcvt.c (noce_try_sign_mask): Punt if !t_unconditional, and t may_trap_or_fault_p, even if it is cheap. * gcc.c-torture/execute/pr106032.c: New test. (cherry picked from commit a0c30fe3b888f20215f3e040d21b62b603804ca9)
2023-05-03expand: Fix up expand_cond_expr_using_cmove [PR106030]Jakub Jelinek2-1/+18
If expand_cond_expr_using_cmove can't find a cmove optab for a particular mode, it tries to promote the mode and perform the cmove in the promoted mode. The testcase in the patch ICEs on arm because in that case we pass temp which has the promoted mode (SImode) as target to expand_operands where the operands have the non-promoted mode (QImode). Later on the function uses paradoxical subregs: if (GET_MODE (op1) != mode) op1 = gen_lowpart (mode, op1); if (GET_MODE (op2) != mode) op2 = gen_lowpart (mode, op2); to change the operand modes. The following patch fixes it by passing NULL_RTX as target if it has promoted mode. 2022-06-21 Jakub Jelinek <jakub@redhat.com> PR middle-end/106030 * expr.c (expand_cond_expr_using_cmove): Pass NULL_RTX instead of temp to expand_operands if mode has been promoted. * gcc.c-torture/compile/pr106030.c: New test. (cherry picked from commit 2df1df945fac85d7b3d084001414a66a2709d8fe)
2023-05-03Daily bump.GCC Administrator1-1/+1
2023-05-02Daily bump.GCC Administrator1-1/+1
2023-05-01Daily bump.GCC Administrator1-1/+1
2023-04-30Daily bump.GCC Administrator1-1/+1
2023-04-29Daily bump.GCC Administrator1-1/+1
2023-04-28Daily bump.GCC Administrator1-1/+1
2023-04-27Daily bump.GCC Administrator1-1/+1
2023-04-26Daily bump.GCC Administrator2-1/+6
2023-04-25testsuite: remove stray ';' [PR109608]Jason Merrill1-1/+1
GCC 10 is still pedantic about empty declarations. PR testsuite/109608 gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-pmf3.C: Remove stray ';'.
2023-04-25Daily bump.GCC Administrator1-1/+1
2023-04-24Daily bump.GCC Administrator1-1/+1
2023-04-23Daily bump.GCC Administrator1-1/+1
2023-04-22Daily bump.GCC Administrator4-1/+55
2023-04-21c-family: -Wsequence-point and COMPONENT_REF [PR107163]Jason Merrill2-1/+43
The patch for PR91415 fixed -Wsequence-point to treat shifts and ARRAY_REF as sequenced in C++17, and COMPONENT_REF as well. But this is unnecessary for COMPONENT_REF, since the RHS is just a FIELD_DECL with no actual evaluation, and in this testcase handling COMPONENT_REF as sequenced blows up fast in a deep inheritance tree. Instead, look through it. PR c++/107163 gcc/c-family/ChangeLog: * c-common.c (verify_tree): Don't use sequenced handling for COMPONENT_REF. gcc/testsuite/ChangeLog: * g++.dg/warn/Wsequence-point-5.C: New test.
2023-04-21c++: constexpr PMF conversion [PR105996]Jason Merrill2-8/+18
Here, we were calling build_reinterpret_cast regardless of whether there was actually a cast, and that now sets REINTERPRET_CAST_P. But that optimization seems dodgy anyway, as it involves NOP_EXPR from one RECORD_TYPE to another and we try to reserve NOP_EXPR for fundamental types. And the generated code seems the same, so let's drop it. And also strip location wrappers. PR c++/105996 gcc/cp/ChangeLog: * typeck.c (build_ptrmemfunc): Drop 0-offset optimization and location wrappers. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-pmf3.C: New test.
2023-04-21c++: constant, array, lambda, template [PR108975]Jason Merrill2-0/+17
When a lambda refers to a constant local variable in the enclosing scope, we tentatively capture it, but if we end up pulling out its constant value, we go back at the end of the lambda and prune any unneeded captures. Here while parsing the template we decided that the dim capture was unneeded, because we folded it away, but then we brought back the use in the template trees that try to preserve the source representation with added type info. So then when we tried to instantiate that use, we couldn't find what it was trying to use, and crashed. Fixed by not trying to prune when parsing a template; we'll prune at instantiation time. PR c++/108975 gcc/cp/ChangeLog: * lambda.c (prune_lambda_captures): Don't bother in a template. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/lambda/lambda-const11.C: New test.
2023-04-21c++: namespace-scoped friend in local class [PR69410]Jason Merrill3-5/+28
do_friend was only considering class-qualified identifiers for the qualified-id case, but we also need to skip local scope when there's an explicit namespace scope. PR c++/69410 gcc/cp/ChangeLog: * friend.c (do_friend): Handle namespace as scope argument. * decl.c (grokdeclarator): Pass down in_namespace. gcc/testsuite/ChangeLog: * g++.dg/lookup/friend24.C: New test.