aboutsummaryrefslogtreecommitdiff
path: root/gcc/testsuite/gcc.dg
AgeCommit message (Collapse)AuthorFilesLines
2025-07-22Fix gcc.dg/vect/slp-28.cRichard Biener1-5/+4
gcc.dg/vect/slp-28.c is now vectorized as expected even on targets without vect32. * gcc.dg/vect/slp-28.c: Adjust.
2025-07-21match: Add `cmp - 1` simplification to `-icmp` [PR110949]Andrew Pinski2-0/+33
I have seen this a few places though the testcase from PR 95906 is an obvious place where this shows up for sure. This convert `cmp - 1` into `-icmp` as that form is more useful in many cases. Changes since v1: * v2: Add check for outer type's precision being greater than 1. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/110949 PR tree-optimization/95906 gcc/ChangeLog: * match.pd (cmp - 1): New pattern. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/cmp-2.c: New test. * gcc.dg/tree-ssa/max-bitcmp-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-21tree-optimization/121194 - check LC PHIs can be vectorizedRichard Biener1-0/+17
With bools we can have the usual mismatch between mask and data use. Catch that, like we do elsewhere. PR tree-optimization/121194 * tree-vect-loop.cc (vectorizable_lc_phi): Verify vector types are compatible. * gcc.dg/torture/pr121194.c: New testcase.
2025-07-21Error handling for hard register constraintsStefan Schulze Frielinghaus7-19/+196
This implements error handling for hard register constraints including potential conflicts with register asm operands. In contrast to register asm operands, hard register constraints allow more than just one register per operand. Even more than just one register per alternative. For example, a valid constraint for an operand is "{r0}{r1}m,{r2}". However, this also means that we have to make sure that each register is used at most once in each alternative over all outputs and likewise over all inputs. For asm statements this is done by this patch during gimplification. For hard register constraints used in machine description, error handling is still a todo and I haven't investigated this so far and consider this rather a low priority. gcc/ada/ChangeLog: * gcc-interface/trans.cc (gnat_to_gnu): Pass null pointer to parse_{input,output}_constraint(). gcc/analyzer/ChangeLog: * region-model-asm.cc (region_model::on_asm_stmt): Pass null pointer to parse_{input,output}_constraint(). gcc/c/ChangeLog: * c-typeck.cc (build_asm_expr): Pass null pointer to parse_{input,output}_constraint(). gcc/ChangeLog: * cfgexpand.cc (n_occurrences): Move this ... (check_operand_nalternatives): and this ... (expand_asm_stmt): and the call to gimplify.cc. * config/s390/s390.cc (s390_md_asm_adjust): Pass null pointer to parse_{input,output}_constraint(). * gimple-walk.cc (walk_gimple_asm): Pass null pointer to parse_{input,output}_constraint(). (walk_stmt_load_store_addr_ops): Ditto. * gimplify-me.cc (gimple_regimplify_operands): Ditto. * gimplify.cc (num_occurrences): Moved from cfgexpand.cc. (num_alternatives): Ditto. (gimplify_asm_expr): Deal with hard register constraints. * stmt.cc (eliminable_regno_p): New helper. (hardreg_ok_p): Perform a similar check as done in make_decl_rtl(). (parse_output_constraint): Add parameter for gimplify_reg_info and validate hard register constrained operands. (parse_input_constraint): Ditto. * stmt.h (class gimplify_reg_info): Forward declaration. (parse_output_constraint): Add parameter. (parse_input_constraint): Ditto. * tree-ssa-operands.cc (operands_scanner::get_asm_stmt_operands): Pass null pointer to parse_{input,output}_constraint(). * tree-ssa-structalias.cc (find_func_aliases): Pass null pointer to parse_{input,output}_constraint(). * varasm.cc (assemble_asm): Pass null pointer to parse_{input,output}_constraint(). * gimplify_reg_info.h: New file. gcc/cp/ChangeLog: * semantics.cc (finish_asm_stmt): Pass null pointer to parse_{input,output}_constraint(). gcc/d/ChangeLog: * toir.cc: Pass null pointer to parse_{input,output}_constraint(). gcc/testsuite/ChangeLog: * gcc.dg/pr87600-2.c: Split test into two files since errors for functions test{0,1} are thrown during expand, and for test{2,3} during gimplification. * lib/scanasm.exp: On s390, skip lines beginning with #. * gcc.dg/asm-hard-reg-error-1.c: New test. * gcc.dg/asm-hard-reg-error-2.c: New test. * gcc.dg/asm-hard-reg-error-3.c: New test. * gcc.dg/asm-hard-reg-error-4.c: New test. * gcc.dg/asm-hard-reg-error-5.c: New test. * gcc.dg/pr87600-3.c: New test. * gcc.target/aarch64/asm-hard-reg-2.c: New test. * gcc.target/s390/asm-hard-reg-7.c: New test.
2025-07-21Hard register constraintsStefan Schulze Frielinghaus8-0/+379
Implement hard register constraints of the form {regname} where regname must be a valid register name for the target. Such constraints may be used in asm statements as a replacement for register asm and in machine descriptions. A more verbose description is given in extend.texi. It is expected and desired that optimizations coalesce multiple pseudos into one whenever possible. However, in case of hard register constraints we may have to undo this and introduce copies since otherwise we would constraint a single pseudo to multiple hard registers. This is done prior RA during asmcons in match_asm_constraints_2(). While IRA tries to reduce live ranges, it also replaces some register-register moves. That in turn might undo those copies of a pseudo which we just introduced during asmcons. Thus, check in decrease_live_ranges_number() via valid_replacement_for_asm_input_p() whether it is valid to perform a replacement. The reminder of the patch mostly deals with parsing and decoding hard register constraints. The actual work is done by LRA in process_alt_operands() where a register filter, according to the constraint, is installed. For the sake of "reviewability" and in order to show the beauty of LRA, error handling (which gets pretty involved) is spread out into a subsequent patch. Limitation ---------- Currently, a fixed register cannot be used as hard register constraint. For example, loading the stack pointer on x86_64 via void * foo (void) { void *y; __asm__ ("" : "={rsp}" (y)); return y; } leads to an error. Asm Adjust Hook --------------- The following targets implement TARGET_MD_ASM_ADJUST: - aarch64 - arm - avr - cris - i386 - mn10300 - nds32 - pdp11 - rs6000 - s390 - vax Most of them only add the CC register to the list of clobbered register. However, cris, i386, and s390 need some minor adjustment. gcc/ChangeLog: * config/cris/cris.cc (cris_md_asm_adjust): Deal with hard register constraint. * config/i386/i386.cc (map_egpr_constraints): Ditto. * config/s390/s390.cc (f_constraint_p): Ditto. * doc/extend.texi: Document hard register constraints. * doc/md.texi: Ditto. * function.cc (match_asm_constraints_2): Have a unique pseudo for each operand with a hard register constraint. (pass_match_asm_constraints::execute): Calling into new helper match_asm_constraints_2(). * genoutput.cc (mdep_constraint_len): Return the length of a hard register constraint. * genpreds.cc (write_insn_constraint_len): Support hard register constraints for insn_constraint_len(). * ira.cc (valid_replacement_for_asm_input_p_1): New helper. (valid_replacement_for_asm_input_p): New helper. (decrease_live_ranges_number): Similar to match_asm_constraints_2() ensure that each operand has a unique pseudo if constrained by a hard register. * lra-constraints.cc (process_alt_operands): Install hard register filter according to constraint. * recog.cc (asm_operand_ok): Accept register type for hard register constrained asm operands. (constrain_operands): Validate hard register constraints. * stmt.cc (decode_hard_reg_constraint): Parse a hard register constraint into the corresponding register number or bail out. (parse_output_constraint): Parse hard register constraint and set *ALLOWS_REG. (parse_input_constraint): Ditto. * stmt.h (decode_hard_reg_constraint): Declaration of new function. gcc/testsuite/ChangeLog: * gcc.dg/asm-hard-reg-1.c: New test. * gcc.dg/asm-hard-reg-2.c: New test. * gcc.dg/asm-hard-reg-3.c: New test. * gcc.dg/asm-hard-reg-4.c: New test. * gcc.dg/asm-hard-reg-5.c: New test. * gcc.dg/asm-hard-reg-6.c: New test. * gcc.dg/asm-hard-reg-7.c: New test. * gcc.dg/asm-hard-reg-8.c: New test. * gcc.target/aarch64/asm-hard-reg-1.c: New test. * gcc.target/i386/asm-hard-reg-1.c: New test. * gcc.target/i386/asm-hard-reg-2.c: New test. * gcc.target/s390/asm-hard-reg-1.c: New test. * gcc.target/s390/asm-hard-reg-2.c: New test. * gcc.target/s390/asm-hard-reg-3.c: New test. * gcc.target/s390/asm-hard-reg-4.c: New test. * gcc.target/s390/asm-hard-reg-5.c: New test. * gcc.target/s390/asm-hard-reg-6.c: New test. * gcc.target/s390/asm-hard-reg-longdouble.h: New test.
2025-07-21Remove bougs minimum VF computeRichard Biener1-0/+15
The following removes the minimum VF compute from dataref analysis which does not take into account SLP at all, leaving the testcase vectorized with V2SImode instead of V4SImode on x86. With SLP the only minimum VF we can compute this early is 1. * tree-vectorizer.h (vect_analyze_data_refs): Remove min_vf output. * tree-vect-data-refs.cc (vect_analyze_data_refs): Likewise. * tree-vect-loop.cc (vect_analyze_loop_2): Remove early out based on bogus min_vf. * tree-vect-slp.cc (vect_slp_analyze_bb_1): Adjust. * gcc.dg/vect/vect-127.c: New testcase.
2025-07-19testsuite: Fix afdo-crossmodule-1b.c [PR120859]Andrew Pinski1-0/+5
The problem here is that the testcase is part of another testcase but dg-final does not work across source files so it needs its own dg-* headers to that match up with afdo-crossmodule-1.c. Pushed as preapproved in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120859#c4 . PR testsuite/120859 gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/afdo-crossmodule-1b.c: Add some dg-* commands like what is in afdo-crossmodule-1.c Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-18testsuite/vec: Fix vect-reduc-cond-[12].c for non vect_condition targets ↵Andrew Pinski2-0/+2
[PR121153] I missed this when I added the two testcase vect-reduc-cond-[12].c. These testcases require support of vectorization of `a ? b : c` which some targets (e.g. sparc) does not support. Pushed as obvious after a quick test. PR testsuite/121153 gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-reduc-cond-1.c: Require vect_condition. * gcc.dg/vect/vect-reduc-cond-2.c: Likewise. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-18tree-sra: Fix grp_covered flag computation when totally scalarizing (PR117423)Martin Jambor1-0/+49
Testcase of PR 117423 shows a flaw in the fancy way we do "total scalarization" in SRA now. We use the types encountered in the function body and not in type declaration (allowing us to totally scalarize when only one union field is ever used, since we effectively "skip" the union then) and can accommodate pre-existing accesses that happen to fall into padding. In this case, we skipped the union (bypassing the totally_scalarizable_type_p check) and the access falling into the "padding" is an aggregate and so not a candidate for SRA but actually containing data. Arguably total scalarization should just bail out when it encounters this situation (but I decided not to depend on this mainly because we'd need to detect all cases when we eventually cannot scalarize, such as when a scalar access has children accesses) but the actual bug is that the detection if all data in an aggregate is indeed covered by replacements just assumes that is always the case if total scalarization triggers which however may not be the case in cases like this - and perhaps more. This patch fixes the bug by just assuming that all padding is taken care of when total scalarization triggered, not that every access was actually scalarized. gcc/ChangeLog: 2025-07-17 Martin Jambor <mjambor@suse.cz> PR tree-optimization/117423 * tree-sra.cc (analyze_access_subtree): Fix computation of grp_covered flag. gcc/testsuite/ChangeLog: 2025-07-17 Martin Jambor <mjambor@suse.cz> PR tree-optimization/117423 * gcc.dg/tree-ssa/pr117423.c: New test.
2025-07-18tree-optimization/121126 - properly verify live LC PHIsRichard Biener1-0/+30
The following makes sure we analyze live LC PHIs not part of a double reduction. PR tree-optimization/121126 * tree-vect-stmts.cc (vect_analyze_stmt): Analyze the live lane extract for LC PHIs that are vect_internal_def. * gcc.dg/vect/pr121126.c: New testcase.
2025-07-18tree-optimization/120924 - up --param uninit-max-chain-lenRichard Biener1-0/+34
The PR shows that the uninit analysis limits are set too low in cases we lower switches to ifs as happens on s390x for a linux kernel TU. This causes false positive uninit diagnostics as we abort the attempt to prove that a value is initialized on all paths. The new testcase only would require upping to 9. PR tree-optimization/120924 * params.opt (uninit-max-chain-len): Up from 8 to 12. * gcc.dg/uninit-pr120924.c: New testcase.
2025-07-18gimple-fold: Fix up big endian _BitInt adjustment [PR121131]Jakub Jelinek1-0/+30
The following testcase ICEs because SCALAR_INT_TYPE_MODE of course doesn't work for large BITINT_TYPE types which have BLKmode. native_encode* as well as e.g. r14-8276 use in cases like these GET_MODE_SIZE (SCALAR_INT_TYPE_MODE ()) and TREE_INT_CST_LOW (TYPE_SIZE_UNIT ()) for the BLKmode ones. In this case, it wants bits rather than bytes, so I've used GET_MODE_BITSIZE like before and TYPE_SIZE otherwise. Furthermore, the patch only computes encoding_size for big endian targets, for little endian we don't really adjust anything, so there is no point computing it. 2025-07-18 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/121131 * gimple-fold.cc (fold_nonarray_ctor_reference): Use TREE_INT_CST_LOW (TYPE_SIZE ()) instead of GET_MODE_BITSIZE (SCALAR_INT_TYPE_MODE ()) for BLKmode BITINT_TYPEs. Don't compute encoding_size at all for little endian targets. * gcc.dg/bitint-124.c: New test.
2025-07-17Reject single lane vector types for SLP buildRichard Biener1-2/+1
The following makes us never consider vector(1) T types for vectorization and ensures this during SLP build. This is a long-standing issue for BB vectorization and when we remove early loop vector type setting we lose the single place we have that rejects this for loops. Once we implement partial loop vectorization we should revisit this, but then use the original scalar types for the unvectorized parts. * tree-vect-slp.cc (vect_build_slp_tree_1): Reject single-lane vector types. * gcc.dg/vect/bb-slp-39.c: Adjust.
2025-07-17tree-optimization/121035 - handle stray VN values without expressionRichard Biener1-0/+94
When VN iterates we can end up with unreachable inserted expressions in the expression tables which in turn will not be added to their value by PREs compute_avail. This will later ICE when we pick them up and want to generate them. Deal with this by giving up. PR tree-optimization/121035 * tree-ssa-pre.cc (find_or_generate_expression): Handle values without expression. * gcc.dg/pr121035.c: New testcase.
2025-07-16x86: Warn -pg without -mfentry only on glibc targetsH.J. Lu5-5/+6
Since only glibc targets support -mfentry, warn -pg without -mfentry only on glibc targets. gcc/ PR target/120881 PR testsuite/121078 * config/i386/i386-options.cc (ix86_option_override_internal): Warn -pg without -mfentry only on glibc targets. gcc/testsuite/ PR target/120881 PR testsuite/121078 * gcc.dg/20021014-1.c (dg-additional-options): Add -mfentry -fno-pic only on gnu/x86 targets. * gcc.dg/aru-2.c (dg-additional-options): Likewise. * gcc.dg/nest.c (dg-additional-options): Likewise. * gcc.dg/pr32450.c (dg-additional-options): Likewise. * gcc.dg/pr43643.c (dg-additional-options): Likewise. * gcc.target/i386/pr104447.c (dg-additional-options): Likewise. * gcc.target/i386/pr113122-3.c(dg-additional-options): Likewise. * gcc.target/i386/pr119386-1.c (dg-additional-options): Add -mfentry only on gnu targets. * gcc.target/i386/pr119386-2.c (dg-additional-options): Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-07-16tree-optimization/121049 - avoid loop masking with even/odd reductionRichard Biener1-0/+25
The following disables loop masking when we are using an even/odd widening operation in a reduction because the loop mask then aligns to the wrong elements. PR tree-optimization/121049 * internal-fn.h (widening_evenodd_fn_p): Declare. * internal-fn.cc (widening_evenodd_fn_p): New function. * tree-vect-stmts.cc (vectorizable_conversion): When using an even/odd widening function disable loop masking. * gcc.dg/vect/pr121049.c: New testcase.
2025-07-16ifconv: simple factor out operators while doing ifcvt [PR119920]Andrew Pinski3-0/+176
For possible reductions, ifconv currently handles if the addition is on one side of the if. But in the case of PR 119920, the reduction addition is on both sides of the if. E.g. ``` if (_27 == 0) goto <bb 14>; [50.00%] else goto <bb 13>; [50.00%] <bb 14> a_29 = b_14(D) + a_17; goto <bb 15>; [100.00%] <bb 13> a_28 = c_12(D) + a_17; <bb 15> # a_30 = PHI <a_28(13), a_29(14)> ``` Which ifcvt converts into: ``` _34 = _32 + _33; a_15 = (int) _34; _23 = _4 == 0; _37 = _33 + _35; a_13 = (int) _37; a_5 = _23 ? a_15 : a_13; ``` But the vectorizer does not recognize this as a reduction. To fix this, we should factor out the addition from the `if`. This allows us to get: ``` iftmp.0_7 = _22 ? b_13(D) : c_12(D); a_14 = iftmp.0_7 + a_18; ``` Which then the vectorizer recognizes as a reduction. In the case of PR 112324 and PR 110015, it is similar but with MAX_EXPR reduction instead of an addition. Note while this should be done in phiopt, there are regressions due to other passes not able to handle the factored out cases (see linked bug to PR 64700). I have not had time to fix all of the passes that could handle the addition being in the if/then/else rather than being outside yet. So this is I thought it would be useful just to have a localized version in ifconv which is then only used for the vectorizer. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/119920 PR tree-optimization/112324 PR tree-optimization/110015 gcc/ChangeLog: * tree-if-conv.cc (find_different_opnum): New function. (factor_out_operators): New function. (predicate_scalar_phi): Call factor_out_operators when there is only 2 elements of a phi. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-reduc-cond-1.c: New test. * gcc.dg/vect/vect-reduc-cond-2.c: New test. * gcc.dg/vect/vect-reduc-cond-3.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-16tree-optimization/121116 - avoid _BitInt for vector element initRichard Biener1-0/+21
When having a _BitInt induction we should make sure to not create the step vector elements as _BitInts but as vector element typed. PR tree-optimization/121116 * tree-vect-loop.cc (vectorizable_induction): Use the step vector element type for further processing. * gcc.dg/torture/pr121116.c: New testcase.
2025-07-15tree-optimization/121059 - fixup loop mask queryRichard Biener1-0/+24
When we opportunistically mask an operand of a AND with an already available loop mask we need to query that set with the correct number of masks we expect. PR tree-optimization/121059 * tree-vect-stmts.cc (vectorizable_operation): Query scalar_cond_masked_set with the correct number of masks. * gcc.dg/vect/pr121059.c: New testcase. Co-Authored-By: Richard Sandiford <richard.sandiford@arm.com>
2025-07-15c, c++: Extend -Wunused-but-set-* warnings [PR44677]Jakub Jelinek2-7/+5
The -Wunused-but-set-* warnings work by using 2 bits on VAR_DECLs & PARM_DECLs, TREE_USED and DECL_READ_P. If neither is set, we typically emit -Wunused-variable or -Wunused-parameter warning, that is for variables which are just declared (including initializer) and completely unused. If TREE_USED is set and DECL_READ_P is unset, -Wunused-but-set-* warnings are emitted, i.e. for variables which can appear on the lhs of an assignment expression but aren't actually used elsewhere. The DECL_READ_P marking is done through mark_exp_read called from lots of places (e.g. lvalue to rvalue conversions etc.). LLVM has an extension on top of that in that it doesn't count pre/post inc/decrements as use (i.e. DECL_READ_P for GCC). The following patch does that too, though because we had the current behavior for 11+ years already and lot of people is -Wunused-but-set-* warning free in the current GCC behavior and not in the clang one (including GCC sources), it allows users to choose. Furthermore, it implements another level, where also var @= expr uses of var (except when it is also used in expr) aren't counted as DECL_READ_P. I think it would be nice to also handle var = var @ expr or var = expr @ var but unfortunately mark_exp_read is then done in both FEs during parsing of var @ expr or expr @ var and the code doesn't know it is rhs of an assignment with var as lhs. The patch works mostly by checking if DECL_READ_P is clear at some point and then clearing it again after some operation which might have set it. -Wunused or -Wall or -Wunused -Wextra or -Wall -Wextra turn on the 3 level of the new warning (i.e. the one which ignores also var++, ++var etc. as well as var @= expr), so does -Wunused-but-set-{variable,parameter}, but users can use explicit -Wunused-but-set-{variable,parameter}={1,2} to select a different level. 2025-07-15 Jakub Jelinek <jakub@redhat.com> Jason Merrill <jason@redhat.com> PR c/44677 gcc/ * common.opt (Wunused-but-set-parameter=, Wunused-but-set-variable=): New options. (Wunused-but-set-parameter, Wunused-but-set-variable): Turn into aliases. * common.opt.urls: Regenerate. * diagnostic-spec.cc (nowarn_spec_t::nowarn_spec_t): Use OPT_Wunused_but_set_variable_ instead of OPT_Wunused_but_set_variable and OPT_Wunused_but_set_parameter_ instead of OPT_Wunused_but_set_parameter. * gimple-ssa-store-merging.cc (find_bswap_or_nop_1): Remove unused but set variable tmp. * ipa-strub.cc (pass_ipa_strub::execute): Cast named_args to (void) if ATTR_FNSPEC_DECONST_WATERMARK is not defined. * doc/invoke.texi (Wunused-but-set-parameter=, Wunused-but-set-variable=): Document new options. (Wunused-but-set-parameter, Wunused-but-set-variable): Adjust documentation now that they are just aliases. gcc/c-family/ * c-opts.cc (c_common_post_options): Change warn_unused_but_set_parameter and warn_unused_but_set_variable from 1 to 3 if they were set only implicitly. * c-attribs.cc (build_attr_access_from_parms): Remove unused but set variable nelts. gcc/c/ * c-parser.cc (c_parser_unary_expression): Clear DECL_READ_P after default_function_array_read_conversion for -Wunused-but-set-{parameter,variable}={2,3} on PRE{IN,DE}CREMENT_EXPR argument. (c_parser_postfix_expression_after_primary): Similarly for POST{IN,DE}CREMENT_EXPR. * c-decl.cc (pop_scope): Use OPT_Wunused_but_set_variable_ instead of OPT_Wunused_but_set_variable. (finish_function): Use OPT_Wunused_but_set_parameter_ instead of OPT_Wunused_but_set_parameter. * c-typeck.cc (mark_exp_read): Handle {PRE,POST}{IN,DE}CREMENT_EXPR and don't handle it when cast to void. (build_modify_expr): Clear DECL_READ_P after build_binary_op for -Wunused-but-set-{parameter,variable}=3. gcc/cp/ * cp-gimplify.cc (cp_fold): Clear DECL_READ_P on lhs of MODIFY_EXPR after cp_fold_rvalue if it wasn't set before. * decl.cc (poplevel): Use OPT_Wunused_but_set_variable_ instead of OPT_Wunused_but_set_variable. (finish_function): Use OPT_Wunused_but_set_parameter_ instead of OPT_Wunused_but_set_parameter. * expr.cc (mark_use): Clear read_p for {PRE,POST}{IN,DE}CREMENT_EXPR cast to void on {VAR,PARM}_DECL for -Wunused-but-set-{parameter,variable}={2,3}. (mark_exp_read): Handle {PRE,POST}{IN,DE}CREMENT_EXPR and don't handle it when cast to void. * module.cc (trees_in::fn_parms_fini): Remove unused but set variable ix. * semantics.cc (finish_unary_op_expr): Return early for PRE{IN,DE}CREMENT_EXPR. * typeck.cc (cp_build_unary_op): Clear DECL_READ_P after mark_lvalue_use for -Wunused-but-set-{parameter,variable}={2,3} on PRE{IN,DE}CREMENT_EXPR argument. (cp_build_modify_expr): Clear DECL_READ_P after cp_build_binary_op for -Wunused-but-set-{parameter,variable}=3. gcc/go/ * gofrontend/gogo.cc (Function::export_func_with_type): Remove unused but set variable i. gcc/cobol/ * gcobolspec.cc (lang_specific_driver): Remove unused but set variable n_cobol_files. gcc/testsuite/ * c-c++-common/Wunused-parm-1.c: New test. * c-c++-common/Wunused-parm-2.c: New test. * c-c++-common/Wunused-parm-3.c: New test. * c-c++-common/Wunused-parm-4.c: New test. * c-c++-common/Wunused-parm-5.c: New test. * c-c++-common/Wunused-parm-6.c: New test. * c-c++-common/Wunused-var-7.c (bar, baz): Expect warning on a. * c-c++-common/Wunused-var-19.c: New test. * c-c++-common/Wunused-var-20.c: New test. * c-c++-common/Wunused-var-21.c: New test. * c-c++-common/Wunused-var-22.c: New test. * c-c++-common/Wunused-var-23.c: New test. * c-c++-common/Wunused-var-24.c: New test. * g++.dg/cpp26/name-independent-decl1.C (foo): Expect one set but not used warning. * g++.dg/warn/Wunused-parm-12.C: New test. * g++.dg/warn/Wunused-parm-13.C: New test. * g++.dg/warn/Wunused-var-2.C (f2): Expect set but not used warning on parameter x and variable a. * g++.dg/warn/Wunused-var-40.C: New test. * g++.dg/warn/Wunused-var-41.C: New test. * gcc.dg/memchr-3.c (test_find): Change return type from void to int, and add return n; statement. * gcc.dg/unused-9.c (g): Move dg-bogus to the correct line and expect a warning on i.
2025-07-14Revert "tree-optimization/121059 - record loop mask when required"Richard Biener1-24/+0
This reverts commit 66346b6d800fc4baae876e0fe4e932401bcc85fa.
2025-07-14tree-optimization/121059 - record loop mask when requiredRichard Biener1-0/+24
For loop masking we need to mask a mask AND operation with the loop mask. The following makes sure we have a corresponding mask available. There's no good way to distinguish loop masking from len masking here, so assume we have recorded a mask for the operands mask producers. PR tree-optimization/121059 * tree-vect-stmts.cc (vectorizable_operation): Record a loop mask for mask AND operations. * gcc.dg/vect/pr121059.c: New testcase.
2025-07-14x86-64: Add --enable-x86-64-mfentryH.J. Lu5-1/+5
When profiling is enabled with shrink wrapping, the mcount call may not be placed at the function entry after pushq %rbp movq %rsp,%rbp As the result, the profile data may be skewed which makes PGO less effective. Add --enable-x86-64-mfentry to enable -mfentry by default to use __fentry__, added to glibc in 2010 by: commit d22e4cc9397ed41534c9422d0b0ffef8c77bfa53 Author: Andi Kleen <ak@linux.intel.com> Date: Sat Aug 7 21:24:05 2010 -0700 x86: Add support for frame pointer less mcount instead of mcount, which is placed before the prologue so that -pg can be used with -fshrink-wrap-separate enabled at -O1. This option is 64-bit only because __fentry__ doesn't support PIC in 32-bit mode. The default it to enable -mfentry when targeting glibc. Also warn -pg without -mfentry with shrink wrapping enabled. The warning is disable for PIC in 32-bit mode. gcc/ PR target/120881 * config.in: Regenerated. * configure: Likewise. * configure.ac: Add --enable-x86-64-mfentry. * config/i386/i386-options.cc (ix86_option_override_internal): Enable __fentry__ in 64-bit mode if ENABLE_X86_64_MFENTRY is set to 1. Warn -pg without -mfentry with shrink wrapping enabled. * doc/install.texi: Document --enable-x86-64-mfentry. gcc/testsuite/ PR target/120881 * gcc.dg/20021014-1.c: Add additional -mfentry -fno-pic options for x86. * gcc.dg/aru-2.c: Likewise. * gcc.dg/nest.c: Likewise. * gcc.dg/pr32450.c: Likewise. * gcc.dg/pr43643.c: Likewise. * gcc.target/i386/pr104447.c: Likewise. * gcc.target/i386/pr113122-3.c: Likewise. * gcc.target/i386/pr119386-1.c: Add additional -mfentry if not ia32. * gcc.target/i386/pr119386-2.c: Likewise. * gcc.target/i386/pr120881-1a.c: New test. * gcc.target/i386/pr120881-1b.c: Likewise. * gcc.target/i386/pr120881-1c.c: Likewise. * gcc.target/i386/pr120881-1d.c: Likewise. * gcc.target/i386/pr120881-2a.c: Likewise. * gcc.target/i386/pr120881-2b.c: Likewise. * gcc.target/i386/pr82699-1.c: Add additional -mfentry. * lib/target-supports.exp (check_effective_target_fentry): New. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-07-14Darwin: account for macOS 26Francois-Xavier Coudert1-0/+1
darwin25 will be named macOS 26 (codename Tahoe). This is a change from darwin24, which was macOS 15. We need to adapt the driver to this new numbering scheme. 2025-07-14 François-Xavier Coudert <fxcoudert@gcc.gnu.org> gcc/ChangeLog: PR target/120645 * config/darwin-driver.cc: Account for latest macOS numbering scheme. gcc/testsuite/ChangeLog: * gcc.dg/darwin-minversion-link.c: Account for macOS 26.
2025-07-12testsuite: Enable the PR 87600 tests for LoongArchXi Ruoyao3-2/+5
I'm going to refine a part of the PR 87600 fix which seems triggering PR 120983 that LoongArch is particularly suffering. Enable the PR 87600 tests so I'll not regress PR 87600. gcc/testsuite/ChangeLog: PR rtl-optimization/87600 PR rtl-optimization/120983 * gcc.dg/pr87600.h [__loongarch__]: Define REG0 and REG1. * gcc.dg/pr87600-1.c (dg-do): Add loongarch. * gcc.dg/pr87600-2.c (dg-do): Likewise.
2025-07-11diagnostics: add support for directed graphs; use them for state graphsDavid Malcolm11-47/+523
In r16-1631-g2334d30cd8feac I added support for capturing state information from -fanalyzer in XML form, and adding a way to visualize these states in HTML output. The data was optionally captured in SARIF output (with "xml-state=yes"), stashing the XML in string form in a property bag. This worked, but there was no way to round-trip the stored data back from SARIF without adding an XML parser to GCC, which I don't want to do. SARIF supports capturing directed graphs, so this patch: (a) adds a new namespace diagnostics::digraphs, with classes digraph, node, and edge, representing directed graphs in a form similar to what SARIF can serialize (b) adds support to GCC's diagnostic subsystem for reporting graphs, either "globally" or as part of a diagnostic. An example in a testsuite plugin emits an error that has a couple of dummy graphs associated with it, and captures the optimization passes as a digraph "globally". Graphs are ignored by text sinks, but are captured by sarif sinks, and the "experimental-html" sink gains SVG-based rendering of any graphs using dot. This HTML output is rather crude; an example can be seen here: https://dmalcolm.fedorapeople.org/gcc/2025-07-10/diagnostic-test-graphs-html.c.html (c) adds support to libgdiagnostics for the above (d) adds support to sarif-replay for the above (round-tripping any graph information) (e) replaces the XML representation of state with a representation based on the above directed graphs, using property bags to stash additional information (e.g. "this is an on-stack buffer") (f) implements round-tripping of this information in sarif-replay To summarize: - previously we could generate HTML diagrams for debugging -fanalyzer directly from gcc, but not from stored .sarif output. - with this patch, we can generate such HTML diagrams both directly *and* from stored .sarif output (provided the SARIF sink was created with "state-graphs=yes") Examples of HTML output can be seen here: https://dmalcolm.fedorapeople.org/gcc/2025-07-10/ where as before j/k can be used to cycle through the events. which is almost identical to the output from the old XML-based implementation seen at: https://dmalcolm.fedorapeople.org/gcc/2025-06-23/ gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add diagnostic-digraphs.o and diagnostic-state-graphs.o. gcc/ChangeLog: * diagnostic-format-html.cc: Include "diagnostic-format-sarif.h", Replace include of "diagnostic-state.h" with includes of "diagnostic-digraphs.h" and "diagnostic-state-graphs.h". (html_generation_options::html_generation_options): Update for field renaming. (html_builder::m_body_element): New field. (html_builder::html_builder): Initialize m_body_element. (html_builder::maybe_make_state_diagram): Port from XML implementation to state graph implementation. (html_builder::make_element_for_diagnostic): Add any per-diagnostic graphs. (html_builder::add_graph): New. (html_builder::emit_global_graph): New. (html_output_format::report_global_digraph): New. * diagnostic-format-html.h (html_generation_options::m_show_state_diagram_xml): Replace with... (html_generation_options::m_show_state_diagrams_sarif): ...this. (html_generation_options::m_show_state_diagram_dot_src): Rename to... (html_generation_options::m_show_state_diagrams_dot_src): ...this. * diagnostic-format-sarif.cc: Include "diagnostic-digraphs.h" and "diagnostic-state-graphs.h". (sarif_builder::m_run_graphs): New field. (sarif_result::on_nested_diagnostic): Update call to make_location_object to pass arg by pointer. (sarif_builder::sarif_builder): Initialize m_run_graphs. (sarif_builder::report_global_digraph): New. (sarif_builder::make_result_object): Add any graphs to the result object. (sarif_builder::make_locations_arr): Update call to make_location_object to pass arg by pointer. (sarif_builder::make_location_object): Pass param "loc_mgr" by pointer rather than by reference so that it can be null, and handle this case. (copy_any_property_bag): New. (make_sarif_graph): New. (make_sarif_node): New. (make_sarif_edge): New. (sarif_property_bag::set_graph): New. (populate_thread_flow_location_object): Port from XML implementation to state graph implementation. (make_run_object): Store any graphs. (sarif_output_format::report_global_digraph): New. (sarif_generation_options::sarif_generation_options): Rename m_xml_state to m_state_graph. (selftest::test_make_location_object): Update for change to make_location_object. * diagnostic-format-sarif.h: (sarif_generation_options::m_xml_state): Replace with... (sarif_generation_options::m_state_graph): ...this. (class sarif_location_manager): Add forward decl. (diagnostics::digraphs::digraph): New forward decl. (diagnostics::digraphs::node): New forward decl. (diagnostics::digraphs::edge): New forward decl. (sarif_property_bag::set_graph): New decl. (class sarif_graph): New. (class sarif_node): New. (class sarif_edge): New. (make_sarif_graph): New decl. (make_sarif_node): New decl. (make_sarif_edge): New decl. * diagnostic-format-text.h (diagnostic_text_output_format::report_global_digraph): New. * diagnostic-format.h (diagnostic_output_format::report_global_digraph): New vfunc. * diagnostic-digraphs.cc: New file. * diagnostic-digraphs.h: New file. * diagnostic-metadata.h (diagnostics::digraphs::lazy_digraphs): New forward decl. (diagnostic_metadata::diagnostic_metadata): Initialize m_lazy_digraphs. (diagnostic_metadata::set_lazy_digraphs): New. (diagnostic_metadata::get_lazy_digraphs): New. (diagnostic_metadata::m_lazy_digraphs): New field. * diagnostic-output-spec.cc (sarif_scheme_handler::make_sink): Update for XML to state graph changes. (sarif_scheme_handler::make_sarif_gen_opts): Likewise. (html_scheme_handler::make_sink): Rename "show-state-diagram-xml" to "show-state-diagrams-sarif" and use pluralization consistently. * diagnostic-path.cc: Replace include of "xml.h" with "diagnostic-state-graphs.h". (diagnostic_event::maybe_make_xml_state): Replace with... (diagnostic_event::maybe_make_diagnostic_state_graph): ...this. * diagnostic-path.h (diagnostics::digraphs::digraph): New forward decl. (diagnostic_event::maybe_make_xml_state): Replace with... (diagnostic_event::maybe_make_diagnostic_state_graph): ...this. * diagnostic-state-graphs.cc: New file. * diagnostic-state-graphs.h: New file. * diagnostic-state-to-dot.cc: Port implementation from XML to state graphs. * diagnostic-state.h: Deleted file. * diagnostic.cc (diagnostic_context::report_global_digraph): New. * diagnostic.h (diagnostics::digraphs::lazy_digraph): New forward decl. (diagnostic_context::report_global_digraph): New decl. * doc/analyzer.texi (Debugging the Analyzer): Update to reflect change from XML to state graphs. * doc/invoke.texi ("sarif" diagnostics sink): Replace "xml-state" with "state-graphs". ("experimental-html" diagnostics sink): Replace "show-state-diagrams-xml" with "show-state-diagrams-sarif" * doc/libgdiagnostics/topics/compatibility.rst (LIBGDIAGNOSTICS_ABI_3): New. * doc/libgdiagnostics/topics/graphs.rst: New file. * doc/libgdiagnostics/topics/index.rst: Add graphs.rst. * graphviz.h (node_id::operator=): New. * json.h (json::value::dyn_cast_string): New. (json::object::get_num_keys): New accessor. (json::object::get_key): New accessor. (json::string::dyn_cast_string): New. * libgdiagnostics++.h (class libgdiagnostics::graph): New. (class libgdiagnostics::node): New. (class libgdiagnostics::edge): New. (class libgdiagnostics::diagnostic::take_graph): New. (class libgdiagnostics::manager::take_global_graph): New. (class libgdiagnostics::graph::set_description): New. (class libgdiagnostics::graph::get_node_by_id): New. (class libgdiagnostics::graph::get_edge_by_id): New. (class libgdiagnostics::graph::add_edge): New. (class libgdiagnostics::node::set_label): New. (class libgdiagnostics::node::set_location): New. (class libgdiagnostics::node::set_logical_location): New. * libgdiagnostics-private.h: New file. * libgdiagnostics.cc: Define INCLUDE_STRING. Include "diagnostic-digraphs.h", "diagnostic-state-graphs.h", and "libgdiagnostics-private.h". (struct diagnostic_graph): New. (struct diagnostic_node): New. (struct diagnostic_edge): New. (libgdiagnostics_path_event::libgdiagnostics_path_event): Add state_graph param. (libgdiagnostics_path_event::maybe_make_diagnostic_state_graph): New. (libgdiagnostics_path_event::m_state_graph): New field. (diagnostic_execution_path::add_event_va): Add state_graph param. (class prebuilt_digraphs): New. (diagnostic::diagnostic): Use m_graphs in m_metadata. (diagnostic::take_graph): New. (diagnostic::get_graphs): New accessor. (diagnostic::m_graphs): New field. (diagnostic_manager::take_global_graph): New. (diagnostic_execution_path_add_event): Update for new param to add_event_va. (diagnostic_execution_path_add_event_va): Likewise. (diagnostic_graph::add_node_with_id): New public entrypoint. (diagnostic_graph::add_edge_with_label): New public entrypoint. (diagnostic_manager_new_graph): New public entrypoint. (diagnostic_manager_take_global_graph): New public entrypoint. (diagnostic_take_graph): New public entrypoint. (diagnostic_graph_release): New public entrypoint. (diagnostic_graph_set_description): New public entrypoint. (diagnostic_graph_add_node): New public entrypoint. (diagnostic_graph_add_edge): New public entrypoint. (diagnostic_graph_get_node_by_id): New public entrypoint. (diagnostic_graph_get_edge_by_id): New public entrypoint. (diagnostic_node_set_location): New public entrypoint. (diagnostic_node_set_label): New public entrypoint. (diagnostic_node_set_logical_location): New public entrypoint. (private_diagnostic_execution_path_add_event_2): New private entrypoint. (private_diagnostic_graph_set_property_bag): New private entrypoint. (private_diagnostic_node_set_property_bag): New private entrypoint. (private_diagnostic_edge_set_property_bag): New private entrypoint. * libgdiagnostics.h (diagnostic_graph): New typedef. (diagnostic_node): New typedef. (diagnostic_edge): New typedef. (diagnostic_manager_new_graph): New decl. (diagnostic_manager_take_global_graph): New decl. (diagnostic_take_graph): New decl. (diagnostic_graph_release): New decl. (diagnostic_graph_set_description): New decl. (diagnostic_graph_add_node): New decl. (diagnostic_graph_add_edge): New decl. (diagnostic_graph_get_node_by_id): New decl. (diagnostic_graph_get_edge_by_id): New decl. (diagnostic_node_set_label): New decl. (diagnostic_node_set_location): New decl. (diagnostic_node_set_logical_location): New decl. * libgdiagnostics.map (LIBGDIAGNOSTICS_ABI_3): New. * libsarifreplay.cc: Include "libgdiagnostics-private.h". (id_map): New "using". (sarif_replayer::report_invalid_sarif): Update for change to report_problem params. (sarif_replayer::report_unhandled_sarif): Likewise. (sarif_replayer::report_note): New. (sarif_replayer::report_problem): Pass param "ref" by pointer rather than reference and handle it being null. (sarif_replayer::maybe_get_property_bag): New. (sarif_replayer::maybe_get_property_bag_value): New. (sarif_replayer::handle_run_obj): Handle run-level "graphs" as per §3.14.20. (sarif_replayer::handle_result_obj): Handle result-level "graphs" as per §3.27.19. (handle_thread_flow_location_object): Optionally handle graphs stored in property "gcc/diagnostic_event/state_graph" as state graphs. (sarif_replayer::handle_graph_object): New. (sarif_replayer::handle_node_object): New. (sarif_replayer::handle_edge_object): New. (sarif_replayer::get_graph_node_by_id_property): New. * selftest-run-tests.cc (selftest::run_tests): Call selftest::diagnostic_graph_cc_tests and selftest::diagnostic_state_graph_cc_tests. * selftest.h (selftest::diagnostic_graph_cc_tests): New decl. (selftest::diagnostic_state_graph_cc_tests): New decl. gcc/analyzer/ChangeLog: * ana-state-to-diagnostic-state.cc: Reimplement, replacing XML-based implementation with one based on state graphs. * ana-state-to-diagnostic-state.h: Likewise. * checker-event.cc: Replace include of "xml.h" with include of "diagnostic-state-graphs.h". (checker_event::maybe_make_xml_state): Replace with... (checker_event::maybe_make_diagnostic_state_graph): ...this. * checker-event.h: Add include of "diagnostic-digraphs.h". (checker_event::maybe_make_xml_state): Replace decl with... (checker_event::maybe_make_diagnostic_state_graph): ...this. * engine.cc (exploded_node::on_stmt_pre): Replace "_analyzer_dump_xml" with "__analyzer_dump_sarif". * program-state.cc: Replace include of "diagnostic-state.h" with "diagnostic-state-graphs.h". (program_state::dump_dot): Port from XML to state graphs. * program-state.h: Drop reduntant forward decl of xml::document. (program_state::make_xml): Replace decl with... (program_state::make_diagnostic_state_graph): ...this. (program_state::dump_xml_to_pp): Drop decl. (program_state::dump_xml_to_file): Drop decl. (program_state::dump_xml): Drop decl. (program_state::dump_dump_sarif): New decl. * sm-malloc.cc (get_dynalloc_state_for_state): New. (malloc_state_machine::add_state_to_xml): Replace with... (malloc_state_machine::add_state_to_state_graph): ...this. * sm.cc (state_machine::add_state_to_xml): Replace with... (state_machine::add_state_to_state_graph): ...this. (state_machine::add_global_state_to_xml): Replace with... (state_machine::add_global_state_to_state_graph): ...this. * sm.h (class xml_state): Drop forward decl. (class analyzer_state_graph): New forward decl. (state_machine::add_state_to_xml): Replace decl with... (state_machine::add_state_to_state_graph): ...this. (state_machine::add_global_state_to_xml): Replace decl with... (state_machine::add_global_state_to_state_graph): ...this. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/state-diagram-1-sarif.py (test_xml_state): Rename to... (test_state_graph): ...this. Port from XML to SARIF graphs. * gcc.dg/analyzer/state-diagram-1.c: Update sink option from "sarif:xml-state=yes" to "sarif:state-graphs=yes". * gcc.dg/analyzer/state-diagram-5-sarif.c: Likewise. * gcc.dg/analyzer/state-diagram-5-sarif.py: Drop import of ET. (test_nested_types_in_xml_state): Rename to... (test_nested_types_in_state_graph): ...this. Port from XML to SARIF graphs. * gcc.dg/plugin/diagnostic-test-graphs-html.c: New test. * gcc.dg/plugin/diagnostic-test-graphs-html.py: New test script. * gcc.dg/plugin/diagnostic-test-graphs-sarif.c: New test. * gcc.dg/plugin/diagnostic-test-graphs-sarif.py: New test script. * gcc.dg/plugin/diagnostic-test-graphs.c: New test. * gcc.dg/plugin/diagnostic_plugin_test_graphs.cc: New test plugin. * gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above. * lib/sarif.py (get_xml_state): Delete. (get_state_graph): New. (def get_state_node_attr): New. (get_state_node_kind): New. (get_state_node_name): New. (get_state_node_type): New. (get_state_node_value): New. * sarif-replay.dg/2.1.0-invalid/3.40.2-duplicate-node-id.sarif: New test. * sarif-replay.dg/2.1.0-invalid/3.41.4-unrecognized-node-id.sarif: New test. * sarif-replay.dg/2.1.0-valid/graphs-check-html.py: New test script. * sarif-replay.dg/2.1.0-valid/graphs-check-sarif-roundtrip.py: New test script. * sarif-replay.dg/2.1.0-valid/graphs.sarif: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-07-11tree-optimization/121034 - fix reduction vectorizationRichard Biener1-0/+17
The following fixes the loop following the reduction chain to properly visit all SLP nodes involved and makes the stmt info and the SLP node we track match. PR tree-optimization/121034 * tree-vect-loop.cc (vectorizable_reduction): Cleanup reduction chain following code. * gcc.dg/vect/pr121034.c: New testcase.
2025-07-10Passing TYPE_SIZE_UNIT of the element as the 6th argument to ↵Qing Zhao1-0/+43
.ACCESS_WITH_SIZE (PR121000) The size of the element of the FAM _cannot_ reliably depends on the original TYPE of the FAM that we passed as the 6th parameter to the .ACCESS_WITH_SIZE: TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (gimple_call_arg (call, 5)))) when the element of the FAM has a variable length type. Since the variable that represents TYPE_SIZE_UNIT has no explicit usage in the original IL, compiler transformations (such as DSE) that are applied before object_size phase might eliminate the whole definition to the variable that represents the TYPE_SIZE_UNIT of the element of the FAM. In order to resolve this issue, instead of passing the original TYPE of the FAM as the 6th argument to .ACCESS_WITH_SIZE, we should explicitly pass the original TYPE_SIZE_UNIT of the element TYPE of the FAM as the 6th argument to the call to .ACCESS_WITH_SIZE. PR middle-end/121000 gcc/c/ChangeLog: * c-typeck.cc (build_access_with_size_for_counted_by): Update comments. Pass TYPE_SIZE_UNIT of the element as the 6th argument. gcc/ChangeLog: * internal-fn.cc (expand_ACCESS_WITH_SIZE): Update comments. * internal-fn.def (ACCESS_WITH_SIZE): Update comments. * tree-object-size.cc (access_with_size_object_size): Update comments. Get the element_size from the 6th argument directly. gcc/testsuite/ChangeLog: * gcc.dg/flex-array-counted-by-pr121000.c: New test.
2025-07-10Fixes to auto-profile and Gimple matching.Jan Hubicka1-0/+9
This patch fixes several issues I noticed in gimple matching and -Wauto-profile warning. One problem is that we mismatched symbols with user names, such as "*strlen" instead of "strlen". I added raw_symbol_name to strip extra '*' which is ok on ELF targets which are only targets we support with auto-profile, but eventually we will want to add the user prefix. There is sorry about this. Also I think dwarf2out is wrong: static void add_linkage_attr (dw_die_ref die, tree decl) { const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); /* Mimic what assemble_name_raw does with a leading '*'. */ if (name[0] == '*') name = &name[1]; The patch also fixes locations of warning. I used location of problematic statement as warning_at parmaeter but also included info about the containing funtction. This makes warning_at to ignore the fist location that is fixed now. I also fixed the ICE with -Wno-auto-profile disussed earlier. Bootstrapped/regtested x86_64-linux. Autoprofiled bootstrap now fails for weird reasons for me (it does not bild the training stage), so I will try to debug this before comitting. gcc/ChangeLog: * auto-profile.cc: Include output.h. (function_instance::set_call_location): Also sanity check that location is known. (raw_symbol_name): Two new static functions. (dump_inline_stack): Use it. (string_table::get_index_by_decl): Likewise. (function_instance::get_cgraph_node): Likewise. (function_instance::get_function_instance_by_decl): Fix typo in warning; use raw names; fix lineno decoding. (match_with_target): Add containing funciton parameter; correctly output function and call location in warning. (function_instance::lookup_count): Fix warning locations. (function_instance::match): Fix warning locations; avoid crash with mismatched callee; do not warn about broken callsites twice. (autofdo_source_profile::offline_external_functions): Use raw_assembler_name. (walk_block): Use raw_assembler_name. gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/afdo-inline.c: Add user symbol names.
2025-07-09aarch64: Some fixes for SVE INDEX constantsRichard Sandiford2-0/+70
When using SVE INDEX to load an Advanced SIMD vector, we need to take account of the different element ordering for big-endian targets. For example, when big-endian targets store the V4SI constant { 0, 1, 2, 3 } in registers, 0 becomes the most significant element, whereas INDEX always operates from the least significant element. A big-endian target would therefore load V4SI { 0, 1, 2, 3 } using: INDEX Z0.S, #3, #-1 rather than little-endian's: INDEX Z0.S, #0, #1 While there, I noticed that we would only check the first vector in a multi-vector SVE constant, which would trigger an ICE if the other vectors turned out to be invalid. This is pretty difficult to trigger at the moment, since we only allow single-register modes to be used as frontend & middle-end vector modes, but it can be seen using the RTL frontend. gcc/ * config/aarch64/aarch64.cc (aarch64_sve_index_series_p): New function, split out from... (aarch64_simd_valid_imm): ...here. Account for the different SVE and Advanced SIMD element orders on big-endian targets. Check each vector in a structure mode. gcc/testsuite/ * gcc.dg/rtl/aarch64/vec-series-1.c: New test. * gcc.dg/rtl/aarch64/vec-series-2.c: Likewise. * gcc.target/aarch64/sve/acle/general/dupq_2.c: Fix expected output for this big-endian test. * gcc.target/aarch64/sve/acle/general/dupq_4.c: Likewise. * gcc.target/aarch64/sve/vec_init_3.c: Restrict to little-endian targets and add more tests. * gcc.target/aarch64/sve/vec_init_4.c: New big-endian version of vec_init_3.c.
2025-07-09testsuite/120093 - fix gcc.dg/vect/pr101145.cRichard Biener1-9/+9
The following changes noinline to noipa to avoid having IPA-CP clones confusing the vectorized loop counting. PR testsuite/120093 * gcc.dg/vect/pr101145.c: Use noipa instead of noinline attribute.
2025-07-09Fix 'main' function in 'gcc.dg/builtin-dynamic-object-size-pr120780.c'Thomas Schwinge1-1/+1
Fix-up for commit 72e85d46472716e670cbe6e967109473b8d12d38 "tree-optimization/120780: Support object size for containing objects". 'size_t sz' is unused here, and GCC/nvptx doesn't accept this: spawn -ignore SIGHUP [...]/nvptx-none-run ./builtin-dynamic-object-size-pr120780.exe error : Prototype doesn't match for 'main' in 'input file 1 at offset 1924', first defined in 'input file 1 at offset 1924' nvptx-run: cuLinkAddData failed: unknown error (CUDA_ERROR_UNKNOWN, 999) FAIL: gcc.dg/builtin-dynamic-object-size-pr120780.c execution test gcc/testsuite/ * gcc.dg/builtin-dynamic-object-size-pr120780.c: Fix 'main' function.
2025-07-09middle-end: don't set range on partial vectors [PR120922]Tamar Christina1-0/+18
Before the change in g:309dbcea2cabb31bde1a65cdfd30bb7f87b170a2 we would never set a range for constant VF and requires partial vector loops. I think a range could be set, since I think the number of latch executions is a ceiling division of TYPE_MAX_VALUE / vf. To account for the partial iteration. This would also then deal with the ICE cause in the PR where the chosen VF was much higher than TYPE_MAX_VALUE and that a mask is relied upon to make it safe. Since the patch was supposed to not change behavior I've added an additional partial vector check on the const_vf > 0 check to make it explicit that we only set it on non-partial vectors (alternative would have been to swap the order of the vf.constant(&const_vf)) check, but that would have hidden the requirement sneakily. The second patch adds support for ranges for partial masks. gcc/ChangeLog: PR tree-optimization/120922 * tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Don't set range for partial vectors. gcc/testsuite/ChangeLog: PR tree-optimization/120922 * gcc.dg/vect/pr120922.c: New test.
2025-07-08Avoid IPA opts around guality plumbingRichard Biener1-2/+3
The following avoids inlining the actual main() (renamed to guality_main) into the guality plumbing. This can cause jump threading opportunities to appear and generally increase the chance what we actually test isn't what we think. Likewise make guality_check noipa instead of just noinline. gcc/testsuite/ * gcc.dg/guality/guality.h (guality_main): Declare noipa. (guality_check): Likewise.
2025-07-07[committed] Minor fix to gcc.dg/torture/pr120654.cJeff Law1-4/+2
I don't recall which port complained, but pr120654.c was failing on one or more of the embedded targets due to the use of malloc/free. This change just turns them into the __builtin variants which makes everyone happy again. gcc/testsuite * gcc.dg/torture/pr120654.c: Use __builtin variants of malloc and free.
2025-07-07Revert "Extend "counted_by" attribute to pointer fields of structures. ↵Qing Zhao5-283/+1
Convert a pointer reference with counted_by attribute to .ACCESS_WITH_SIZE." due to PR120929. This reverts commit 687727375769dd41971bad369f3553f1163b3e7a.
2025-07-07Revert "Use the counted_by attribute of pointers in builtinin-object-size." ↵Qing Zhao8-253/+0
due to PR120929 This reverts commit 7165ca43caf47007f5ceaa46c034618d397d42ec.
2025-07-07Revert "Use the counted_by attribute of pointers in array bound checker." ↵Qing Zhao5-221/+0
due to PR120929 This reverts commit 9d579c522d551eaa807e438206e19a91a3def67f.
2025-07-07testsuite: add sve hw check to testcase [PR120817]Tamar Christina1-1/+2
Drop down from SVE2 to SVE1 as that's the minimum required for the test, and since it's a mid-end test add the aarch64_sve_hw check. gcc/testsuite/ChangeLog: PR tree-optimization/120817 * gcc.dg/vect/pr120817.c: Add SVE HW check.
2025-07-07tree-optimization/120817 - bogus DSE of .MASK_STORERichard Biener1-0/+40
DSE used ao_ref_init_from_ptr_and_size for .MASK_STORE but alias-analysis will use the specified size to disambiguate against smaller objects. For .MASK_STORE we instead have to make the access size unspecified but we can still constrain the access extent based on the maximum size possible. PR tree-optimization/120817 * tree-ssa-dse.cc (initialize_ao_ref_for_dse): Use ao_ref_init_from_ptr_and_range with unknown size for .MASK_STORE and .MASK_LEN_STORE. * gcc.dg/vect/pr120817.c: New testcase.
2025-07-06crc: Error out on non-constant poly arguments for the crc builtins [PR120709]Andrew Pinski1-0/+11
These builtins requires a constant integer for the third argument but currently there is assert rather than error. This fixes that and updates the documentation too. Uses the same terms as was being used for the __builtin_prefetch arguments. Bootstrapped and tested on x86_64-linux-gnu. PR middle-end/120709 gcc/ChangeLog: * builtins.cc (expand_builtin_crc_table_based): Error out instead of asserting the 3rd argument is an integer constant. * internal-fn.cc (expand_crc_optab_fn): Likewise. * doc/extend.texi (crc): Document requirement of the poly argument being a constant. gcc/testsuite/ChangeLog: * gcc.dg/crc-non-cst-poly-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-06cdce: Fix non-call exceptions with signaling nans [PR120951]Andrew Pinski1-0/+12
The cdce code introduces a test for a NaN using the EQ_EXPR code. The problem is EQ_EXPR can cause an exception with non-call exceptions and signaling nans turned on. This is now correctly rejected by the verfier since r16-241-g4c40e3d7b9152f. The fix is seperate out the comparison into its own statement from the GIMPLE_COND. Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/120951 gcc/ChangeLog: * tree-call-cdce.cc (use_internal_fn): For non-call exceptions with EQ_EXPR can throw for floating point types, then create the EQ_EXPR seperately. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr120951-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-06Add cutoff information to profile_info and use it when forcing non-zero valueJan Hubicka1-1/+1
Main difference between normal profile feedback and auto-fdo is that with profile feedback every basic block with non-zero profile has an incomming edge with non-zero profile. With auto-profile it is possible that none of predecessors was sampled and also the tool has cutoff parameter which makes it to ignore small counts. This becomes a problem when one tries to specialize code and scale profile. For exmaple if inline function happens to have hot loop with non-zero counts but its entry count has zero counts and we want to inline to zero counts and we want to inline to a call with a non-zero count X, we want to scale the body by X/0 which we currently turn into X/1. This is a problem since I added logic to scale up the auto-profiles (to get some extra bits of precision) so X is often a large value and multiplying by X is not a right answer at all. The multiply factor should be <= 1. Iterating this few times will make counts to cap and we will lost any useful info. Original implementation avoided this by doing all inlines before AFDO readback, bit this is not possible with LTO (unless we move AFDO readback to WPA or add support for context sensitive profiles). I think I can get the scaling work reasonably well and then we can look into possible benefits of context sensitive profiling which can be implemented both atop of AFDO as well as FDO. This patch adds cutoff value to profile_info which is initialized by profile feedback to 1 and by auto-profile to the scale factor (since we do not know the cutoff create_gcov used; llvm's tool streams it and we probably should too). Then force_nonzero forces every value smaller than cutoff/2 to cutoff/2 which should keep scaling factors in reasonable ranges. gcc/ChangeLog: * auto-profile.cc (autofdo_source_profile::read): Scale cutoff. (read_autofdo_file): Initialize cutoff * coverage.cc (read_counts_file): Initialize cutoff to 1. * gcov-io.h (struct gcov_summary): Add cutoff field. * ipa-inline.cc (inline_small_functions): mac_count can be non-zero also with auto_profile. * lto-cgraph.cc (output_profile_summary): Write cutoff and sum_max. (input_profile_summary): Read cutoff and sum max. (merge_profile_summaries): Initialize and scale global cutoffs and sum max. * profile-count.cc: Include profile.h (profile_count::force_nonzero): move here from ...; use cutoff. * profile-count.h: (profile_count::force_nonzero): ... here. gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/clone-merge-1.c:
2025-07-04fold: Change comparison of error_mark_node to use error_operand_p in ↵Andrew Pinski1-0/+12
tree_expr_nonnegative_warnv_p [PR118948] This is an obvious fix for this small regression. Basically after r15-328-g5726de79e2154a, there is a call to tree_expr_nonnegative_warnv_p where the type of the expression is now error_mark_node. Though there was only a check if the expression was error_mark_node. Bootstrapped and tested on x86_64-linux-gnu. PR c/118948 gcc/ChangeLog: * fold-const.cc (tree_expr_nonnegative_warnv_p): Use error_operand_p instead of checking for error_mark_node directly. gcc/testsuite/ChangeLog: * gcc.dg/pr118948-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-04tree-optimization/120944 - bogus VN with volatile copiesRichard Biener1-0/+34
The following avoids translating expressions through volatile copies. PR tree-optimization/120944 * tree-ssa-sccvn.cc (vn_reference_lookup_3): Gate optimizations invalid when volatile is involved. * gcc.dg/torture/pr120944.c: New testcase.
2025-07-04tree-optimization/120927 - 510.parest_r segfault with masked epilogRichard Biener2-0/+48
The following fixes bad alignment computaton for epilog vectorization when as in this case for 510.parest_r and masked epilog vectorization with AVX512 we end up choosing AVX to vectorize the main loop and masked AVX512 (sic!) to vectorize the epilog. In that case alignment analysis for the epilog tries to force alignment of the base to 64, but that cannot possibly help the epilog when the main loop had used a vector mode with smaller alignment requirement. There's another issue, that the check whether the step preserves alignment needs to consider possibly previously involved VFs (here, the main loops smaller VF) as well. These might not be the only case with problems for such a mode mix but at least there it seems wise to never use DR alignment forcing when analyzing an epilog. We get to chose this mode setup because the iteration over epilog modes doesn't prevent this, the maybe_ge (cached_vf_per_mode[0], first_vinfo_vf) skip is conditional on !supports_partial_vectors and it is also conditional on having a cached VF. Further nothing in vect_analyze_loop_1 rejects this setup - it might be conceivable that a target can do masking only for larger modes. There is a second reason we end up with this mode setup, which is that vect_need_peeling_or_partial_vectors_p says we do not need peeling or partial vectors when analyzing the main loop with AVX512 (if it would say so we'd have chosen a masked AVX512 epilog-only vectorization). It does that because it looks at LOOP_VINFO_COST_MODEL_THRESHOLD (which is not yet computed, so always zero at this point), and compares max_niter (5) against the VF (8), but not with equality as the comment says but with greater. This also needs looking at, PR120939. PR tree-optimization/120927 * tree-vect-data-refs.cc (vect_compute_data_ref_alignment): Do not force a DRs base alignment when analyzing an epilog loop. Check whether the step preserves alignment for all VFs possibly involved sofar. * gcc.dg/vect/vect-pr120927.c: New testcase. * gcc.dg/vect/vect-pr120927-2.c: Likewise.
2025-07-04c-family: Tweak ptr +- (expr +- cst) FE optimization [PR120837]Jakub Jelinek1-0/+32
The following testcase is miscompiled with -fsanitize=undefined but we introduce UB into the IL even without that flag. The optimization ptr +- (expr +- cst) when expr/cst have undefined overflow into (ptr +- cst) +- expr is sometimes simply not valid, without careful analysis on what ptr points to we don't know if it is valid to do (ptr +- cst) pointer arithmetics. E.g. on the testcase, ptr points to start of an array (actually conditionally one or another) and cst is -1, so ptr - 1 is invalid pointer arithmetics, while ptr + (expr - 1) can be valid if expr is at runtime always > 1 and smaller than size of the array ptr points to + 1. Unfortunately, removing this 1992-ish optimization altogether causes FAIL: c-c++-common/restrict-2.c -Wc++-compat scan-tree-dump-times lim2 "Moving statement" 11 FAIL: gcc.dg/tree-ssa/copy-headers-5.c scan-tree-dump ch2 "is now do-while loop" FAIL: gcc.dg/tree-ssa/copy-headers-5.c scan-tree-dump-times ch2 " if " 3 FAIL: gcc.dg/vect/pr57558-2.c scan-tree-dump vect "vectorized 1 loops" FAIL: gcc.dg/vect/pr57558-2.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops" regressions (restrict-2.c also for C++ in all std modes). I've been thinking about some match.pd optimization for signed integer addition/subtraction of constant followed by widening integral conversion followed by multiplication or left shift, but that wouldn't help 32-bit arches. So, instead at least for now, the following patch keeps doing the optimization, just doesn't perform it in pointer arithmetics. pointer_int_sum itself actually adds the multiplication by size_exp, so ptr + expr is turned into ptr p+ expr * size_exp, so this patch will try to optimize ptr + (expr +- cst) into ptr p+ ((sizetype)expr * size_exp +- (sizetype)cst * size_exp) and ptr - (expr +- cst) into ptr p+ -((sizetype)expr * size_exp +- (sizetype)cst * size_exp) 2025-07-04 Jakub Jelinek <jakub@redhat.com> PR c/120837 * c-common.cc (pointer_int_sum): Rewrite the intop PLUS_EXPR or MINUS_EXPR optimization into extension of both intop operands, their separate multiplication and then addition/subtraction followed by rest of pointer_int_sum handling after the multiplication. * gcc.dg/ubsan/pr120837.c: New test.
2025-07-03testsuite: Fix gcc.dg/ipa/pr120295.c on SolarisRainer Orth1-2/+2
gcc.dg/ipa/pr120295.c FAILs on Solaris: FAIL: gcc.dg/ipa/pr120295.c (test for excess errors) Excess errors: ld: warning: symbol 'glob' has differing types: (file /var/tmp//ccsDR59c.o type=OBJT; file /lib/libc.so type=FUNC); /var/tmp//ccsDR59c.o definition taken Fixed by renaming the glob variable to glob_ to avoid the conflict. Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu. gcc/testsuite: * gcc.dg/ipa/pr120295.c (glob): Rename to glob_.
2025-07-03tree-optimization/120780: Support object size for containing objectsSiddhesh Poyarekar1-0/+233
MEM_REF cast of a subobject to its containing object has negative offsets, which objsz sees as an invalid access. Support this use case by peeking into the structure to validate that the containing object indeed contains a type of the subobject at that offset and if present, adjust the wholesize for the object to allow the negative offset. gcc/ChangeLog: PR tree-optimization/120780 * tree-object-size.cc (inner_at_offset, get_wholesize_for_memref): New functions. (addr_object_size): Call get_wholesize_for_memref. gcc/testsuite/ChangeLog: PR tree-optimization/120780 * gcc.dg/builtin-dynamic-object-size-pr120780.c: New test case. Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2025-07-01Use the counted_by attribute of pointers in array bound checker.Qing Zhao5-0/+221
Current array bound checker only instruments ARRAY_REF, and the INDEX information is the 2nd operand of the ARRAY_REF. When extending the array bound checker to pointer references with counted_by attributes, the hardest part is to get the INDEX of the corresponding array ref from the offset computation expression of the pointer ref. I.e. Given an OFFSET expression, and the ELEMENT_SIZE, get the index expression from the OFFSET. For example: OFFSET: ((long unsigned int) m * (long unsigned int) SAVE_EXPR <n>) * 4 ELEMENT_SIZE: (sizetype) SAVE_EXPR <n> * 4 get the index as (long unsigned int) m. gcc/c-family/ChangeLog: * c-gimplify.cc (is_address_with_access_with_size): New function. (ubsan_walk_array_refs_r): Instrument an INDIRECT_REF whose base address is .ACCESS_WITH_SIZE or an address computation whose base address is .ACCESS_WITH_SIZE. * c-ubsan.cc (ubsan_instrument_bounds_pointer_address): New function. (struct factor_t): New structure. (get_factors_from_mul_expr): New function. (get_index_from_offset): New function. (get_index_from_pointer_addr_expr): New function. (is_instrumentable_pointer_array_address): New function. (ubsan_array_ref_instrumented_p): Change prototype. Handle MEM_REF in addtional to ARRAY_REF. (ubsan_maybe_instrument_array_ref): Handle MEM_REF in addtional to ARRAY_REF. gcc/testsuite/ChangeLog: * gcc.dg/ubsan/pointer-counted-by-bounds-2.c: New test. * gcc.dg/ubsan/pointer-counted-by-bounds-3.c: New test. * gcc.dg/ubsan/pointer-counted-by-bounds-4.c: New test. * gcc.dg/ubsan/pointer-counted-by-bounds-5.c: New test. * gcc.dg/ubsan/pointer-counted-by-bounds.c: New test.