riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2025-01-29	[PR testsuite/116860] Testsuite adjustment for recently added tests	Jeff Law	2	-2/+2
	There's two new tests that are dependent on logical-op-non-short-circuit settings. The BZ is reported against ppc64 and ppc64le, but also applies to a goodly number of the other targets. The "regression" fix is trivial, just add the appropriate param to force the behavior we're expecting. I'm committing that fix momentarily. It's been verified on ppc64, ppc64le and x86_64 as well as the various embedded targets in my tester where many FAILS flip to PASS. I'm leaving the bug open without the regression marker as Jakub has noted a couple of improvements that we can and probably should make. PR target/116860 gcc/testsuite * gcc.dg/tree-ssa/fold-xor-and-or.c: Set logical-op-non-short-circuit. * gcc.dg/tree-ssa/fold-xor-or.c: Similarly.
2025-01-29	tree-ssa-dce: Avoid creating invalid BBs with no outgoing edge (PR117892)	Martin Jambor	2	-0/+28
	Zhendong Su and Michal Jireš found out that our gimple DSE pass can, under fairly specific conditions, remove a noreturn call which then leaves behind a "normal" BB with no successor edges which following passes do not expect. This patch simply tells the pass to leave such calls alone even when they otherwise appear to be dead. Interestingly, our CFG verifier does not report this. I'll put on my todo list to add a test for it in the next stage 1. gcc/ChangeLog: 2025-01-28 Martin Jambor <mjambor@suse.cz> PR tree-optimization/117892 * tree-ssa-dse.cc (dse_optimize_call): Leave control-altering noreturn calls alone. gcc/testsuite/ChangeLog: 2025-01-27 Martin Jambor <mjambor@suse.cz> PR tree-optimization/117892 * gcc.dg/tree-ssa/pr117892.c: New test. * gcc.dg/tree-ssa/pr118517.c: Likewise. co-authored-by: Michal Jireš <mjires@suse.cz>
2025-01-29	middle-end/118684 - fix fallout of wrong stack local alignment fix	Richard Biener	1	-1/+1
	When we expand BIT_FIELD_REF <x_2(D), 8, 8> we can end up creating a stack local, running into the fix. But get_object_alignment will return 8 for any SSA_NAME because that's not an "object" we handle. Deal with handled components on registers by singling out SSA_NAME bases, using their type alignment instead of get_object_alignment (I considered "robustifying" get_object_alignment, but decided not to at this point). This fixes an ICE on gcc.dg/pr41123.c on arm as reported by the CI. PR middle-end/118684 * expr.cc (expand_expr_real_1): When creating a stack local during expansion of a handled component, when the base is a SSA_NAME use its type alignment and avoid calling get_object_alignment. * gcc.dg/pr118684.c: Require automatic_stack_alignment.
2025-01-28	middle-end/118684 - wrongly aligned stack local during expansion	Richard Biener	1	-0/+12
	The following fixes a not properly aligned stack temporary created during RTL expansion of a MEM_REF that we handle as a BIT_FIELD_REF whose base was allocated to a register but which was originally aligned to allow a larger load not trapping. While probably UB in C the vectorizer creates aligned accesses that might overread a (static) allocation because it is then known not to trap. PR middle-end/118684 * expr.cc (expand_expr_real_1): When expanding a reference based on a register and we end up needing a MEM make sure that's aligned as the original reference required. * gcc.dg/pr118684.c: New testcase.
2025-01-28	sarif output: escape braces in messages [PR118675]	David Malcolm	3	-5/+5
	gcc/ChangeLog: PR other/118675 * diagnostic-format-sarif.cc: Define INCLUDE_STRING. (escape_braces): New. (set_string_property_escaping_braces): New. (sarif_builder::make_message_object): Escape braces in the "text" property. (sarif_builder::make_message_object_for_diagram): Likewise, and for the "markdown" property. (sarif_builder::make_multiformat_message_string): Likewise for the "text" property. (xelftest::test_message_with_braces): New. (selftest::diagnostic_format_sarif_cc_tests): Call it. gcc/testsuite/ChangeLog: PR other/118675 * gcc.dg/sarif-output/bad-binary-op.py: Update expected output for escaping of braces in message text. * gcc.dg/sarif-output/missing-semicolon.py: Likewise. * gcc.dg/sarif-output/multiple-outputs.py: Likewise. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-01-28	tree-optimization/117424 - invalid LIM of trapping ref	Richard Biener	1	-0/+18
	The following addresses a bug in tree_could_trap_p leading to hoisting of a possibly trapping, because of out-of-bound, access. We only ensured the first accessed byte is within a decl there, the patch makes sure the whole base of the reference is within it. This is pessimistic if a handled component would then subset to a sub-object within the decl but upcasting of a decl to larger types should be uncommon, questionable, and wrong without -fno-strict-aliasing. The testcase is a bit fragile, but I could not devise a (portable) way to ensure an out-of-bound access to a decl would fault. PR tree-optimization/117424 * tree-eh.cc (tree_could_trap_p): Verify the base is fully contained within a decl. * gcc.dg/tree-ssa/ssa-lim-25.c: New testcase.
2025-01-28	Add tests for implied copy of variables in reduction clause.	Hafiz Abid Qadeer	1	-0/+29
	The OpenACC reduction clause on compute construct implies a copy clause for each reduction variable [1]. This patch adds tests to check if the implied copy is being generated. The check covers various types and operators as described in the specification. [1] OpenACC 2.7 Specification section 2.5.13 gcc/testsuite/ChangeLog: * c-c++-common/goacc/implied-copy-1.c: New test. * c-c++-common/goacc/implied-copy-2.c: New test. * g++.dg/goacc/implied-copy.C: New test. * gcc.dg/goacc/implied-copy.c: New test. * gfortran.dg/goacc/implied-copy-1.f90: New test. * gfortran.dg/goacc/implied-copy-2.f90: New test.
2025-01-28	c: For array element type drop qualifiers but keep other properties of the ↵	Jakub Jelinek	1	-0/+10
	element type [PR116357] In the following testcase we error on the first case because it is trying to construct an array from overaligned type, but if there are qualifiers, we accept it silently (unlike in C++ which diagnoses all 3). The problem is that grokdeclarator if TYPE_QUALS (element_type) is non-zero just uses TYPE_MAIN_VARIANT; that loses not just the qualifiers but also attributes, alignment etc. The following patch uses c_build_qualified_type with TYPE_UNQUALIFIED instead, which will be in the common case the same as TYPE_MAIN_VARIANT if the checks are satisfied for it, but if not, will look up different unqualified type or even create it if there is none. 2025-01-28 Jakub Jelinek <jakub@redhat.com> PR c/116357 * c-decl.cc (grokdeclarator): Use c_build_qualified_type with TYPE_UNQUALIFIED instead of TYPE_MAIN_VARIANT. * gcc.dg/pr116357.c: New test.
2025-01-27	RISC-V: Disable two-source permutes for now [PR117173].	Robin Dapp	2	-0/+2
	After testing on the BPI (4.2% improvement for x264 input 1, 4.4% for input 2) and the discussion in PR117173 I figured it's best to disable the two-source permutes by default for now. The patch adds a parameter "riscv-two-source-permutes" which restores the old behavior. PR target/117173 gcc/ChangeLog: * config/riscv/riscv-v.cc (shuffle_generic_patterns): Only support single-source permutes by default. * config/riscv/riscv.opt: New param "riscv-two-source-permutes". gcc/testsuite/ChangeLog: * gcc.dg/fold-perm-2.c: Run with two-source permutes. * gcc.dg/pr54346.c: Ditto.
2025-01-27	tree-optimization/118653 - ICE in vectorizable_live_operation	Richard Biener	1	-0/+15
	The checking code didn't take into account debug uses. PR tree-optimization/118653 * tree-vect-loop.cc (vectorizable_live_operation): Also allow out-of-loop debug uses. * gcc.dg/vect/pr118653.c: New testcase.
2025-01-27	rtl-optimization/118662 - wrong combination of vector sign-extends	Richard Biener	1	-0/+18
	The following fixes an issue in the RTL combiner where we correctly combine two vector sign-extends with a vector load Trying 7, 9 -> 10: 7: r106:V4QI=[r119:DI] REG_DEAD r119:DI 9: r108:V4HI=sign_extend(vec_select(r106:V4QI#0,parallel)) 10: r109:V4SI=sign_extend(vec_select(r108:V4HI#0,parallel)) REG_DEAD r108:V4HI to modifying insn i2 9: r109:V4SI=sign_extend([r119:DI]) but since r106 is used we wrongly materialize it using a subreg: modifying insn i3 10: r106:V4QI=r109:V4SI#0 which of course does not work for modes with more than one component, specifically vector and complex modes. PR rtl-optimization/118662 * combine.cc (try_combine): When re-materializing a load from an extended reg by a lowpart subreg make sure we're not dealing with vector or complex modes. * gcc.dg/torture/pr118662.c: New testcase.
2025-01-27	middle-end/118643 - ICE with out-of-bound decl access	Richard Biener	1	-0/+11
	When RTL expansion of an out-of-bound access of a register falls back to a BIT_FIELD_REF we have to ensure that's valid. The following avoids negative offsets by expanding through a stack temporary. PR middle-end/118643 * expr.cc (expand_expr_real_1): Avoid falling back to BIT_FIELD_REF expansion for negative offset. * gcc.dg/pr118643.c: New testcase.
2025-01-27	tree-optimization/112859 - bogus loop distribution	Richard Biener	2	-0/+45
	When we get a zero distance vector we still have to check for the situation of a common inner loop with zero distance. But we can still allow a zero distance for the loop we distribute (gcc.dg/tree-ssa/ldist-33.c is such a case). This is because zero distances in non-outermost loops are a misrepresentation of dependence by dependence analysis. Note that test coverage of loop distribution of loop nests is very low. PR tree-optimization/112859 PR tree-optimization/115347 * tree-loop-distribution.cc (loop_distribution::pg_add_dependence_edges): For a zero distance vector still make sure to not have an inner loop with zero distance. * gcc.dg/torture/pr112859.c: New testcase. * gcc.dg/torture/pr115347.c: Likewise.
2025-01-27	match.pd: Canonicalize unsigned division by power of two into right shift ↵	Jakub Jelinek	1	-0/+22
	[PR118637] We already do this canonicalization in simplify_using_ranges::simplify_div_or_mod_using_ranges, but that means that it is not done at -O1 or when vrp is otherwise disabled, and that it can be done too late in some cases when e.g. the r8-2064 "X / C1 op C2 into a simple range test." optimization triggers first. Note, for unsigned modulo we already have (simplify (mod @0 (convert? (power_of_two_cand@1 @2))) (if ((TYPE_UNSIGNED (type) \|\| tree_expr_nonnegative_p (@0)) ... optimization which duplicates what simplify_using_ranges::simplify_div_or_mod_using_ranges does in case ranges aren't needed. For GCC 16 I think we should improve the niters pattern recognition and handle even what r8-2064 comes with, after all as I've tried to show in the PR the user could have written it that way. I've guarded this optimization on #if GIMPLE just in case this would stand in any way to the various divmult etc. simplification, guess that can be lifted for GCC 16 too. In the modulo case we also handle unsigned % (power_of_two << n), but not really sure if we could do that for the division, because unsigned / (power_of_two << n) is not simple unsigned >> (log2 (power_of_two) + n), one can shift the bit out and then it becomes just 0. 2025-01-27 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/118637 * match.pd: Canonicalize unsigned division by power of two to right shift. * gcc.dg/tree-ssa/pr118637.c: New test.
2025-01-27	match.pd: Fix indefinite recursion during exp-log transformations [PR118490]	Soumya AR	2	-0/+7
	This patch fixes the ICE caused when comparing log or exp of a constant with another constant. The transform is now restricted to cases where the resultant log/exp (CST) can be constant folded. Signed-off-by: Soumya AR <soumyaa@nvidia.com> gcc/ChangeLog: PR target/118490 * match.pd: Added ! to verify that log/exp (CST) can be constant folded. gcc/testsuite/ChangeLog: PR target/118490 * gcc.dg/pr118490.c: New test.
2025-01-23	[ifcombine] check for more zero-extension cases [PR118572]	Alexandre Oliva	1	-0/+36
	When comparing a signed narrow variable with a wider constant that has the bit corresponding to the variable's sign bit set, we would check that the constant is a sign-extension from that sign bit, and conclude that the compare fails if it isn't. When the signed variable is masked without getting the [lr]l_signbit variable set, or when the sign bit itself is masked out, we know the sign-extension bits from the extended variable are going to be zero, so the constant will only compare equal if it is a zero- rather than sign-extension from the narrow variable's precision, therefore, check that it satisfies this property, and yield a false compare result otherwise. for gcc/ChangeLog PR tree-optimization/118572 * gimple-fold.cc (fold_truth_andor_for_ifcombine): Compare as unsigned the variables whose extension bits are masked out. for gcc/testsuite/ChangeLog PR tree-optimization/118572 * gcc.dg/field-merge-24.c: New.
2025-01-23	[ifcombine] out-of-bounds bitfield refs can trap [PR118514]	Alexandre Oliva	1	-0/+19
	Check that BIT_FIELD_REFs of DECLs are in range before deciding they don't trap. Check that a replacement bitfield load is as trapping as the replaced load. for gcc/ChangeLog PR tree-optimization/118514 * tree-eh.cc (bit_field_ref_in_bounds_p): New. (tree_could_trap_p) <BIT_FIELD_REF>: Call it. * gimple-fold.cc (make_bit_field_load): Check trapping status of replacement load against original load. for gcc/testsuite/ChangeLog PR tree-optimization/118514 * gcc.dg/field-merge-23.c: New.
2025-01-23	rtl-ssa: Avoid dangling phi uses [PR118562]	Richard Sandiford	1	-0/+18
	rtl-ssa uses degenerate phis to maintain an RPO list of accesses in which every use is of the RPO-previous definition. Thus, if it finds that a phi is always equal to a particular value V, it sometimes needs to keep the phi and make V the single input, rather than replace all uses of the phi with V. The code to do that rerouted the phi's first input to the single value V. But as this PR shows, it failed to unlink the uses of the other inputs. The specific problem in the PR was that we had: x = PHI<x(a), V(b)> The code replaced the first input with V and removed the second input from the phi, but it didn't unlink the use of V associated with that second input. gcc/ PR rtl-optimization/118562 * rtl-ssa/blocks.cc (function_info::replace_phi): When converting to a degenerate phi, make sure to remove all uses of the previous inputs. gcc/testsuite/ PR rtl-optimization/118562 * gcc.dg/torture/pr118562.c: New test.
2025-01-23	builtins: Store unspecified value to *exp for inf/nan [PR114877]	Jakub Jelinek	1	-8/+25
	The fold_builtin_frexp folding for NaN/Inf just returned the first argument with evaluating second arguments side-effects, rather than storing something to what the second argument points to. The PR argues that the C standard requires the function to store something there but what exactly is stored is unspecified, so not storing there anything can result in UB if the value isn't initialized and is read later. glibc and newlib store there 0, musl apparently doesn't store anything. The following patch stores there zero (or would you prefer storing there some other value, 42, INT_MAX, INT_MIN, etc.?; zero is cheapest to form in assembly though) and adjusts the test so that it doesn't rely on not storing there anything but instead checks for -Wmaybe-uninitialized warning to find out that something has been stored there. Unfortunately I had to disable the NaN tests for -O0, while we can fold __builtin_isnan (__builtin_nan ("")) at compile time, we can't fold __builtin_isnan ((i = 0, __builtin_nan (""))) at compile time. fold_builtin_classify uses just tree_expr_nan_p and if that isn't true (because expr is a COMPOUND_EXPR with tree_expr_nan_p on the second arg), it does arg = builtin_save_expr (arg); return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg); and that isn't folded at -O0 further, as we wrap it into SAVE_EXPR and nothing propagates the NAN to the comparison. I think perhaps tree_expr_nan_p etc. could have case COMPOUND_EXPR: added and recurse on the second argument, but that feels like stage1 material to me if we want to do that at all. 2025-01-23 Jakub Jelinek <jakub@redhat.com> PR middle-end/114877 * builtins.cc (fold_builtin_frexp): Handle rvc_nan and rvc_inf cases like rvc_zero, return passed in arg and set exp = 0. gcc.dg/torture/builtin-frexp-1.c: Add -Wmaybe-uninitialized as dg-additional-options. (bar): New function. (TESTIT_FREXP2): Rework the macro so that it doesn't test whether nothing has been stored to what the second argument points to, but instead that something has been stored there, whatever it is. (main): Temporarily don't enable the nan tests for -O0.
2025-01-23	testsuite: Only run test if alarm is available	Torbjörn SVENSSON	5	-3/+8
	Most baremetal toolchains will not have an implementation for alarm and sigaction as they are target specific. For arm-none-eabi with newlib, function signatures are exposed, but there is no implmentation and thus the test cases causes a undefined symbol link error. gcc/testsuite/ChangeLog: * gcc.dg/pr78185.c: Remove dg-do and replace with with dg-require-effective-target of signal and alarm. * gcc.dg/pr116906-1.c: Likewise. * gcc.dg/pr116906-2.c: Likewise. * gcc.dg/vect/pr101145inf.c: Use effective-target alarm. * gcc.dg/vect/pr101145inf_1.c: Likewise. * lib/target-supports.exp(check_effective_target_alarm): New. gcc/ChangeLog: * doc/sourcebuild.texi (Effective-Target Keywords): Document 'alarm'. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2025-01-23	tree-optimization/118558 - fix alignment compute with VMAT_CONTIGUOUS_REVERSE	Richard Biener	1	-0/+15
	There are calls to dr_misalignment left that do not correct for the offset (which is vector type dependent) when the stride is negative. Notably vect_known_alignment_in_bytes doesn't allow to pass through such offset which the following adds (computing the offset in vect_known_alignment_in_bytes would be possible as well, but the offset can be shared as seen). Eventually this function could go away. This leads to peeling for gaps not considerd, nor shortening of the access applied which is what fixes the testcase on x86_64. PR tree-optimization/118558 * tree-vectorizer.h (vect_known_alignment_in_bytes): Pass through offset to dr_misalignment. * tree-vect-stmts.cc (get_group_load_store_type): Compute offset applied for negative stride and use it when querying alignment of accesses. (vectorizable_load): Likewise. * gcc.dg/vect/pr118558.c: New testcase.
2025-01-22	d,ada/spec: only sub nostd{inc,lib} rather than nostd{inc,lib}*	Arsen Arsenović	1	-0/+4
	This prevents the gcc driver erroneously accepting -nostdlib++ when it should not when Ada was enabled. Also, similarly, -nostdinc* (where * is nonempty) is unhandled by either the Ada or D compiler, so the spec should not substitute those either (thanks for pointing that out, Jakub). Brought to my attention by Michał Górny <mgorny@gentoo.org>. gcc/ada/ChangeLog: * gcc-interface/lang-specs.h: Replace %{nostdinc} %{nostdlib} with %{nostdinc} %{nostdlib}. gcc/d/ChangeLog: * lang-specs.h: Replace %{nostdinc} with %{nostdinc}. gcc/testsuite/ChangeLog: gcc.dg/driver-nostdlibstar.c: New test.
2025-01-21	match: Improve the `x ==/!= ~x` pattern [PR118483]	Andrew Pinski	4	-0/+61
	This improves this pattern by 2 ways: * Allow for an optional convert, similar to how the few other `a OP ~a` patterns also allow for an optional convert. * Use bitwise_inverted_equal_p/maybe_bit_not instead of directly matching bit_not. Just like the other patterns do too. Note pr118483-2.c used to optimized for aarch64-linux-gnu with GCC 4.9.4 on the RTL level even though the gimple level was missing it. PR tree-optimization/118483 gcc/ChangeLog: * match.pd (`x ==/!= ~x`): Allow for an optional convert and use itwise_inverted_equal_p/maybe_bit_not instead of directly matching bit_not. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr118483-1.c: New test. * gcc.dg/tree-ssa/pr118483-2.c: New test. * gcc.dg/tree-ssa/pr118483-3.c: New test. * gcc.dg/tree-ssa/pr118483-4.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-01-21	testsuite: Require int32plus for test case pr117546.c	Dimitar Dimitrov	1	-1/+1
	Test case is valid even if size of int is more than 32 bits. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr117546.c: Require effective target int32plus. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2025-01-21	testsuite: Add testcase for already fixed PR [PR118560]	Jakub Jelinek	1	-0/+17
	The fix for this PR has been committed without a testcase. The following testcase would take at least 15 minutes to compile on a fast machine (powerpc64-linux both -m32 or -m64), now it takes 100ms. 2025-01-21 Jakub Jelinek <jakub@redhat.com> PR target/118560 * gcc.dg/dfp/pr118560.c: New test.
2025-01-21	vect: Force alignment peeling to vectorize more early break loops ↵	Thomas Schwinge	1	-2/+2
	[PR118211]: update 'gcc.dg/vect/vect-switch-search-line-fast.c' for GCN PR tree-optimization/118211 PR tree-optimization/116126 gcc/testsuite/ * gcc.dg/vect/vect-switch-search-line-fast.c: Update for GCN.
2025-01-21	tree-optimization/118569 - LC SSA broken after unrolling	Richard Biener	1	-0/+36
	The following amends the previous fix to mark all of the loop BBs as need to be scanned for new LC PHI uses when its nesting parents changed, noticing one caller of fix_loop_placement was already doing that. So the following moves this code into fix_loop_placement, covering both callers now. PR tree-optimization/118569 * cfgloopmanip.cc (fix_loop_placement): When the loops nesting parents changed, mark all blocks to be scanned for LC PHI uses. (fix_bb_placements): Remove code moved into fix_loop_placement. * gcc.dg/torture/pr118569.c: New testcase.
2025-01-20	arm, testsuite: fix fast-math-bb-slp-complex-mla-float.c dg-add-options	Christophe Lyon	1	-1/+1
	The test uses floats, not fp16 so it should use arm_v8_3a_complex_neon instead of arm_v8_3a_fp16_complex_neon. This makes it PASS on arm-linux-gnueabihf instead of being UNRESOLVED. gcc/testsuite/ChangeLog: * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Use arm_v8_3a_complex_neon.
2025-01-20	arm, testsuite: remove duplicate dg-add-options arm_v8_3a_complex_neon	Christophe Lyon	2	-2/+0
	These two testcases have twice the same dg-add-options arm_v8_3a_complex_neon, the patch removes one of them. gcc/testsuite/ChangeLog: * gcc.dg/vect/complex/complex-operations-run.c: Remove duplicate dg-add-options arm_v8_3a_complex_neon. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c: Likewise.
2025-01-20	tree-optimization/118552 - failed LC SSA update after unrolling	Richard Biener	1	-0/+34
	When unrolling changes nesting relationship of loops we fail to mark blocks as in need to change for LC SSA update. Specifically the LC SSA PHI on a former inner loop exit might be misplaced if that loop becomes a sibling of its outer loop. PR tree-optimization/118552 * cfgloopmanip.cc (fix_loop_placement): Properly mark exit source blocks as to be scanned for LC SSA update when the loops nesting relationship changed. (fix_loop_placements): Adjust. (fix_bb_placements): Likewise. * gcc.dg/torture/pr118552.c: New testcase.
2025-01-20	tree-ssa-dce: Fix calloc handling [PR118224]	Jakub Jelinek	1	-0/+2
	As reported by Dimitar, this should have been a multiplication, but wasn't caught because in the test (~(__SIZE_TYPE__) 0) / 2 is the largest accepted size and so adding 3 to it also resulted in "overflow". The following patch adds one subtest to really verify it is a multiplication and fixes the operation. 2025-01-20 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/118224 * tree-ssa-dce.cc (is_removable_allocation_p): Multiply a1 by a2 instead of adding it. * gcc.dg/pr118224.c: New test.
2025-01-19	testsuite: Fixes for test case pr117546.c	Dimitar Dimitrov	1	-1/+3
	This test fails on AVR. Debugging the test on x86 host, I noticed that u in function s sometimes has value 16128. The "t <= 3 * u" expression in the same function results in signed integer overflow for targets with sizeof(int)=2. Fix by requiring int32 effective target. Also add return statement for the main function. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr117546.c: Require effective target int32. (main): Add return statement. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2025-01-18	Fix uniqueness of symtab_node::get_dump_name.	Michal Jires	2	-2/+2
	symtab_node::get_dump_name uses node order to identify nodes. Order is no longer unique because of Incremental LTO patches. This patch moves uid from cgraph_node node to symtab_node, so get_dump_name can use uid instead and get back unique dump names. In inlining passes, uid is replaced with more appropriate (more compact for indexing) summary id. Bootstrapped/regtested on x86_64-linux. Ok for trunk? gcc/ChangeLog: * cgraph.cc (symbol_table::create_empty): Move uid to symtab_node. (test_symbol_table_test): Change expected dump id. * cgraph.h (struct cgraph_node): Move uid to symtab_node. (symbol_table::register_symbol): Likewise. * dumpfile.cc (test_capture_of_dump_calls): Change expected dump id. * ipa-inline.cc (update_caller_keys): Use summary id instead of uid. (update_callee_keys): Likewise. * symtab.cc (symtab_node::get_dump_name): Use uid instead of order. gcc/testsuite/ChangeLog: * gcc.dg/live-patching-1.c: Change expected dump id. * gcc.dg/live-patching-4.c: Likewise.
2025-01-17	match.pd: Fix (FTYPE) N CMP (FTYPE) M optimization for GENERIC [PR118522]	Jakub Jelinek	1	-0/+11
	The last case of this optimization assumes that if 2 integral types have same precision and TYPE_UNSIGNED, then they are uselessly convertible. While that is very likely the case for GIMPLE, it is not the case for GENERIC, so the following patch adds there a convert so that the optimization produces also valid GENERIC. Without it we got (int) p == b where b had _BitInt(32) type, so incompatible types. 2025-01-17 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/118522 * match.pd ((FTYPE) N CMP (FTYPE) M): Add convert, as in GENERIC integral types with the same precision and sign might actually not be compatible types. * gcc.dg/bitint-120.c: New test.
2025-01-17	tree-optimization/92539 - missed optimization leads to bogus -Warray-bounds	Richard Biener	3	-5/+21
	The following makes niter analysis recognize a loop with an exit condition scanning over a STRING_CST. This is done via enhancing the force evaluation code rather than recognizing for example strlen (s) as number of iterations because it allows to handle some more cases. STRING_CSTs are easy to handle since nothing can write to them, also processing those should be cheap. I've refrained from handling anything besides char8_t. Note to avoid the -Warray-bound dianostic we have to either early unroll the loop (there's no final value replacement done, there's a PR for doing this as part of CD-DCE when possibly eliding a loop), or create a canonical IV so we can DCE the loads. The latter is what the patch does, also avoiding to repeatedly force-evaluate niters. This also makes final value replacement work again since now ivcanon is after it. There are some testsuite adjustments needed, in particular we now unroll some loops early, causing messages to appear in different passes but also vectorization to now no longer happening on outer loops. The changes mitigate that. PR tree-optimization/92539 * tree-ssa-loop-ivcanon.cc (tree_unroll_loops_completely_1): Also try force-evaluation if ivcanon did not yet run. (canonicalize_loop_induction_variables): When niter was computed constant by force evaluation add a canonical IV if we didn't unroll. * tree-ssa-loop-niter.cc (loop_niter_by_eval): When we don't find a proper PHI try if the exit condition scans over a STRING_CST and simulate that. * g++.dg/warn/Warray-bounds-pr92539.C: New testcase. * gcc.dg/tree-ssa/sccp-16.c: New testcase. * g++.dg/vect/pr87621.cc: Use larger power to avoid inner loop unrolling. * gcc.dg/vect/pr89440.c: Use larger loop bound to avoid inner loop unrolling. * gcc.dg/pr77975.c: Scan cunrolli dump and adjust.
2025-01-16	[testsuite] drop explicit run overrider in more dfp tests	Alexandre Oliva	2	-2/+0
	A few more dfp tests that recently got backported to gcc-14 override dfp.exp's selection of default action depending on dfprt. Let the default stand. for gcc/testsuite/ChangeLog * gcc.dg/dfp/pr102674.c: Use the default dg-do. * gcc.dg/dfp/pr43374.c: Likewise.
2025-01-16	[testsuite] rearrange requirements for dfp bitint run tests	Alexandre Oliva	12	-12/+16
	dfp.exp sets the default to compile when dfprt is not available, but some dfp bitint tests override the default without that requirement, and try to run even when dfprt is not available. Instead of overriding the default, rewrite the requirements so that they apply even when compiling, since the absence of bitint or of int128 would presumably cause compile failures. for gcc/testsuite/ChangeLog * gcc.dg/dfp/bitint-1.c: Rewrite requirements to retain dfprt. * gcc.dg/dfp/bitint-2.c: Likewise. * gcc.dg/dfp/bitint-3.c: Likewise. * gcc.dg/dfp/bitint-4.c: Likewise. * gcc.dg/dfp/bitint-5.c: Likewise. * gcc.dg/dfp/bitint-6.c: Likewise. * gcc.dg/dfp/bitint-7.c: Likewise. * gcc.dg/dfp/bitint-8.c: Likewise. * gcc.dg/dfp/int128-1.c: Likewise. * gcc.dg/dfp/int128-2.c: Likewise. * gcc.dg/dfp/int128-3.c: Likewise. * gcc.dg/dfp/int128-4.c: Likewise.
2025-01-16	OpenMP: Add C support for metadirectives and dynamic selectors.	Sandra Loosemore	1	-0/+15
	Additional shared C/C++ testcases are included in a subsequent patch in this series. gcc/c-family/ChangeLog PR middle-end/112779 PR middle-end/113904 * c-common.h (enum c_omp_directive_kind): Add C_OMP_DIR_META. (c_omp_expand_variant_construct): Declare. * c-gimplify.cc: Include omp-general.h. (genericize_omp_metadirective_stmt): New. (c_genericize_control_stmt): Add case for OMP_METADIRECTIVE. * c-omp.cc (c_omp_directives): Fix entries for metadirective. (c_omp_expand_variant_construct_r): New. (c_omp_expand_variant_construct): New. * c-pragma.cc (omp_pragmas): Add metadirective. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_METADIRECTIVE. gcc/c/ChangeLog PR middle-end/112779 PR middle-end/113904 * c-parser.cc (struct c_parser): Add omp_metadirective_state field. (c_parser_skip_to_end_of_block_or_statement): Add metadirective_p parameter and handle skipping over the parentheses in a "for" statement. (struct omp_metadirective_parse_data): New. (mangle_metadirective_region_label): New. (c_parser_label): Mangle label names in a metadirective body. (c_parser_statement_after_labels): Likewise. (c_parser_pragma): Handle PRAGMA_OMP_METADIRECTIVE. (c_parser_omp_context_selector): Allow arbitrary expressions in device_num and condition properties. (c_parser_omp_assumption_clauses): Handle C_OMP_DIR_META. (analyze_metadirective_body): New. (c_parser_omp_metadirective): New. gcc/testsuite/ PR middle-end/112779 * c-c++-common/gomp/declare-variant-2.c: Adjust expected output for C. * gcc.dg/gomp/metadirective-1.c: New. Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com> Co-Authored-By: Sandra Loosemore <sandra@codesourcery.com>
2025-01-16	middle-end: Add early break conditions to vect-switch-search-line-fast.c ↵	Tamar Christina	1	-0/+2
	[PR118451] When this test was added initially it didn't add the early break effective target tests. This means that the test was "passing" (as in, it was failing to vectorize) because many targets don't support early break. But the test should not have been run for these targets. When the vectorizer learned PFA the test started passing for 32-bit targets. I had adjusted the testcase but fail to notice the requirements were wrong. Thus this adds the extra guards, and on targets that don't support early break this test will move to UNSUPPORTED, which is what it should have been all along... gcc/testsuite/ChangeLog: PR testsuite/118451 * gcc.dg/vect/vect-switch-search-line-fast.c: Add early_break guards.
2025-01-16	forwprop: Ensure that shuffle masks are VECTOR_CSTs	Christoph Müllner	1	-0/+18
	As reported in PR118487, it is possible that the mask parameter of a __builtin_shuffle() is not a VECTOR_CST. If this is the case and checking is enabled then an ICE is triggered. Let's add a check to fix this issue. PR tree-optimization/118487 gcc/ChangeLog: * tree-ssa-forwprop.cc (recognise_vec_perm_simplify_seq): Ensure that shuffle masks are VECTOR_CSTs. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr118487.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2025-01-16	tree-optimization/115494 - PRE PHI translation and ranges	Richard Biener	1	-0/+24
	When we PHI translate dependent expressions we keep SSA defs in place of the translated expression in case the expression itself did not change even though it's context did and thus the validity of ranges associated with it. That eventually leads to simplification errors given we violate the precondition that used SSA defs fed to vn_valueize are valid to use (including their associated ranges). The following makes sure to replace those with new representatives always, not only when the dependent expression translation changed it. The fix was originally discovered by Michael Morin. PR tree-optimization/115494 * tree-ssa-pre.cc (phi_translate_1): Always generate a representative for translated dependent expressions. * gcc.dg/torture/pr115494.c: New testcase. Co-Authored-By: Mikael Morin <mikael@gcc.gnu.org>
2025-01-15	match: Simplify `1 >> x` into `x == 0` [PR102705]	Andrew Pinski	4	-5/+39
	This in this PR we have missed optimization where we miss that, `1 >> x` and `(1 >> x) ^ 1` can't be equal. There are a few ways of optimizing this, the easiest and simpliest is to simplify `1 >> x` into just `x == 0` as those are equivalant (if we ignore out of range values for x). we already have an optimization for `(1 >> X) !=/== 0` so the only difference here is we don't need the `!=/== 0` part to do the transformation. So this removes the `(1 >> X) !=/== 0` transformation and just adds a simplfied `1 >> x` -> `x == 0` one. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/102705 gcc/ChangeLog: * match.pd (`(1 >> X) != 0`): Remove pattern. (`1 >> x`): New pattern. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr105832-2.c: Update testcase. * gcc.dg/tree-ssa/pr96669-1.c: Likewise. * gcc.dg/tree-ssa/pr102705-1.c: New test. * gcc.dg/tree-ssa/pr102705-2.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-01-15	middle-end: Fix incorrect type replacement in operands_equals [PR118472]	Tamar Christina	1	-0/+32
	In g:3c32575e5b6370270d38a80a7fa8eaa144e083d0 I made a mistake and incorrectly replaced the type of the arguments of an expression with the type of the expression. This is of course wrong. This reverts that change and I have also double checked the other replacements and they are fine. gcc/ChangeLog: PR middle-end/118472 * fold-const.cc (operand_compare::operand_equal_p): Fix incorrect replacement. gcc/testsuite/ChangeLog: PR middle-end/118472 * gcc.dg/pr118472.c: New test.
2025-01-15	ipa: Initialize/release global obstack in process_new_functions [PR116068]	Jakub Jelinek	1	-0/+26
	Other spots in cgraphunit.cc already call bitmap_obstack_initialize (NULL); before running a pass list and bitmap_obstack_release (NULL); after that, while process_new_functions wasn't doing that and with the new r15-130 bitmap_alloc checking that results in ICE. 2025-01-15 Jakub Jelinek <jakub@redhat.com> PR ipa/116068 * cgraphunit.cc (symbol_table::process_new_functions): Call bitmap_obstack_initialize (NULL); and bitmap_obstack_release (NULL) around processing the functions. * gcc.dg/graphite/pr116068.c: New test.
2025-01-14	[ifcombine] check and extend constants to compare with bitfields	Alexandre Oliva	2	-0/+84
	Add logic to check and extend constants compared with bitfields, so that fields are only compared with constants they could actually equal. This involves making sure the signedness doesn't change between loads and conversions before shifts: we'd need to carry a lot more data to deal with all the possibilities. for gcc/ChangeLog PR tree-optimization/118456 * gimple-fold.cc (decode_field_reference): Punt if shifting after changing signedness. (fold_truth_andor_for_ifcombine): Check extension bits in constants before clipping. for gcc/testsuite/ChangeLog PR tree-optimization/118456 * gcc.dg/field-merge-21.c: New. * gcc.dg/field-merge-22.c: New.
2025-01-14	match: Keep conditional in simplification to constant [PR118140].	Robin Dapp	1	-0/+27
	In PR118140 we simplify _ifc__33 = .COND_IOR (_41, d_lsm.7_11, _46, d_lsm.7_11); to 1: Match-and-simplified .COND_IOR (_41, d_lsm.7_11, _46, d_lsm.7_11) to 1 when _46 == 1. This happens by removing the conditional and applying a \| 1 = 1. Normally we re-introduce the conditional and its else value if needed but that does not happen here as we're not dealing with a vector type. For correctness's sake, we must not remove the conditional even for non-vector types. This patch re-introduces a COND_EXPR in such cases. For PR118140 this result in a non-vectorized loop. PR middle-end/118140 gcc/ChangeLog: * gimple-match-exports.cc (maybe_resimplify_conditional_op): Add COND_EXPR when we simplified to a scalar gimple value but still have an else value. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr118140.c: New test. * gcc.target/riscv/rvv/autovec/pr118140.c: New test.
2025-01-13	c: improve UX for -Wincompatible-pointer-types (v3) [PR116871]	David Malcolm	3	-0/+114
	PR c/116871 notes that our diagnostics about incompatible function types could be improved. In particular, for the case of migrating to C23 I'm seeing a lot of build failures with signal handlers similar to this (simplified from alsa-tools-1.2.11, envy24control/profiles.c; see rhbz#2336278): typedef void (__sighandler_t) (int); extern __sighandler_t signal (int __sig, __sighandler_t __handler) __attribute__ ((__nothrow__ , __leaf__)); void new_process(void) { void (int_stat)(); int_stat = signal(2, ((__sighandler_t) 1)); signal(2, int_stat); } Before this patch, cc1 fails with this message: t.c: In function 'new_process': t.c:18:12: error: assignment to 'void ()(void)' from incompatible pointer type '__sighandler_t' {aka 'void ()(int)'} [-Wincompatible-pointer-types] 18 \| int_stat = signal(2, ((__sighandler_t) 1)); \| ^ t.c:20:13: error: passing argument 2 of 'signal' from incompatible pointer type [-Wincompatible-pointer-types] 20 \| signal(2, int_stat); \| ^~~~~~~~ \| \| \| void ()(void) t.c:11:57: note: expected '__sighandler_t' {aka 'void ()(int)'} but argument is of type 'void ()(void)' 11 \| extern __sighandler_t signal (int __sig, __sighandler_t __handler) \| ~~~~~~~~~~~~~~~^~~~~~~~~ With this patch, cc1 emits: t.c: In function 'new_process': t.c:18:12: error: assignment to 'void ()(void)' from incompatible pointer type '__sighandler_t' {aka 'void ()(int)'} [-Wincompatible-pointer-types] 18 \| int_stat = signal(2, ((__sighandler_t) 1)); \| ^ t.c:9:16: note: '__sighandler_t' declared here 9 \| typedef void (__sighandler_t) (int); \| ^~~~~~~~~~~~~~ t.c:20:13: error: passing argument 2 of 'signal' from incompatible pointer type [-Wincompatible-pointer-types] 20 \| signal(2, int_stat); \| ^~~~~~~~ \| \| \| void ()(void) t.c:11:57: note: expected '__sighandler_t' {aka 'void ()(int)'} but argument is of type 'void ()(void)' 11 \| extern __sighandler_t signal (int __sig, __sighandler_t __handler) \| ~~~~~~~~~~~~~~~^~~~~~~~~ t.c:9:16: note: '__sighandler_t' declared here 9 \| typedef void (__sighandler_t) (int); \| ^~~~~~~~~~~~~~ showing the location of the pertinent typedef ("__sighandler_t") Another example, simplfied from a52dec-0.7.4: src/a52dec.c (rhbz#2336013): typedef void (__sighandler_t) (int); extern __sighandler_t signal (int __sig, __sighandler_t __handler) __attribute__ ((__nothrow__ , __leaf__)); / Mismatching return type. / static RETSIGTYPE signal_handler (int sig) { } static void print_fps (int final) { signal (42, signal_handler); } Before this patch, cc1 emits: t2.c: In function 'print_fps': t2.c:22:15: error: passing argument 2 of 'signal' from incompatible pointer type [-Wincompatible-pointer-types] 22 \| signal (42, signal_handler); \| ^~~~~~~~~~~~~~ \| \| \| int ()(int) t2.c:11:57: note: expected '__sighandler_t' {aka 'void ()(int)'} but argument is of type 'int ()(int)' 11 \| extern __sighandler_t signal (int __sig, __sighandler_t __handler) \| ~~~~~~~~~~~~~~~^~~~~~~~~ With this patch cc1 emits: t2.c: In function 'print_fps': t2.c:22:15: error: passing argument 2 of 'signal' from incompatible pointer type [-Wincompatible-pointer-types] 22 \| signal (42, signal_handler); \| ^~~~~~~~~~~~~~ \| \| \| int ()(int) t2.c:11:57: note: expected '__sighandler_t' {aka 'void ()(int)'} but argument is of type 'int ()(int)' 11 \| extern __sighandler_t signal (int __sig, __sighandler_t __handler) \| ~~~~~~~~~~~~~~~^~~~~~~~~ t2.c:16:19: note: 'signal_handler' declared here 16 \| static RETSIGTYPE signal_handler (int sig) \| ^~~~~~~~~~~~~~ t2.c:9:16: note: '__sighandler_t' declared here 9 \| typedef void (__sighandler_t) (int); \| ^~~~~~~~~~~~~~ showing the location of the pertinent fndecl ("signal_handler"), and, as before, the pertinent typedef. The patch also updates the colorization in the messages to visually link and contrast the different types and typedefs. My hope is that this make it easier for users to decipher build failures seen with the new C23 default. Further improvements could be made to colorization in convert_for_assignment, and similar improvements to C++, but I'm punting those to GCC 16. gcc/c/ChangeLog: PR c/116871 * c-typeck.cc (pedwarn_permerror_init): Return bool for whether a warning was emitted. Only call print_spelling if we warned. (pedwarn_init): Return bool for whether a warning was emitted. (permerror_init): Likewise. (warning_init): Return bool for whether a warning was emitted. Only call print_spelling if we warned. (class pp_element_quoted_decl): New. (maybe_inform_typedef_location): New. (convert_for_assignment): For OPT_Wincompatible_pointer_types, move auto_diagnostic_group to cover all cases. Use %e and pp_element rather than %qT and tree to colorize the types. Capture whether a warning was emitted, and, if it was, show various notes: for a pointer to a function, show the function decl, for typedef types, and show the decls. gcc/testsuite/ChangeLog: PR c/116871 * gcc.dg/c23-mismatching-fn-ptr-a52dec.c: New test. * gcc.dg/c23-mismatching-fn-ptr-alsatools.c: New test. * gcc.dg/c23-mismatching-fn-ptr.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-01-13	[ifcombine] propagate signbit mask to XOR right-hand operand	Alexandre Oliva	1	-0/+64
	If a single-bit bitfield takes up the sign bit of a storage unit, comparing the corresponding bitfield between two objects loads the storage units, XORs them, converts the result to signed char, and compares it with zero: ((signed char)(a.<byte> ^ c.<byte>) >= 0). fold_truth_andor_for_ifcombine recognizes the compare with zero as a sign bit test, then it decomposes the XOR into an equality test. The problem is that, after this decomposition, that figures out the width of the accessed fields, we apply the sign bit mask to the left-hand operand of the compare, but we failed to also apply it to the right-hand operand when both were taken from the same XOR. This patch fixes that. for gcc/ChangeLog PR tree-optimization/118409 * gimple-fold.cc (fold_truth_andor_for_ifcombine): Apply the signbit mask to the right-hand XOR operand too. for gcc/testsuite/ChangeLog PR tree-optimization/118409 * gcc.dg/field-merge-20.c: New.
2025-01-13	tree-optimization/117119 - ICE with int128 IV in dataref analysis	Richard Biener	1	-0/+10
	Here's another fix for a missing check that an IV value fits in a HIW. It's originally from Stefan. PR tree-optimization/117119 * tree-data-ref.cc (initialize_matrix_A): Check whether an INTEGER_CST fits in HWI, otherwise return chrec_dont_know. * gcc.dg/torture/pr117119.c: New testcase. Co-Authored-By: Stefan Schulze Frielinghaus <stefansf@linux.ibm.com>
2025-01-12	[PATCH] crc: Fix up some crc related wrong code issues [PR117997, PR118415]	Jakub Jelinek	6	-13/+153
	Hi! As mentioned in the second PR, using table names like crc_table_for_crc_8_polynomial_0x12 in the user namespace is wrong, user could have defined such variables in their code and as can be seen on the last testcase, then it just misbehaves. At minimum such names should start with 2 underscores, moving it into implementation namespace, and if possible have some dot or dollar in the name if target supports it. I think assemble_crc_table right now always emits tables a local variables, I really don't see what would be setting TREE_PUBLIC flag on IDENTIFIER_NODEs. It might be nice to share the tables between TUs in the same binary or shared library, but it in that case should have hidden visibility if possible, so that it isn't exported from the libraries or binaries, we don't want the optimization to affect set of exported symbols from libraries. And, as can be seen in the first PR, building gen_rtx_SYMBOL_REF by hand is certainly unexpected on some targets, e.g. those which use -fsection-anchors, so we should instead use DECL_RTL of the VAR_DECL. For that we'd need to look it up if we haven't emitted it already, while IDENTIFIER_NODEs can be looked up easily, I guess for the VAR_DECLs we'd need custom hash table. Now, all of the above (except sharing between multiple TUs) is already implemented in output_constant_def, so I think it is much better to just use that function. And, if we want to share it between multiple TUs, we could extend the SHF_MERGE usage in gcc, currently we only use it for constant pool entries with same size as alignment, from 1 to 32 bytes, using .rodata.cstN sections. We could just use say .rodata.cstM.N sections where M would be alignment and N would be the entity size. We could use that for all constant pool entries say up to 2048 bytes. Though, as the current code doesn't share between multiple TUs, I think it can be done incrementally (either still for GCC 15, or GCC 16+). Bootstrapped/regtested on {x86_64,i686,aarch64,powerpc64le,s390x}-linux, on aarch64 it also fixes -FAIL: crypto/rsa -FAIL: hash ok for trunk? gcc/ PR tree-optimization/117997 PR middle-end/118415 * expr.cc (assemble_crc_table): Make static, remove id argument, use output_constant_def. Emit note if -fdump-rtl-expand-details about which table has been emitted. (generate_crc_table): Make static, adjust assemble_crc_table caller, call it always. (calculate_table_based_CRC): Make static. * internal-fn.cc (expand_crc_optab_fn): Emit note if -fdump-rtl-expand-details about using optab for crc. Formatting fix. gcc/testsuite/ * gcc.dg/crc-builtin-target32.c: Add -fdump-rtl-expand-details as dg-additional-options. Scan expand dump rather than assembly, adjust the regexps. * gcc.dg/crc-builtin-target64.c: Likewise. * gcc.dg/crc-builtin-rev-target32.c: Likewise. * gcc.dg/crc-builtin-rev-target64.c: Likewise. * gcc.dg/pr117997.c: New test. * gcc.dg/pr118415.c: New test.