aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-06-22[RISC-V][PR target/119830] Fix RISC-V codegen on 32bit hostsAndrew Pinski2-3/+16
So this is Andrew's patch from the PR. We weren't clean for a 32bit host in some of the arithmetic for constant synthesis. I confirmed the bug on a 32bit linux host, then confirmed that Andrew's patch from the PR fixes the problem, then ran Andrew's patch through my tester successfully. Naturally I'll wait for pre-commit testing, but I'm not expecting problems. PR target/119830 gcc/ * config/riscv/riscv.cc (riscv_build_integer_1): Make arithmetic in bclr case clean for 32 bit hosts. gcc/testsuite/ * gcc.target/riscv/pr119830.c: New test.
2025-06-22[committed][PR rtl-optimization/120550] Drop REG_EQUAL note after ext-dce ↵Jeff Law1-0/+5
transformation This bug was found by Edwin's fuzzing efforts on RISC-V, though it likely affects other targets. In simplest terms when ext-dce converts an extension into a (possibly simplified) subreg copy it may make an attached REG_EQUAL note invalid. In the case Edwin found the note was an extension, but I don't think that would necessarily always be the case. The note could have other forms which potentially need invalidation. So the safest thing to do is just remove any attached REG_EQUAL or REG_EQUIV note. Note adjusting Edwin's testcase in the obvious way to avoid having to interpret printf output for pass/fail status makes the bug go latent. That's why no testcase is included with this patch. Bootstrapped and regression tested on x86_64. Obviously also verified it fixes the testcase Edwin filed. This is a good candidate for cherry-picking to the gcc-15 release branch after simmering on the trunk a bit. PR rtl-optimization/120550 gcc/ * ext-dce.cc (ext_dce_try_optimize_insn): Drop REG_EQUAL/REG_EQUIV notes on modified insns.
2025-06-22xtensa: Make use of DEPBITS instructionTakayuki 'January June' Suwa2-1/+21
This patch implements bitfield insertion MD pattern using the DEPBITS machine instruction, the counterpart of the EXTUI instruction, if available. /* example */ struct foo { unsigned int b:10; unsigned int r:11; unsigned int g:11; }; void test(struct foo *p) { p->g >>= 1; } ;; result (endianness: little) test: entry sp, 32 l32i.n a8, a2, 0 extui a9, a8, 1, 10 depbits a8, a9, 0, 11 s32i.n a8, a2, 0 retw.n gcc/ChangeLog: * config/xtensa/xtensa.h (TARGET_DEPBITS): New macro. * config/xtensa/xtensa.md (insvsi): New insn pattern.
2025-06-22xtensa: Implement TARGET_ZERO_CALL_USED_REGSTakayuki 'January June' Suwa1-0/+56
This patch implements the target-specific ZERO_CALL_USED_REGS hook, since if -fzero-call-used-regs=all the default hook tries to assign 0 to B0 (bit 0 of the BR register) and the ICE will be thrown. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_zero_call_used_regs): New prototype and function. (TARGET_ZERO_CALL_USED_REGS): Define macro.
2025-06-22Fix some problems with afdo propagationJan Hubicka1-9/+38
This patch fixes problems I noticed by exploring profiles of some hot functions in GCC. In particular the propagation sometimes changed precise 0 to afdo 0 for paths calling abort and sometimes we could propagate more when we accept that some paths has 0 count. Finally there was important bug in computing all_known which resulted in BB probabilities to be quite broken after afdo. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * auto-profile.cc (update_count_by_afdo_count): Make static; add variant accepting profile_count. (afdo_find_equiv_class): Use update_count_by_afdo_count. (afdo_propagate_edge): Likewise. (afdo_propagate): Likewise. (afdo_calculate_branch_prob): Fix handling of all_known. (afdo_annotate_cfg): Annotate by 0 where both afdo and static profile agrees.
2025-06-22Handle functions with 0 profile in auto-profileJan Hubicka1-12/+57
This is the last part of the infrastructure to allow functions with local profiles and 0 global autofdo counts. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * auto-profile.cc (afdo_set_bb_count): Dump inline stacks and reasons when lookup failed. (afdo_set_bb_count): Record info about BBs with zero AFDO count. (afdo_annotate_cfg): Set profile to global0_afdo if there are no samples in profile.
2025-06-22[PR modula2/120731] error in Strings.Pos causing sigsegvGaius Mulley3-28/+69
This patch corrects the m2log library procedure function Strings.Pos which incorrectly sliced the wrong component of the source string. The incorrect slice could cause a sigsegv if negative slice indices were generated. gcc/m2/ChangeLog: PR modula2/120731 * gm2-libs-log/Strings.def (Delete): Rewrite comment. * gm2-libs-log/Strings.mod (Pos): Rewrite. (PosLower): New procedure function. gcc/testsuite/ChangeLog: PR modula2/120731 * gm2/pimlib/logitech/run/pass/teststrings.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-06-22Prevent possible overflows in ipa-profileJan Hubicka1-8/+16
The bug in scaling profile of fnsplit produced clones made some afdo counts during gcc bootstrap very large (2^59). This made computations in ipa-profile to silently overflow which triggered hot count to be identified as 1 instead of sane value. While fixing the fnsplit bug prevents overflow, I think the histogram code should be made safe too. sreal is not very fitting here since mantisa is 32bit and not very good for many additions of many numbers which are possibly of very different order. So I use widest_int while 128bit arithmetics would be safe (we are summing 60 bit counts multiplied by time estimates). I don't think we have readily available 128bit type and code is not really time critical since the histogram is computed once. gcc/ChangeLog: * ipa-profile.cc (ipa_profile): Use widest_int to avoid possible overflows.
2025-06-22Scale up auto-profile countsJan Hubicka1-8/+38
This patch makes auto-profile counts to scale up when the train run has too small maximal count. This still happens on moderately sized train runs such as SPEC2017 ref datasets and helps to avoid rounding errors in cases where number of samples is small. gcc/ChangeLog: * auto-profile.cc (autofdo::afdo_count_scale): New. (autofdo_source_profile::update_inlined_ind_target): Scale counts. (autofdo_source_profile::read): Set scale and dump statistics. (afdo_indirect_call): Scale. (afdo_set_bb_count): Scale. (afdo_find_equiv_class): Fix dumps. (afdo_annotate_cfg): Scale.
2025-06-22Add GUESSED_GLOBAL0_AFDOJan Hubicka4-71/+133
This patch adds GUESSED_GLOBAL0_AFDO profile quality. It can be used to preserve local counts of functions which have 0 AFDO profile. I originally did not include it as it was not clear it will be useful and it turns quality from 3bits to 4bits which means that we need to steal another bit from the actual counters. It is likely not a problem for profile_count since counting up to 2^60 still takes a while. However with profile_probability I run into a problem that gimple FE testcases encodes probability with current representation and thus changing profile_probability::n_bits would require updating all testcases. Since probabilities never use GLOBAL0 qualities (they are always local) addoing bits is not necessary, but it requires encoding quality in the data type that required adding accessors. While working on this I also noticed that GUESSED_GLOBAL0 and GUESSED_GLOBAL0_ADJUSED are misordered. Qualities should be in increasing order. This is also fixed. auto-profile will be updated later. Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: * cgraph.cc (cgraph_node::make_profile_global0): Support GUESSED_GLOBAL0_AFDO * ipa-cp.cc (update_profiling_info): Use GUESSED_GLOBAL0_AFDO. * profile-count.cc (profile_probability::dump): Use quality (). (profile_probability::stream_in): Use m_adjusted_quality. (profile_probability::stream_out): Use m_adjusted_quality. (profile_count::combine_with_ipa_count): Use quality (). (profile_probability::sqrt): Likewise. * profile-count.h (enum profile_quality): Add GUESSED_GLOBAL0_AFDO; reoder GUESSED_GLOBAL0_ADJUSTED and GUESSED_GLOBAL0. (profile_probability): Add min_quality; replase m_quality by m_adjused_quality; add set_quality; update all users of quality. (profile_count): Set n_bits to 60; make m_quality 4 bits; update uses of quality. (profile_count::afdo_zero, profile_count::globa0afdo): New.
2025-06-22Daily bump.GCC Administrator4-1/+89
2025-06-21Fix profile after fnsplitJan Hubicka1-4/+6
when splitting functions, tree-inline determined correctly entry count of the new function part, but then in case entry block of new function part is in a loop it scales body which is not suposed to happen. * tree-inline.cc (copy_cfg_body): Fix profile of split functions.
2025-06-21[modula2] Comment tidyup in gm2-compiler/M2GCCDeclare.modGaius Mulley1-10/+6
This patch reformats three comments in the GNU GCC style. gcc/m2/ChangeLog: * gm2-compiler/M2GCCDeclare.mod (StartDeclareModuleScopeSeparate): Reformat statement comments. (StartDeclareModuleScopeWholeProgram): Ditto. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-06-21[RISC-V][PR target/118241] Fix data prefetch predicate/constraint for RISC-VJeff Law4-2/+34
The RISC-V prefetch support is broken in a few ways. This addresses the data side prefetch problems. I'd mistakenly thought this BZ was a prefetch.i related (which has deeper problems). The basic problem is we were accepting any valid address when in fact there are restrictions. This patch more precisely defines the predicate such that we allow REG REG+D Where D must have the low 5 bits clear. Note that absolute addresses fall into the REG+D form using the x0 for the register operand since it always has the value zero. The test verifies REG, REG+D, ABS addressing modes that are valid as well as REG+D and ABS which must be reloaded into a REG because the displacement has low bits set. An earlier version of this patch has gone through testing in my tester on rv32 and rv64. Obviously I'll wait for pre-commit CI to do its thing before moving forward. This is a good backport candidate after simmering on the trunk for a bit. PR target/118241 gcc/ * config/riscv/predicates.md (prefetch_operand): New predicate. * config/riscv/constraints.md (Q): New constraint. * config/riscv/riscv.md (prefetch): Use new predicate and constraint. (riscv_prefetchi_<mode>): Similarly. gcc/testsuite/ * gcc.target/riscv/pr118241.c: New test.
2025-06-21value-range: Use int instead of uint for wi::ctz result [PR120746]Jakub Jelinek1-1/+1
uint is some compatibility type in glibc sys/types.h enabled in misc/GNU modes, so it doesn't exist on many hosts. Furthermore, wi::ctz returns int rather than unsigned and the var is only used in comparison to zero or as second argument of left shift, so I think just using int instead of unsigned is better. 2025-06-21 Jakub Jelinek <jakub@redhat.com> PR middle-end/120746 * value-range.cc (irange::snap): Use int type instead of uint.
2025-06-21Extend afdo inliner to introduce speculative callsJan Hubicka6-246/+357
This patch makes the AFDO's VPT to happen during early inlining. This should make the einline pass inside afdo pass unnecesary, but some inlining still happens there - I will need to debug why that happens and will try to drop the afdo's inliner incrementally. get_inline_stack_in_node can now be used to produce inline stack out of callgraph nodes which are marked as inline clones, so we do not need to iterate tree-inline and IPA decisions phases like old code did. I also added some debug facilities - dumping of decisions and inline stacks, so one can match them with data in gcov profile. Former VPT pass identified all caes where in train run indirect call was inlined and the inlined callee collected some samples. In this case it forced inline without doing any checks, such as whether inlining is possible. New code simply introduces speculative edges into callgraph and lets afdo inlining to decide. Old code also marked statements that were introduced during promotion to prevent doing double speculation i.e. if (ptr == foo) .. inlined foo ... else ptr (); to if (ptr == foo) .. inlined foo ... else if (ptr == foo) foo (); // for IPA inlining else ptr (); Since inlning now happens much earlier, tracking the statements would be quite hard. Instead I simply remove the targets from profile data which sould have same effect. I also noticed that there is nothing setting max_count so all non-0 profile is considered hot which I fixed too. Training with ref run I now get 500.perlbench_r 1 160 9.93 * 1 162 9.84 * 502.gcc_r NR NR 505.mcf_r 1 186 8.68 * 1 194 8.34 * 520.omnetpp_r 1 183 7.15 * 1 208 6.32 * 523.xalancbmk_r NR NR 525.x264_r 1 85.2 20.5 * 1 85.8 20.4 * 531.deepsjeng_r 1 165 6.93 * 1 176 6.51 * 541.leela_r 1 268 6.18 * 1 282 5.87 * 548.exchange2_r 1 86.3 30.4 * 1 88.9 29.5 * 557.xz_r 1 224 4.81 * 1 224 4.82 * Est. SPECrate2017_int_base 9.72 Est. SPECrate2017_int_peak 9.33 503.bwaves_r NR NR 507.cactuBSSN_r 1 107 11.9 * 1 105 12.0 * 508.namd_r 1 108 8.79 * 1 116 8.18 * 510.parest_r 1 143 18.3 * 1 156 16.8 * 511.povray_r 1 188 12.4 * 1 163 14.4 * 519.lbm_r 1 72.0 14.6 * 1 75.0 14.1 * 521.wrf_r 1 106 21.1 * 1 106 21.1 * 526.blender_r 1 147 10.3 * 1 147 10.4 * 527.cam4_r 1 110 15.9 * 1 118 14.8 * 538.imagick_r 1 104 23.8 * 1 105 23.7 * 544.nab_r 1 146 11.6 * 1 143 11.8 * 549.fotonik3d_r 1 134 29.0 * 1 169 23.1 * 554.roms_r 1 86.6 18.4 * 1 89.3 17.8 * Est. SPECrate2017_fp_base 15.4 Est. SPECrate2017_fp_peak 14.9 Base is without profile feedback and peak is AFDO. gcc/ChangeLog: * auto-profile.cc (dump_inline_stack): New function. (get_inline_stack_in_node): New function. (get_relative_location_for_stmt): Add FN parameter. (has_indirect_call): Remove. (function_instance::find_icall_target_map): Add FN parameter. (function_instance::remove_icall_target): New function. (function_instance::read_function_instance): Set sum_max. (autofdo_source_profile::get_count_info): Add NODE parameter. (autofdo_source_profile::update_inlined_ind_target): Add NODE parameter. (autofdo_source_profile::remove_icall_target): New function. (afdo_indirect_call): Add INDIRECT_EDGE parameter; dump reason for failure; do not check for recursion; do not inline call. (afdo_vpt): Add INDIRECT_EDGE parameter. (afdo_set_bb_count): Do not take PROMOTED set. (afdo_vpt_for_early_inline): Remove. (afdo_annotate_cfg): Do not take PROMOTED set. (auto_profile): Do not call afdo_vpt_for_early_inline. (afdo_callsite_hot_enough_for_early_inline): Dump count. (remove_afdo_speculative_target): New function. * auto-profile.h (afdo_vpt_for_early_inline): Declare. (remove_afdo_speculative_target): Declare. * ipa-inline.cc (inline_functions_by_afdo): Do VPT. (early_inliner): Redirecct edges if inlining happened. * tree-inline.cc (expand_call_inline): Add sanity check. gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/afdo-vpt-earlyinline.c: Update template. * gcc.dg/tree-prof/indir-call-prof-2.c: Update template.
2025-06-21Implement afdo inlinerJan Hubicka5-20/+128
This patch moves afdo inlining from early inliner into specialized one. The reason is that early inliner is by design non-recursive while afdo inliner needs to recurse. In the past google handled it by increasing early inliner iterations, but it can be done easily and cheaply without it by simply recusing into inlined functions. I will also look into moving VPT to early inliner now. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * auto-profile.cc (get_inline_stack): Add fn parameter. * ipa-inline.cc (want_early_inline_function_p): Do not care about AFDO. (inline_functions_by_afdo): New function. (early_inliner): Use it. gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/afdo-vpt-earlyinline.c: Update template. * gcc.dg/tree-prof/indir-call-prof-2.c: Likewise. * gcc.dg/tree-prof/afdo-inline.c: New test.
2025-06-21RISC-V: Fix ICE for expand_select_vldi [PR120652]Pan Li5-1/+47
The will be one ICE when expand pass, the bt similar as below. during RTL pass: expand red.c: In function 'main': red.c:20:5: internal compiler error: in require, at machmode.h:323 20 | int main() { | ^~~~ 0x2e0b1d6 internal_error(char const*, ...) ../../../gcc/gcc/diagnostic-global-context.cc:517 0xd0d3ed fancy_abort(char const*, int, char const*) ../../../gcc/gcc/diagnostic.cc:1803 0xc3da74 opt_mode<machine_mode>::require() const ../../../gcc/gcc/machmode.h:323 0xc3de2f opt_mode<machine_mode>::require() const ../../../gcc/gcc/poly-int.h:1383 0xc3de2f riscv_vector::expand_select_vl(rtx_def**) ../../../gcc/gcc/config/riscv/riscv-v.cc:4218 0x21c7d22 gen_select_vldi(rtx_def*, rtx_def*, rtx_def*) ../../../gcc/gcc/config/riscv/autovec.md:1344 0x134db6c maybe_expand_insn(insn_code, unsigned int, expand_operand*) ../../../gcc/gcc/optabs.cc:8257 0x134db6c expand_insn(insn_code, unsigned int, expand_operand*) ../../../gcc/gcc/optabs.cc:8288 0x11b21d3 expand_fn_using_insn ../../../gcc/gcc/internal-fn.cc:318 0xef32cf expand_call_stmt ../../../gcc/gcc/cfgexpand.cc:3097 0xef32cf expand_gimple_stmt_1 ../../../gcc/gcc/cfgexpand.cc:4264 0xef32cf expand_gimple_stmt ../../../gcc/gcc/cfgexpand.cc:4411 0xef95b6 expand_gimple_basic_block ../../../gcc/gcc/cfgexpand.cc:6472 0xefb66f execute ../../../gcc/gcc/cfgexpand.cc:7223 The select_vl op_1 and op_2 may be the same const_int like (const_int 32). And then maybe_legitimize_operands will: 1. First mov the const op_1 to a reg. 2. Resue the reg of op_1 for op_2 as the op_1 and op_2 is equal. That will break the assumption that the op_2 of select_vl is immediate, or something like CONST_INT_POLY. The below test suites are passed for this patch series. * The rv64gcv fully regression test. PR target/120652 gcc/ChangeLog: * config/riscv/autovec.md: Add immediate_operand for select_vl operand 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr120652-1.c: New test. * gcc.target/riscv/rvv/autovec/pr120652-2.c: New test. * gcc.target/riscv/rvv/autovec/pr120652-3.c: New test. * gcc.target/riscv/rvv/autovec/pr120652.h: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-06-21Daily bump.GCC Administrator6-1/+169
2025-06-20cobol: Correct diagnostic strings for 32-bit builds.James K. Lowden4-22/+26
Avoid %z for printf-family. Cast pid_t to long. Avoid use of YYUNDEF for old Bison versions. PR cobol/120621 gcc/cobol/ChangeLog: * genapi.cc (parser_compile_ecs): Cast argument to unsigned long. (parser_compile_dcls): Same. (parser_division): RAII. (inspect_tally): Cast argument to unsigned long. * lexio.cc (cdftext::lex_open): Cast pid_t to long. * parse.y: hard-code values for old versions of Bison, and message format. * scan_ante.h (wait_for_the_child): Cast pid_t to long.
2025-06-20Fix range wrap check and enhance verify_range.Andrew MacLeod2-17/+61
when snapping range bounds to satidsdaybitmask constraints, end bound overflow and underflow checks were not working properly. Also Adjust some comments, and enhance verify_range to make sure range pairs are sorted properly. PR tree-optimization/120701 gcc/ * value-range.cc (irange::verify_range): Verify range pairs are sorted properly. (irange::snap): Check for over/underflow properly. gcc/testsuite/ * gcc.dg/pr120701.c: New.
2025-06-20amdgcn: allow SImode in VCC_HI [PR120722]Andrew Stubbs1-2/+1
This patch isn't fully tested yet, but it fixes the build failure, so that will do for now. SImode was not allowed in VCC_HI because there were issues, way back before the port went upstream, so it's possible we'll find out what those issues were again soon. gcc/ChangeLog: PR target/120722 * config/gcn/gcn.cc (gcn_hard_regno_mode_ok): Allow SImode in VCC_HI.
2025-06-20libgcobol: Add license.James K. Lowden1-0/+27
libgcobol/ChangeLog: * LICENSE: New file.
2025-06-20Use auto_vec in prime paths selftests [PR120634]Jørgen Kvalsvik1-25/+23
The selftests had a bunch of memory leaks that showed up in make selftest-valgrind as a result of not using auto_vec or other explicitly calling release. Replacing vec with auto_vec makes the problem go away. The auto_vec_vec helper is made constructable from a vec so that objects returned from functions can be automatically managed too. PR gcov-profile/120634 gcc/ChangeLog: * prime-paths.cc (struct auto_vec_vec): Add constructor from vec. (test_split_components): Use auto_vec_vec. (test_scc_internal_prime_paths): Ditto. (test_scc_entry_exit_paths): Ditto. (test_complete_prime_paths): Ditto. (test_entry_prime_paths): Ditto. (test_singleton_path): Ditto.
2025-06-20Free buffer on function exit [PR120634]Jørgen Kvalsvik1-1/+1
Using auto_vec ensures that the buffer is always free'd when the function returns. PR gcov-profile/120634 gcc/ChangeLog: * prime-paths.cc (trie::paths): Use auto_vec.
2025-06-20tree-optimization/120654 - ICE with range query from IVOPTsRichard Biener2-5/+29
The following ICEs as we hand down an UNDEFINED range to where it isn't expected. Put the guard that's there earlier. PR tree-optimization/120654 * vr-values.cc (range_fits_type_p): Check for undefined_p () before accessing type (). * gcc.dg/torture/pr120654.c: New testcase.
2025-06-20x86: Get the widest vector mode from MOVE_MAXH.J. Lu15-35/+110
Since MOVE_MAX defines the maximum number of bytes that an instruction can move quickly between memory and registers, use it to get the widest vector mode in vector loop when inlining memcpy and memset. gcc/ PR target/120708 * config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): Use MOVE_MAX to get the widest vector mode in vector loop. gcc/testsuite/ PR target/120708 * gcc.target/i386/memcpy-pr120708-1.c: New test. * gcc.target/i386/memcpy-pr120708-2.c: Likewise. * gcc.target/i386/memcpy-pr120708-3.c: Likewise. * gcc.target/i386/memcpy-pr120708-4.c: Likewise. * gcc.target/i386/memcpy-pr120708-5.c: Likewise. * gcc.target/i386/memcpy-pr120708-6.c: Likewise. * gcc.target/i386/memset-pr120708-1.c: Likewise. * gcc.target/i386/memset-pr120708-2.c: Likewise. * gcc.target/i386/memcpy-strategy-1.c: Drop dg-skip-if. Replace -march=atom with -mno-avx -msse2 -mtune=generic -mtune-ctrl=^sse_typeless_stores. * gcc.target/i386/memcpy-strategy-2.c: Likewise. * gcc.target/i386/memcpy-vector_loop-1.c: Likewise. * gcc.target/i386/memcpy-vector_loop-2.c: Likewise. * gcc.target/i386/memset-vector_loop-1.c: Likewise. * gcc.target/i386/memset-vector_loop-2.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-06-20or1k: Improve If-Conversion by delaying cbranch splitsStafford Horne2-2/+87
When working on PR120587 I found that the ce1 pass was not able to properly optimize branches on OpenRISC. This is because of the early splitting of "compare" and "branch" instructions during the expand pass. Convert the cbranch* instructions from define_expand to define_insn_and_split. This dalays the instruction split until after the ce1 pass is done giving ce1 the best opportunity to perform the optimizations on the original form of cbranch<mode>4 instructions. gcc/ChangeLog: * config/or1k/or1k.cc (or1k_noce_conversion_profitable_p): New function. (or1k_is_cmov_insn): New function. (TARGET_NOCE_CONVERSION_PROFITABLE_P): Define macro. * config/or1k/or1k.md (cbranchsi4): Convert to insn_and_split. (cbranch<mode>4): Convert to insn_and_split. Signed-off-by: Stafford Horne <shorne@gmail.com>
2025-06-20or1k: Implement *extendbisi* to fix ICE in convert_mode_scalar [PR120587]Stafford Horne2-0/+29
After commit 2dcc6dbd8a0 ("emit-rtl: Use simplify_subreg_regno to validate hardware subregs [PR119966]") the OpenRISC port is broken again. Add extend* iinstruction patterns for the SR_F pseudo registers to avoid having to use the subreg conversions which no longer work. gcc/ChangeLog: PR target/120587 * config/or1k/or1k.md (zero_extendbisi2_sr_f): New expand. (extendbisi2_sr_f): New expand. * config/or1k/predicates.md (sr_f_reg_operand): New predicate. Signed-off-by: Stafford Horne <shorne@gmail.com>
2025-06-19[RISC-V] Force several tests to use rocket tuningJeff Law5-5/+5
My tester has been flagging these regressions since the default cost model was committed, along with several others > unix/-march=rv64gc_zba_zbb_zbs_zicond: gcc: gcc.target/riscv/rvv/vsetvl/avl_single-37.c -O2 scan-assembler-times \\.L[0-9]+\\:\\s+addi\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[0-9]00\\s+addi\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[0-9]00\\s+addi\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[0-9]00\\s+add\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[a-x0-9]+\\s+\\.L[0-9]+\\: 1 > unix/-march=rv64gc_zba_zbb_zbs_zicond: gcc: gcc.target/riscv/rvv/vsetvl/avl_single-37.c -O2 -flto -fno-use-linker-plugin -flto-partition=none scan-assembler-times \\.L[0-9]+\\:\\s+addi\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[0-9]00\\s+addi\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[0-9]00\\s+addi\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[0-9]00\\s+add\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[a-x0-9]+\\s+\\.L[0-9]+\\: 1 > unix/-march=rv64gc_zba_zbb_zbs_zicond: gcc: gcc.target/riscv/rvv/vsetvl/avl_single-37.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects scan-assembler-times \\.L[0-9]+\\:\\s+addi\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[0-9]00\\s+addi\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[0-9]00\\s+addi\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[0-9]00\\s+add\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[a-x0-9]+\\s+\\.L[0-9]+\\: 1 I really question the value of checking the output that precisely in these tests -- they're supposed to be checking vsetvl correctness and optimization, so the ordering and such of scalar ops shouldn't really be important at all. Regardless, since I don't know these tests at all I resisted the temptation to rip out the undesirable aspects of the test. Next up, fix the bogus scan or force the old cost model (rocket). I choose the latter as a path of least resistance and least surprise. Waiting for pre-commit CI to spin. gcc/testsuite * gcc.target/riscv/rvv/vsetvl/avl_single-37.c: Force rocket tuning. * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-17.c: Likewise. * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-18.c: Likewise. * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-19.c: Likewise. * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-20.c: Likewise.
2025-06-19[PATCH] RISC-V: Use builtin clz/ctz when count_leading_zeros and ↵Sosutha Sethuramapandian1-0/+14
count_trailing_zeros is used longlong.h for RISCV should define count_leading_zeros and count_trailing_zeros and COUNT_LEADING_ZEROS_0 when ZBB is enabled. The following patch patch fixes the bug reported in, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110181 The divdi3 on riscv32 with zbb extension generates __clz_tab instead of genearating __builtin_clzll/__builtin_clz which is not efficient since lookup table is emitted. Updating longlong.h to use this __builtin_clzll/__builtin_clz generates optimized code for the instruction. PR target/110181 include/ChangeLog * longlong.h [__riscv] (count_leading_zeros): Define. [__riscv] (count_trailing_zeros): Likewise. [__riscv] (COUNT_LEADING_ZEROS_0): Likewise.
2025-06-20RISC-V: Add test for vec_duplicate + vminu.vv combine case 1 with GR2VR cost ↵Pan Li12-0/+35
0, 1 and 2 Add asm dump check test for vec_duplicate + vminu.vv combine to vminu.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check for vminu.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-06-20RISC-V: Add test for vec_duplicate + vminu.vv combine case 0 with GR2VR cost ↵Pan Li22-1/+357
0, 2 and 15 Add asm dump check and run test for vec_duplicate + vminu.vv combine to vminu.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test helper macors. * gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test data for run test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vmin-run-1-u16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vmin-run-1-u32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vmin-run-1-u64.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vmin-run-1-u8.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vmin-run-2-u16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vmin-run-2-u32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vmin-run-2-u64.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vmin-run-2-u8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-06-20RISC-V: Combine vec_duplicate + vminu.vv to vminu.vx on GR2VR costPan Li3-2/+5
This patch would like to combine the vec_duplicate + vminu.vv to the vminu.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the GR2VR cost is greater than zero. Assume we have example code like below, GR2VR cost is 0. #define DEF_VX_BINARY(T, FUNC) \ void \ test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \ { \ for (unsigned i = 0; i < n; i++) \ out[i] = FUNC (in[i], x); \ } uint32_t min(uint32 a, uint32 b) { return a > b ? b : a; } DEF_VX_BINARY(uint32_t, min) Before this patch: 10 │ test_vx_binary_or_int32_t_case_0: 11 │ beq a3,zero,.L8 12 │ vsetvli a5,zero,e32,m1,ta,ma 13 │ vmv.v.x v2,a2 14 │ slli a3,a3,32 15 │ srli a3,a3,32 16 │ .L3: 17 │ vsetvli a5,a3,e32,m1,ta,ma 18 │ vle32.v v1,0(a1) 19 │ slli a4,a5,2 20 │ sub a3,a3,a5 21 │ add a1,a1,a4 22 │ vminu.vv v1,v1,v2 23 │ vse32.v v1,0(a0) 24 │ add a0,a0,a4 25 │ bne a3,zero,.L3 After this patch: 10 │ test_vx_binary_or_int32_t_case_0: 11 │ beq a3,zero,.L8 12 │ slli a3,a3,32 13 │ srli a3,a3,32 14 │ .L3: 15 │ vsetvli a5,a3,e32,m1,ta,ma 16 │ vle32.v v1,0(a1) 17 │ slli a4,a5,2 18 │ sub a3,a3,a5 19 │ add a1,a1,a4 20 │ vminu.vx v1,v1,a2 21 │ vse32.v v1,0(a0) 22 │ add a0,a0,a4 23 │ bne a3,zero,.L3 gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vx_binary_vec_dup_vec): Add new case UMIN. (expand_vx_binary_vec_vec_dup): Ditto. * config/riscv/riscv.cc (riscv_rtx_costs): Ditto. * config/riscv/vector-iterators.md: Add new op umin. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-06-20Daily bump.GCC Administrator5-1/+118
2025-06-19libgomp/target.c: Fix buffer size for 'omp requires' diagnosticTobias Burnus1-6/+7
One of the buffers that printed the list of set 'omp requires' requirements missed the 'self' clause addition, being potentially to short when all device-affecting clauses were passed. Solved it by moving the sizeof(<string of all permitted values>" into a new '#define' just above the associated gomp_requires_to_name function. libgomp/ChangeLog: * target.c (GOMP_REQUIRES_NAME_BUF_LEN): Define. (GOMP_offload_register_ver, gomp_target_init): Use it for the char buffer size.
2025-06-19libgomp.texi: Document omp(x)::allocator::*, restructure memory allocator docTobias Burnus1-68/+113
libgomp/ChangeLog: * libgomp.texi (omp_init_allocator): Refer to 'Memory allocation' for available memory spaces. (OMP_ALLOCATOR): Move list of traits and predefined memspaces and allocators to ... (Memory allocation): ... here. Document omp(x)::allocator::*; minor wording tweaks, be more explicit about memkind, pinned and pool_size. Co-authored-by: waffl3x <waffl3x@baylibre.com>
2025-06-19expand: Align PARM_DECLs again to at least BITS_PER_WORD if possible [PR120689]Jakub Jelinek2-1/+18
The following testcase shows a regression caused by the r10-577 change made for cris. Before that change, the MEM holding (in this case 3 byte) struct parameter was BITS_PER_WORD aligned, now it is just BITS_PER_UNIT aligned and that causes significantly worse generated code. So, the MAX (DECL_ALIGN (parm), BITS_PER_WORD) extra alignment clearly doesn't help just STRICT_ALIGNMENT targets, but other targets as well. Of course, it isn't worth doing stack realignment in the rare case of MAX_SUPPORTED_STACK_ALIGNMENT < BITS_PER_WORD targets like cris, so the patch only bumps the alignment if it won't go the > MAX_SUPPORTED_STACK_ALIGNMENT path because of that optimization. The patch keeps the gcc 15 behavior for avr, pru, m68k and cris (at least some options for those) and restores the behavior before r10-577 on other targets. The change on the testcase on x86_64 is: bar: - movl %edi, %eax - movzbl %dil, %r8d - movl %esi, %ecx - movzbl %sil, %r10d - movl %edx, %r9d - movzbl %dl, %r11d - shrl $16, %edi - andl $65280, %ecx - shrl $16, %esi - shrl $16, %edx - andl $65280, %r9d - orq %r10, %rcx - movzbl %dl, %edx - movzbl %sil, %esi - andl $65280, %eax - movzbl %dil, %edi - salq $16, %rdx - orq %r11, %r9 - salq $16, %rsi - orq %r8, %rax - salq $16, %rdi - orq %r9, %rdx - orq %rcx, %rsi - orq %rax, %rdi jmp foo 2025-06-19 Jakub Jelinek <jakub@redhat.com> PR target/120689 * function.cc (assign_parm_setup_block): Align parm to at least word alignment even on !STRICT_ALIGNMENT targets, as long as BITS_PER_WORD is not larger than MAX_SUPPORTED_STACK_ALIGNMENT. * gcc.target/i386/pr120689.c: New test.
2025-06-19fortran: Statically initialize length of SAVEd character arraysMikael Morin2-2/+33
PR fortran/120713 gcc/fortran/ChangeLog: * trans-array.cc (gfc_trans_deferred_array): Statically initialize deferred length variable for SAVEd character arrays. gcc/testsuite/ChangeLog: * gfortran.dg/save_alloc_character_1.f90: New test.
2025-06-19x86: Enable *mov<mode>_(and|or) only for -OzH.J. Lu6-3/+121
commit ef26c151c14a87177d46fd3d725e7f82e040e89f Author: Roger Sayle <roger@nextmovesoftware.com> Date: Thu Dec 23 12:33:07 2021 +0000 x86: PR target/103773: Fix wrong-code with -Oz from pop to memory. added "*mov<mode>_and" and extended "*mov<mode>_or" to transform "mov $0,mem" to the shorter "and $0,mem" and "mov $-1,mem" to the shorter "or $-1,mem" for -Oz. But the new pattern: (define_insn "*mov<mode>_and" [(set (match_operand:SWI248 0 "memory_operand" "=m") (match_operand:SWI248 1 "const0_operand")) (clobber (reg:CC FLAGS_REG))] "reload_completed" "and{<imodesuffix>}\t{%1, %0|%0, %1}" [(set_attr "type" "alu1") (set_attr "mode" "<MODE>") (set_attr "length_immediate" "1")]) and the extended pattern: (define_insn "*mov<mode>_or" [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm") (match_operand:SWI248 1 "constm1_operand")) (clobber (reg:CC FLAGS_REG))] "reload_completed" "or{<imodesuffix>}\t{%1, %0|%0, %1}" [(set_attr "type" "alu1") (set_attr "mode" "<MODE>") (set_attr "length_immediate" "1")]) aren't guarded for -Oz. As a result, "and $0,mem" and "or $-1,mem" are generated without -Oz. 1. Change *mov<mode>_and" to define_insn_and_split and split it to "mov $0,mem" if not -Oz. 2. Change "*mov<mode>_or" to define_insn_and_split and split it to "mov $-1,mem" if not -Oz. 3. Don't transform "mov $-1,reg" to "push $-1; pop reg" for -Oz since it should be transformed to "or $-1,reg". gcc/ PR target/120427 * config/i386/i386.md (*mov<mode>_and): Changed to define_insn_and_split. Split it to "mov $0,mem" if not -Oz. (*mov<mode>_or): Changed to define_insn_and_split. Split it to "mov $-1,mem" if not -Oz. (peephole2): Don't transform "mov $-1,reg" to "push $-1; pop reg" for -Oz since it will be transformed to "or $-1,reg". gcc/testsuite/ PR target/120427 * gcc.target/i386/cold-attribute-4.c: Compile with -Oz. * gcc.target/i386/pr120427-1.c: New test. * gcc.target/i386/pr120427-2.c: Likewise. * gcc.target/i386/pr120427-3.c: Likewise. * gcc.target/i386/pr120427-4.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-06-19install.texi: Note that Texinfo < v7.1 may throw incorrect warnings.Georg-Johann Lay1-0/+3
PR other/115893 gcc/ * doc/install.texi (Prerequisites): Note that Texinfo older than v7.1 may throw incorrect build warnings, cf. https://lists.nongnu.org/archive/html/help-texinfo/2023-11/msg00004.html
2025-06-19RISC-V: Add generic tune as default.Dongyan Chen4-1/+46
According to the discussion in https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686893.html, by creating a -mtune=generic may be a good idea to slove the question regarding the branch cost. Changes for v2: - Delete the code about -mcpu=generic. gcc/ChangeLog: * config/riscv/riscv-cores.def (RISCV_TUNE): Add "generic" tune. * config/riscv/riscv.cc: Add generic_tune_info. * config/riscv/riscv.h (RISCV_TUNE_STRING_DEFAULT): Change default tune. gcc/testsuite/ChangeLog: * gcc.target/riscv/zicond-primitiveSemantics_compare_reg_reg_return_reg_reg.c: New test.
2025-06-19dfp: Further decimal_real_to_integer fixes [PR120631]Jakub Jelinek3-8/+112
Unfortunately, the following further testcase shows that there aren't problems only with very large precisions and large exponents, but pretty much anything larger than 64-bits. After all, before _BitInt support dfp didn't even have {,unsigned }__int128 <-> _Decimal{32,64,128,64x} support, and the testcase again shows some of the conversions yielding zeros. While the pr120631.c test worked even without the earlier patch. So, this patch assumes 64-bit precision at most is ok and for anything larger it just uses exponent 0 and multiplies afterwards. 2025-06-19 Jakub Jelinek <jakub@redhat.com> PR middle-end/120631 * dfp.cc (decimal_real_to_integer): Use result multiplication not just when precision > 128 and dn.exponent > 19, but when precision > 64 and dn.exponent > 0. * gcc.dg/dfp/bitint-10.c: New test. * gcc.dg/dfp/pr120631.c: New test.
2025-06-19RISC-V: Use riscv_2x_xlen_mode_p [NFC]Kito Cheng1-8/+4
Use riscv_v_ext_mode_p to check the mode size is 2x XLEN, instead of using "(GET_MODE_UNIT_SIZE (mode) == (UNITS_PER_WORD * 2))". gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_move): Use riscv_2x_xlen_mode_p. (riscv_binary_cost): Ditto. (riscv_hard_regno_mode_ok): Ditto.
2025-06-19RISC-V: Adding cost model for zilsdKito Cheng3-0/+69
Motivation of this patch is we want to use ld/sd if possible when zilsd is enabled, however the subreg pass may split that into two lw/sw instructions because the cost, and it only check cost for 64 bits reg move, that's why we need adjust cost for 64 bit reg move as well. However even we adjust the cost model, 64 bit shift still use 32 bit load because it already got split at expand time, this may need to fix on the expander side, and this apparently need few more time to investigate, so I just added a testcase with XFAIL to show the current behavior, and we can fix that...when we have time. For long term, we may adding a new field to riscv_tune_param to control the cost model for that. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_cost_model): Add cost model for zilsd. gcc/testsuite/ChangeLog: * gcc.target/riscv/zilsd-code-gen-split-subreg-1.c: New test. * gcc.target/riscv/zilsd-code-gen-split-subreg-2.c: New test.
2025-06-19x86: Fix shrink wrap separate ICE under -fstack-clash-protection [PR120697]Lili Cui2-13/+20
gcc/ChangeLog: PR target/120697 * config/i386/i386.cc (ix86_expand_prologue): Remove 3 assertions and associated code. gcc/testsuite/ChangeLog: PR target/120697 * gcc.target/i386/stack-clash-protection.c: New test.
2025-06-19Daily bump.GCC Administrator8-1/+389
2025-06-18analyzer: make checker_event::m_kind privateDavid Malcolm3-16/+19
No functional change intended. gcc/analyzer/ChangeLog: * checker-event.h (checker_event::get_kind): New accessor. (checker_event::m_kind): Make private. * checker-path.cc (checker_path::maybe_log): Use accessor for checker_event::m_kind. (checker_path::add_event): Likewise. (checker_path::debug): Likewise. (checker_path::cfg_edge_pair_at_p): Likewise. (checker_path::inject_any_inlined_call_events): Likewise. * diagnostic-manager.cc (diagnostic_manager::prune_for_sm_diagnostic): Likewise. (diagnostic_manager::prune_for_sm_diagnostic): Likewise. (diagnostic_manager::consolidate_conditions): Likewise. (diagnostic_manager::consolidate_unwind_events): Likewise. (diagnostic_manager::finish_pruning): Likewise. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-06-18Add space after foo in testcaseAndrew MacLeod1-1/+1
gcc/testsuite/ * gcc.dg/pr119039-1.c: Add space in search criteria.
2025-06-18emit-rtl: Use simplify_subreg_regno to validate hardware subregs [PR119966]Dimitar Dimitrov3-18/+30
PR119966 showed that combine could generate unfoldable hardware subregs for pru-unknown-elf. To fix, strengthen the checks performed by validate_subreg. The simplify_subreg_regno performs more validity checks than the simple info.representable_p. Most importantly, the targetm.hard_regno_mode_ok hook is called to ensure the hardware register is valid in subreg's outer mode. This fixes the rootcause for PR119966. The checks for stack-related registers are bypassed because the i386 backend generates them, in this seemingly valid peephole optimization: ;; Attempt to always use XOR for zeroing registers (including FP modes). (define_peephole2 [(set (match_operand 0 "general_reg_operand") (match_operand 1 "const0_operand"))] "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ()) && peep2_regno_dead_p (0, FLAGS_REG)" [(parallel [(set (match_dup 0) (const_int 0)) (clobber (reg:CC FLAGS_REG))])] "operands[0] = gen_lowpart (word_mode, operands[0]);") Testing done: * No regressions were detected for C and C++ on x86_64-pc-linux-gnu. * "contrib/compare-all-tests i386" showed no difference in code generation. * No regressions for pru-unknown-elf. * Reverted r16-809-gf725d6765373f7 to expose the now latent PR119966. Then ensured pru-unknown-elf build is ok. Only two cases regressed where rnreg pass transforms a valid hardware subreg into invalid one. But I think that is not related to combine's PR119966: gcc.c-torture/execute/20040709-1.c gcc.c-torture/execute/20040709-2.c PR target/119966 gcc/ChangeLog: * emit-rtl.cc (validate_subreg): Call simplify_subreg_regno instead of checking info.representable_p.. * rtl.h (simplify_subreg_regno): Add new argument allow_stack_regs. * rtlanal.cc (simplify_subreg_regno): Do not reject stack-related registers if allow_stack_regs is true. Co-authored-by: Richard Sandiford <richard.sandiford@arm.com> Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>