aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-08-12rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]Peter Bergner3-4/+21
Only disable shrink-wrapping when using -mrop-protect when we know we will be emitting the ROP-protect hash instructions (ie, non-leaf functions). 2024-06-17 Peter Bergner <bergner@linux.ibm.com> gcc/ PR target/114759 * config/rs6000/rs6000.cc (rs6000_override_options_after_change): Move the disabling of shrink-wrapping from here.... * config/rs6000/rs6000-logue.cc (rs6000_emit_prologue): ...to here. gcc/testsuite/ PR target/114759 * gcc.target/powerpc/pr114759-1.c: New test.
2024-08-12RISC-V: Fix missing abi arg in testEdwin Lu1-1/+1
The following test was failing when building on 32 bit targets due to not overwriting the mabi arg. This resulted in dejagnu attempting to run the test with -mabi=ilp32d -march=rv64gcv_zvl256b gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr116202-run-1.c: Add mabi arg Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
2024-08-12[rtl-optimization/116244] Don't create bogus regs in alter_subregJeff Law4-2/+281
So this is another nasty latent bug exposed by ext-dce. Similar to the prior m68k failure it's another problem with how we handle paradoxical subregs on big endian targets. In this instance when we remove the hard subregs we take something like: (subreg:DI (reg:SI 0) 0) And turn it into (reg:SI -1) Which is clearly wrong. (reg:SI 0) is correct. The transformation happens in alter_subreg, but I really wanted to fix this in subreg_regno since we could have similar problems in some of the other callers of subreg_regno. Unfortunately reload depends on the current behavior of subreg_regno; in the cases where the return value is an invalid register, the wrong half of a register pair, etc the resulting bogus value is detected by reload and triggers reloading of the inner object. So that's the new comment in subreg_regno. The second best place to fix is alter_subreg which is what this patch does. If presented with a paradoxical subreg, then the base register number should always be REGNO (SUBREG_REG (object)). It's just how paradoxicals are designed to work. I haven't tried to fix the other places that call subreg_regno. After being burned by reload, I'm more than a bit worried about unintended fallout. I must admit I'm surprised we haven't stumbled over this before and that it didn't fix any failures on the big endian embedded targets. Boostrapped & regression tested on x86_64, also went through all the embedded targets in my tester and bootstrapped on m68k & s390x to get some additional big endian testing. Pushing to the trunk. rtl-optimization/116244 gcc/ * rtlanal.cc (subreg_regno): Update comment. * final.cc (alter_subrg): Always use REGNO (SUBREG_REG ()) to get the base regsiter for paradoxical subregs. gcc/testsuite/ * g++.target/m68k/m68k.exp: New test driver. * g++.target/m68k/pr116244.C: New test.
2024-08-12borrowck: Fix debug prints on 32-bits architecturesArthur Cohen2-3/+6
gcc/rust/ChangeLog: * checks/errors/borrowck/rust-bir-builder.h: Cast size_t values to unsigned long before printing. * checks/errors/borrowck/rust-bir-fact-collector.h: Likewise.
2024-08-12borrowck: Avoid overloading issues on 32bit architecturesArthur Cohen1-2/+2
On architectures where `size_t` is `unsigned int`, such as 32bit x86, we encounter an issue with `PlaceId` and `FreeRegion` being aliases to the same types. This poses an issue for overloading functions for these two types, such as `push_subset` in that case. This commit renames one of these `push_subset` functions to avoid the issue, but this should be fixed with a newtype pattern for these two types. gcc/rust/ChangeLog: * checks/errors/borrowck/rust-bir-fact-collector.h (points): Rename `push_subset(PlaceId, PlaceId)` to `push_subset_place(PlaceId, PlaceId)`
2024-08-12ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_setsManolis Tsamis3-138/+141
The existing implementation of need_cmov_or_rewire and noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG. This commit enchances them so they can handle/rewire arbitrary set statements. To do that a new helper struct noce_multiple_sets_info is introduced which is used by noce_convert_multiple_sets and its helper functions. This results in cleaner function signatures, improved efficientcy (a number of vecs and hash set/map are replaced with a single vec of struct) and simplicity. gcc/ChangeLog: * ifcvt.cc (need_cmov_or_rewire): Renamed init_noce_multiple_sets_info. (init_noce_multiple_sets_info): Initialize noce_multiple_sets_info. (noce_convert_multiple_sets_1): Use noce_multiple_sets_info and handle rewiring of multiple registers. (noce_convert_multiple_sets): Updated to use noce_multiple_sets_info. * ifcvt.h (struct noce_multiple_sets_info): Introduce new struct noce_multiple_sets_info to store info for noce_convert_multiple_sets. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ifcvt_multiple_sets_rewire.c: New test.
2024-08-12ifcvt: Allow more operations in multiple set if conversionManolis Tsamis2-21/+92
Currently the operations allowed for if conversion of a basic block with multiple sets are few, namely REG, SUBREG and CONST_INT (as controlled by bb_ok_for_noce_convert_multiple_sets). This commit allows more operations (arithmetic, compare, etc) to participate in if conversion. The target's profitability hook and ifcvt's costing is expected to reject sequences that are unprofitable. This is especially useful for targets which provide a rich selection of conditional instructions (like aarch64 which has cinc, csneg, csinv, ccmp, ...) which are currently not used in basic blocks with more than a single set. For targets that have a rich selection of conditional instructions, like aarch64, we have seen an ~5x increase of profitable if conversions for multiple set blocks in SPEC CPU 2017 benchmarks. gcc/ChangeLog: * ifcvt.cc (try_emit_cmove_seq): Modify comments. (noce_convert_multiple_sets_1): Modify comments. (bb_ok_for_noce_convert_multiple_sets): Allow more operations. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ifcvt_multiple_sets_arithm.c: New test.
2024-08-12ifcvt: handle sequences that clobber flags in noce_convert_multiple_setsManolis Tsamis1-48/+79
This is an extension of what was done in PR106590. Currently if a sequence generated in noce_convert_multiple_sets clobbers the condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards (sequences that emit the comparison itself). Since this applies only from the next iteration it assumes that the sequences generated (in particular seq2) doesn't clobber the condition rtx itself before using it in the if_then_else, which is only true in specific cases (currently only register/subregister moves are allowed). This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp in the current iteration. It also checks whether the resulting sequence clobbers the condition attached to the jump. This makes it possible to include arithmetic operations in noce_convert_multiple_sets. It also makes the code that checks whether the condition is used outside of the if_then_else emitted more robust. gcc/ChangeLog: * ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead. (noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp. Punt if seq clobbers cond. Refactor the code that sets read_comparison.
2024-08-12AVR: target/85624 - Fix non-matching alignment in clrmem* insns.Georg-Johann Lay2-0/+9
The clrmem* patterns don't use the provided alignment information, hence the setmemhi expander can just pass down 0 as alignment to the clrmem* insns. PR target/85624 gcc/ * config/avr/avr.md (setmemhi): Set alignment to 0. gcc/testsuite/ * gcc.target/avr/torture/pr85624.c: New test.
2024-08-1216-bit testsuite fixes - excessive code sizeJoern Rennecke4-0/+5
gcc/testsuite/ * gcc.c-torture/execute/20021120-1.c: Skip if not size20plus or -Os. * gcc.dg/fixed-point/convert-float-4.c: Require size20plus. * gcc.dg/torture/pr112282.c: Skip if -O0 unless size20plus. * g++.dg/lookup/pr21802.C: Require size20plus.
2024-08-12This fixes problems with tests that exceed a data type or the maximum stack ↵Joern Rennecke7-5/+14
frame size on 16 bit targets. Note: GCC has a limitation that a stack frame cannot exceed half the address space. For two tests the decision to modify or skip them seems not so clear-cut; I choose to modify gcc.dg/pr47893.c to use types that fit the numbers, as that seemed to have little impact on the test, and skip gcc.dg/pr115646.c for 16 bit, as layout of structs with bitfields members can have quite subtle rules. gcc/testsuite/ * gcc.dg/pr107523.c: Make sure variables can fit numbers. * gcc.dg/pr47893.c: Add dg-require-effective-target size20plus clause. * c-c++-common/torture/builtin-clear-padding-2.c: dg-require-effective-target size20plus. * gcc.dg/pr115646.c: dg-require-effective-target int32plus. * c-c++-common/analyzer/coreutils-sum-pr108666.c: For c++, expect a warning about exceeding maximum object size if not size20plus. * gcc.dg/torture/inline-mem-cpy-1.c: Like the included file, dg-require-effective-target ptr32plus. * gcc.dg/torture/inline-mem-cmp-1.c: Likewise.
2024-08-12Avoid cfg corruption when using sjlj exceptions where loops are present in ↵Joern Rennecke1-0/+4
the assign_params emitted code. 2024-08-06 Joern Rennecke <joern.rennecke@riscy-ip.com> gcc/ * except.cc (sjlj_emit_function_enter): Set fn_begin_outside_block again if encountering a jump instruction.
2024-08-12Use splay-tree-utils.h in tree-ssa-sccvn [PR30920]Richard Sandiford4-81/+131
This patch is an attempt to gauge opinion on one way of fixing PR30920. The PR points out that the libiberty splay tree implementation does not implement the algorithm described by Sleator and Tarjan and has unclear complexity bounds. (It's also somewhat dangerous in that splay_tree_min and splay_tree_max walk the tree without splaying, meaning that they are fully linear in the worst case, rather than amortised logarithmic.) These properties have been carried over to typed-splay-tree.h. We could fix those problems directly in the existing implementations, and probably should for libiberty. But when I added rtl-ssa, I also added a third(!) splay tree implementation: splay-tree-utils.h. In response to Jeff's understandable unease about having three implementations, I was supposed to go back during the next stage 1 and reduce it to no more than two. I never did that. :-( splay-tree-utils.h is so called because rtl-ssa uses splay trees in structures that are relatively small and very size-sensitive. I therefore wanted to be able to embed the splay tree links directly in the structures, rather than pay the penalty of using separate nodes with one-way or two-way links between them. There were also operations for which it was convenient to treat the splay tree root as an explicitly managed cursor, rather than treating the tree as a pure ADT. The interface is therefore a bit more low-level than for the other implementations. I wondered whether the same trade-offs might apply to users of the libiberty splay trees. The first one I looked at in detail was SCC value numbering, which seemed like it would benefit from using splay-tree-utils.h directly. The patch does that. It also adds a couple of new helper routines to splay-tree-utils.h. I don't expect this approach to be the right one for every use of splay trees. E.g. splay tree used for omp gimplification would certainly need separate nodes. gcc/ PR other/30920 * splay-tree-utils.h (rooted_splay_tree::insert_relative) (rooted_splay_tree::lookup_le): New functions. (rooted_splay_tree::remove_root_and_splay_next): Likewise. * splay-tree-utils.tcc (rooted_splay_tree::insert_relative): New function, extracted from... (rooted_splay_tree::insert): ...here. (rooted_splay_tree::lookup_le): New function. (rooted_splay_tree::remove_root_and_splay_next): Likewise. * tree-ssa-sccvn.cc (pd_range::m_children): New member variable. (vn_walk_cb_data::vn_walk_cb_data): Initialize first_range. (vn_walk_cb_data::known_ranges): Use a default_splay_tree. (vn_walk_cb_data::~vn_walk_cb_data): Remove freeing of known_ranges. (pd_range_compare, pd_range_alloc, pd_range_dealloc): Delete. (vn_walk_cb_data::push_partial_def): Rewrite splay tree operations to use splay-tree-utils.h. * rtl-ssa/accesses.cc (function_info::add_use): Use insert_relative.
2024-08-12aarch64: Emit ADD X, Y, Y instead of SHL X, Y, #1 for Advanced SIMDKyrylo Tkachov3-5/+77
On many cores, including Neoverse V2 the throughput of vector ADD instructions is higher than vector shifts like SHL. We can lean on that to emit code like: add v0.4s, v0.4s, v0.4s instead of: shl v0.4s, v0.4s, 1 LLVM already does this trick. In RTL the code gets canonincalised from (plus x x) to (ashift x 1) so I opted to instead do this at the final assembly printing stage, similar to how we emit CMLT instead of SSHR elsewhere in the backend. I'd like to also do this for SVE shifts, but those will have to be separate patches. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_simd_imm_shl<mode><vczle><vczbe>): Rewrite to new syntax. Add =w,w,vs1 alternative. * config/aarch64/constraints.md (vs1): New constraint. gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd_shl_add.c: New test.
2024-08-12Fortran: Fix coarray in associate not linking [PR85510]Andre Vehreschild3-5/+26
PR fortran/85510 gcc/fortran/ChangeLog: * resolve.cc (resolve_variable): Mark the variable as host associated only, when it is not in an associate block. * trans-decl.cc (generate_coarray_init): Remove incorrect unused flag on parameter. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/pr85510.f90: New test.
2024-08-12Initial support for AVX10.2Haochen Jiang19-22/+140
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Handle avx10.2. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_AVX10_2_256_SET): New. (OPTION_MASK_ISA2_AVX10_2_512_SET): Ditto. (OPTION_MASK_ISA2_AVX10_1_256_UNSET): Add OPTION_MASK_ISA2_AVX10_2_256_UNSET. (OPTION_MASK_ISA2_AVX10_1_512_UNSET): Add OPTION_MASK_ISA2_AVX10_2_512_UNSET. (OPTION_MASK_ISA2_AVX10_2_256_UNSET): New. (OPTION_MASK_ISA2_AVX10_2_512_UNSET): Ditto. (ix86_handle_option): Handle avx10.2-256 and avx10.2-512. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_AVX10_2_256 and FEATURE_AVX10_2_512. * common/config/i386/i386-isas.h: Add ISA_NAMES_TABLE_ENTRY for avx10.2-256 and avx10.2-512. * config/i386/i386-c.cc (ix86_target_macros_internal): Define __AVX10_2_256__ and __AVX10_2_512__. * config/i386/i386-isa.def (AVX10_2): Add DEF_PTA(AVX10_2_256) and DEF_PTA(AVX10_2_512). * config/i386/i386-options.cc (isa2_opts): Add -mavx10.2-256 and -mavx10.2-512. (ix86_valid_target_attribute_inner_p): Handle avx10.2-256 and avx10.2-512. * config/i386/i386.opt: Add option -mavx10.2, -mavx10.2-256 and -mavx10.2-512. * config/i386/i386.opt.urls: Regenerated. * doc/extend.texi: Document avx10.2, avx10.2-256 and avx10.2-512. * doc/invoke.texi: Document -mavx10.2, -mavx10.2-256 and -mavx10.2-512. * doc/sourcebuild.texi: Document target avx10.2, avx10.2-256, avx10.2-512. gcc/testsuite/ChangeLog: * g++.dg/other/i386-2.C: Ditto. * g++.dg/other/i386-3.C: Ditto. * gcc.target/i386/sse-12.c: Ditto. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto.
2024-08-12PR target/116275: Handle STV of *extenddi2_doubleword_highpart on i386.Roger Sayle2-0/+33
This patch resolves PR target/116275, a recent ICE-on-valid regression on -m32 caused by my recent change to enable STV of DImode arithmeric right shift on non-AVX512VL targets. The oversight is that the i386 backend contains an *extenddi2_doubleword_highpart instruction (whose pattern is an arithmetic right shift of a left shift) that optimizes the case where sign-extension need only update the highpart word of a DImode value when generating 32-bit code (!TARGET_64BIT). STV accepts this pattern as a candidate, as there are patterns to handle this form of extension on SSE using AVX512VL instructions (and previously ASHIFTRT was only allowed on AVX512VL). Now that ASHIFTRT is a candidate on non-AVX512vL targets, we either need to check that the first operand is a register, or as done below provide the define_insn_and_split that provides a non-AVX512VL implementation of *extendv2di_highpart_stv. The new testcase only ICEed with -m32, so this test could be limited to target ia32, but there's no harm also running this test on -m64 to provide a little extra test coverage. 2024-08-12 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/116275 * config/i386/i386.md (*extendv2di2_highpart_stv_noavx512vl): New define_insn_and_split to handle the STV conversion of the DImode pattern *extendsi2_doubleword_highpart. gcc/testsuite/ChangeLog PR target/116275 * g++.target/i386/pr116275.C: New test case.
2024-08-12LoongArch: Provide ashr lshr and ashl RTL pattern for vectors.Lulu Cheng3-6/+155
We support vashr vlshr and vashl. However, in r15-1638 support optimize x < 0 ? -1 : 0 into (signed) x >> 31 and x < 0 ? 1 : 0 into (unsigned) x >> 31. To support this optimization, vector ashr lshr and ashl need to be implemented. gcc/ChangeLog: * config/loongarch/loongarch.md (insn): Added rotatert rotr pairs. * config/loongarch/simd.md (rotr<mode>3): Remove to ... (<optab><mode>3): This. gcc/testsuite/ChangeLog: * g++.target/loongarch/vect-ashr-lshr.C: New test.
2024-08-12LoongArch: Drop vcond{,u} expanders.Lulu Cheng2-68/+0
Optabs vcond{,u} will be removed for GCC 15. Since regtest shows no fallout, dropping the expanders, now. gcc/ChangeLog: PR target/114189 * config/loongarch/lasx.md (vcondu<LASX:mode><ILASX:mode>): Delete. (vcond<LASX:mode><LASX_2:mode>): Likewise. * config/loongarch/lsx.md (vcondu<LSX:mode><ILSX:mode>): Likewise. (vcond<LSX:mode><LSX_2:mode>): Likewise.
2024-08-12LoongArch: Use iorn and andn standard pattern names.Lulu Cheng6-18/+59
R15-1890 introduced new optabs iorc and andc, and its corresponding internal functions BIT_{ANDC,IORC}, and if targets defines such optabs for vector modes. And in r15-2258 the iorc and andc were renamed to iorn and andn. So we changed the andn and iorn implementation templates to the standard template names. gcc/ChangeLog: * config/loongarch/lasx.md (xvandn<mode>3): Rename to ... (andn<mode>3): This. (xvorn<mode>3): Rename to ... (iorn<mode>3): This. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vandn_v): Defined as the modified name. (CODE_FOR_lsx_vorn_v): Likewise. (CODE_FOR_lasx_xvandn_v): Likewise. (CODE_FOR_lasx_xvorn_v): Likewise. (loongarch_expand_builtin_insn): When the builtin function to be called is __builtin_lasx_xvandn or __builtin_lsx_vandn, swap the two operands. * config/loongarch/loongarch.md (<optab>n<mode>): Rename to ... (<optab>n<mode>3): This. * config/loongarch/lsx.md (vandn<mode>3): Rename to ... (andn<mode>3): This. (vorn<mode>3): Rename to ... (iorn<mode>3): This. gcc/testsuite/ChangeLog: * gcc.target/loongarch/lasx-andn-iorn.c: New test. * gcc.target/loongarch/lsx-andn-iorn.c: New test.
2024-08-12PR modula2/116181 fix ODR warnings for C/m2 interface library modulesGaius Mulley27-663/+742
This patch fixes many ODR warnings which appear when compiling the interface files found in gcc/m2/*-ch/ and gcc/m2/{pge,mc}-boot directories. gcc/m2/ChangeLog: PR modula2/116181 * gm2-compiler/ppg.mod (FindStr): Initialize j. * gm2-libs-ch/UnixArgs.cc (_M2_UnixArgs_ctor): Replace M2RTS_RegisterModule with M2RTS_RegisterModule_Cstr. * gm2-libs-ch/dtoa.cc (_M2_dtoa_ctor): Ditto. * gm2-libs-ch/ldtoa.cc (ldtoa_strtold): Cast parameter s for strtod. (_M2_ldtoa_ctor): Replace M2RTS_RegisterModule with M2RTS_RegisterModule_Cstr. * gm2-libs-ch/m2rts.h (M2RTS_RegisterModule_Cstr): New define. (M2RTS_RegisterModule): Remove const. * mc-boot-ch/GSelective.c (Selective_FdIsSet): Return bool rather than int. * mc-boot-ch/Gldtoa.cc (ldtoa_strtold): Change const char to void. Cast s before passing as a parameter to strtod. * mc-boot-ch/Glibc.c (tracedb_open): Replace const char with const void. (libc_perror): Replace char with const char. (libc_printf): Replace char with void. (libc_snprintf): Replace char with void. Add const_cast for parameter to index. Add reinterpret_cast for parameter to vsnprintf. (libc_open): Replace first paramter type char with void. Add vararg for the third parameter. * mc-boot-ch/Gm2rtsdummy.cc (M2RTS_RequestDependant): Remove #if 0 code. (m2pim_M2RTS_RegisterModule): Change const char parameters to void (M2RTS_RegisterModule): Ditto. (_M2_M2RTS_init): Remove #if 0 code. (M2RTS_ConstructModules): Ditto. (M2RTS_Terminate): Ditto. (M2RTS_DeconstructModules): Ditto. (M2RTS_Halt): Ditto. * mc-boot-ch/Gtermios.cc (SetFlag): Return bool. * mc-boot-ch/m2rts.h (M2RTS_RegisterModule_Cstr): New define. (M2RTS_RegisterModule): Change const char parameters to void. * mc-boot/Gdecl.cc: Regenerate. * mc/decl.mod (getNextConstExp): Reimplement. * pge-boot/GDynamicStrings.cc: Regenerate. * pge-boot/GDynamicStrings.h: Ditto. * pge-boot/GM2RTS.h (M2RTS_RegisterModule_Cstr): New function. (M2RTS_RegisterModule): Reformat. * pge-boot/GSymbolKey.cc: Regenerate. * pge-boot/GSysExceptions.cc (_M2_SysExceptions_init): Add correct parameters. (_M2_SysExceptions_fini): Ditto. * pge-boot/GUnixArgs.cc (_M2_UnixArgs_ctor::_M2_UnixArgs_ctor): Replace call to M2RTS_RegisterModule with M2RTS_RegisterModuleCstr. * pge-boot/Gerrno.cc (_M2_errno_init): Add correct parameters. (_M2_errno_fini): Ditto. * pge-boot/Gldtoa.cc (ldtoa_strtold): Replace const char with void. Use reinterpret_cast when passing s to strtod. Replace true with TRUE. * pge-boot/Gldtoa.h (ldtoa_strtold): Tidy up. * pge-boot/Glibc.cc (libc_read): Use size_t as the return type. (libc_write): Ditto. (libc_strlen): Ditto. (libc_perror): Replace char with const char. (libc_printf): Replace char to const char. Cast parameter to index using const_cast. (libc_snprintf): Replace char with void. Cast parameter to index using const_cast. (libc_malloc): Replace parameter type with size_t. (libc_memcpy): Replace third parameter type with size_t. (libc_open): Use varargs. * pge-boot/Glibc.h (libc_perror): Add _string_high parameter. * pge-boot/Gpge.cc: Regenerate. * pge-boot/Gtermios.cc (SetFlag): Replace return type with bool. (_M2_termios_init): Add correct parameters. (_M2_termios_fini): Ditto. * pge-boot/m2rts.h (M2RTS_RegisterModule_Cstr): New define. (M2RTS_RegisterModule): Replace const char with void. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-08-12Daily bump.GCC Administrator3-1/+17
2024-08-11Fortran: silence Wmaybe-uninitialized warnings for LTO build [PR116221]Harald Anlauf2-2/+2
PR fortran/116221 gcc/fortran/ChangeLog: * intrinsic.cc (gfc_get_intrinsic_sub_symbol): Initialize variable. * symbol.cc (gfc_get_ha_symbol): Likewise.
2024-08-11AVR: -mlra is not documeted in TEXI.Georg-Johann Lay1-1/+1
gcc/ * config/avr/avr.opt (mlra): Set Undocumented flag.
2024-08-11AVR: Add function avr.cc::ra_in_progress().Georg-Johann Lay1-6/+13
It returns lra_in_progress resp. reload_in_progress depending on avr_lra_p. Currently, direct use of ra_in_progress() is only made with -mlog=. gcc/ * config/avr/avr.cc (ra_in_progress): New static function. (avr_legitimate_address_p, avr_addr_space_legitimate_address_p) (extra_constraint_Q): Use it with -mlog=.
2024-08-11Daily bump.GCC Administrator5-1/+80
2024-08-11i386: testsuite: Adapt fentryname3.c for r14-811 change [PR70150]Xi Ruoyao1-2/+1
After r14-811 "call *nop@GOTPCREL(%rip)" is only generated with -mno-direct-extern-access even if --enable-default-pie. So the r13-1614 change to this file is not valid anymore. gcc/testsuite/ChangeLog: PR testsuite/70150 * gcc.target/i386/fentryname3.c (dg-final): Revert r13-1614 change.
2024-08-11i386: testsuite: Add -no-pie for pr113689-1.c [PR70150]Xi Ruoyao1-1/+1
For a --enable-default-pie build, using -fno-pic (for compiler) but not -no-pie (for linker) triggers some linker warnings counted as excess errors: /usr/bin/ld: /tmp/cc8MgxiR.o: warning: relocation in read-only section `.text.startup' /usr/bin/ld: warning: creating DT_TEXTREL in a PIE gcc/testsuite/ChangeLog: PR testsuite/70150 * gcc.target/i386/pr113689-1.c (dg-options): Add -no-pie.
2024-08-10Fix reference to the dom walker function in the documentationAndi Kleen1-2/+2
It is using a class now with a different name. gcc/ChangeLog: * doc/cfg.texi: Fix references to dom_walker.
2024-08-10gm2: add missing debug output guardWilken Gottwalt1-1/+4
The Close() procedure in MemStream is missing a guard to prevent it from printing in non-debug mode. gcc/gm2: * gm2-libs-iso/MemStream.mod: Guard debug output. Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>
2024-08-10testsuite: Fix up sse3-addsubps.cJakub Jelinek1-1/+1
The testcase uses sizeof (vals) / sizeof (vals) as the number of vals to handle (though, handles 8 vals at a time). That is an obvious typo, all similar testcases use sizeof (vals) / sizeof (vals[0]) properly. 2024-08-10 Jakub Jelinek <jakub@redhat.com> * gcc.target/powerpc/sse3-addsubps.c (TEST): Divide by sizeof (vals[0]) rather than sizeof (vals).
2024-08-10AVR: ad target/113934 - Add option -mlra to enable LRA.Georg-Johann Lay2-2/+17
PR target/113934 gcc/ * config/avr/avr.opt (-mlra): New target option. * config/avr/avr.cc (avr_use_lra_p): New function. (TARGET_LRA_P): Use it. (avr_hard_regno_mode_ok) [lra]: Don't disallow 4-byte modes for X.
2024-08-09c++: inherited CTAD fixes [PR116276]Patrick Palka6-14/+139
This implements the overlooked inherited vs non-inherited guide tiebreaker from P2582R1. This requires tracking inherited-ness of a guide, for which it seems natural to reuse the lang_decl_fn::context field which for a constructor tracks its inherited-ness. This patch also works around CLASSTYPE_CONSTRUCTORS not reliably returning all inherited constructors (due to some using-decl handling quirks in in push_class_level_binding) by iterating over TYPE_FIELDS instead. This patch also makes us recognize another written form of inherited constructor, 'using Base<T>::Base::Base' whose USING_DECL_SCOPE is a TYPENAME_TYPE. PR c++/116276 gcc/cp/ChangeLog: * call.cc (joust): Implement P2582R1 inherited vs non-inherited guide tiebreaker. * cp-tree.h (lang_decl_fn::context): Document usage in deduction_guide_p FUNCTION_DECLs. (inherited_guide_p): Declare. * pt.cc (inherited_guide_p): Define. (set_inherited_guide_context): Define. (alias_ctad_tweaks): Use set_inherited_guide_context. (inherited_ctad_tweaks): Recognize some inherited constructors whose scope is a TYPENAME_TYPE. (ctor_deduction_guides_for): For C++23 inherited CTAD, iterate over TYPE_FIELDS instead of CLASSTYPE_CONSTRUCTORS to recognize all inherited constructors. gcc/testsuite/ChangeLog: * g++.dg/cpp23/class-deduction-inherited4.C: Remove an xfail. * g++.dg/cpp23/class-deduction-inherited5.C: New test. * g++.dg/cpp23/class-deduction-inherited6.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-08-09c++: DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P tweaksPatrick Palka1-4/+2
DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P templates can only appear as part of a template friend declaration, and in turn get partially instantiated only from tsubst_friend_function or tsubst_friend_class. So rather than having tsubst_template_decl clear the flag, let's leave it up to the tsubst friend routines to clear it so that template friend handling stays localized (note that tsubst_friend_function was already clearing it). Also the template depth comparison test within tsubst_friend_function is equivalent to DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P since such templates belong to the class context (and so always have more levels than the context), and conversely and it isn't possible to directly refer to an existing template that has more levels than the class context. gcc/cp/ChangeLog: * pt.cc (tsubst_friend_class): Simplify depth comparison test in the redeclaration code path to DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P. Clear the flag after partial instantiation here ... (tsubst_template_decl): ... instead of here. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-08-09c++: clean up cp_identifier_kind checksPatrick Palka2-21/+23
The predicates for checking an IDENTIFIER node's cp_identifier_kind currently directly test the three flag bits that encode the kind. This patch instead makes the checks first reconstruct the cp_identifier_kind in its entirety and then compare that. gcc/cp/ChangeLog: * cp-tree.h (get_identifier_kind): Define. (IDENTIFIER_KEYWORD_P): Redefine in terms of get_identifier_kind. (IDENTIFIER_CDTOR_P): Likewise. (IDENTIFIER_CTOR_P): Likewise. (IDENTIFIER_DTOR_P): Likewise. (IDENTIFIER_ANY_OP_P): Likewise. (IDENTIFIER_OVL_OP_P): Likewise. (IDENTIFIER_ASSIGN_OP_P): Likewise. (IDENTIFIER_CONV_OP_P): Likewise. (IDENTIFIER_TRAIT_P): Likewise. * parser.cc (cp_lexer_peek_trait): Mark IDENTIFIER_TRAIT_P check UNLIKELY. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-08-10Daily bump.GCC Administrator7-1/+221
2024-08-09[RISC-V][PR target/116283] Fix split code for recent Zbs improvements with ↵Jeff Law2-6/+33
masked bit positions So Patrick's fuzzer found an interesting little buglet in the Zbs improvements I added a couple months back. Specifically when we have masked bit position for a Zbs instruction. If the mask has extraneous bits set we'll generate an unrecognizable insn due to an invalid constant. More concretely, let's take this pattern: > (define_insn_and_split "" > [(set (match_operand:DI 0 "register_operand" "=r") > (any_extend:DI > (ashift:SI (const_int 1) > (subreg:QI (and:DI (match_operand:DI 1 "register_operand" "r") > (match_operand 2 "const_int_operand")) 0))))] What we need to know to transform this into bset for rv64. After masking the shift count we want to know the low 5 bits aren't 0x1f. If they were 0x1f, then the constant generated would be 0x80000000 which would then need sign extension out to 64bits, which the bset instruction will not do for us. We can ignore anything outside the low 5 bits. The mode of the shift is SI, so shifting by 32+ bits is undefined behavior. It's also worth explicitly mentioning that the hardware is going to mask the count against 0x3f. The net is if (operands[2] & 0x1f) != 0x1f, then this transformation is safe. So onto the generated split code... > [(set (match_dup 0) (and:DI (match_dup 1) (match_dup 2))) > (set (match_dup 0) (zero_extend:DI (ashift:SI > (const_int 1) > (subreg:QI (match_dup 0) 0))))] Which would seemingly do exactly what we want. The problem is the first split insn. If the constant does not fit into a simm12, that insn won't be recognized resulting in the ICE. The fix is simple, we just need to mask the constant before generating RTL. We can just mask it against 0x1f since we only care about the low 5 bits. This affects multiple patterns. I've added the appropriate fix to all of them. Tested in my tester. Waiting for the pre-commit bits to run before pushing. PR target/116283 gcc/ * config/riscv/bitmanip.md (Zbs combiner patterns/splitters): Mask the bit position in the split code appropriately. gcc/testsuite/ * gcc.target/riscv/pr116283.c: New test
2024-08-09Revert "lra: emit caller-save register spills before call insn [PR116028]"Kyrylo Tkachov3-26/+6
This reverts commit 3c67a0fa1dd39a3378deb854a7fef0ff7fe38004.
2024-08-09Adjust rangers recomputation depth based on the number of BBs.Andrew MacLeod2-4/+9
As the number of block increase, recomputations can become more expensive. Adjust the depth limit to avoid excessive compile time. PR tree-optimization/114855 * gimple-range-gori.cc (gori_compute::gori_compute): Adjust ranger_recompute_depth limit based on the number of BBs. (gori_compute::may_recompute_p): Use previosuly calculated value. * gimple-range-gori.h (gori_compute::m_recompute_depth): New.
2024-08-09Limit equivalency processing in rangers cache.Andrew MacLeod1-0/+8
When the number of block exceed VRP's sparse threshold, do not query all equivalencies during cache filling. This can be expensive for unknown benefit. PR tree-optimization/114855 * gimple-range-cache.cc (ranger_cache::fill_block_cache): Do not process equivalencies if the number of blocks is too high.
2024-08-09btf: Protect BTF_KIND_INFO against invalid kindWill Hawkins1-2/+2
If the user provides a kind value that is more than 5 bits, the BTF_KIND_INFO macro would emit incorrect values for info (by clobbering values of the kind flag). Tested on x86_64-redhat-linux. include/ChangeLog: * btf.h (BTF_TYPE_INFO): Protect against user providing invalid kind. Signed-off-by: Will Hawkins <hawkinsw@obs.cr>
2024-08-09c++: Don't accept multiple enum definitions within template class [PR115806]Simon Martin2-10/+21
We have been accepting the following invalid code since revision 557831a91df === cut here === template <typename T> struct S { enum E { a }; enum E { b }; }; S<int> s; === cut here === The problem is that start_enum will set OPAQUE_ENUM_P to true even if it retrieves an existing definition for the enum, which causes the redefinition check in cp_parser_enum_specifier to be bypassed. This patch only sets OPAQUE_ENUM_P and ENUM_FIXED_UNDERLYING_TYPE_P when actually pushing a new tag for the enum. PR c++/115806 gcc/cp/ChangeLog: * decl.cc (start_enum): Only set OPAQUE_ENUM_P and ENUM_FIXED_UNDERLYING_TYPE_P when pushing a new tag. gcc/testsuite/ChangeLog: * g++.dg/parse/enum15.C: New test.
2024-08-09RISC-V: Enable stack clash in allocaRaphael Moreira Zinsly15-0/+219
Add the TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE to riscv in order to enable stack clash protection when using alloca. The code and tests are the same used by aarch64. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_compute_frame_info): Update outgoing args size. (riscv_stack_clash_protection_alloca_probe_range): New. (TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE): New. * config/riscv/riscv.h (STACK_CLASH_MIN_BYTES_OUTGOING_ARGS): New. (STACK_DYNAMIC_OFFSET): New. gcc/testsuite/ChangeLog: * gcc.target/riscv/stack-check-14.c: New test. * gcc.target/riscv/stack-check-15.c: New test. * gcc.target/riscv/stack-check-alloca-1.c: New test. * gcc.target/riscv/stack-check-alloca-2.c: New test. * gcc.target/riscv/stack-check-alloca-3.c: New test. * gcc.target/riscv/stack-check-alloca-4.c: New test. * gcc.target/riscv/stack-check-alloca-5.c: New test. * gcc.target/riscv/stack-check-alloca-6.c: New test. * gcc.target/riscv/stack-check-alloca-7.c: New test. * gcc.target/riscv/stack-check-alloca-8.c: New test. * gcc.target/riscv/stack-check-alloca-9.c: New test. * gcc.target/riscv/stack-check-alloca-10.c: New test. * gcc.target/riscv/stack-check-alloca.h: New.
2024-08-09RISC-V: Add support to vector stack-clash protectionRaphael Moreira Zinsly5-21/+173
Adds basic support to vector stack-clash protection using a loop to do the probing and stack adjustments. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_allocate_and_probe_stack_loop): New function. (riscv_v_adjust_scalable_frame): Add stack-clash protection support. (riscv_allocate_and_probe_stack_space): Move the probe loop implementation to riscv_allocate_and_probe_stack_loop. * config/riscv/riscv.h: Define RISCV_STACK_CLASH_VECTOR_CFA_REGNUM. gcc/testsuite/ChangeLog: * gcc.target/riscv/stack-check-cfa-3.c: New test. * gcc.target/riscv/stack-check-prologue-16.c: New test. * gcc.target/riscv/struct_vect_24.c: New test.
2024-08-09RISC-V: Stack-clash protection implementionRaphael Moreira Zinsly27-40/+504
This implements stack-clash protection for riscv, with riscv_allocate_and_probe_stack_space being based of aarch64_allocate_and_probe_stack_space from aarch64's implementation. We enforce the probing interval and the guard size to always be equal, their default value is 4Kb which is riscv page size. We also probe up by 1024 bytes in the general case when a probe is required. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_option_override): Enforce that interval is the same size as guard size. (riscv_allocate_and_probe_stack_space): New function. (riscv_expand_prologue): Call riscv_allocate_and_probe_stack_space to the final allocation of the stack and add stack-clash dump information. * config/riscv/riscv.h: Define STACK_CLASH_CALLER_GUARD and STACK_CLASH_MAX_UNROLL_PAGES. gcc/testsuite/ChangeLog: * gcc.dg/params/blocksort-part.c: Skip riscv for stack-clash protection intervals. * gcc.dg/pr82788.c: Skip riscv. * gcc.dg/stack-check-6.c: Skip residual check for riscv. * gcc.dg/stack-check-6a.c: Skip riscv. * gcc.target/riscv/stack-check-12.c: New test. * gcc.target/riscv/stack-check-13.c: New test. * gcc.target/riscv/stack-check-cfa-1.c: New test. * gcc.target/riscv/stack-check-cfa-2.c: New test. * gcc.target/riscv/stack-check-prologue-1.c: New test. * gcc.target/riscv/stack-check-prologue-10.c: New test. * gcc.target/riscv/stack-check-prologue-11.c: New test. * gcc.target/riscv/stack-check-prologue-12.c: New test. * gcc.target/riscv/stack-check-prologue-13.c: New test. * gcc.target/riscv/stack-check-prologue-14.c: New test. * gcc.target/riscv/stack-check-prologue-15.c: New test. * gcc.target/riscv/stack-check-prologue-2.c: New test. * gcc.target/riscv/stack-check-prologue-3.c: New test. * gcc.target/riscv/stack-check-prologue-4.c: New test. * gcc.target/riscv/stack-check-prologue-5.c: New test. * gcc.target/riscv/stack-check-prologue-6.c: New test. * gcc.target/riscv/stack-check-prologue-7.c: New test. * gcc.target/riscv/stack-check-prologue-8.c: New test. * gcc.target/riscv/stack-check-prologue-9.c: New test. * gcc.target/riscv/stack-check-prologue.h: New file. * lib/target-supports.exp (check_effective_target_supports_stack_clash_protection): Add riscv. (check_effective_target_caller_implicit_probes): Likewise.
2024-08-09RISC-V: Move riscv_v_adjust_scalable_frameRaphael Moreira Zinsly1-31/+31
Move riscv_v_adjust_scalable_frame () in preparation for the stack clash protection support. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_adjust_scalable_frame): Move closer to riscv_expand_prologue.
2024-08-09RISC-V: Small stack tie changesRaphael Moreira Zinsly2-10/+10
Enable the register used by riscv_emit_stack_tie () to be passed as an argument so we can tie the stack with other registers besides hard_frame_pointer_rtx. Also don't allow operand 1 of stack_tie<mode> to be optimized to sp in preparation for the stack clash protection support. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_emit_stack_tie): Pass the register to be tied to the stack pointer as argument. * config/riscv/riscv.md (stack_tie<mode>): Don't match equal operands.
2024-08-09c-family: regenerate c.opt.urlsPatrick Palka1-0/+3
The addition of -Wtemplate-body in r15-2774-g596d1ed9d40b10 means we need to regenerate c.opt.urls. gcc/c-family/ChangeLog: * c.opt.urls: Regenerate.
2024-08-09c++: add fixed testcase [PR116289]Patrick Palka1-0/+16
Fully fixed since r14-6724-gfced59166f95e9. PR c++/116289 PR c++/113063 gcc/testsuite/ChangeLog: * g++.dg/cpp2a/spaceship-synth16a.C: New test.
2024-08-09i386: Fix up __builtin_ia32_b{extr{,i}_u{32,64},zhi_{s,d}i} folding [PR116287]Jakub Jelinek4-4/+89
The GENERIC folding of these builtins have cases where it folds to a constant regardless of the value of the first operand. If so, we need to use omit_one_operand to avoid throwing away side-effects in the first operand if any. The cases which verify the first argument is INTEGER_CST don't need that, INTEGER_CST doesn't have side-effects. 2024-08-09 Jakub Jelinek <jakub@redhat.com> PR target/116287 * config/i386/i386.cc (ix86_fold_builtin) <case IX86_BUILTIN_BEXTR32>: When folding into zero without checking whether first argument is constant, use omit_one_operand. (ix86_fold_builtin) <case IX86_BUILTIN_BZHI32>: Likewise. * gcc.target/i386/bmi-pr116287.c: New test. * gcc.target/i386/bmi2-pr116287.c: New test. * gcc.target/i386/tbm-pr116287.c: New test.