aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2024-03-10Merge commit 'ea1cd66f2200839d46a8b4dc140d18c00b849c82^' into HEADThomas Schwinge344-4323/+12743
2024-03-10Merge commit 'bc45e18d433f879a02e369d027829f90f9e85724' into HEADThomas Schwinge3-15/+2
2024-03-10Merge commit 'bc45e18d433f879a02e369d027829f90f9e85724^' into HEADThomas Schwinge3569-50796/+116181
2024-03-10Merge commit '0a85544e1aaeca41133ecfc438cda913dbc0f122' into HEADThomas Schwinge3-22/+89
2024-03-10Merge commit '0a85544e1aaeca41133ecfc438cda913dbc0f122^' into HEADThomas Schwinge1014-65079/+93114
2024-03-09AVR: Fix typos in comment, indentation glitches in avr.md.Georg-Johann Lay1-44/+43
gcc/ * config/avr/avr.md: Fix typos in comment, indentation glitches and some other nits.
2024-03-09fwprop: Restore previous behavior for forward propagation of RTL with MEMs ↵Jakub Jelinek1-0/+1
[PR114284] Before the recent PR111267 r14-8319 fwprop changes, fwprop would never try to propagate what was not considered PROFITABLE, where the profitable part actually was partly about profitability, partly about very good reasons not to actually propagate and partly for cases where propagation is completely incorrect. In particular, classify_result has: /* Allow (subreg (mem)) -> (mem) simplifications with the following exceptions: 1) Propagating (mem)s into multiple uses is not profitable. 2) Propagating (mem)s across EBBs may not be profitable if the source EBB runs less frequently. 3) Propagating (mem)s into paradoxical (subreg)s is not profitable. 4) Creating new (mem/v)s is not correct, since DCE will not remove the old ones. */ if (single_use_p && single_ebb_p && SUBREG_P (old_rtx) && !paradoxical_subreg_p (old_rtx) && MEM_P (new_rtx) && !MEM_VOLATILE_P (new_rtx)) return PROFITABLE; and didn't mark any other MEM_P (new_rtx) or rtxes which contain a MEM in its subrtxes as PROFITABLE. Now, since r14-8319 profitable_p method has been renamed to likely_profitable_p and has just a minor role. Now, rule 4) above is something that isn't about profitability, but about correct behavior, if you propagate mem/v, the code is miscompiled. This particular case has been fixed elsewhere by Haochen in r14-9379. But I think even the 1) and 2) and maybe 3) are a strong don't do it, don't rely solely on rtx costs, increasing the number of loads of the same memory, even when cached, is undesirable, canceling load hoisting can be undesirable as well. So, the following patch restores previous behavior of src contains any MEMs, in that case likely_profitable_p () is taken as the old profitable_p () as a requirement rather than just a hint. For propagation of something which doesn't load from memory this keeps the r14-8319 behavior. 2024-03-09 Jakub Jelinek <jakub@redhat.com> PR target/114284 * fwprop.cc (try_fwprop_subst_pattern): Don't propagate src containing MEMs unless prop.likely_profitable_p ().
2024-03-09LoongArch: Emit R_LARCH_RELAX for TLS IE with non-extreme code model to ↵Xi Ruoyao5-2/+36
allow the IE to LE linker relaxation In Binutils we need to make IE to LE relaxation only allowed when there is an R_LARCH_RELAX after R_LARCH_TLE_IE_PC_{HI20,LO12} so an invalid "partial" relaxation won't happen with the extreme code model. So if we are emitting %ie_pc_{hi20,lo12} in a non-extreme code model, emit an R_LARCH_RELAX to allow the relaxation. The IE to LE relaxation does not require the pcalau12i and the ld instruction to be adjacent, so we don't need to limit ourselves to use the macro. For the distro maintainers backporting changes: this change depends on r14-8721, without r14-8721 R_LARCH_RELAX can be emitted mistakenly in the extreme code model. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_print_operand_reloc): Support 'Q' for R_LARCH_RELAX for TLS IE. (loongarch_output_move): Use 'Q' to print R_LARCH_RELAX for TLS IE. * config/loongarch/loongarch.md (ld_from_got<mode>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/loongarch/tls-ie-relax.c: New test. * gcc.target/loongarch/tls-ie-norelax.c: New test. * gcc.target/loongarch/tls-ie-extreme.c: New test.
2024-03-09AVR: Add cost computation for some insn combine patterns.Georg-Johann Lay2-11/+49
gcc/ * config/avr/avr.cc (avr_rtx_costs_1) [PLUS]: Determine cost for usum_widenqihi and add_zero_extend1. [MINUS]: Determine costs for udiff_widenqihi, sub+zero_extend, sub+sign_extend. * config/avr/avr.md (*addhi3.sign_extend1, *subhi3.sign_extend2): Compute exact insn lengths. (*usum_widenqihi3): Allow input operands to commute.
2024-03-09i386: Regenerate i386.opt.urlsJakub Jelinek1-0/+3
When I've added the -mnoreturn-no-callee-saved-registers option to i386.opt, I forgot to regenerate i386.opt.urls and Mark's CI kindly reminded me of that. Fixed thusly. 2024-03-09 Jakub Jelinek <jakub@redhat.com> * config/i386/i386.opt.urls: Regenerate.
2024-03-09LoongArch: testsuite: Add compilation options to the regname-fp-s9.c.Lulu Cheng1-0/+1
When the value of the macro DEFAULT_CFLAGS is set to '-ansi -pedantic-errors', regname-s9-fp.c will test to fail. To solve this problem, add the compilation option '-Wno-pedantic -std=gnu90' to this test case. gcc/testsuite/ChangeLog: * gcc.target/loongarch/regname-fp-s9.c: Add compilation option '-Wno-pedantic -std=gnu90'.
2024-03-09LoongArch: Fixed an issue with the implementation of the template ↵Lulu Cheng2-11/+67
atomic_compare_and_swapsi. If the hardware does not support LAMCAS, atomic_compare_and_swapsi needs to be implemented through "ll.w+sc.w". In the implementation of the instruction sequence, it is necessary to determine whether the two registers are equal. Since LoongArch's comparison instructions do not distinguish between 32-bit and 64-bit, the two operand registers that need to be compared are symbolically extended, and one of the operand registers is obtained from memory through the "ll.w" instruction, which can ensure that the symbolic expansion is carried out. However, the value of the other operand register is not guaranteed to be the value of the sign extension. gcc/ChangeLog: * config/loongarch/sync.md (atomic_cas_value_strong<mode>): In loongarch64, a sign extension operation is added when operands[2] is a register operand and the mode is SImode. gcc/testsuite/ChangeLog: * g++.target/loongarch/atomic-cas-int.C: New test.
2024-03-09Daily bump.GCC Administrator7-1/+199
2024-03-09ipa: Avoid excessive removing of SSAs (PR 113757)Martin Jambor2-12/+16
PR 113757 shows that the code which was meant to debug-reset and remove SSAs defined by LHSs of calls redirected to __builtin_unreachable can trigger also when speculative devirtualization creates a call to a noreturn function (and since it is noreturn, it does not bother dealing with its return value). What is more, it seems that the code handling this case is not really necessary. I feel slightly idiotic about this because I have a feeling that I added it because of a failing test-case but I can neither find the testcase nor a reason why the code in cgraph_edge::redirect_call_stmt_to_callee would not be sufficient (it turns the SSA name into a default-def, a bit like IPA-SRA, but any code dominated by a call to a noreturn is not dangerous when it comes to its side-effects). So this patch just removes the handling. gcc/ChangeLog: 2024-02-07 Martin Jambor <mjambor@suse.cz> PR ipa/113757 * tree-inline.cc (redirect_all_calls): Remove code adding SSAs to id->killed_new_ssa_names. gcc/testsuite/ChangeLog: 2024-02-07 Martin Jambor <mjambor@suse.cz> PR ipa/113757 * g++.dg/ipa/pr113757.C: New test.
2024-03-08[PR113790][LRA]: Fixing LRA ICE on riscv64Vladimir N. Makarov1-7/+13
LRA failed to consider all insn alternatives when non-reload pseudo did not get a hard register. This resulted in failure to generate code by LRA. The patch fixes this problem. gcc/ChangeLog: PR target/113790 * lra-assigns.cc (assign_by_spills): Set up all_spilled_pseudos for non-reload pseudo too.
2024-03-08bpf: add size threshold for inlining mem builtinsDavid Faust5-2/+65
BPF cannot fall back on library calls to implement memmove, memcpy and memset, so we attempt to expand these inline always if possible. However, this inline expansion was being attempted even for excessively large operations, which could result in gcc consuming huge amounts of memory and hanging. Add a size threshold in the BPF backend below which to always expand these operations inline, and introduce an option -minline-memops-threshold= to control the threshold. Defaults to 1024 bytes. gcc/ * config/bpf/bpf.cc (bpf_expand_cpymem, bpf_expand_setmem): Do not attempt inline expansion if size is above threshold. * config/bpf/bpf.opt (-minline-memops-threshold): New option. * doc/invoke.texi (eBPF Options) <-minline-memops-threshold>: Document. gcc/testsuite/ * gcc.target/bpf/inline-memops-threshold-1.c: New test. * gcc.target/bpf/inline-memops-threshold-2.c: New test.
2024-03-08arm: testsuite: tweak bics_3.c [PR113542]Richard Earnshaw1-11/+8
This test was too simple, which meant that the compiler was sometimes able to find a better optimization of the code than using a BICS instruction. Fix this by changing the test slightly to produce a sequence where BICS should always be the preferred solution. gcc/testsuite: PR target/113542 * gcc.target/arm/bics_3.c: Adjust code to something which should always result in BICS.
2024-03-08bpf: testsuite: fix unresolved test in memset-1.cDavid Faust2-8/+22
The test was trying to do too much by both checking for an error, and checking the resulting assembly. Of course, due to the error no asm was produced, so the scan-asm went unresolved. Split it into two separate tests to fix the issue. gcc/testsuite/ * gcc.target/bpf/memset-1.c: Move error test case to... * gcc.target/bpf/memset-2.c: ... here. New test.
2024-03-08ARM: Fix builtin-bswap-1.c test [PR113915]Wilco Dijkstra1-4/+4
On Thumb-2 the use of CBZ blocks conditional execution, so change the test to compare with a non-zero value. gcc/testsuite/ChangeLog: PR target/113915 * gcc.target/arm/builtin-bswap.x: Fix test to avoid emitting CBZ.
2024-03-08testsuite: Fix up pr113617 test for darwin [PR113617]Jakub Jelinek2-1/+40
The test attempts to link a shared library, and apparently Darwin doesn't allow by default for shared libraries to contain undefined symbols. The following patch just adds dummy definitions for the symbols, so that the library no longer has any undefined symbols at least in my linux testing. Furthermore, for target { !shared } targets (like darwin until the it is fixed in target-supports.exp), because we then link a program rather than shared library, the patch also adds a dummy main definition so that it can link. 2024-03-08 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/113617 PR target/114233 * g++.dg/other/pr113617.C: Define -DSHARED when linking with -shared. * g++.dg/other/pr113617-aux.cc: Add definitions for used methods and templates not defined elsewhere.
2024-03-08tree-optimization/114269 - 434.zeusmp regression after SCEV analysis fixRichard Biener1-0/+48
The following addresses a performance regression caused by the recent SCEV analysis fix with regard to folding multiplications and undefined behavior on overflow. We do not handle (T) { a, +, b } * c but can treat sign-conversions from unsigned by performing the multiplication in the unsigned type. That's what we already do for additions (but that misses one case that turns out important). This fixes the 434.zeusmp regression for me. PR tree-optimization/114269 PR tree-optimization/114074 * tree-chrec.cc (chrec_fold_plus_1): Handle sign-conversions in the third CASE_CONVERT case as well. (chrec_fold_multiply): Handle sign-conversions from unsigned by performing the operation in the unsigned type.
2024-03-08modula2: Rebuild bootstrap tools with faster dynamic arraysGaius Mulley11-257/+540
This patch configures the larger dynamic arrays to use a larger growth factor and larger initial size. It also rebuilds mc and pge using the improved default array sizes in Indexing.mod. gcc/m2/ChangeLog: * gm2-compiler/M2Quads.mod (Init): Use InitIndexTuned with default size 65K. * gm2-compiler/SymbolConversion.mod (Init): Ditto. * gm2-compiler/SymbolTable.mod (BEGIN): Ditto. * mc-boot/GM2Dependent.cc: Rebuild. * mc-boot/GM2Dependent.h: Rebuild. * mc-boot/GM2RTS.cc: Rebuild. * pge-boot/GIndexing.cc: Rebuild. * pge-boot/GIndexing.h: Rebuild. * pge-boot/GM2Dependent.cc: Rebuild. * pge-boot/GM2Dependent.h: Rebuild. * pge-boot/GM2RTS.cc: Rebuild. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-03-08AVR: Add an insn combine pattern for offset computation.Georg-Johann Lay2-0/+44
Computing uint16_t += 2 * uint8_t can occur when an offset into a 16-bit array is computed. Without this pattern is costs six instructions: A move (1), a zero-extend (1), a shift (2) and an addition (2). With this pattern it costs 4. gcc/ * config/avr/avr.md (*addhi3_zero_extend.ashift1): New pattern. * config/avr/avr.cc (avr_rtx_costs_1) [PLUS]: Compute its cost.
2024-03-08bb-reorder: Fix assertionJakub Jelinek1-1/+2
When touching bb-reorder yesterday, I've noticed the checking assert doesn't actually check what it meant to. Because asm_noperands returns >= 0 for inline asm patterns (in that case number of input+output+label operands, so asm goto has at least one) and -1 if it isn't inline asm. The following patch fixes the assertion to actually check that it is asm goto. 2024-03-08 Jakub Jelinek <jakub@redhat.com> * bb-reorder.cc (fix_up_fall_thru_edges): Fix up checking assert, asm_noperands < 0 means it is not asm goto too.
2024-03-08i386: Guard noreturn no-callee-saved-registers optimization with ↵Jakub Jelinek11-10/+26
-mnoreturn-no-callee-saved-registers [PR38534] The following patch hides the noreturn no_callee_saved_registers (except bp) optimization with a not enabled by default option. The reason is that most noreturn functions should be called just once in a program (unless they are recursive or invoke longjmp or similar, for exceptions we already punt), so it isn't that essential to save a few instructions in their prologue, but more importantly because it interferes with debugging. And unlike most other optimizations, doesn't actually make it harder to debug the given function, which can be solved by recompiling the given function if it is too hard to debug, but makes it harder to debug the callers of that noreturn function. Those can be from a different translation unit, different binary or even different package, so if e.g. glibc abort needs to use all of the callee saved registers (%rbx, %rbp, %r12, %r13, %r14, %r15), debugging any programs which abort will be harder because any DWARF expressions which use those registers will be optimized out, not just in the immediate caller, but in other callers as well until some frame restores a particular register from some stack slot. 2024-03-08 Jakub Jelinek <jakub@redhat.com> PR target/38534 * config/i386/i386.opt (mnoreturn-no-callee-saved-registers): New option. * config/i386/i386-options.cc (ix86_set_func_type): Don't use TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP unless ix86_noreturn_no_callee_saved_registers is enabled. * doc/invoke.texi (-mnoreturn-no-callee-saved-registers): Document. * gcc.target/i386/pr38534-1.c: Add -mnoreturn-no-callee-saved-registers to dg-options. * gcc.target/i386/pr38534-2.c: Likewise. * gcc.target/i386/pr38534-3.c: Likewise. * gcc.target/i386/pr38534-4.c: Likewise. * gcc.target/i386/pr38534-5.c: Likewise. * gcc.target/i386/pr38534-6.c: Likewise. * gcc.target/i386/pr114097-1.c: Likewise. * gcc.target/i386/stack-check-17.c: Likewise.
2024-03-08c-family, c++: Fix up handling of types which may have padding in ↵Jakub Jelinek3-6/+66
__atomic_{compare_}exchange On Fri, Feb 16, 2024 at 01:51:54PM +0000, Jonathan Wakely wrote: > Ah, although __atomic_compare_exchange only takes pointers, the > compiler replaces that with a call to __atomic_compare_exchange_n > which takes the newval by value, which presumably uses an 80-bit FP > register and so the padding bits become indeterminate again. The problem is that __atomic_{,compare_}exchange lowering if it has a supported atomic 1/2/4/8/16 size emits code like: _3 = *p2; _4 = VIEW_CONVERT_EXPR<I_type> (_3); so if long double or some small struct etc. has some carefully filled padding bits, those bits can be lost on the assignment. The library call for __atomic_{,compare_}exchange would actually work because it woiuld load the value from memory using integral type or memcpy. E.g. on void foo (long double *a, long double *b, long double *c) { __atomic_compare_exchange (a, b, c, false, __ATOMIC_RELAXED, __ATOMIC_RELAXED); } we end up with -O0 with: fldt (%rax) fstpt -48(%rbp) movq -48(%rbp), %rax movq -40(%rbp), %rdx i.e. load *c from memory into 387 register, store it back to uninitialized stack slot (the padding bits are now random in there) and then load a __uint128_t (pair of GPR regs). The problem is that we first load it using whatever type the pointer points to and then VIEW_CONVERT_EXPR that value: p2 = build_indirect_ref (loc, p2, RO_UNARY_STAR); p2 = build1 (VIEW_CONVERT_EXPR, I_type, p2); The following patch fixes that by creating a MEM_REF instead, with the I_type type, but with the original pointer type on the second argument for aliasing purposes, so we actually preserve the padding bits that way. With this patch instead of the above assembly we emit movq 8(%rax), %rdx movq (%rax), %rax I had to add support for MEM_REF in pt.cc, though with the assumption that it has been already originally created with non-dependent types/operands (which is the case here for the __atomic*exchange lowering). 2024-03-08 Jakub Jelinek <jakub@redhat.com> gcc/c-family/ * c-common.cc (resolve_overloaded_atomic_exchange): Instead of setting p1 to VIEW_CONVERT_EXPR<I_type> (*p1), set it to MEM_REF with p1 and (typeof (p1)) 0 operands and I_type type. (resolve_overloaded_atomic_compare_exchange): Similarly for p2. gcc/cp/ * pt.cc (tsubst_expr): Handle MEM_REF. gcc/testsuite/ * g++.dg/ext/atomic-5.C: New test.
2024-03-08dwarf2out: Emit DW_AT_export_symbols on anon unions/structs [PR113918]Jakub Jelinek6-0/+76
DWARF5 added DW_AT_export_symbols both for use on inline namespaces (where we emit it), but also on anonymous unions/structs (and we didn't emit that attribute there). The following patch fixes it. 2024-03-08 Jakub Jelinek <jakub@redhat.com> PR debug/113918 gcc/ * dwarf2out.cc (gen_field_die): Emit DW_AT_export_symbols on anonymous unions or structs for -gdwarf-5 or -gno-strict-dwarf. gcc/c/ * c-tree.h (c_type_dwarf_attribute): Declare. * c-objc-common.h (LANG_HOOKS_TYPE_DWARF_ATTRIBUTE): Redefine. * c-objc-common.cc: Include dwarf2.h. (c_type_dwarf_attribute): New function. gcc/cp/ * cp-objcp-common.cc (cp_type_dwarf_attribute): Return 1 for DW_AT_export_symbols on anonymous structs or unions. gcc/testsuite/ * c-c++-common/dwarf2/pr113918.c: New test.
2024-03-08c++: Fix up parameter pack diagnostics on xobj vs. varargs functions [PR113802]Jakub Jelinek2-30/+69
The simple presence of ellipsis as next token after the parameter declaration doesn't imply it is a parameter pack, it sometimes is, e.g. if its type is a pack, but sometimes is not and in that case it acts the same as if the next tokens were , ... instead of just ... The xobj param cannot be a function parameter pack though treats both the declarator->parameter_pack_p and token->type == CPP_ELLIPSIS as sufficient conditions for the error. The conditions for CPP_ELLIPSIS are done a little bit later in the same function and complex enough that IMHO shouldn't be repeated, on the other side for the declarator->parameter_pack_p case we clear that flag for xobj params for error recovery reasons. This patch just moves the diagnostics later (after the CPP_ELLIPSIS handling) and changes the error recovery behavior by pretending the this specifier didn't appear if an error is reported. 2024-03-08 Jakub Jelinek <jakub@redhat.com> PR c++/113802 * parser.cc (cp_parser_parameter_declaration): Move the xobj_param_p pack diagnostics after ellipsis handling and if an error is reported, pretend this specifier didn't appear. Formatting fix. * g++.dg/cpp23/explicit-obj-diagnostics3.C (S0, S1, S2, S3, S4): Don't expect any diagnostics on f and fd member function templates, add similar templates with ...Selves instead of Selves as k and kd and expect diagnostics for those. Expect extra diagnostics in error recovery for g and gd member function templates.
2024-03-08testsuite/108355 - make gcc.dg/tree-ssa/ssa-fre-104.c properly XFAILRichard Biener1-1/+1
The testcase only XFAILs on targets where int has an alignment of sizeof(int). Align the respective array this way to make it XFAIL consistenlty. PR testsuite/108355 * gcc.dg/tree-ssa/ssa-fre-104.c: Align e.
2024-03-08modula2: Add constant aggregate testsGaius Mulley4-0/+132
This patch adds four constant aggregate tests and assignment of arrays by a constant in two different scopes. gcc/testsuite/ChangeLog: * gm2/iso/pass/arrayconst.mod: New test. * gm2/iso/pass/arrayconst2.mod: New test. * gm2/iso/pass/arrayconst3.mod: New test. * gm2/iso/pass/arrayconst4.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-03-08RISC-V: Fix ICE in riscv vector costsdemin.han2-0/+17
The following code can result in ICE: -march=rv64gcv --param riscv-autovec-lmul=dynamic -O3 char *jpeg_difference7_input_buf; void jpeg_difference7(int *diff_buf) { unsigned width; int samp, Rb; while (--width) { Rb = samp = *jpeg_difference7_input_buf; *diff_buf++ = -(int)(samp + (long)Rb >> 1); } } One biggest_mode update missed in one branch and trigger assertion fail. gcc_assert (biggest_size >= mode_size); Tested On RV64 and no regression. PR target/114264 gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc: Fix ICE gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr114264.c: New test. Signed-off-by: demin.han <demin.han@starfivetech.com>
2024-03-08fwprop: Avoid volatile rtx to be propagatedHaochen Gui2-0/+17
The patch for PR111267 (commit id 86de9b66480b710202a2898cf513db105d8c432f) which introduces an exception for propagation on single set insn. The propagation which might not be profitable (checked by profitable_p) is still allowed to be propagated to single set insn. It has a potential problem that a volatile operand might be propagated to a singel set insn. If the define insn is not eliminated after propagation, the volatile operand will be executed for multiple times. This patch fixes the problem by skipping volatile set source rtx in propagation. gcc/ * fwprop.cc (forward_propagate_into): Return false for volatile set source rtx. gcc/testsuite/ * gcc.target/powerpc/fwprop-1.c: New.
2024-03-08Daily bump.GCC Administrator7-1/+277
2024-03-08c++: Redetermine whether to write vtables on stream-in [PR114229]Nathaniel Shead7-4/+42
We currently always stream DECL_INTERFACE_KNOWN, which is needed since many kinds of declarations already have their interface determined at parse time. But for vtables and type-info declarations we need to re-evaluate on stream-in as whether they need to be emitted or not changes in each TU, so this patch clears DECL_INTERFACE_KNOWN on these kinds of declarations so that they can go through 'import_export_decl' again. Note that the precise details of the virt-2 tests will need to change when we implement the resolution of [1], for now I just updated the test to not fail with the new (current) semantics. [1]: https://github.com/itanium-cxx-abi/cxx-abi/pull/171 PR c++/114229 gcc/cp/ChangeLog: * module.cc (trees_out::core_bools): Redetermine DECL_INTERFACE_KNOWN on stream-in for vtables and tinfo. * decl2.cc (import_export_decl): Add fixme for ABI changes with module vtables and tinfo. gcc/testsuite/ChangeLog: * g++.dg/modules/virt-2_b.C: Update test to acknowledge that we now emit vtables here too. * g++.dg/modules/virt-3_a.C: New test. * g++.dg/modules/virt-3_b.C: New test. * g++.dg/modules/virt-3_c.C: New test. * g++.dg/modules/virt-3_d.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-03-07c++/modules: member alias tmpl partial inst [PR103994]Patrick Palka7-124/+77
Alias templates are weird in that their specializations can appear in both decl_specializations and type_specializations. They're always in the decl table, and additionally appear in the type table only at parse time via finish_template_type. There seems to be no good reason for them to appear in both tables, and the code paths end up stepping over each other in particular for a partial instantiation such as A<B>::key_arg<T> in the below modules testcase: the type code path (lookup_template_class) wants to set TI_TEMPLATE to the most general template whereas the decl code path (tsubst_template_decl called during instantiation of A<B>) already set TI_TEMPLATE to the partially instantiated TEMPLATE_DECL. This TI_TEMPLATE change ends up confusing modules which decides to stream the logically equivalent TYPE_DECL and TEMPLATE_DECL for this partial instantiation separately. This patch fixes this by making lookup_template_class dispatch to instantiate_alias_template early for alias template specializations. In turn we now add such specializations only to the decl table. This admits some nice simplification in the modules code which otherwise has to cope with such specializations appearing in both tables. PR c++/103994 gcc/cp/ChangeLog: * cp-tree.h (add_mergeable_specialization): Remove second parameter. * module.cc (depset::disc_bits::DB_ALIAS_TMPL_INST_BIT): Remove. (depset::disc_bits::DB_ALIAS_SPEC_BIT): Remove. (depset::is_alias_tmpl_inst): Remove. (depset::is_alias): Remove. (merge_kind::MK_tmpl_alias_mask): Remove. (merge_kind::MK_alias_spec): Remove. (merge_kind_name): Remove entries for alias specializations. (trees_out::core_vals) <case TEMPLATE_DECL>: Adjust after removing is_alias_tmpl_inst. (trees_in::decl_value): Adjust add_mergeable_specialization calls. (trees_out::get_merge_kind) <case depset::EK_SPECIALIZATION>: Use MK_decl_spec for alias template specializations. (trees_out::key_mergeable): Simplify after MK_tmpl_alias_mask removal. (depset::hash::make_dependency): Adjust after removing DB_ALIAS_TMPL_INST_BIT. (specialization_add): Don't allow alias templates when !decl_p. (depset::hash::add_specializations): Remove now-dead code accomodating alias template specializations in the type table. * pt.cc (lookup_template_class): Dispatch early to instantiate_alias_template for alias templates. Simplify accordingly. (add_mergeable_specialization): Remove alias_p parameter and simplify accordingly. gcc/testsuite/ChangeLog: * g++.dg/modules/pr99425-1_b.H: s/alias/decl in dump scan. * g++.dg/modules/tpl-alias-1_a.H: Likewise. * g++.dg/modules/tpl-alias-2_a.H: New test. * g++.dg/modules/tpl-alias-2_b.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-03-07AArch64: memcpy/memset expansions should not emit LDP/STP [PR113618]Wilco Dijkstra2-47/+57
The new RTL introduced for LDP/STP results in regressions due to use of UNSPEC. Given the new LDP fusion pass is good at finding LDP opportunities, change the memcpy, memmove and memset expansions to emit single vector loads/stores. This fixes the regression and enables more RTL optimization on the standard memory accesses. Handling of unaligned tail of memcpy/memmove is improved with -mgeneral-regs-only. SPEC2017 performance improves slightly. Codesize is a bit worse due to missed LDP opportunities as discussed in the PR. gcc/ChangeLog: PR target/113618 * config/aarch64/aarch64.cc (aarch64_copy_one_block): Remove. (aarch64_expand_cpymem): Emit single load/store only. (aarch64_set_one_block): Emit single stores only. gcc/testsuite/ChangeLog: PR target/113618 * gcc.target/aarch64/pr113618.c: New test.
2024-03-07c++/modules: inline namespace abi_tag streaming [PR110730]Patrick Palka5-0/+66
The unreduced testcase from PR110730 crashes at runtime ultimately because we don't stream the abi_tag attribute on inline namespaces and so the filesystem::current_path() call resolves to the non-C++11 ABI version even though the C++11 ABI is active, leading to a crash when destroying the path temporary (which contains an std::string member). Similar story for the PR105512 testcase. While we do stream the DECL_ATTRIBUTES of all decls that go through the generic tree streaming routines, it seems namespaces are streamed separately from other decls and we don't use the generic routines for them. So this patch makes us stream the abi_tag manually for (inline) namespaces. PR c++/110730 PR c++/105512 gcc/cp/ChangeLog: * module.cc (module_state::write_namespaces): Stream the abi_tag attribute of an inline namespace. (module_state::read_namespaces): Likewise. gcc/testsuite/ChangeLog: * g++.dg/modules/hello-2_a.C: New test. * g++.dg/modules/hello-2_b.C: New test. * g++.dg/modules/namespace-6_a.H: New test. * g++.dg/modules/namespace-6_b.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-03-07testsuite, darwin: improve check for -shared supportFrancois-Xavier Coudert1-1/+1
The undefined symbols are allowed for C checks, but when this is run as C++, the mangled foo() symbol is still seen as undefined, and the testsuite thinks darwin does not support -shared. gcc/testsuite/ChangeLog: PR target/114233 * lib/target-supports.exp: Fix test for C++.
2024-03-07vect: Do not peel epilogue for partial vectors.Robin Dapp3-23/+45
r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early break but PR114196 shows that we also run into the problem without early break. Therefore merge the condition into the topmost vectorization guard. gcc/ChangeLog: PR middle-end/114196 * tree-vect-loop-manip.cc (vect_can_peel_nonlinear_iv_p): Merge vectorization guards. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr114196.c: New test. * gcc.target/riscv/rvv/autovec/pr114196.c: New test.
2024-03-07PR modula2/109969 Linking large project causes an ICEGaius Mulley6-471/+451
This patch contains a re-write of M2LexBuf.mod which removes the linked list of token buckets and simplifies the implementation using a dynamic array. It contains more checking (for empty source files for example). The patch also contains a fix for an ICE in gcc/m2/gm2-gcc/builtins.cc gcc/m2/ChangeLog: PR modula2/109969 * gm2-compiler/M2LexBuf.def (TokenToLineNo): Rename parameter. (TokenToColumnNo): Rename parameter. (TokenToLocation): Rename parameter. (FindFileNameFromToken): Rename parameter. (DumpTokens): Rewrite comment. * gm2-compiler/M2LexBuf.mod: Rewrite. * gm2-compiler/P0SyntaxCheck.bnf (CheckInsertCandidate): DumpTokens before and after inserting recovery token. * gm2-gcc/m2builtins.cc (do_target_support_exists): Add bf_c99_compl case. * gm2-libs/Indexing.def (InitIndexTuned): New procedure function. (IsEmpty): New procedure function. * gm2-libs/Indexing.mod (InitIndexTuned): New procedure function. (IsEmpty): New procedure function. (Index): New field GrowFactor. (PutIndice): Use GrowFactor to extend dynamic array. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-03-07c++: ICE with variable template and [[deprecated]] [PR110031]Marek Polacek2-1/+33
lookup_and_finish_template_variable already has and uses the complain parameter but it is not passing it down to mark_used so we got the default tf_warning_or_error, which causes various problems when lookup_and_finish_template_variable gets called with complain=tf_none. PR c++/110031 gcc/cp/ChangeLog: * pt.cc (lookup_and_finish_template_variable): Pass complain to mark_used. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/inline-var11.C: New test.
2024-03-07doc: Fix docs for -dD regarding predefined macrosJonathan Wakely1-2/+1
The manual has always claimed that -dD differs from -dM by not outputting predefined macros, but that's untrue. It has been untrue since GCC 3.0 (probably with the change to use libcpp as the default preprocessor implementation). gcc/ChangeLog: * doc/cppopts.texi: Remove incorrect claim about -dD not outputting predefined macros.
2024-03-07rs6000: Don't ICE when compiling the __builtin_vsx_splat_2di [PR113950]Jeevitha2-2/+26
When we expand the __builtin_vsx_splat_2di built-in, we were allowing immediate value for second operand which causes an unrecognizable insn ICE. Even though the immediate value was forced into a register, it wasn't correctly assigned to the second operand. So corrected the assignment of op1 to operands[1]. 2024-03-07 Jeevitha Palanisamy <jeevitha@linux.ibm.com> gcc/ PR target/113950 * config/rs6000/vsx.md (vsx_splat_<mode>): Correct assignment to operand1 and simplify else if with else. gcc/testsuite/ PR target/113950 * gcc.target/powerpc/pr113950.c: New testcase.
2024-03-07Fix bogus error on allocator for array type with Dynamic_PredicateEric Botcazou2-2/+19
This is a regression present on all active branches: the compiler gives a bogus error on an allocator for an unconstrained array type declared with a Dynamic_Predicate because Apply_Predicate_Check is invoked directly on a subtype reference, which it cannot handle. This moves the check to the resulting access value (after dereference) like in Expand_Allocator_Expression. gcc/ada/ PR ada/113979 * exp_ch4.adb (Expand_N_Allocator): In the subtype indication case, call Apply_Predicate_Check on the resulting access value if needed. gcc/testsuite/ * gnat.dg/predicate15.adb: New test.
2024-03-07Include safe-ctype.h after C++ standard headers, to avoid over-poisoningFrancois-Xavier Coudert1-21/+18
When building gcc's C++ sources against recent libc++, the poisoning of the ctype macros due to including safe-ctype.h before including C++ standard headers such as <list>, <map>, etc, causes many compilation errors, similar to: In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23: In file included from /home/dim/src/gcc/master/gcc/system.h:233: In file included from /usr/include/c++/v1/vector:321: In file included from /usr/include/c++/v1/__format/formatter_bool.h:20: In file included from /usr/include/c++/v1/__format/formatter_integral.h:32: In file included from /usr/include/c++/v1/locale:202: /usr/include/c++/v1/__locale:546:5: error: '__abi_tag__' attribute only applies to structs, variables, functions, and namespaces 546 | _LIBCPP_INLINE_VISIBILITY | ^ /usr/include/c++/v1/__config:813:37: note: expanded from macro '_LIBCPP_INLINE_VISIBILITY' 813 | # define _LIBCPP_INLINE_VISIBILITY _LIBCPP_HIDE_FROM_ABI | ^ /usr/include/c++/v1/__config:792:26: note: expanded from macro '_LIBCPP_HIDE_FROM_ABI' 792 | __attribute__((__abi_tag__(_LIBCPP_TOSTRING( _LIBCPP_VERSIONED_IDENTIFIER)))) | ^ In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23: In file included from /home/dim/src/gcc/master/gcc/system.h:233: In file included from /usr/include/c++/v1/vector:321: In file included from /usr/include/c++/v1/__format/formatter_bool.h:20: In file included from /usr/include/c++/v1/__format/formatter_integral.h:32: In file included from /usr/include/c++/v1/locale:202: /usr/include/c++/v1/__locale:547:37: error: expected ';' at end of declaration list 547 | char_type toupper(char_type __c) const | ^ /usr/include/c++/v1/__locale:553:48: error: too many arguments provided to function-like macro invocation 553 | const char_type* toupper(char_type* __low, const char_type* __high) const | ^ /home/dim/src/gcc/master/gcc/../include/safe-ctype.h:146:9: note: macro 'toupper' defined here 146 | #define toupper(c) do_not_use_toupper_with_safe_ctype | ^ This is because libc++ uses different transitive includes than libstdc++, and some of those transitive includes pull in various ctype declarations (typically via <locale>). There was already a special case for including <string> before safe-ctype.h, so move the rest of the C++ standard header includes to the same location, to fix the problem. gcc/ChangeLog: * system.h: Include safe-ctype.h after C++ standard headers. Signed-off-by: Dimitry Andric <dimitry@andric.com>
2024-03-07analyzer: Fix up some -Wformat* warningsJakub Jelinek5-1/+5
I'm seeing warnings like ../../gcc/analyzer/access-diagram.cc: In member function ‘void ana::bit_size_expr::print(pretty_printer*) const’: ../../gcc/analyzer/access-diagram.cc:399:26: warning: unknown conversion type character ‘E’ in format [-Wformat=] 399 | pp_printf (pp, _("%qE bytes"), bytes_expr); | ^~~~~~~~~~~ when building stage2/stage3 gcc. While such warnings would be understandable when building stage1 because one could e.g. have some older host compiler which doesn't understand some of the format specifiers, the above seems to be because we have in pretty-print.h #ifdef GCC_DIAG_STYLE #define GCC_PPDIAG_STYLE GCC_DIAG_STYLE #else #define GCC_PPDIAG_STYLE __gcc_diag__ #endif and use GCC_PPDIAG_STYLE e.g. for pp_printf, and while diagnostic-core.h has #ifndef GCC_DIAG_STYLE #define GCC_DIAG_STYLE __gcc_tdiag__ #endif (and similarly various FE headers include their own GCC_DIAG_STYLE) when including pretty-print.h before diagnostic-core.h we end up with __gcc_diag__ style rather than __gcc_tdiag__ style, which I think is the right thing for the analyzer, because analyzer seems to use default_tree_printer everywhere: grep pp_format_decoder.*=.default_tree_printer analyzer/* | wc -l 57 The following patch fixes that by making sure diagnostic-core.h is included before pretty-print.h. 2024-03-07 Jakub Jelinek <jakub@redhat.com> * access-diagram.cc: Include diagnostic-core.h before including diagnostic.h or diagnostic-path.h. * sm-malloc.cc: Likewise. * diagnostic-manager.cc: Likewise. * call-summary.cc: Likewise. * record-layout.cc: Likewise.
2024-03-07c++: Fix ICE diagnosing incomplete type of overloaded function set [PR98356]Nathaniel Shead2-6/+14
In the linked PR the result of 'get_first_fn' is a USING_DECL against the template parameter, to be filled in on instantiation. But we don't actually need to get the first set of the member functions: it's enough to know that we have a (possibly overloaded) member function at all. PR c++/98356 gcc/cp/ChangeLog: * typeck2.cc (cxx_incomplete_type_diagnostic): Don't assume 'member' will be a FUNCTION_DECL (or something like it). gcc/testsuite/ChangeLog: * g++.dg/pr98356.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-03-07c++: Stream DECL_CONTEXT for template template parms [PR98881]Nathaniel Shead5-31/+47
When streaming in a nested template-template parameter as in the attached testcase, we end up reaching the containing template-template parameter in 'tpl_parms_fini'. We should not set the DECL_CONTEXT to this (nested) template-template parameter, as it should already be the struct that the outer template-template parameter is declared on. The precise logic for what DECL_CONTEXT should be for a template template parameter in various situations seems rather obscure. Rather than trying to determine the assumptions that need to hold, it seems simpler to just always re-stream the DECL_CONTEXT as needed for now. PR c++/98881 gcc/cp/ChangeLog: * module.cc (trees_out::tpl_parms_fini): Stream out DECL_CONTEXT for template template parameters. (trees_in::tpl_parms_fini): Read it. gcc/testsuite/ChangeLog: * g++.dg/modules/tpl-tpl-parm-3.h: New test. * g++.dg/modules/tpl-tpl-parm-3_a.H: New test. * g++.dg/modules/tpl-tpl-parm-3_b.C: New test. * g++.dg/modules/tpl-tpl-parm-3_c.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-03-07bb-reorder: Fix -freorder-blocks-and-partition ICEs on aarch64 with asm goto ↵Jakub Jelinek2-1/+45
[PR110079] The following testcase ICEs, because fix_crossing_unconditional_branches thinks that asm goto is an unconditional jump and removes it, replacing it with unconditional jump to one of the labels. This doesn't happen on x86 because the function in question isn't invoked there at all: /* If the architecture does not have unconditional branches that can span all of memory, convert crossing unconditional branches into indirect jumps. Since adding an indirect jump also adds a new register usage, update the register usage information as well. */ if (!HAS_LONG_UNCOND_BRANCH) fix_crossing_unconditional_branches (); I think for the asm goto case, for the non-fallthru edge if any we should handle it like any other fallthru (and fix_crossing_unconditional_branches doesn't really deal with those, it only looks at explicit branches at the end of bbs and we are in cfglayout mode at that point) and for the labels we just pass the labels as immediates to the assembly and it is up to the user to figure out how to store them/branch to them or whatever they want to do. So, the following patch fixes this by not treating asm goto as a simple unconditional jump. I really think that on the !HAS_LONG_UNCOND_BRANCH targets we have a bug somewhere else, where outofcfglayout or whatever should actually create those indirect jumps on the crossing edges instead of adding normal unconditional jumps, I see e.g. in __attribute__((cold)) int bar (char *); __attribute__((hot)) int baz (char *); void qux (int x) { if (__builtin_expect (!x, 1)) goto l1; bar (""); goto l1; l1: baz (""); } void corge (int x) { if (__builtin_expect (!x, 0)) goto l1; baz (""); l2: return; l1: bar (""); goto l2; } with -O2 -freorder-blocks-and-partition on aarch64 before/after this patch just b .L? jumps which I believe are +-32MB, so if .text is larger than 32MB, it could fail to link, but this patch doesn't address that. 2024-03-07 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/110079 * bb-reorder.cc (fix_crossing_unconditional_branches): Don't adjust asm goto. * gcc.dg/pr110079.c: New test.
2024-03-07expand: Fix UB in choose_mult_variant [PR105533]Jakub Jelinek2-5/+18
As documented in the function comment, choose_mult_variant attempts to compute costs of 3 different cases, val, -val and val - 1. The -val case is actually only done if val fits into host int, so there should be no overflow, but the val - 1 case is done unconditionally. val is shwi (but inside of synth_mult already uhwi), so when val is HOST_WIDE_INT_MIN, val - 1 invokes UB. The following patch fixes that by using val - HOST_WIDE_INT_1U, but I'm not really convinced it would DTRT for > 64-bit modes, so I've guarded it as well. Though, arch would need to have really strange costs that something that could be expressed as x << 63 would be better expressed as (x * 0x7fffffffffffffff) + 1 In the long term, I think we should just rewrite choose_mult_variant/synth_mult etc. to work on wide_int. 2024-03-07 Jakub Jelinek <jakub@redhat.com> PR middle-end/105533 * expmed.cc (choose_mult_variant): Only try the val - 1 variant if val is not HOST_WIDE_INT_MIN or if mode has exactly HOST_BITS_PER_WIDE_INT precision. Avoid triggering UB while computing val - 1. * gcc.dg/pr105533.c: New test.