aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2021-01-14Daily bump.GCC Administrator4-1/+258
2021-01-14or1k: Fixup exception header data encodingsStafford Horne1-0/+4
While running glibc tests several *-textrel tests failed showing that relocations remained against read only sections. It turned out this was related to exception headers data encoding being wrong. By default pointer encoding will always use the DW_EH_PE_absptr format. This patch uses format DW_EH_PE_pcrel and DW_EH_PE_sdata4. Optionally DW_EH_PE_indirect is included for global symbols. This eliminates the relocations. gcc/ChangeLog: * config/or1k/or1k.h (ASM_PREFERRED_EH_DATA_FORMAT): New macro.
2021-01-14or1k: Add note to indicate execstackStafford Horne1-0/+2
Define TARGET_ASM_FILE_END as file_end_indicate_exec_stack to allow generation of the ".note.GNU-stack" section note. This allows binutils to properly set PT_GNU_STACK in the program header. This fixes a glibc execstack testsuite test failure found while working on the OpenRISC glibc port. gcc/ChangeLog: * config/or1k/linux.h (TARGET_ASM_FILE_END): Define macro.
2021-01-14or1k: Add builtin define to detect hard floatStafford Horne1-0/+2
This is used in libgcc and now glibc to detect when hardware floating point operations are supported by the target. gcc/ChangeLog: * config/or1k/or1k.h (TARGET_CPU_CPP_BUILTINS): Add builtin define for __or1k_hard_float__.
2021-01-14or1k: Implement profile hook calling _mcountStafford Horne1-2/+13
Defining this to not abort as found when working on running tests in the glibc test suite. We implement this with a call to _mcount with no arguments. The required return address's will be pulled from the stack. Passing the LR (r9) as an argument had problems as sometimes r9 is clobbered by the GOT logic in the prologue before the call to _mcount. gcc/ChangeLog: * config/or1k/or1k.h (NO_PROFILE_COUNTERS): Define as 1. (PROFILE_HOOK): Define to call _mcount. (FUNCTION_PROFILER): Change from abort to no-op.
2021-01-13c++: Failure to lookup using-decl name [PR98231]Marek Polacek4-0/+31
In r11-4690 we removed the call to finish_nonmember_using_decl in tsubst_expr/DECL_EXPR in the USING_DECL block. This was done not to perform name lookup twice for a non-dependent using-decl, which sounds sensible. However, finish_nonmember_using_decl also pushes the decl's bindings which we still have to do so that we can find the USING_DECL's name later. In this case, we've got a USING_DECL N::operator<< that we are tsubstituting. We already looked it up while parsing the template "foo", and lookup_using_decl stashed the OVERLOAD it found into USING_DECL_DECLS. Now we just have to update the IDENTIFIER_BINDING of the identifier for operator<< with the overload the name is bound to. I didn't want to export push_local_binding so I've introduced a new wrapper. gcc/cp/ChangeLog: PR c++/98231 * name-lookup.c (push_using_decl_bindings): New. * name-lookup.h (push_using_decl_bindings): Declare. * pt.c (tsubst_expr): Call push_using_decl_bindings. gcc/testsuite/ChangeLog: PR c++/98231 * g++.dg/lookup/using63.C: New test.
2021-01-13match.pd: Fold (~X | C) ^ D into (X | C) ^ (~D ^ C) if (~D ^ C) can be ↵Jakub Jelinek2-1/+32
simplified [PR96691] These simplifications are only simplifications if the (~D ^ C) or (D ^ C) expressions fold into gimple vals, but in that case they decrease number of operations by 1. 2021-01-13 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/96691 * match.pd ((~X | C) ^ D -> (X | C) ^ (~D ^ C), (~X & C) ^ D -> (X & C) ^ (D ^ C)): New simplifications if (~D ^ C) or (D ^ C) can be simplified. * gcc.dg/tree-ssa/pr96691.c: New test.
2021-01-13tree-optimization/92645 - avoid harmful early BIT_FIELD_REF canonicalizationRichard Biener4-5/+31
This avoids canonicalizing BIT_FIELD_REF <T1> (a, <sz>, 0) to (T1)a on integer typed a. This confuses the vectorizer SLP matching. With this delayed to after vector lowering the testcase in PR92645 from Skia is now finally optimized to reasonable assembly. 2021-01-13 Richard Biener <rguenther@suse.de> PR tree-optimization/92645 * match.pd (BIT_FIELD_REF to conversion): Delay canonicalization until after vector lowering. * gcc.target/i386/pr92645-7.c: New testcase. * gcc.dg/tree-ssa/ssa-fre-54.c: Adjust. * gcc.dg/pr69047.c: Likewise.
2021-01-13c++: Fix cp_build_function_call_vec [PR 98626]Nathan Sidwell1-2/+2
I misunderstood the cp_build_function_call_vec API, thinking a NULL vector was an acceptable way of passing no arguments. You need to pass a vector of no elements. PR c++/98626 gcc/cp/ * module.cc (module_add_import_initializers): Pass a zero-element argument vector.
2021-01-13aarch64: Add support for unpacked SVE MLS and MSBRichard Sandiford7-44/+246
This patch extends the MLS/MSB patterns to support unpacked integer vectors. The type suffix could be either the element size or the container size, but using the element size should be more efficient. gcc/ * config/aarch64/aarch64-sve.md (fnma<mode>4): Extend from SVE_FULL_I to SVE_I. (@aarch64_pred_fnma<mode>, cond_fnma<mode>, *cond_fnma<mode>_2) (*cond_fnma<mode>_4, *cond_fnma<mode>_any): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/mls_2.c: New test. * g++.target/aarch64/sve/cond_mls_1.C: Likewise. * g++.target/aarch64/sve/cond_mls_2.C: Likewise. * g++.target/aarch64/sve/cond_mls_3.C: Likewise. * g++.target/aarch64/sve/cond_mls_4.C: Likewise. * g++.target/aarch64/sve/cond_mls_5.C: Likewise.
2021-01-13aarch64: Add support for unpacked SVE MLA and MADRichard Sandiford7-44/+246
This patch extends the MLA/MAD patterns to support unpacked integer vectors. The type suffix could be either the element size or the container size, but using the element size should be more efficient. gcc/ * config/aarch64/aarch64-sve.md (fma<mode>4): Extend from SVE_FULL_I to SVE_I. (@aarch64_pred_fma<mode>, cond_fma<mode>, *cond_fma<mode>_2) (*cond_fma<mode>_4, *cond_fma<mode>_any): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/mla_2.c: New test. * g++.target/aarch64/sve/cond_mla_1.C: Likewise. * g++.target/aarch64/sve/cond_mla_2.C: Likewise. * g++.target/aarch64/sve/cond_mla_3.C: Likewise. * g++.target/aarch64/sve/cond_mla_4.C: Likewise. * g++.target/aarch64/sve/cond_mla_5.C: Likewise.
2021-01-13tree-optimization/92645 - improve SLP with existing vectorsRichard Biener2-2/+63
This improves SLP discovery in the face of existing vectors allowing punning of the vector shape (or even punning from an integer type). For punning from integer types this does not yet handle lane zero extraction being represented as conversion rather than BIT_FIELD_REF. 2021-01-13 Richard Biener <rguenther@suse.de> PR tree-optimization/92645 * tree-vect-slp.c (vect_build_slp_tree_1): Relax supported BIT_FIELD_REF argument. (vect_build_slp_tree_2): Record the desired vector type on the external vector def. (vectorizable_slp_permutation): Handle required punning of existing vector defs. * gcc.target/i386/pr92645-6.c: New testcase.
2021-01-13aarch64: Tighten condition on sve/sel* testsRichard Sandiford3-3/+3
Noticed while testing on a different machine that the sve/sel_*.c tests require .variant_pcs support but don't test for it. .variant_pcs post-dates SVE so there shouldn't be a need to test for both. gcc/testsuite/ * gcc.target/aarch64/sve/sel_1.c: Require aarch64_variant_pcs. * gcc.target/aarch64/sve/sel_2.c: Likewise. * gcc.target/aarch64/sve/sel_3.c: Likewise.
2021-01-13rtl-ssa: Fix reversed comparisons in accesses.h commentRichard Sandiford1-4/+4
Noticed while looking at something else that the comment above def_lookup got the description of the comparisons the wrong way round. gcc/ * rtl-ssa/accesses.h (def_lookup): Fix order of comparison results.
2021-01-13sh: Remove match_scratch operand testRichard Sandiford1-2/+1
This patch fixes a regression on sh4 introduced by the rtl-ssa stuff. The port had a pattern: (define_insn "movsf_ie" [(set (match_operand:SF 0 "general_movdst_operand" "=f,r,f,f,fy, f,m, r, r,m,f,y,y,rf,r,y,<,y,y") (match_operand:SF 1 "general_movsrc_operand" " f,r,G,H,FQ,mf,f,FQ,mr,r,y,f,>,fr,y,r,y,>,y")) (use (reg:SI FPSCR_MODES_REG)) (clobber (match_scratch:SI 2 "=X,X,X,X,&z, X,X, X, X,X,X,X,X, y,X,X,X,X,X"))] "TARGET_SH2E && (arith_reg_operand (operands[0], SFmode) || fpul_operand (operands[0], SFmode) || arith_reg_operand (operands[1], SFmode) || fpul_operand (operands[1], SFmode) || arith_reg_operand (operands[2], SImode))" But recog can generate this pattern from something that matches: [(set (match_operand:SF 0 "general_movdst_operand") (match_operand:SF 1 "general_movsrc_operand") (use (reg:SI FPSCR_MODES_REG))] with recog adding the (clobber (match_scratch:SI)) automatically. recog tests the C condition before adding the clobber, so there might not be an operands[2] to test. Similarly, gen_movsf_ie takes only two arguments, with operand 2 being filled in automatically. The only way to create this pattern with a REG operands[2] before RA would be to generate it directly from RTL. AFAICT the only things that do this are the secondary reload patterns, which are generated during RA and come with pre-vetted operands. arith_reg_operand rejects 6 specific registers: return (regno != T_REG && regno != PR_REG && regno != FPUL_REG && regno != FPSCR_REG && regno != MACH_REG && regno != MACL_REG); The fpul_operand tests allow FPUL_REG, leaving 5 invalid registers. However, in all alternatives of movsf_ie, either operand 0 or operand 1 is a register that belongs r, f or y, none of which include any of the 5 rejected registers. This means that any post-RA pattern would satisfy the operands[0] or operands[1] condition without the operands[2] test being necessary. gcc/ * config/sh/sh.md (movsf_ie): Remove operands[2] test.
2021-01-13Hurd: Enable ifunc by defaultSamuel Thibault1-1/+3
The binutils bugs seem to have been fixed. gcc/ * config.gcc [$target == *-*-gnu*]: Enable 'default_gnu_indirect_function'.
2021-01-13i386, expand: Optimize also 256-bit and 512-bit permutatations as vpmovzx if ↵Jakub Jelinek13-11/+388
possible [PR95905] The following patch implements what I've talked about, i.e. to no longer force operands of vec_perm_const into registers in the generic code, but let each of the (currently 8) targets force it into registers individually, giving the targets better control on if it does that and when and allowing them to do something special with some particular operands. And then defines the define_insn_and_split for the 256-bit and 512-bit permutations into vpmovzx* (only the bw, wd and dq cases, in theory we could add define_insn_and_split patterns also for the bd, bq and wq). 2021-01-13 Jakub Jelinek <jakub@redhat.com> PR target/95905 * optabs.c (expand_vec_perm_const): Don't force v0 and v1 into registers before calling targetm.vectorize.vec_perm_const, only after that. * config/i386/i386-expand.c (ix86_vectorize_vec_perm_const): Handle two argument permutation when one operand is zero vector and only after that force operands into registers. * config/i386/sse.md (*avx2_zero_extendv16qiv16hi2_1): New define_insn_and_split pattern. (*avx512bw_zero_extendv32qiv32hi2_1): Likewise. (*avx512f_zero_extendv16hiv16si2_1): Likewise. (*avx2_zero_extendv8hiv8si2_1): Likewise. (*avx512f_zero_extendv8siv8di2_1): Likewise. (*avx2_zero_extendv4siv4di2_1): Likewise. * config/mips/mips.c (mips_vectorize_vec_perm_const): Force operands into registers. * config/arm/arm.c (arm_vectorize_vec_perm_const): Likewise. * config/sparc/sparc.c (sparc_vectorize_vec_perm_const): Likewise. * config/ia64/ia64.c (ia64_vectorize_vec_perm_const): Likewise. * config/aarch64/aarch64.c (aarch64_vectorize_vec_perm_const): Likewise. * config/rs6000/rs6000.c (rs6000_vectorize_vec_perm_const): Likewise. * config/gcn/gcn.c (gcn_vectorize_vec_perm_const): Likewise. Use std::swap. * gcc.target/i386/pr95905-2.c: Use scan-assembler-times instead of scan-assembler. Add tests with zero vector as first __builtin_shuffle operand. * gcc.target/i386/pr95905-3.c: New test. * gcc.target/i386/pr95905-4.c: New test.
2021-01-13if-to-switch: fix also virtual phisMartin Liska2-7/+23
gcc/ChangeLog: PR tree-optimization/98455 * gimple-if-to-switch.cc (condition_info::record_phi_mapping): Record also virtual PHIs. (pass_if_to_switch::execute): Return TODO_cleanup_cfg only conditionally. gcc/testsuite/ChangeLog: PR tree-optimization/98455 * gcc.dg/tree-ssa/pr98455.c: New test.
2021-01-13doc: Fix typos in C++ Modules documentationJonathan Wakely1-2/+2
gcc/ChangeLog: * doc/invoke.texi (C++ Modules): Fix typos.
2021-01-13tree-optimization/98640 - fix bogus sign-extension with VNRichard Biener2-6/+31
VN tried to express a sign extension from int to long of a trucated quantity with a plain conversion but that loses the truncation. Since there's no single operand doing truncate plus sign extend (there was a proposed SEXT_EXPR to do that at some point mapping to RTL sign_extract) don't bother to appropriately model this with two ops (which the VN insert machinery doesn't handle and which is unlikely to CSE fully). 2021-01-13 Richard Biener <rguenther@suse.de> PR tree-optimization/98640 * tree-ssa-sccvn.c (visit_nary_op): Do not try to handle plus or minus from a truncated operand to be sign-extended. * gcc.dg/torture/pr98640.c: New testcase.
2021-01-13i386: Add define_insn_and_split patterns for btrl [PR96938]Jakub Jelinek2-0/+131
In the following testcase we only optimize f2 and f7 to btrl, although we should optimize that way all of the functions. The problem is the type demotion/narrowing (which is performed solely during the generic folding and not later), without it we see the AND performed in SImode and match it as btrl, but with it while the shifts are still performed in SImode, the AND is already done in QImode or HImode low part of the shift. 2021-01-13 Jakub Jelinek <jakub@redhat.com> PR target/96938 * config/i386/i386.md (*btr<mode>_1, *btr<mode>_2): New define_insn_and_split patterns. (splitter after *btr<mode>_2): New splitter. * gcc.target/i386/pr96938.c: New test.
2021-01-13ipa: remove a dead codeMartin Liska1-2/+0
gcc/ChangeLog: PR ipa/98652 * cgraphunit.c (analyze_functions): Remove dead code.
2021-01-13[PATCH v2] aarch64: Add cpu cost tables for A64FXQian Jianhua2-4/+171
This patch add cost tables for A64FX. 2021-01-13 Qian jianhua <qianjh@cn.fujitsu.com> gcc/ * config/aarch64/aarch64-cost-tables.h (a64fx_extra_costs): New. * config/aarch64/aarch64.c (a64fx_addrcost_table): New. (a64fx_regmove_cost, a64fx_vector_cost): New. (a64fx_tunings): Use the new added cost tables.
2021-01-13i386: Optimize _mm_unpacklo_epi8 of 0 vector as second argument or similar ↵Jakub Jelinek4-0/+188
VEC_PERM_EXPRs into pmovzx [PR95905] The following patch adds patterns (so far 128-bit only) for permutations like { 0 16 1 17 2 18 3 19 4 20 5 21 6 22 7 23 } where the second operand is CONST0_RTX CONST_VECTOR to be emitted as pmovzx. 2021-01-13 Jakub Jelinek <jakub@redhat.com> PR target/95905 * config/i386/predicates.md (pmovzx_parallel): New predicate. * config/i386/sse.md (*sse4_1_zero_extendv8qiv8hi2_3): New define_insn_and_split pattern. (*sse4_1_zero_extendv4hiv4si2_3): Likewise. (*sse4_1_zero_extendv2siv2di2_3): Likewise. * gcc.target/i386/pr95905-1.c: New test. * gcc.target/i386/pr95905-2.c: New test.
2021-01-12amdgcn: Remove dead code for fixed v0 registerJulian Brown1-4/+0
This patch removes code to fix the v0 register in gcn_conditional_register_usage that was missed out of the previous patch removing the need for that: https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534284.html 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * config/gcn/gcn.c (gcn_conditional_register_usage): Remove dead code to fix v0 register.
2021-01-12amdgcn: Fix exec register live-on-entry to BB in md-reorgJulian Brown1-1/+16
This patch fixes a corner case in the AMD GCN md-reorg pass when the EXEC register is live on entry to a BB, and could be clobbered by code inserted by the pass before a use in (e.g.) a different BB. 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * config/gcn/gcn.c (gcn_md_reorg): Fix case where EXEC reg is live on entry to a BB.
2021-01-12amdgcn: Improve FP division accuracyJulian Brown3-20/+81
GCN has a reciprocal-approximation instruction but no hardware divide. This patch adjusts the open-coded reciprocal approximation/Newton-Raphson refinement steps to use fused multiply-add instructions as is necessary to obtain a properly-rounded result, and adds further refinement steps to correctly round the full division result. The patterns in question are still guarded by a flag_reciprocal_math condition, and do not yet support denormals. 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * config/gcn/gcn-valu.md (recip<mode>2<exec>, recip<mode>2): Use unspec for reciprocal-approximation instructions. (div<mode>3): Use fused multiply-accumulate operations for reciprocal refinement and division result. * config/gcn/gcn.md (UNSPEC_RCP): New unspec constant. gcc/testsuite/ * gcc.target/gcn/fpdiv.c: New test.
2021-01-12amdgcn: Fix subdf3 patternJulian Brown1-1/+1
This patch fixes a typo in the subdf3 pattern that meant it had a non-standard name and thus the compiler would emit a libcall rather than the proper hardware instruction for DFmode subtraction. 2021-01-13 Julian Brown <julian@codesourcery.com> gcc/ * config/gcn/gcn-valu.md (subdf): Rename to... (subdf3): This.
2021-01-13Daily bump.GCC Administrator6-1/+186
2021-01-12syscall: ensure openat uses variadic libc wrapperPaul E. Murphy1-1/+1
On powerpc64le, this caused a failure in TestUnshareUidGidMapping due to stack corruption which resulted in a bogus execve syscall. Use the existing c wrapper to ensure we respect the ppc abi for variadic functions. Fixes PR go/98610 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/282717
2021-01-12Avoid a couple more ICEs in print_mem_ref (PR c/98597).Martin Sebor3-6/+79
Resolves: PR c/98597 - ICE in -Wuninitialized printing a MEM_REF PR c/98592 - ICE in gimple_canonical_types_compatible_p while formatting gcc/c-family/ChangeLog: PR c/98597 PR c/98592 * c-pretty-print.c (print_mem_ref): Avoid assuming MEM_REF operand has pointer type. Remove redundant code. Avoid calling gimple_canonical_types_compatible_p. gcc/testsuite/ChangeLog: PR c/98597 PR c/98592 * g++.dg/warn/Wuninitialized-13.C: New test. gcc.dg/uninit-39.c: New test. #
2021-01-12gcov: fix printf format for 32-bit hostsMartin Liska1-2/+2
gcc/ChangeLog: * gcov.c (source_info::debug): Fix printf format for 32-bit hosts.
2021-01-12Fix typo in function-abi.hAndrea Corallo1-1/+1
gcc/Changelog 2021-01-12 Andrea Corallo <andrea.corallo@arm.com> * function-abi.h: Fix typo.
2021-01-12arm: Add movmisalign patterns for MVE (PR target/97875)Christophe Lyon5-25/+89
This patch adds new movmisalign<mode>_mve_load and store patterns for MVE to help vectorization. They are very similar to their Neon counterparts, but use different iterators and instructions. Indeed MVE supports less vectors modes than Neon, so we use the MVE_VLD_ST iterator where Neon uses VQX. Since the supported modes are different from the ones valid for arithmetic operators, we introduce two new sets of macros: ARM_HAVE_NEON_<MODE>_LDST true if Neon has vector load/store instructions for <MODE> ARM_HAVE_<MODE>_LDST true if any vector extension has vector load/store instructions for <MODE> We move the movmisalign<mode> expander from neon.md to vec-commond.md, and replace the TARGET_NEON enabler with ARM_HAVE_<MODE>_LDST. The patch also updates the mve-vneg.c test to scan for the better code generation when loading and storing the vectors involved: it checks that no 'orr' instruction is generated to cope with misalignment at runtime. This test was chosen among the other mve tests, but any other should be OK. Using a plain vector copy loop (dest[i] = a[i]) is not a good test because the compiler chooses to use memcpy. For instance we now generate: test_vneg_s32x4: vldrw.32 q3, [r1] vneg.s32 q3, q3 vstrw.32 q3, [r0] bx lr instead of: test_vneg_s32x4: orr r3, r1, r0 lsls r3, r3, #28 bne .L15 vldrw.32 q3, [r1] vneg.s32 q3, q3 vstrw.32 q3, [r0] bx lr .L15: push {r4, r5} ldrd r2, r3, [r1, #8] ldrd r5, r4, [r1] rsbs r2, r2, #0 rsbs r5, r5, #0 rsbs r4, r4, #0 rsbs r3, r3, #0 strd r5, r4, [r0] pop {r4, r5} strd r2, r3, [r0, #8] bx lr 2021-01-12 Christophe Lyon <christophe.lyon@linaro.org> PR target/97875 gcc/ * config/arm/arm.h (ARM_HAVE_NEON_V8QI_LDST): New macro. (ARM_HAVE_NEON_V16QI_LDST, ARM_HAVE_NEON_V4HI_LDST): Likewise. (ARM_HAVE_NEON_V8HI_LDST, ARM_HAVE_NEON_V2SI_LDST): Likewise. (ARM_HAVE_NEON_V4SI_LDST, ARM_HAVE_NEON_V4HF_LDST): Likewise. (ARM_HAVE_NEON_V8HF_LDST, ARM_HAVE_NEON_V4BF_LDST): Likewise. (ARM_HAVE_NEON_V8BF_LDST, ARM_HAVE_NEON_V2SF_LDST): Likewise. (ARM_HAVE_NEON_V4SF_LDST, ARM_HAVE_NEON_DI_LDST): Likewise. (ARM_HAVE_NEON_V2DI_LDST): Likewise. (ARM_HAVE_V8QI_LDST, ARM_HAVE_V16QI_LDST): Likewise. (ARM_HAVE_V4HI_LDST, ARM_HAVE_V8HI_LDST): Likewise. (ARM_HAVE_V2SI_LDST, ARM_HAVE_V4SI_LDST, ARM_HAVE_V4HF_LDST): Likewise. (ARM_HAVE_V8HF_LDST, ARM_HAVE_V4BF_LDST, ARM_HAVE_V8BF_LDST): Likewise. (ARM_HAVE_V2SF_LDST, ARM_HAVE_V4SF_LDST, ARM_HAVE_DI_LDST): Likewise. (ARM_HAVE_V2DI_LDST): Likewise. * config/arm/mve.md (*movmisalign<mode>_mve_store): New pattern. (*movmisalign<mode>_mve_load): New pattern. * config/arm/neon.md (movmisalign<mode>): Move to ... * config/arm/vec-common.md: ... here. PR target/97875 gcc/testsuite/ * gcc.target/arm/simd/mve-vneg.c: Update test.
2021-01-12[PR97969] LRA: Transform pattern `plus (plus (hard reg, const), pseudo)` ↵Vladimir N. Makarov2-1/+81
after elimination LRA can loop infinitely on targets without `reg + imm` insns. Register elimination on such targets can increase register pressure resulting in permanent stack size increase and changing elimination offset. To avoid such situation, a simple transformation can be done to avoid register pressure increase after generating reload insns containing eliminated hard regs. gcc/ChangeLog: PR target/97969 * lra-eliminations.c (eliminate_regs_in_insn): Add transformation of pattern 'plus (plus (hard reg, const), pseudo)'. gcc/testsuite/ChangeLog: PR target/97969 * gcc.target/arm/pr97969.c: New.
2021-01-12c++: Fix ICE with CTAD in concept [PR98611]Patrick Palka3-1/+33
This patch teaches cp_walk_subtrees to visit the template represented by a CTAD placeholder, which would otherwise be not visited during find_template_parameters. The template may be a template template parameter (as in the first testcase), or it may implicitly use the template parameters of an enclosing class template (as in the second testcase), and in either case we need to visit this tree to record the template parameters used therein for later satisfaction. gcc/cp/ChangeLog: PR c++/98611 * tree.c (cp_walk_subtrees) <case TEMPLATE_TYPE_PARM>: Visit the template of a CTAD placeholder. gcc/testsuite/ChangeLog: PR c++/98611 * g++.dg/cpp2a/concepts-ctad1.C: New test. * g++.dg/cpp2a/concepts-ctad2.C: New test.
2021-01-12tree-optimization/98550 - fix BB vect unrolling checkRichard Biener2-8/+128
This fixes the check that disqualifies BB vectorization because of required unrolling to match up with the later exact_div we do. To not disable the ability to split groups that do not match up exactly with a choosen vector type this also introduces a soft-fail mechanism to vect_build_slp_tree_1 which delays failing to after the matches[] array is populated from other checks and only then determines the split point according to the vector type. 2021-01-12 Richard Biener <rguenther@suse.de> PR tree-optimization/98550 * tree-vect-slp.c (vect_record_max_nunits): Check whether the group size is a multiple of the vector element count. (vect_build_slp_tree_1): When we need to fail because the vector type choosen causes unrolling do so lazily without affecting matches only at the end to guide group splitting. * g++.dg/opt/pr98550.C: New testcase.
2021-01-12options: properly compare string argumentsMartin Liska1-2/+4
Similarly to 7f967bd2a7ba156ede3fbb147e66dea5fb7137a6, we need to compare string with strcmp. gcc/ChangeLog: PR c++/97284 * optc-save-gen.awk: Compare also n_target_save vars with strcmp.
2021-01-12gcov: add more debugging facilityMartin Liska2-1/+47
gcc/ChangeLog: * gcov.c (source_info::debug): New. (print_usage): Add --debug (-D) option. (process_args): Likewise. (generate_results): Call src->debug after accumulate_line_counts. (read_graph_file): Properly assign id for EXIT_BLOCK. * profile.c (branch_prob): Dump function body before it is instrumented.
2021-01-12widening_mul: Fix up ICE caused by my signed multiplication overflow pattern ↵Jakub Jelinek2-14/+32
recognition changes [PR98629] As the testcase shows, my latest changes caused ICE on that testcase. The problem is that arith_overflow_check_p now can change the use_stmt argument (has a reference), so that if it succeeds (returns non-zero), it points it to the GIMPLE_COND or EQ/NE or COND_EXPR assignment from the TRUNC_DIV_EXPR assignment. The problem was that it would change use_stmt also if it returned 0 in some cases, such as multiple imm uses of the division, and in one of the callers if arith_overflow_check_p returns 0 it looks at use_stmt again and performs other checks, which of course assumes that use_stmt is the one passed to arith_overflow_check_p and not e.g. NULL instead or some other unrelated stmt. The following patch fixes that by only changing use_stmt when we are about to return non-zero (for the MULT_EXPR case, which is the only one with the need to use different use_stmt). 2021-01-12 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/98629 * tree-ssa-math-opts.c (arith_overflow_check_p): Don't update use_stmt unless returning non-zero. * gcc.c-torture/compile/pr98629.c: New test.
2021-01-12reassoc: Optimize in reassoc x < 0 && y < 0 to (x | y) < 0 etc. [PR95731]Jakub Jelinek3-12/+112
We already had x != 0 && y != 0 to (x | y) != 0 and x != -1 && y != -1 to (x & y) != -1 and x < 32U && y < 32U to (x | y) < 32U, this patch adds signed x < 0 && y < 0 to (x | y) < 0. In that case, the low/high seem to be always the same and just in_p indices whether it is >= 0 or < 0, also, all types in the same bucket (same precision) should be type compatible, but we can have some >= 0 and some < 0 comparison mixed, so the patch handles that by using the right BIT_IOR_EXPR or BIT_AND_EXPR and doing one set of < 0 or >= 0 first, then BIT_NOT_EXPR and then the other one. I had to move optimize_range_tests_var_bound before this optimization because that one deals with signed a >= 0 && a < b, and limited it to the last reassoc pass as reassoc itself can't virtually undo this optimization yet (and not sure if vrp would be able to). 2021-01-12 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/95731 * tree-ssa-reassoc.c (optimize_range_tests_cmp_bitwise): Also optimize x < 0 && y < 0 && z < 0 into (x | y | z) < 0 for signed x, y, z. (optimize_range_tests): Call optimize_range_tests_cmp_bitwise only after optimize_range_tests_var_bound. * gcc.dg/tree-ssa/pr95731.c: New test. * gcc.c-torture/execute/pr95731.c: New test.
2021-01-12configure, make: Fix up --enable-link-serializationJakub Jelinek2-2/+12
As reported by Matthias, --enable-link-serialization=1 can currently start two concurrent links first (e.g. gnat1 and cc1). The problem is that make var = value values seem to work differently between dependencies and actual rules (where it was tested). As the language make fragments can be in different order, we can have: ada.prev = ... magic that will become $(c.serial) under --enable-link-serialization=1 gnat1$(exe): ..... $(ada.prev) ... c.serial = cc1$(exe) and while if I add echo $(ada.prev) in the gnat1 rule's command, it prints cc1, the dependencies are actually evaluated during reading of the goal or when. The configure creates (and puts into Makefile) some serialization order of the languages and in that order c always comes first, and the rest is actually sorted the way the all_lang_makefrags are already sorted, so just by forcing c/Make-lang.in first we achieve that X.serial variable is always defined before some other Y.prev will use it in its goal dependencies. 2021-01-12 Jakub Jelinek <jakub@redhat.com> * configure.ac: Ensure c/Make-lang.in comes first in @all_lang_makefrags@. * configure: Regenerated.
2021-01-11c++: -Wmissing-field-initializers in unevaluated ctx [PR98620]Marek Polacek2-0/+46
This PR wants us not to warn about missing field initializers when the code in question takes places in decltype and similar. Fixed thus. gcc/cp/ChangeLog: PR c++/98620 * typeck2.c (process_init_constructor_record): Don't emit -Wmissing-field-initializers warnings in unevaluated contexts. gcc/testsuite/ChangeLog: PR c++/98620 * g++.dg/warn/Wmissing-field-initializers-2.C: New test.
2021-01-12Delete dead code in ix86_expand_sse_comi.liuhongt2-9/+0
d->flag is always 0 for builtins located in BDESC_FIRST (comi,COMI,...) ... BDESC_END (COMI, PCMPESTR) gcc/ChangeLog: PR target/98612 * config/i386/i386-builtins.h (BUILTIN_DESC_SWAP_OPERANDS): Deleted. * config/i386/i386-expand.c (ix86_expand_sse_comi): Delete dead code.
2021-01-11make FOR_EACH_IMM_USE_STMT safe for early exitsAlexandre Oliva16-67/+54
Use a dtor to automatically remove ITER from IMM_USE list in FOR_EACH_IMM_USE_STMT. for gcc/ChangeLog * ssa-iterators.h (end_imm_use_stmt_traverse): Forward declare. (auto_end_imm_use_stmt_traverse): New struct. (FOR_EACH_IMM_USE_STMT): Use it. (BREAK_FROM_IMM_USE_STMT, RETURN_FROM_IMM_USE_STMT): Remove, along with uses... * gimple-ssa-strength-reduction.c: ... here, ... * graphite-scop-detection.c: ... here, ... * ipa-modref.c, ipa-pure-const.c, ipa-sra.c: ... here, ... * tree-predcom.c, tree-ssa-ccp.c: ... here, ... * tree-ssa-dce.c, tree-ssa-dse.c: ... here, ... * tree-ssa-loop-ivopts.c, tree-ssa-math-opts.c: ... here, ... * tree-ssa-phiprop.c, tree-ssa.c: ... here, ... * tree-vect-slp.c: ... and here, ... * doc/tree-ssa.texi: ... and the example here.
2021-01-11analyzer: fix ICE merging dereferencing unknown ptrs [PR98628]David Malcolm2-2/+24
gcc/analyzer/ChangeLog: PR analyzer/98628 * store.cc (binding_cluster::make_unknown_relative_to): Don't mark dereferenced unknown pointers as having escaped. gcc/testsuite/ChangeLog: PR analyzer/98628 * gcc.dg/analyzer/pr98628.c: New test.
2021-01-12Daily bump.GCC Administrator5-1/+294
2021-01-11aarch64: Add support for unpacked SVE ASRDRichard Sandiford8-38/+290
This patch adds support for both conditional and unconditional unpacked ASRD. This meant adding a new define_insn for the unconditional form, instead of reusing the conditional instructions. It also meant extending the current conditional patterns to support merging with any independent value, not just zero. gcc/ * config/aarch64/aarch64-sve.md (sdiv_pow2<mode>3): Extend from SVE_FULL_I to SVE_I. Generate an UNSPEC_PRED_X. (*sdiv_pow2<mode>3): New pattern. (@cond_<sve_int_op><mode>): Extend from SVE_FULL_I to SVE_I. Wrap the ASRD in an UNSPEC_PRED_X. (*cond_<sve_int_op><mode>_2): Likewise. Replace the UNSPEC_PRED_X predicate with a constant PTRUE, if it isn't already. (*cond_<sve_int_op><mode>_z): Replace with... (*cond_<sve_int_op><mode>_any): ...this new pattern. gcc/testsuite/ * gcc.target/aarch64/sve/asrdiv_4.c: New test. * gcc.target/aarch64/sve/cond_asrd_1.c: Likewise. * gcc.target/aarch64/sve/cond_asrd_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_asrd_2.c: Likewise. * gcc.target/aarch64/sve/cond_asrd_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_asrd_3.c: Likewise. * gcc.target/aarch64/sve/cond_asrd_3_run.c: Likewise.
2021-01-11aarch64: Add support for unpacked SVE conditional BICRichard Sandiford5-13/+156
This patch adds support for unpacked conditional BIC. The type suffix could be taken from the element size or the container size, so the patch continues to use the element size. This is consistent with the existing support for unconditional BIC. gcc/ * config/aarch64/aarch64-sve.md (*cond_bic<mode>_2): Extend from SVE_FULL_I to SVE_I. (*cond_bic<mode>_any): Likewise. gcc/testsuite/ * g++.target/aarch64/sve/cond_bic_1.C: New test. * g++.target/aarch64/sve/cond_bic_2.C: Likewise. * g++.target/aarch64/sve/cond_bic_3.C: Likewise. * g++.target/aarch64/sve/cond_bic_4.C: Likewise.
2021-01-11aarch64: Add support for unpacked SVE MULHRichard Sandiford2-10/+44
This patch extends the SMULH and UMULH support to unpacked vectors. The type suffix must be taken from the element size rather than the container size. The main use of these patterns is to support division and modulus by a constant. The conditional forms would be hard to trigger from non-ACLE code, and ACLE code needs fully-packed vectors only. gcc/ * config/aarch64/aarch64-sve.md (<su>mul<mode>3_highpart) (@aarch64_pred_<MUL_HIGHPART:optab><mode>): Extend from SVE_FULL_I to SVE_I. gcc/testsuite/ * gcc.target/aarch64/sve/mul_highpart_3.c: New test.