aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-11-03c++: Disable -Winit-list-lifetime in unevaluated operand [PR97632]Marek Polacek2-1/+16
Jon suggested turning this warning off when we're not actually evaluating the operand. This patch does that. gcc/cp/ChangeLog: PR c++/97632 * init.c (build_new_1): Disable -Winit-list-lifetime for an unevaluated operand. gcc/testsuite/ChangeLog: PR c++/97632 * g++.dg/warn/Winit-list4.C: New test.
2020-11-03Cleanup of a merge mistake in fold-const.cBernd Edlinger1-5/+0
This removes a duplicated statement. It was apparently introduced due to a merge mistake. 2020-11-03 Bernd Edlinger <bernd.edlinger@hotmail.de> * fold-const.c (getbyterep): Remove duplicated statement.
2020-11-03Fix PR97205Bernd Edlinger4-17/+45
This makes sure that stack allocated SSA_NAMEs are at least MODE_ALIGNED. Also increase the MEM_ALIGN for the corresponding rtl objects. gcc: 2020-11-03 Bernd Edlinger <bernd.edlinger@hotmail.de> PR target/97205 * cfgexpand.c (align_local_variable): Make SSA_NAMEs at least MODE_ALIGNED. (expand_one_stack_var_at): Increase MEM_ALIGN for SSA_NAMEs. gcc/testsuite: 2020-11-03 Bernd Edlinger <bernd.edlinger@hotmail.de> PR target/97205 * gcc.c-torture/compile/pr97205.c: New test.
2020-11-03libcpp: unbreak bootstrapNathan Sidwell1-1/+1
This fixes the bootstrap breakage I caused. Sorry about that. libcpp/ * init.c (cpp_read_main_file): Use cpp_get_deps result.
2020-11-03AArch64: Add FLAG for AES/SHA/SM3/SM4 intrinsics [PR94442]zhengnannan1-27/+27
2020-11-03 Zhiheng Xie <xiezhiheng@huawei.com> Nannan Zheng <zhengnannan@huawei.com> gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def: Add proper FLAG for AES/SHA/SM3/SM4 intrinsics.
2020-11-03AArch64: Add FLAG for compare intrinsics [PR94442]zhengnannan1-9/+9
2020-11-03 Zhiheng Xie <xiezhiheng@huawei.com> Nannan Zheng <zhengnannan@huawei.com> gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def: Add proper FLAG for compare intrinsics.
2020-11-03Save some memory at debug stream-in timeRichard Biener1-0/+1
This allows us to release references to BLOCKs by not keeping them rooted in the external_die_map but instead remove it from there as soon as we created the corresponding stub DIE. For decls it doesn't help since we still keep the decl_die_table. 2020-11-03 Richard Biener <rguenther@suse.de> * dwarf2out.c (maybe_create_die_with_external_ref): Remove hashtable entry.
2020-11-03arm: Add vstN_lane_bf16 + vstNq_lane_bf16 intrisicsAndrea Corallo9-12/+133
gcc/ChangeLog 2020-10-29 Andrea Corallo <andrea.corallo@arm.com> * config/arm/arm_neon.h (vst2_lane_bf16, vst2q_lane_bf16) (vst3_lane_bf16, vst3q_lane_bf16, vst4_lane_bf16) (vst4q_lane_bf16): New intrinsics. * config/arm/arm_neon_builtins.def: Touch it for: __builtin_neon_vst2_lanev4bf, __builtin_neon_vst2_lanev8bf, __builtin_neon_vst3_lanev4bf, __builtin_neon_vst3_lanev8bf, __builtin_neon_vst4_lanev4bf,__builtin_neon_vst4_lanev8bf. gcc/testsuite/ChangeLog 2020-10-29 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/aarch64/advsimd-intrinsics/vst2_lane_bf16_indices_1.c: Run it also for arm-*-*. * gcc.target/aarch64/advsimd-intrinsics/vst2q_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst3_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst3q_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst4_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst4q_lane_bf16_indices_1.c: Likewise. * gcc.target/arm/simd/vstn_lane_bf16_1.c: New test.
2020-11-03arm: Add vldN_lane_bf16 + vldNq_lane_bf16 intrisicsAndrea Corallo10-13/+148
gcc/ChangeLog 2020-10-29 Andrea Corallo <andrea.corallo@arm.com> * config/arm/arm_neon.h (vld2_lane_bf16, vld2q_lane_bf16) (vld3_lane_bf16, vld3q_lane_bf16, vld4_lane_bf16) (vld4q_lane_bf16): Add intrinsics. * config/arm/arm_neon_builtins.def: Touch for: __builtin_neon_vld2_lanev4bf, __builtin_neon_vld2_lanev8bf, __builtin_neon_vld3_lanev4bf, __builtin_neon_vld3_lanev8bf, __builtin_neon_vld4_lanev4bf, __builtin_neon_vld4_lanev8bf. * config/arm/iterators.md (VQ_HS): Add V8BF to the iterator. gcc/testsuite/ChangeLog 2020-10-29 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/aarch64/advsimd-intrinsics/vld2_lane_bf16_indices_1.c: Run it also for the arm backend. * gcc.target/aarch64/advsimd-intrinsics/vld2q_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld3_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld3q_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld4_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld4q_lane_bf16_indices_1.c: Likewise. * gcc.target/arm/simd/vldn_lane_bf16_1.c: New test.
2020-11-03arm: Add vst1_bf16 + vst1q_bf16 intrinsicsAndrea Corallo3-2/+46
gcc/ChangeLog 2020-10-29 Andrea Corallo <andrea.corallo@arm.com> * config/arm/arm_neon.h (vst1_bf16, vst1q_bf16): Add intrinsics. * config/arm/arm_neon_builtins.def : Touch for: __builtin_neon_vst1v4bf, __builtin_neon_vst1v8bf. gcc/testsuite/ChangeLog 2020-10-29 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/arm/simd/vst1_bf16_1.c: New test.
2020-11-03arm: Add vld1_bf16 + vld1q_bf16 intrinsicsAndrea Corallo4-2/+49
gcc/ChangeLog 2020-10-29 Andrea Corallo <andrea.corallo@arm.com> * config/arm/arm-builtins.c (VAR14): Define macro. * config/arm/arm_neon_builtins.def: Touch for: __builtin_neon_vld1v4bf, __builtin_neon_vld1v8bf. * config/arm/arm_neon.h (vld1_bf16, vld1q_bf16): Add intrinsics. gcc/testsuite/ChangeLog 2020-10-29 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/arm/simd/vld1_bf16_1.c: New test.
2020-11-03arm: Add vst1_lane_bf16 + vstq_lane_bf16 intrinsicsAndrea Corallo5-2/+67
gcc/ChangeLog 2020-10-23 Andrea Corallo <andrea.corallo@arm.com> * config/arm/arm_neon.h (vst1_lane_bf16, vst1q_lane_bf16): Add intrinsics. * config/arm/arm_neon_builtins.def (STORE1LANE): Add v4bf, v8bf. gcc/testsuite/ChangeLog 2020-10-23 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/arm/simd/vst1_lane_bf16_1.c: New testcase. * gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c: Likewise. * gcc.target/arm/simd/vst1_lane_bf16_indices_1.c: Likewise.
2020-11-03arm: Add vld1_lane_bf16 + vldq_lane_bf16 intrinsicsAndrea Corallo5-2/+71
gcc/ChangeLog 2020-10-21 Andrea Corallo <andrea.corallo@arm.com> * config/arm/arm_neon_builtins.def: Add to LOAD1LANE v4bf, v8bf. * config/arm/arm_neon.h (vld1_lane_bf16, vld1q_lane_bf16): Add intrinsics. gcc/testsuite/ChangeLog 2020-10-21 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/arm/simd/vld1_lane_bf16_1.c: New testcase. * gcc.target/arm/simd/vld1_lane_bf16_indices_1.c: Likewise. * gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c: Likewise.
2020-11-03c++: cp_tree_equal cleanupsNathan Sidwell1-22/+26
A couple of small fixes. I noticed bind_template_template_parms was not marking the parm a template parm (this broke some module handling). Debugging CALL_EXPR comparisons led me to refactor cp_tree_equal's CALL_EXPR code (and my recent fix to debug printing of same). Finally TREE_VECS are best compared by comp_template_args. I recall that last piece being a left over from fixes during gcc-10. I've been using it on the modules branch since then. gcc/cp/ * tree.c (bind_template_template_parm): Mark the parm as a template parm. (cp_tree_equal): Refactor CALL_EXPR. Use comp_template_args for TREE_VECs.
2020-11-03c++: rtti cleanupsNathan Sidwell1-40/+48
Here are a few cleanups from the modules branch. Generally some RAII, and a bit of lazy namespace pushing. gcc/cp/ * rtti.c (init_rtti_processing): Move var decl to its init. (get_tinfo_decl): Likewise. Break out creation to called helper ... (get_tinfo_decl_direct): ... here. (build_dynamic_cast_1): Move var decls to their initializers. (tinfo_base_init): Set decl's location to BUILTINS_LOCATION. (get_tinfo_desc): Only push ABI namespace when needed. Set type's context.
2020-11-03libcpp: dependency emission tidyingNathan Sidwell6-24/+26
This patch cleans up the interface to the dependency generation a little. We now only check the option in one place, and the cpp_get_deps function returns nullptr if there are no dependencies. I also reworded the -MT and -MQ help text to be make agnostic -- as there are ideas about emitting, say, JSON. libcpp/ * include/mkdeps.h: Include cpplib.h (deps_write): Adjust first parm type. * mkdeps.c: Include internal.h (make_write): Adjust first parm type. Check phony option directly. (deps_write): Adjust first parm type. * init.c (cpp_read_main_file): Use get_deps. * directives.c (cpp_get_deps): Check option before initializing. gcc/c-family/ * c.opt (MQ,MT): Reword description to be make-agnostic. gcc/fortran/ * cpp.c (gfc_cpp_add_dep): Only add dependency if we're recording them. (gfc_cpp_init): Likewise for target.
2020-11-03aarch64: ACLE intrinsics convert BF16 to Float32Dennis Zhang7-0/+117
This patch enables intrinsics to convert BFloat16 scalar and vector operands to Float32 modes. The intrinsics are implemented by shifting each BFloat16 item 16 bits to left using shl/shll/shll2 instructions. gcc/ChangeLog: 2020-11-03 Dennis Zhang <dennis.zhang@arm.com> * config/aarch64/aarch64-simd-builtins.def(vbfcvt): New entry. (vbfcvt_high, bfcvt): Likewise. * config/aarch64/aarch64-simd.md(aarch64_vbfcvt<mode>): New entry. (aarch64_vbfcvt_highv8bf, aarch64_bfcvtsf): Likewise. * config/aarch64/arm_bf16.h (vcvtah_f32_bf16): New intrinsic. * config/aarch64/arm_neon.h (vcvt_f32_bf16): Likewise. (vcvtq_low_f32_bf16, vcvtq_high_f32_bf16): Likewise. gcc/testsuite/ChangeLog * gcc.target/aarch64/advsimd-intrinsics/bfcvt-compile.c (test_vcvt_f32_bf16, test_vcvtq_low_f32_bf16): New tests. (test_vcvtq_high_f32_bf16, test_vcvth_f32_bf16): Likewise.
2020-11-03bootstrap/97666 - fix array of bool allocationRichard Biener1-1/+1
This fixes the bad assumption that sizeof (bool) == 1 2020-11-03 Richard Biener <rguenther@suse.de> PR bootstrap/97666 * tree-vect-slp.c (vect_build_slp_tree_2): Scale allocation of skip_args by sizeof (bool).
2020-11-03tree-optimization/80928 - SLP vectorize nested loop inductionRichard Biener3-65/+164
This adds SLP vectorization of nested inductions. 2020-11-03 Richard Biener <rguenther@suse.de> PR tree-optimization/80928 * tree-vect-loop.c (vectorizable_induction): SLP vectorize nested inductions. * gcc.dg/vect/vect-outer-slp-2.c: New testcase. * gcc.dg/vect/vect-outer-slp-3.c: Likewise.
2020-11-03testsuite: Fix gcc.target/i386/zero-scratch-regs-*.c scan-asm directivesUros Bizjak29-159/+163
Improve zero-scratch-regs-*.c scan-asm regexps and add target selectors for 32bit targets. 2020-11-03 Uroš Bizjak <ubizjak@gmail.com> gcc/testsuite/ChangeLog: * gcc.target/i386/zero-scratch-regs-1.c: Add ia32 target selector where appropriate. Improve scan-assembler regexp. * gcc.target/i386/zero-scratch-regs-2.c: Ditto. * gcc.target/i386/zero-scratch-regs-3.c: Ditto. * gcc.target/i386/zero-scratch-regs-4.c: Ditto. * gcc.target/i386/zero-scratch-regs-5.c: Ditto. * gcc.target/i386/zero-scratch-regs-6.c: Ditto. * gcc.target/i386/zero-scratch-regs-7.c: Ditto. * gcc.target/i386/zero-scratch-regs-8.c: Ditto. * gcc.target/i386/zero-scratch-regs-9.c: Ditto. * gcc.target/i386/zero-scratch-regs-10.c: Ditto. * gcc.target/i386/zero-scratch-regs-13.c: Ditto. * gcc.target/i386/zero-scratch-regs-14.c: Ditto. * gcc.target/i386/zero-scratch-regs-15.c: Ditto. * gcc.target/i386/zero-scratch-regs-16.c: Ditto. * gcc.target/i386/zero-scratch-regs-17.c: Ditto. * gcc.target/i386/zero-scratch-regs-18.c: Ditto. * gcc.target/i386/zero-scratch-regs-19.c: Ditto. * gcc.target/i386/zero-scratch-regs-20.c: Ditto. * gcc.target/i386/zero-scratch-regs-21.c: Ditto. * gcc.target/i386/zero-scratch-regs-22.c: Ditto. * gcc.target/i386/zero-scratch-regs-23.c: Ditto. * gcc.target/i386/zero-scratch-regs-24.c: Ditto. * gcc.target/i386/zero-scratch-regs-25.c: Ditto. * gcc.target/i386/zero-scratch-regs-26.c: Ditto. * gcc.target/i386/zero-scratch-regs-27.c: Ditto. * gcc.target/i386/zero-scratch-regs-28.c: Ditto. * gcc.target/i386/zero-scratch-regs-29.c: Ditto. * gcc.target/i386/zero-scratch-regs-30.c: Ditto. * gcc.target/i386/zero-scratch-regs-31.c: Ditto.
2020-11-03Add missing require-effective-target ltoOlivier Hainque1-0/+1
This prevents failure of an lto test in configurations missing LTO support, such as VxWorks for kernel mode. 2020-11-02 Olivier Hainque <hainque@adacore.com> gcc/testsuite/ * gcc.dg/tree-ssa/pr71077.c: Add dg-require-effective-target lto.
2020-11-03Add dg-require-effective-target fpic to gcc i386 testsOlivier Hainque14-0/+14
This change adds /* { dg-require-effective-target fpic } */ to tests in gcc.target/i386 that do use -fpic or -fPIC but don't currently query the target support. This corresponds to what many other fpic tests do and helps the vxWorks ports at least, as -fpic is typically not supported in at least one of the two major modes of such port (kernel vs RTP). 2020-11-03 Olivier Hainque <hainque@adacore.com> gcc/testsuite/ * gcc.target/i386/pr45352-1.c: Add dg-require-effective-target fpic. * gcc.target/i386/pr47602.c: Likewise. * gcc.target/i386/pr55151.c: Likewise. * gcc.target/i386/pr55458.c: Likewise. * gcc.target/i386/pr56348.c: Likewise. * gcc.target/i386/pr57097.c: Likewise. * gcc.target/i386/pr65753.c: Likewise. * gcc.target/i386/pr65915.c: Likewise. * gcc.target/i386/pr66232-5.c: Likewise. * gcc.target/i386/pr66334.c: Likewise. * gcc.target/i386/pr66819-2.c: Likewise. * gcc.target/i386/pr67265.c: Likewise. * gcc.target/i386/pr81481.c: Likewise. * gcc.target/i386/pr83994.c: Likewise.
2020-11-03Avoid recursion in tree-inlineJan Hubicka2-0/+38
gcc/ChangeLog: 2020-11-03 Jan Hubicka <hubicka@ucw.cz> PR ipa/97578 * ipa-inline-transform.c (maybe_materialize_called_clones): New function. (inline_transform): Use it. gcc/testsuite/ChangeLog: 2020-11-03 Jan Hubicka <hubicka@ucw.cz> * gcc.c-torture/compile/pr97578.c: New test.
2020-11-03testsuite/97688 - fix check_vect () with __AVX2__Richard Biener1-1/+1
This fixes the cpuid check to always specify a subleaf zero which is required to detect AVX2 and doesn't hurt for level one. Without this fix we get zero runtime coverage when -mavx2 is specified. 2020-11-03 Richard Biener <rguenther@suse.de> PR testsuite/97688 * gcc.dg/vect/tree-vect.h (check_vect): Fix the x86 cpuid check to always specify subleaf zero.
2020-11-03tree-optimization/97678 - fix SLP induction epilogue vectorizationRichard Biener3-7/+79
This restores not tracking SLP nodes for induction initial values in not nested context because this interferes with peeling and epilogue vectorization. 2020-11-03 Richard Biener <rguenther@suse.de> PR tree-optimization/97678 * tree-vect-slp.c (vect_build_slp_tree_2): Do not track the initial values of inductions when not nested. * tree-vect-loop.c (vectorizable_induction): Look at PHI node initial values again for SLP and not nested inductions. Handle LOOP_VINFO_MASK_SKIP_NITERS and cost invariants. * gcc.dg/vect/pr97678.c: New testcase.
2020-11-03Fortran: Add !GCC$ attributes DEPRECATEDTobias Burnus6-1/+56
gcc/fortran/ChangeLog: * decl.c (ext_attr_list): Add EXT_ATTR_DEPRECATED. * gfortran.h (ext_attr_id_t): Ditto. * gfortran.texi (GCC$ ATTRIBUTES): Document it. * resolve.c (resolve_variable, resolve_function, resolve_call, resolve_values): Show -Wdeprecated-declarations warning. * trans-decl.c (add_attributes_to_decl): Skip those with no middle_end_name. gcc/testsuite/ChangeLog: * gfortran.dg/attr_deprecated.f90: New test.
2020-11-03x86: Optimize aes<aeswideklvariant>u8 a bit, fix whitespaceUros Bizjak1-32/+35
2020-11-03 Uroš Bizjak <ubizjak@gmail.com> gcc/ * config/i386/sse.md (aes<aeswideklvariant>u8): Do not use xmm_regs array. Fix whitespace.
2020-11-03x86: Fix comment in ix86_expand_builtinUros Bizjak1-4/+4
2020-11-03 Uroš Bizjak <ubizjak@gmail.com> gcc/ * config/i386/i386-expand.c (ix86_expand_builtin): Fix comment.
2020-11-03[OpenACC] Enable inconsistent nested 'reduction' clauses checking for ↵Thomas Schwinge5-40/+1063
OpenACC 'kernels' gcc/ * omp-low.c (scan_omp_for) <OpenACC>: Move earlier inconsistent nested 'reduction' clauses checking. gcc/testsuite/ * c-c++-common/goacc/nested-reductions-1-kernels.c: Extend. * c-c++-common/goacc/nested-reductions-2-kernels.c: Likewise. * gfortran.dg/goacc/nested-reductions-1-kernels.f90: Likewise. * gfortran.dg/goacc/nested-reductions-2-kernels.f90: Likewise.
2020-11-03[OpenACC] Split up testcases for inconsistent nested 'reduction' clauses ↵Thomas Schwinge12-561/+589
checking gcc/testsuite/ * c-c++-common/goacc/nested-reductions.c: Split file into... * c-c++-common/goacc/nested-reductions-1-kernels.c: ... this... * c-c++-common/goacc/nested-reductions-1-parallel.c: ..., this... * c-c++-common/goacc/nested-reductions-1-routine.c: ..., and this. * c-c++-common/goacc/nested-reductions-warn.c: Split file into... * c-c++-common/goacc/nested-reductions-2-kernels.c: ... this... * c-c++-common/goacc/nested-reductions-2-parallel.c: ..., this... * c-c++-common/goacc/nested-reductions-2-routine.c: ..., and this. * gfortran.dg/goacc/nested-reductions.f90: Split file into... * gfortran.dg/goacc/nested-reductions-1-kernels.f90: ... this... * gfortran.dg/goacc/nested-reductions-1-parallel.f90: ..., this... * gfortran.dg/goacc/nested-reductions-1-routine.f90: ..., and this. * gfortran.dg/goacc/nested-reductions-warn.f90: Split file into... * gfortran.dg/goacc/nested-reductions-2-kernels.f90: ... this... * gfortran.dg/goacc/nested-reductions-2-parallel.f90: ..., this... * gfortran.dg/goacc/nested-reductions-2-routine.f90: ..., and this.
2020-11-03libstdc++: use lt_host_flags for libstdc++.laJonathan Yong2-2/+2
For platforms like Mingw and Cygwin, cygwin refuses to generate the shared library without using -no-undefined. Attached patch makes sure the right flags are used, since libtool is already used to link libstdc++. libstdc++-v3/ChangeLog: * src/Makefile.am (libstdc___la_LINK): Add lt_host_flags. * src/Makefile.in: Regenerate.
2020-11-03[Fortran] More precise location information for OpenACC 'gang', 'worker', ↵Thomas Schwinge2-33/+36
'vector' clauses with argument [PR92793] gcc/fortran/ PR fortran/92793 * trans-openmp.c (gfc_trans_omp_clauses): More precise location information for OpenACC 'gang', 'worker', 'vector' clauses with argument. gcc/testsuite/ PR fortran/92793 * gfortran.dg/goacc/pr92793-1.f90: Adjust.
2020-11-03[OpenACC] More precise diagnostics for 'gang', 'worker', 'vector' clauses ↵Thomas Schwinge3-10/+108
with arguments on 'loop' only allowed in 'kernels' regions Instead of at the location of the 'loop' directive, 'error_at' the location of the improper clause, and 'inform' at the location of the enclosing parent compute construct/routine. The Fortran testcases come with some XFAILing, to be resolved later. gcc/ * omp-low.c (scan_omp_for) <OpenACC>: More precise diagnostics for 'gang', 'worker', 'vector' clauses with arguments only allowed in 'kernels' regions. gcc/testsuite/ * c-c++-common/goacc/pr92793-1.c: Extend. * gfortran.dg/goacc/pr92793-1.f90: Likewise.
2020-11-02pass: Run cleanup passes before SLP [PR96789]Kewen Lin10-6/+136
As the discussion in PR96789, we found that some scalar stmts which can be eliminated by some passes after SLP, but we still modeled their costs when trying to SLP, it could impact vectorizer's decision. One typical case is the case in PR96789 on target Power. As Richard suggested there, this patch is to introduce one pass called pre_slp_scalar_cleanup which has some secondary clean up passes, for now they are FRE and DSE. It introduces one new TODO flags group called pending TODO flags, unlike normal TODO flags, the pending TODO flags are passed down in the pipeline until one of its consumers can perform the requested action. Consumers should then clear the flags for the actions that they have taken. Soem compilation time statistics on all SPEC2017 INT bmks were collected on one Power9 machine for several option sets below: A1: -Ofast -funroll-loops A2: -O1 A3: -O1 -funroll-loops A4: -O2 A5: -O2 -funroll-loops the corresponding increment rate is trivial: A1 A2 A3 A4 A5 0.08% 0.00% -0.38% -0.10% -0.05% Bootstrapped/regtested on powerpc64le-linux-gnu P8. gcc/ChangeLog: PR tree-optimization/96789 * function.h (struct function): New member unsigned pending_TODOs. * passes.c (class pass_pre_slp_scalar_cleanup): New class. (make_pass_pre_slp_scalar_cleanup): New function. (pass_data_pre_slp_scalar_cleanup): New pass data. * passes.def: (pass_pre_slp_scalar_cleanup): New pass, add pass_fre and pass_dse as its children. * timevar.def (TV_SCALAR_CLEANUP): New timevar. * tree-pass.h (PENDING_TODO_force_next_scalar_cleanup): New pending TODO flag. (make_pass_pre_slp_scalar_cleanup): New declare. * tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Once any outermost loop gets unrolled, flag cfun pending_TODOs PENDING_TODO_force_next_scalar_cleanup on. gcc/testsuite/ChangeLog: PR tree-optimization/96789 * gcc.dg/tree-ssa/ssa-dse-28.c: Adjust. * gcc.dg/tree-ssa/ssa-dse-29.c: Likewise. * gcc.dg/vect/bb-slp-41.c: Likewise. * gcc.dg/tree-ssa/pr96789.c: New test.
2020-11-03libgcc: Expose the instruction pointer and stack pointer in SEH ↵Martin Storsjö1-0/+5
_Unwind_Backtrace Previously, the SEH version of _Unwind_Backtrace did unwind the stack and call the provided callback function as intended, but there was little the caller could do within the callback to actually get any info about that particular level in the unwind. Set the ra and cfa pointers, which are used by _Unwind_GetIP and _Unwind_GetCFA, to allow using these functions from the callacb to inspect the state at each stack frame. 2020-09-08 Martin Storsjö <martin@martin.st> libgcc/ * unwind-seh.c (_Unwind_Backtrace): Set the ra and cfa pointers before calling the callback.
2020-11-03Daily bump.GCC Administrator3-1/+32
2020-11-03can_implement_as_sibling_call_p REG_PARM_STACK_SPACE checkAlan Modra4-17/+34
This moves an #ifdef block of code from calls.c to targetm.function_ok_for_sibcall. Only two targets, x86 and rs6000, define REG_PARM_STACK_SPACE or OUTGOING_REG_PARM_STACK_SPACE macros that might vary depending on the called function. Macros like UNITS_PER_WORD don't change over a function boundary, nor does the MIPS ABI, nor does TARGET_64BIT on PA-RISC. Other targets are even more trivially proven to not need the calls.c code. Besides cleaning up a small piece of #ifdef code, the motivation for this patch is to allow tail calls on PowerPC for functions that require less reg_parm_stack_space than their caller. The original code in calls.c only permitted tail calls when exactly equal, but on PowerPC we can tail call if the callee has less or equal REG_PARM_STACK_SPACE than the caller, as demonstrated by the testcase. So we should use /* If reg parm stack space increases, we cannot sibcall. */ if (REG_PARM_STACK_SPACE (decl ? decl : fntype) > INCOMING_REG_PARM_STACK_SPACE (current_function_decl)) and note the change to use INCOMING_REG_PARM_STACK_SPACE. REG_PARM_STACK_SPACE has always been wrong there for PowerPC. See https://gcc.gnu.org/pipermail/gcc-patches/2014-May/389867.html for why if you're curious. Not that it matters, because PowerPC can do without this check entirely, relying on a stack slot test in generic code. a) The generic code checks that arg passing stack in the callee is not greater than that in the caller, and, b) ELFv2 only allocates reg_parm_stack_space when some parameter is passed on the stack. Point (b) means that zero reg_parm_stack_space implies zero stack space, and non-zero reg_parm_stack_space implies non-zero stack space. So the case of 0 reg_parm_stack_space in the caller and 64 in the callee will be caught by (a). gcc/ PR middle-end/97267 * calls.h (maybe_complain_about_tail_call): Declare. * calls.c (maybe_complain_about_tail_call): Make global. (can_implement_as_sibling_call_p): Delete reg_parm_stack_space param. Adjust caller. Move REG_PARM_STACK_SPACE check to.. * config/i386/i386.c (ix86_function_ok_for_sibcall): ..here. gcc/testsuite/ PR middle-end/97267 * gcc.target/powerpc/pr97267.c: New test.
2020-11-02Expand reg_equiv when scratches are removed.Vladimir N. Makarov1-9/+16
gcc/ChangeLog: * ira.c (ira_remove_scratches): Rename to remove_scratches. Make it static and returning flag of any change. (ira.c): Call ira_expand_reg_equiv in case of removing scratches.
2020-11-02x86: Also require MMX for __builtin_ia32_maskmovqH.J. Lu2-1/+17
MMX emulation with SEE is implemented at MMX intrinsic level, not at MMX instruction level. _mm_maskmove_si64 intrinsic for "MASKMOVQ mm1, mm2" is emulated with __builtin_ia32_maskmovdqu. Since SSE "MASKMOVQ mm1, mm2" builtin function, __builtin_ia32_maskmovq, can't be emulated with XMM registers, make __builtin_ia32_maskmovq also require MMX instead of SSE only. gcc/ PR target/97140 * config/i386/i386-expand.c (ix86_expand_builtin): Require MMX for __builtin_ia32_maskmovq. gcc/testsuite/ PR target/97140 * gcc.target/i386/pr97140.c: New test.
2020-11-02Daily bump.GCC Administrator13-1/+967
2020-11-02Correct -Wstringop-overflow and -Wstringop-overread.Martin Sebor1-19/+19
gcc/ChangeLog: * doc/invoke.texi (-Wstringop-overflow): Correct default setting. (-Wstringop-overread): Move past -Wstringop-overflow.
2020-11-02gcc: quote characters in texi sourceFrançois-Xavier Coudert1-1/+1
gcc/ChangeLog: PR bootstrap/57076 * Makefile.in (gcc-vers.texi): Quote @, { and }.
2020-11-02libstdc++: Add c++2a <syncstream>Thomas Rodgers15-1/+939
libstdc++-v3/ChangeLog: * doc/doxygen/user.cfg.in (INPUT): Add new header. * include/Makefile.am (std_headers): Add new header. * include/Makefile.in: Regenerate. * include/precompiled/stdc++.h: Include new header. * include/std/syncstream: New header. * include/std/version: Add __cpp_lib_syncbuf. * testsuite/27_io/basic_syncbuf/1.cc: New test. * testsuite/27_io/basic_syncbuf/2.cc: Likewise. * testsuite/27_io/basic_syncbuf/basic_ops/1.cc: Likewise. * testsuite/27_io/basic_syncbuf/requirements/types.cc: Likewise. * testsuite/27_io/basic_syncbuf/sync_ops/1.cc: Likewise. * testsuite/27_io/basic_syncstream/1.cc: Likewise. * testsuite/27_io/basic_syncstream/2.cc: Likewise. * testsuite/27_io/basic_syncstream/basic_ops/1.cc: Likewise. * testsuite/27_io/basic_syncstream/requirements/types.cc: Likewise.
2020-11-02c++: Fixup some vardecls and whitespaceNathan Sidwell1-13/+10
Move some var decls to their initializers. Correct some whitespace. gcc/cp/ * decl.c (start_decl_1): Refactor declarations. Fixup some whitespace. (lookup_and_check_tag): Fixup some whitespace.
2020-11-02c++: refactor duplicate declsNathan Sidwell1-23/+29
A couple of paths in duplicate decls dealing with templates and builtins were overly complicated. Fixing thusly. gcc/cp/ * decl.c (duplicate_decls): Refactor some template & builtin handling.
2020-11-02c++: Delete unused hash typeNathan Sidwell2-29/+0
Since I redid block-scope extern decls, the need for a uid->decl hasher has gone away. Deleting thusly. gcc/cp/ * cp-tree.h (struct cxx_int_tree_map): Delete. (struct cxx_int_tree_map_hasher): Delete. * cp-gimplify.c (cxx_int_tree_map_hasher::equal): Delete. (cxx_int_tree_map_hasher::hash): Delete.
2020-11-02c++: Don't purge the satisfaction cachesPatrick Palka6-30/+4
The adoption of P2104 ("Disallow changing concept values") means we can memoize the result of satisfaction indefinitely and no longer have to clear the satisfaction caches on various events that would affect satisfaction. To that end, this patch removes the invalidation routine clear_satisfaction_cache and adjusts its callers appropriately. This provides a large reduction in compile time and memory use in some cases. For example, on the libstdc++ test std/ranges/adaptor/join.cc, compile time and memory usage drops nearly 75%, from 7.5s/770MB to 2s/230MB, with a --enable-checking=release compiler. gcc/cp/ChangeLog: * class.c (finish_struct_1): Don't call clear_satisfaction_cache. * constexpr.c (clear_cv_and_fold_caches): Likewise. Remove bool parameter. * constraint.cc (clear_satisfaction_cache): Remove definition. * cp-tree.h (clear_satisfaction_cache): Remove declaration. (clear_cv_and_fold_caches): Remove bool parameter. * typeck2.c (store_init_value): Remove argument to clear_cv_and_fold_caches. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-complete1.C: Delete test that became ill-formed after P2104.
2020-11-02Add bcd builtings listed in appendix B of the ABICarl Love9-36/+823
2020-10-29 Carl Love <cel@us.ibm.com> gcc/ PR target/93449 * config/rs6000/altivec.h (__builtin_bcdadd, __builtin_bcdadd_lt, __builtin_bcdadd_eq, __builtin_bcdadd_gt, __builtin_bcdadd_ofl, __builtin_bcdadd_ov, __builtin_bcdsub, __builtin_bcdsub_lt, __builtin_bcdsub_eq, __builtin_bcdsub_gt, __builtin_bcdsub_ofl, __builtin_bcdsub_ov, __builtin_bcdinvalid, __builtin_bcdmul10, __builtin_bcddiv10, __builtin_bcd2dfp, __builtin_bcdcmpeq, __builtin_bcdcmpgt, __builtin_bcdcmplt, __builtin_bcdcmpge, __builtin_bcdcmple): Add defines. * config/rs6000/altivec.md: Add UNSPEC_BCDSHIFT. (BCD_TEST): Add le, ge to code iterator. Add VBCD mode iterator. (bcd<bcd_add_sub>_test, *bcd<bcd_add_sub>_test2, bcd<bcd_add_sub>_<code>, bcd<bcd_add_sub>_<code>): Add mode to name. Change iterator from V1TI to VBCD. (*bcdinvalid_<mode>, bcdshift_v16qi): New define_insn. (bcdinvalid_<mode>, bcdmul10_v16qi, bcddiv10_v16qi): New define. * config/rs6000/dfp.md (dfp_denbcd_v16qi_inst): New define_insn. (dfp_denbcd_v16qi): New define_expand. * config/rs6000/rs6000-builtin.def (BU_P8V_MISC_1): New define. (BCDADD): Replaced with BCDADD_V1TI and BCDADD_V16QI. (BCDADD_LT): Replaced with BCDADD_LT_V1TI and BCDADD_LT_V16QI. (BCDADD_EQ): Replaced with BCDADD_EQ_V1TI and BCDADD_EQ_V16QI. (BCDADD_GT): Replaced with BCDADD_GT_V1TI and BCDADD_GT_V16QI. (BCDADD_OV): Replaced with BCDADD_OV_V1TI and BCDADD_OV_V16QI. (BCDSUB_V1TI, BCDSUB_V16QI, BCDSUB_LT_V1TI, BCDSUB_LT_V16QI, BCDSUB_LE_V1TI, BCDSUB_LE_V16QI, BCDSUB_EQ_V1TI, BCDSUB_EQ_V16QI, BCDSUB_GT_V1TI, BCDSUB_GT_V16QI, BCDSUB_GE_V1TI, BCDSUB_GE_V16QI, BCDSUB_OV_V1TI, BCDSUB_OV_V16QI, BCDINVALID_V1TI, BCDINVALID_V16QI, BCDMUL10_V16QI, BCDDIV10_V16QI, DENBCD_V16QI): New builtin definitions. (BCDADD, BCDADD_LT, BCDADD_EQ, BCDADD_GT, BCDADD_OV, BCDSUB, BCDSUB_LT, BCDSUB_LE, BCDSUB_EQ, BCDSUB_GT, BCDSUB_GE, BCDSUB_OV, BCDINVALID, BCDMUL10, BCDDIV10, DENBCD): New overload definitions. * config/rs6000/rs6000-call.c (P8V_BUILTIN_VEC_BCDADD, P8V_BUILTIN_VEC_BCDADD_LT, P8V_BUILTIN_VEC_BCDADD_EQ, P8V_BUILTIN_VEC_BCDADD_GT, P8V_BUILTIN_VEC_BCDADD_OV, P8V_BUILTIN_VEC_BCDINVALID, P9V_BUILTIN_VEC_BCDMUL10, P8V_BUILTIN_VEC_DENBCD. P8V_BUILTIN_VEC_BCDSUB, P8V_BUILTIN_VEC_BCDSUB_LT, P8V_BUILTIN_VEC_BCDSUB_LE, P8V_BUILTIN_VEC_BCDSUB_EQ, P8V_BUILTIN_VEC_BCDSUB_GT, P8V_BUILTIN_VEC_BCDSUB_GE, P8V_BUILTIN_VEC_BCDSUB_OV): New overloaded specifications. (CODE_FOR_bcdadd): Replaced with CODE_FOR_bcdadd_v16qi and CODE_FOR_bcdadd_v1ti. (CODE_FOR_bcdadd_lt): Replaced with CODE_FOR_bcdadd_lt_v16qi and CODE_FOR_bcdadd_lt_v1ti. (CODE_FOR_bcdadd_eq): Replaced with CODE_FOR_bcdadd_eq_v16qi and CODE_FOR_bcdadd_eq_v1ti. (CODE_FOR_bcdadd_gt): Replaced with CODE_FOR_bcdadd_gt_v16qi and CODE_FOR_bcdadd_gt_v1ti. (CODE_FOR_bcdsub): Replaced with CODE_FOR_bcdsub_v16qi and CODE_FOR_bcdsub_v1ti. (CODE_FOR_bcdsub_lt): Replaced with CODE_FOR_bcdsub_lt_v16qi and CODE_FOR_bcdsub_lt_v1ti. (CODE_FOR_bcdsub_eq): Replaced with CODE_FOR_bcdsub_eq_v16qi and CODE_FOR_bcdsub_eq_v1ti. (CODE_FOR_bcdsub_gt): Replaced with CODE_FOR_bcdsub_gt_v16qi and CODE_FOR_bcdsub_gt_v1ti. (rs6000_expand_ternop_builtin): Add CODE_FOR_dfp_denbcd_v16qi to else if. * doc/extend.texi: Add documentation for new builtins. gcc/testsuite/ * gcc.target/powerpc/bcd-2.c: Add include altivec.h. * gcc.target/powerpc/bcd-3.c: Add include altivec.h. * gcc.target/powerpc/bcd-4.c: New test.
2020-11-02c++: Some additional testsNathan Sidwell3-1/+33
I created a few tests on the modules branch that are not actually module-related. Here they are. gcc/testsuite/ * g++.dg/concepts/pack-1.C: New. * g++.dg/lookup/using53.C: Add an enum. * g++.dg/template/error25.C: Relax 'export' error check.
2020-11-02options: Tiny refactorNathan Sidwell1-3/+1
This changes more on the modules branch, but let's move the declaration to the initializer now. gcc/c-family/ * c-opts.c (c_common_post_options): Move var decl to its initialization point.