aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2021-02-01c++: alias in qualified-id in template arg [PR98570]Jason Merrill6-37/+53
template_args_equal has handled dependent alias specializations for a while, but in this testcase the actual template argument is a SCOPE_REF, so we called cp_tree_equal, which doesn't handle aliases specially when we get to them. This patch generalizes this by setting a flag so structural_comptypes will check for template alias equivalence (if we aren't doing partial ordering). The existing flag, comparing_specializations, was too broad; in particular, when we're doing decls_match, we want to treat corresponding parameters as equivalent, so we need to separate that from alias comparison. So I introduce the comparing_dependent_aliases flag. From looking at other uses of comparing_specializations, it seems to me that the new flag is what modules wants, as well. The other use of comparing_specializations in structural_comptypes is a hack to deal with spec_hasher::equal not calling push_to_top_level, which we also don't want to tie to the alias comparison semantics. This patch also changes how we get to structural comparison of aliases from checking TYPE_CANONICAL in comptypes to marking the aliases as getting structural comparison when they are built, which is more consistent with how e.g. typename is handled. As I mention in the comment for comparing_dependent_aliases, I think the default should be to treat different dependent aliases for the same type as distinct, only treating them as equal during deduction (particularly partial ordering). But that's a matter for the C++ committee, to try in stage 1. gcc/cp/ChangeLog: PR c++/98570 * cp-tree.h: Declare it. * pt.c (comparing_dependent_aliases): New flag. (template_args_equal, spec_hasher::equal): Set it. (dependent_alias_template_spec_p): Assert that we don't get non-types other than error_mark_node. (instantiate_alias_template): SET_TYPE_STRUCTURAL_EQUALITY on complex alias specializations. Set TYPE_DEPENDENT_P here. (tsubst_decl): Not here. * module.cc (module_state::read_cluster): Set comparing_dependent_aliases instead of comparing_specializations. * tree.c (cp_tree_equal): Remove comparing_specializations module handling. * typeck.c (structural_comptypes): Adjust. (comptypes): Remove comparing_specializations handling. gcc/testsuite/ChangeLog: PR c++/98570 * g++.dg/cpp0x/alias-decl-targ1.C: New test.
2021-02-01testsuite: aarch64: Add tests for vmlXl_high intrinsicsJonathan Wright12-0/+445
Add tests for vmlal_high_* and vmlsl_high_* Neon intrinsics. Since these intrinsics are only supported for AArch64, these tests are restricted to only run on AArch64 targets. gcc/testsuite/ChangeLog: 2021-01-31 Jonathan Wright <jonathan.wright@arm.com> * gcc.target/aarch64/advsimd-intrinsics/vmlXl_high.inc: New test template. * gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_lane.inc: New test template. * gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_laneq.inc: New test template. * gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_n.inc: New test. * gcc.target/aarch64/advsimd-intrinsics/vmlal_high.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vmlal_high_lane.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vmlal_high_laneq.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vmlal_high_n.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vmlsl_high.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_lane.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_laneq.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_n.c: New test.
2021-02-01testsuite: aarch64: Add tests for vmull_high intrinsicsJonathan Wright4-0/+277
Add tests for vmull_high_* Neon intrinsics. Since these intrinsics are only supported for AArch64, these tests are restricted to only run on AArch64 targets. gcc/testsuite/ChangeLog: 2021-01-29 Jonathan Wright <jonathan.wright@arm.com> * gcc.target/aarch64/advsimd-intrinsics/vmull_high.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vmull_high_lane.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vmull_high_laneq.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vmull_high_n.c: New test.
2021-02-01AArch64: Change canonization of smlal and smlsl in order to be able to ↵Tamar Christina2-9/+54
optimize the vec_dup g:87301e3956d44ad45e384a8eb16c79029d20213a and g:ee4c4fe289e768d3c6b6651c8bfa3fdf458934f4 changed the intrinsics to be proper RTL but accidentally ended up creating a regression because of the ordering in the RTL pattern. The existing RTL that combine should try to match to remove the vec_dup is aarch64_vec_<su>mlal_lane<Qlane> and aarch64_vec_<su>mult_lane<Qlane> which expects the select register to be the second operand of mult. The pattern introduced has it as the first operand so combine was unable to remove the vec_dup. This flips the order such that the patterns optimize correctly. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_<su>mlal_n<mode>, aarch64_<su>mlsl<mode>, aarch64_<su>mlsl_n<mode>): Flip mult operands. gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/smlal-smlsl-mull-optimized.c: New test.
2021-02-01c++: Add testcase for PR84494Patrick Palka1-0/+11
We correctly accept this testcase ever since r10-5143. gcc/testsuite/ChangeLog: PR c++/84494 * g++.dg/cpp1y/constexpr-84494.C: New test.
2021-02-01RISC-V: Fix gcc.target/riscv/attribute-18.cXing GUO1-1/+1
gcc/testsuite/ChangeLog: * gcc.target/riscv/attribute-18.c: Add -mriscv-attribute option.
2021-02-01rtl-optimization/98863 - prune RD with LIVE in STVRichard Biener1-1/+1
This sets DF_RD_PRUNE_DEAD_DEFS like all other uses of the UD/DU chain problems which makes the RD problem consume a lot less memory. 2021-02-01 Richard Biener <rguenther@suse.de> PR rtl-optimization/98863 * config/i386/i386-features.c (convert_scalars_to_vector): Set DF_RD_PRUNE_DEAD_DEFS.
2021-01-31testsuite: Update pr79251 ilp32 store regexXionghu Luo2-2/+2
BE ilp32 Linux generates extra stack stwu instructions which shouldn't be counted in, \m … \M is needed around each instruction, not just the beginning and end of the entire pattern. gcc/testsuite/ChangeLog: 2021-02-01 Xionghu Luo <luoxhu@linux.ibm.com> * gcc.target/powerpc/pr79251.p8.c: Update store count regex. * gcc.target/powerpc/pr79251.p9.c: Likewise.
2021-02-01Daily bump.GCC Administrator3-1/+13
2021-01-31Add missing definition of SIZE_MAXEric Botcazou1-0/+4
If the stdint.h system file follows the ISO C99 specification, it might not define SIZE_MAX in C++ by default, so provide a local fallback. gcc/ * system.h (SIZE_MAX): Define if not already defined.
2021-01-31testsuite, Darwin : Skip ELF-specific tests.Iain Sandoe5-0/+5
A number of ELF-specific tests were introduced in r11-6140, one of which fails on all Mach-O/Darwin platforms. On examination, the tests have no meaningful parallel for Mach-O which dead strips at the symbol level, and does not make use of function sections (the fact that a used and an unused symbol are placed in the same section will not affect dead stripping). Given that the tests do not demonstrate anything useful on Darwin, skip them. gcc/testsuite/ChangeLog: * c-c++-common/attr-used-5.c: Skip for Darwin. * c-c++-common/attr-used-6.c: Likewise. * c-c++-common/attr-used-7.c: Likewise. * c-c++-common/attr-used-8.c: Likewise. * c-c++-common/attr-used-9.c: Likewise.
2021-01-31Daily bump.GCC Administrator4-1/+54
2021-01-30testsuite: Update pr79251 ilp32 store counts.David Edelsohn2-2/+2
With the recent changes to vector insert optimization, the number of expected stores for the two testcases has changed. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr79251.p8.c: Update ilp32 store counts. * gcc.target/powerpc/pr79251.p9.c: Same.
2021-01-30Fusion patterns for logical-logicalAaron Sawdey5-1/+2279
This patch adds a new function to genfusion.pl to generate patterns for logical-logical fusion. They are enabled by default for power10 and can be disabled by -mno-power10-fusion-2logical or -mno-power10-fusion. gcc/ChangeLog * config/rs6000/genfusion.pl (gen_2logical): New function to generate patterns for logical-logical fusion. * config/rs6000/fusion.md: Regenerated patterns. * config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION_2LOGICAL. * config/rs6000/rs6000.c (rs6000_option_override_internal): Enable logical-logical fusion for p10. * config/rs6000/rs6000.opt: Add -mpower10-fusion-2logical.
2021-01-30aix: add periods to option explanation.David Edelsohn1-2/+2
gcc/ChangeLog: * config/rs6000/rs6000.opt: Add periods to new AIX options.
2021-01-30aix: Permit use of AIX Vector extended ABI modeDavid Edelsohn4-3/+20
AIX only permits use of Altivec VSRs 20-31 in a Vector Extended ABI mode. This patch explicitly enables use of the VSRs using the new -mabi=vec-extabi command line option also implemented in LLVM for AIX. Bootstrapped on powerpc-ibm-aix7.2.3.0 and powerpc64le-linux-gnu. gcc/ChangeLog: * config/rs6000/rs6000.opt (mabi=vec-extabi): New. (mabi=vec-default): New. * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define __EXTABI__ for AIX Vector extended ABI. * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print AIX Vector extabi info. (conditional_register_usage): If AIX vec_extabi enabled, vs20-vs31 are non-volatile. * doc/invoke.texi (PowerPC mabi): Add AIX vec-extabi and vec-default.
2021-01-30libphobos: Synchronize libdruntime bindings with upstream druntimeIain Buclaw1-16/+0
Reviewed-on: https://github.com/dlang/druntime/pull/3348 gcc/d/ChangeLog: * typeinfo.cc (TypeInfoVisitor::visit (TypeInfoDeclaration *)): Don't layout m_arg1 and m_arg2 fields. libphobos/ChangeLog: * Makefile.in: Regenerate. * configure: Regenerate. * libdruntime/MERGE: Merge upstream druntime e4aae28e. * libdruntime/Makefile.am (DRUNTIME_DSOURCES): Refresh module list. (DRUNTIME_DSOURCES_BIONIC): Add core/sys/bionic/err.d. (DRUNTIME_DSOURCES_DARWIN): Add core/sys/darwin/err.d, core/sys/darwin/ifaddrs.d, core/sys/darwin/mach/nlist.d, core/sys/darwin/mach/stab.d, and core/sys/darwin/sys/attr.d. (DRUNTIME_DSOURCES_DRAGONFLYBSD): Add core/sys/dragonflybsd/err.d. (DRUNTIME_DSOURCES_FREEBSD): Add core/sys/freebsd/err.d. (DRUNTIME_DSOURCES_LINUX): Add core/sys/linux/err.d. (DRUNTIME_DSOURCES_NETBSD): Add core/sys/netbsd/err.d. (DRUNTIME_DSOURCES_OPENBSD): Add core/sys/openbsd/err.d. (DRUNTIME_DSOURCES_POSIX): Add core/sys/posix/locale.d, core/sys/posix/stdc/time.d, core/sys/posix/string.d, and core/sys/posix/strings.d. (DRUNTIME_DSOURCES_SOLARIS): Add core/sys/solaris/err.d. (DRUNTIME_DSOURCES_WINDOWS): Add core/sys/windows/sdkddkver.d, and core/sys/windows/stdc/time.d * libdruntime/Makefile.in: Regenerate. * libdruntime/gcc/sections/elf_shared.d (sizeofTLS): New function. * testsuite/libphobos.thread/fiber_guard_page.d: Use __traits(getMember) to get internal fields.
2021-01-30i386, df: Fix up gcc.c-torture/compile/20051216-1.c -O1 -march=cascadelakeJakub Jelinek2-0/+6
> rtl-optimization/98863 - tame i386 specific RPAD pass > > caused > > FAIL: gcc.c-torture/compile/20051216-1.c -O1 (internal compiler error) > FAIL: gcc.c-torture/compile/20051216-1.c -O1 (test for excess errors) The problem is that we don't revert the df flags back. This patch fixes it by clearing DF_DEFER_INSN_RESCAN after calling df_process_deferred_rescans, so that it doesn't leak into following unprepared passes that expect non-deferred rescans. 2021-01-30 Jakub Jelinek <jakub@redhat.com> * config/i386/i386-features.c (remove_partial_avx_dependency): Clear DF_DEFER_INSN_RESCAN after calling df_process_deferred_rescans. * gcc.target/i386/20051216-1.c: New test.
2021-01-30testsuite: Fix up gomp/simd-{2,3}.c tests [PR98243]Jakub Jelinek2-2/+4
The test (intentionally) is not gcc.dg/vect/, as it needs -fopenmp and uses OpenMP directives other than simd and therefore can't rely on default VECTFLAGS and so I think can't safely use vect_int effective target either. So, I'm just making sure it is vectorized on x86 and on aarch64 (the latter as an example of a target that doesn't need any extra options to get the vectorization). 2021-01-30 Jakub Jelinek <jakub@redhat.com> PR testsuite/98243 * gcc.dg/gomp/simd-2.c: Add -msse2 on x86. Restrict scan-tree-dump-times to x86 and aarch64 targets. * gcc.dg/gomp/simd-3.c: Likewise.
2021-01-30Daily bump.GCC Administrator5-1/+302
2021-01-29internal/cpu: correctly link to getsystemcfgClément Chigot1-1/+1
Directly set getsystemcfg as //extern in internal/cpu instead of trying to use the runtime as in Go toolchain. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/287932
2021-01-29PR testsuite/98870: Fix IEEE 128-bit fortran testMichael Meissner1-1/+1
This test started failing when I changed the mapping of IEEE 128-bit long double built-in functions on 2021-01-28. This patch fixes the test so it uses the correct name. gcc/testsuite/ 2021-01-29 Michael Meissner <meissner@linux.ibm.com> PR testsuite/98870 * gcc.target/powerpc/ppc-fortran/ieee128-math.f90: Fix the expected result.
2021-01-29[PATCH, rs6000] Fix typo in gcc.target/pr91903.c dg-require stanzaWill Schmidt1-1/+1
Fix obvious typo in testcases dg-require stanza. 2021-01-29 Will Schmidt <will_schmidt@vnet.ibm.como> gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr91903.c: Fix dg-require stanza.
2021-01-29[PR97701] Modify test for trunkVladimir N. Makarov1-1/+1
Original test was for gcc-10. The modified one for trunk. gcc/testsuite/ChangeLog: PR target/97701 * gcc.target/aarch64/pr97701.c: Modify.
2021-01-29analyzer: consolidate conditionals in pathsDavid Malcolm5-0/+283
This patch adds a simplification to analyzer paths for repeated CFG edges generated from compound conditionals. For example, it simplifies: | 5 | if (a && b && c) | | ^~~~~~~~~~~~ | | | | | | | | | (4) ...to here | | | | (5) following ‘true’ branch (when ‘c != 0’)... | | | (2) ...to here | | | (3) following ‘true’ branch (when ‘b != 0’)... | | (1) following ‘true’ branch (when ‘a != 0’)... | 6 | __analyzer_dump_path (); | | ~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (6) ...to here to: | 5 | if (a && b && c) | | ^ | | | | | (1) following ‘true’ branch... | 6 | __analyzer_dump_path (); | | ~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (2) ...to here gcc/analyzer/ChangeLog: * checker-path.cc (event_kind_to_string): Handle EK_START_CONSOLIDATED_CFG_EDGES and EK_END_CONSOLIDATED_CFG_EDGES. (start_consolidated_cfg_edges_event::get_desc): New. (checker_path::cfg_edge_pair_at_p): New. * checker-path.h (enum event_kind): Add EK_START_CONSOLIDATED_CFG_EDGES and EK_END_CONSOLIDATED_CFG_EDGES. (class start_consolidated_cfg_edges_event): New class. (class end_consolidated_cfg_edges_event): New class. (checker_path::delete_events): New. (checker_path::replace_event): New. (checker_path::cfg_edge_pair_at_p): New decl. * diagnostic-manager.cc (diagnostic_manager::prune_path): Call consolidate_conditions. (same_line_as_p): New. (diagnostic_manager::consolidate_conditions): New. * diagnostic-manager.h (diagnostic_manager::consolidate_conditions): New decl. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/combined-conditionals-1.c: New test.
2021-01-29[PR97701] LRA: Don't narrow class only for REG or MEM.Vladimir N. Makarov2-6/+23
Reload pseudos of ALL_REGS class did not narrow class from constraint in insn (set (pseudo) (lo_sum ...)) because lo_sum is considered an object (OBJECT_P) although the insn is not a classic move. To permit narrowing we are starting to use MEM_P and REG_P instead of OBJECT_P. gcc/ChangeLog: PR target/97701 * lra-constraints.c (in_class_p): Don't narrow class only for REG or MEM. gcc/testsuite/ChangeLog: PR target/97701 * gcc.target/aarch64/pr97701.c: New.
2021-01-29libgo: update to Go1.16rc1Ian Lance Taylor1-1/+1
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/287493
2021-01-29[PATCH, rs6000] improve vec_ctf invalid parameter handling.Will Schmidt4-6/+81
Hi, Per PR91903, GCC ICEs when we attempt to pass a variable (or out of range value) into the vec_ctf() builtin. Per investigation, the parameter checking exists for this builtin with the int types, but was missing for the long long types. This problem also occurs for the vec_cts() builtin, which is also fixed by this patch. This patch adds the missing CODE_FOR_* entries to the rs6000_expand_binup_builtin to cover that scenario. This patch also updates some existing tests to remove calls to vec_ctf() and vec_cts() that contain negative values. PR target/91903 2020-01-29 Will Schmidt <will_schmidt@vnet.ibm.com> gcc/ChangeLog: * config/rs6000/rs6000-call.c (rs6000_expand_binup_builtin): Add clauses for CODE_FOR_vsx_xvcvuxddp_scale and CODE_FOR_vsx_xvcvsxddp_scale to the parameter checking code. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr91903.c: New test. * gcc.target/powerpc/builtins-1.fold.h: Update. * gcc.target/powerpc/builtins-2.c: Update.
2021-01-29c++: Fix unordered entity array [PR 98843]Nathan Sidwell4-11/+52
A couple of module invariants are that the modules are always allocated in ascending order and appended to the module array. The entity array is likewise ordered, with each module having spans in that array in ascending order. Prior to header-units, this was provided by the way import declarations were encountered. With header-units we need to load the preprocessor state of header units before we parse the C++, and this can lead to incorrect ordering of the entity array. I had made the initialization of a module's language state a little too lazy. This moves the allocation of entity array spans into the initial read of a module, thus ensuring the ordering of those spans. We won't be looking in them until we've loaded the language portions of that particular module, and even if we did, we'd find NULLs there and issue a diagnostic. PR c++/98843 gcc/cp/ * module.cc (module_state_config): Add num_entities field. (module_state::read_entities): The entity_ary span is already allocated. (module_state::write_config): Write num_entities. (module_state::read_config): Read num_entities. (module_state::write): Set config's num_entities. (module_state::read_initial): Allocate the entity ary span here. (module_state::read_language): Do not set entity_lwm here. gcc/testsuite/ * g++.dg/modules/pr98843_a.C: New. * g++.dg/modules/pr98843_b.H: New. * g++.dg/modules/pr98843_c.C: New.
2021-01-29tree-optimization/98866 - Compile time hog in VRPAndrew MacLeod3-5/+29
Don't track [1, +INF] for pointer types, treat them as invariant for caching purposes as they cannot be further refined without evaluating to UNDEFINED. PR tree-optimization/98866 * gimple-range-gori.h (gori_compute:set_range_invariant): New. * gimple-range-gori.cc (gori_map::set_range_invariant): New. (gori_map::m_maybe_invariant): Rename from all_outgoing. (gori_map::gori_map): Rename all_outgoing to m_maybe_invariant. (gori_map::is_export_p): Ditto. (gori_map::calculate_gori): Ditto. (gori_compute::set_range_invariant): New. * gimple-range.cc (gimple_ranger::range_of_stmt): Set range invariant for pointers evaluating to [1, +INF].
2021-01-29rtl-optimization/98863 - tame i386 specific RPAD passRichard Biener1-10/+7
This removes analyzing DF with expensive problems which we do not use at all and which somehow cause 5GB of memory to leak. Instead just do a defered rescan of added insns. 2021-01-29 Richard Biener <rguenther@suse.de> PR rtl-optimization/98863 * config/i386/i386-features.c (remove_partial_avx_dependency): Do not perform DF analysis. (pass_data_remove_partial_avx_dependency): Remove TODO_df_finish.
2021-01-29aarch64: Use RTL builtins for [su]mull_n intrinsicsJonathan Wright3-24/+20
Rewrite [su]mull_n Neon intrinsics to use RTL builtins rather than inline assembly code, allowing for better scheduling and optimization. gcc/ChangeLog: 2021-01-19 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd-builtins.def: Add [su]mull_n builtin generator macros. * config/aarch64/aarch64-simd.md (aarch64_<su>mull_n<mode>): Define. * config/aarch64/arm_neon.h (vmull_n_s16): Use RTL builtin instead of inline asm. (vmull_n_s32): Likewise. (vmull_n_u16): Likewise. (vmull_n_u32): Likewise.
2021-01-29aarch64: Reimplement vabdl_high* intrinsics using builtinsKyrylo Tkachov3-41/+15
This patch reimplements the vabdl_high intrinsics using builtins. It slightly cleans up the RTL pattern (the mode iterators) but nothing interesting apart from that. gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def (sabdl2, uabdl2): Define builtins. * config/aarch64/aarch64-simd.md (aarch64_<sur>abdl2<mode>_3): Rename to... (aarch64_<sur>abdl2<mode>): ... This. (<sur>sadv16qi): Adjust use of above. * config/aarch64/arm_neon.h (vabdl_high_s8): Reimplement using builtin. (vabdl_high_s16): Likewise. (vabdl_high_s32): Likewise. (vabdl_high_u8): Likewise. (vabdl_high_u16): Likewise. (vabdl_high_u32): Likewise.
2021-01-29aarch64: Re-implement vabal_high* intrinsics using builtinsKyrylo Tkachov5-36/+27
This patch reimplements the vabal_high* intrinsics using RTL builtins. It's straightforward, defining new unspecs and a new pattern. gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def (sabal2): Define builtin. (uabal2): Likewise. * config/aarch64/aarch64-simd.md (aarch64_<sur>abal2<mode>): New pattern. * config/aarch64/aarch64.md (unspec): Add UNSPEC_SABAL2 and UNSPEC_UABAL2. * config/aarch64/arm_neon.h (vabal_high_s8): Reimplement using builtin. (vabal_high_s16): Likewise. (vabal_high_s32): Likewise. (vabal_high_u8): Likewise. (vabal_high_u16): Likewise. (vabal_high_u32): Likewise. * config/aarch64/iterators.md (ABAL2): New mode iterator. (sur): Handle UNSPEC_SABAL2, UNSPEC_UABAL2.
2021-01-29aarch64: Reimplement vabal* intrinsics using builtinsKyrylo Tkachov3-45/+21
This patch reimplements the vabal intrinsics with builtins. The RTL pattern is cleaned up to emit the right .8b suffixes for the inputs (though .16b is also accepted) and iterate over the right modes. The pattern's only other use is through the sadv16qi expander, which is adjusted. I've verified that the codegen for sadv16qi is not worse off. gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def (sabal): Define builtin. (uabal): Likewise. * config/aarch64/aarch64-simd.md (aarch64_<sur>abal<mode>_4): Rename to... (aarch64_<sur>abal<mode>): ... This (<sur>sadv16qi): Adust use of the above. * config/aarch64/arm_neon.h (vabal_s8): Reimplement using builtin. (vabal_s16): Likewise. (vabal_s32): Likewise. (vabal_u8): Likewise. (vabal_u16): Likewise. (vabal_u32): Likewise.
2021-01-29aarch64: Reimplement vaddlv* intrinsics using builtinsKyrylo Tkachov5-66/+104
This patch reimplements the vaddlv* intrinsics using builtins. The vaddlv_s32 and vaddlv_u32 intrinsics actually perform a pairwise SADDLP/UADDLP instead of a SADDLV/UADDLV but because they only use two elements it has the same semantics. gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def (saddlv, uaddlv): Define builtins. * config/aarch64/aarch64-simd.md (aarch64_<su>addlv<mode>): Define. * config/aarch64/arm_neon.h (vaddlv_s8): Reimplement using builtin. (vaddlv_s16): Likewise. (vaddlv_u8): Likewise. (vaddlv_u16): Likewise. (vaddlvq_s8): Likewise. (vaddlvq_s16): Likewise. (vaddlvq_s32): Likewise. (vaddlvq_u8): Likewise. (vaddlvq_u16): Likewise. (vaddlvq_u32): Likewise. (vaddlv_s32): Likewise. (vaddlv_u32): Likewise. * config/aarch64/iterators.md (VDQV_L): New mode iterator. (unspec): Add UNSPEC_SADDLV, UNSPEC_UADDLV. (Vwstype): New mode attribute. (Vwsuf): Likewise. (VWIDE_S): Likewise. (USADDLV): New int iterator. (su): Handle UNSPEC_SADDLV, UNSPEC_UADDLV. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/vaddlv_1.c: New test.
2021-01-29aarch64: Use RTL builtins for [su]mlsl_lane[q] intrinsicsJonathan Wright3-104/+77
Rewrite [su]mlsl_lane[q] Neon intrinsics to use RTL builtins rather than inline assembly code, allowing for better scheduling and optimization. gcc/ChangeLog: 2021-01-28 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd-builtins.def: Add [su]mlsl_lane[q] builtin generator macros. * config/aarch64/aarch64-simd.md (aarch64_vec_<su>mlsl_lane<Qlane>): Define. * config/aarch64/arm_neon.h (vmlsl_lane_s16): Use RTL builtin instead of inline asm. (vmlsl_lane_s32): Likewise. (vmlsl_lane_u16): Likewise. (vmlsl_lane_u32): Likewise. (vmlsl_laneq_s16): Likewise. (vmlsl_laneq_s32): Likewise. (vmlsl_laneq_u16): Likewise. (vmlsl_laneq_u32): Likewise.
2021-01-29change unit of --param max-gcse-memory to kBRichard Biener3-5/+5
This changes it from bytes to kB since its value is limited to 2147483648. 2021-01-29 Richard Biener <rguenther@suse.de> * doc/invoke.texi (--param max-gcse-memory): Document unit of size. * gcse.c (gcse_or_cprop_is_too_expensive): Adjust. * params.opt (--param max-gcse-memory): Adjust default and document unit of size.
2021-01-29rtl-optimization/98863 - fix PRE/CPROP memory usage checkRichard Biener1-5/+6
This fixes overflow of the memory usage estimate in turn failing to disable itself on WRF with LTO, causing a few GBs worth of memory peak. 2021-01-29 Richard Biener <rguenther@suse.de> PR rtl-optimization/98863 * gcse.c (gcse_or_cprop_is_too_expensive): Use unsigned HOST_WIDE_INT for the memory estimate.
2021-01-29tree-optimization/97627 - Avoid computing niters for fake edgesRichard Biener2-0/+49
This avoids computing niters information for fake edges. 2021-01-29 Bin Cheng <bin.cheng@linux.alibaba.com> Richard Biener <rguenther@suse.de> PR tree-optimization/97627 * tree-ssa-loop-niter.c (number_of_iterations_exit_assumptions): Do not analyze fake edges. * g++.dg/pr97627.C: New testcase.
2021-01-29rtl-optimization/98144 - tame REE memory usageRichard Biener2-7/+22
This changes the REE dataflow to change the explicit all-ones starting solution to be implicit via a visited flag, removing the need to initially start with fully populated bitmaps for all basic-blocks. That reduces peak memory use when compiling the RTL checking enabled insn-extract.c testcase from PR98144 from 6GB to less than 2GB. 2021-01-29 Richard Biener <rguenther@suse.de> PR rtl-optimization/98144 * df.h (df_mir_bb_info): Add con_visited member. * df-problems.c (df_mir_alloc): Initialize con_visited, do not fully populate IN and OUT. (df_mir_reset): Likewise. (df_mir_confluence_0): Set con_visited. (df_mir_confluence_n): Properly handle implicitely fully populated IN and OUT as designated by con_visited and update con_visited accordingly.
2021-01-29arm: Fix up -mcpu=iwmmxt ICEs [PR98849]Jakub Jelinek2-4/+64
The https://gcc.gnu.org/r11-6707-g7432f255b70811dafaf325d94036ac580891de69 https://gcc.gnu.org/r11-6708-gbfab355012ca0f5219da8beb04f2fdaf757d34b7 changes moved the vashl/vashr/vlshr expanders from neon.md to vec-common.md and changed their condition from TARGET_NEON to ARM_HAVE_<MODE>_ARITH, so that they apply also for TARGET_HAVE_MVE. But, the ARM_HAVE_<MODE>_ARITH macros are sometimes true also for TARGET_REALLY_IWMMXT, which at least from quick skimming of former iwmmxt*.md doesn't have such instructions, so it seems incorrect to enable them for iwmmxt. Furthermore, even if it had them, iwmmxt doesn't support any way to broadcast values in those modes (vec_duplicate and vec_init optabs) and the middle end relies on if the vector x vector shift/rotate patterns are supported it can emit vector x scalar shift/rotate by broadcasting the shift amount to a vector. As the TARGET_NEON vs. TARGET_REALLY_IWMMXT vs. TARGET_HAVE_MVE never seem to be enabled together, I think we can just write it the following way. Note, seems iwmmxt actually does support vector x scalar shifts, but doesn't really enable the optabs that would tell the middle-end code that it does (and neon and mve don't seem to support those). I'll defer that to anybody that cares about iwmmxt (if any). 2021-01-29 Jakub Jelinek <jakub@redhat.com> PR target/98849 * config/arm/vec-common.md (mve_vshlq_<supf><mode>, vashl<mode>3, vashr<mode>3, vlshr<mode>3): Add && !TARGET_REALLY_IWMMXT to conditions. * gcc.c-torture/compile/pr98849.c: New test.
2021-01-29expand: Fix up find_bb_boundaries [PR98331]Jakub Jelinek2-0/+19
When expansion emits some control flow insns etc. inside of a former GIMPLE basic block, find_bb_boundaries needs to split it into multiple basic blocks. The code needs to ignore debug insns in decisions how many splits to do or where in between some non-debug insns the split should be done, but it can decide where to put debug insns if they can be kept and otherwise throws them away (they can't stay outside of basic blocks). On the following testcase, we end up in the bb from expander with control flow insn debug insns barrier some other insn (the some other insn is effectively dead after __builtin_unreachable and we'll optimize that out later). Without debug insns, we'd do the split when encountering some other insn and split after PREV_INSN (some other insn), i.e. after barrier (and the splitting code then moves the barrier in between basic blocks). But if there are debug insns, we actually split before the first debug insn that appeared after the control flow insn, so after control flow insn, and get a basic block that starts with debug insns and then has a barrier in the middle that nothing moves it out of the bb. This leads to ICEs and even if it wouldn't, different behavior from -g0. The reason for treating debug insns that way is a different case, e.g. control flow insn debug insns some other insn or even control flow insn barrier debug insns some other insn where splitting before the first such debug insn allows us to keep them while otherwise we would have to drop them on the floor, and in those situations we behave the same with -g0 and -g. So, the following patch fixes it by resetting debug_insn not just when splitting the blocks (it is set only after seeing a control flow insn and before splitting for it if needed), but also when seeing a barrier, which effectively means we always throw away debug insns after a control flow insn and before following barrier if any, but there is no way around that, control flow insn must be the last in the bb (BB_END) and BARRIER after it, debug insns aren't allowed outside of bb. We still handle the other cases fine (when there is no barrier or when debug insns appear only after the barrier). 2021-01-29 Jakub Jelinek <jakub@redhat.com> PR debug/98331 * cfgbuild.c (find_bb_boundaries): Reset debug_insn when seeing a BARRIER. * gcc.dg/pr98331.c: New test.
2021-01-29testsuite: Run vec_insert case on P8 and P9 with option specifiedXionghu Luo6-34/+45
Move run_test and TEST_VEC_INSERT_ALL to header file for share usage. gcc/testsuite/ChangeLog: 2021-01-29 Xionghu Luo <luoxhu@linux.ibm.com> * gcc.target/powerpc/pr79251.p8.c: Move TEST_VEC_INSERT_ALL to ... * gcc.target/powerpc/pr79251.h: ...this. * gcc.target/powerpc/pr79251.p9.c: Likewise. * gcc.target/powerpc/pr79251-run.c: Move run_test to pr79251.h. Rename to... * gcc.target/powerpc/pr79251-run.p8.c: ...this. * gcc.target/powerpc/pr79251-run.p9.c: New test.
2021-01-28c++: Fix infinite looping with invalid operator [PR96137]Marek Polacek2-1/+11
My r11-86 adjusted cp_parser_class_name to do - scope = parser->scope; + scope = parser->scope ? parser->scope : parser->context->object_type; if (scope == error_mark_node) return error_mark_node; but that caused endless looping in cp_parser_type_specifier_seq (the while (true) loop) in this invalid test, because we never set a parser error, therefore cp_parser_type_specifier returned error_mark_node instead of NULL_TREE, and we never issued the "expected type-specifier" error. At first I thought I'd just add cp_parser_simulate_error right before the return, but that regresses crash81.C -- we'd emit multiple errors for "T::X". So the next best thing seemed to revert to pre-r11-86 behavior: return early when parser->scope is bad, otherwise proceed to get the parser error. gcc/cp/ChangeLog: PR c++/96137 * parser.c (cp_parser_class_name): If parser->scope is error_mark_node, return it, otherwise continue. gcc/testsuite/ChangeLog: PR c++/96137 * g++.dg/parse/error63.C: New test.
2021-01-29Daily bump.GCC Administrator7-1/+223
2021-01-28gccgo driver: always act as though -g is passedIan Lance Taylor1-0/+24
The go1 compiler always turns on debugging, to support Go stack traces and functions like runtime.Callers. With the recent switch to turn on DWARF 5 by default, this caused failures with some versions of gas, such as 2.35.1, because the assembly code would assume DWARF 5 but the driver would not pass --gdwarf-5 to gas. gas would then give an error: "file number less than one". This change avoids that problem by having the gccgo driver spec add a -g option to the command line if no other -g option is present. The newly added -g option is passed to the assembler as --gdwarf-5. * gospec.c (lang_specific_driver): Add -g if no debugging options were passed.
2021-01-29c++: Fix -Weffc++ in templates [PR98841]Jakub Jelinek2-1/+26
We emit a bogus warning on the following testcase, suggesting that the operator should return *this even when it does that already. The problem is that normally cp_build_indirect_ref_1 ensures that *this is folded as current_class_ref, but in templates (if return type is non-dependent, otherwise check_return_expr doesn't check it) it didn't go through cp_build_indirect_ref_1, but just built another INDIRECT_REF. Which means it then doesn't compare pointer-equal to current_class_ref. The following patch fixes it by doing in build_x_indirect_ref for *this what cp_build_indirect_ref_1 would do. 2021-01-28 Jakub Jelinek <jakub@redhat.com> PR c++/98841 * typeck.c (build_x_indirect_ref): For *this, return current_class_ref. * g++.dg/warn/effc5.C: New test.
2021-01-28tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]Marek Polacek4-4/+64
A year ago I submitted this patch: ~~ Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1. When we strip_typedefs the element of the array "const d", we see it's a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is char, but we need to add the const qualifier, so we call cp_build_qualified_type -> build_qualified_type where get_qualified_type checks to see if we already have such a type by walking the variants list, which in this case is: char -> c -> const char -> const char -> d -> const d Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN, we choose the first const char, which has TYPE_USER_ALIGN set. If the element type of an array has TYPE_USER_ALIGN, the array type gets it too. So we can make check_base_type stricter. I was afraid that it might make us reuse types less often, but measuring showed that we build the same amount of types with and without the patch, while bootstrapping. ~~ However, the patch broke a few tests on STRICT_ALIGNMENT platforms and had to be reverted. This is another try. The original patch is kept unchanged, but I added the finalize_type_size hunk that ought to fix the STRICT_ALIGNMENT issues. The problem is that finalize_type_size can clear TYPE_USER_ALIGN on the main variant of a type, but doesn't clear it on any of the variants. Then we end up with types which share the same TYPE_MAIN_VARIANT, but their TYPE_CANONICAL differs and then the usual "canonical types differ for identical types" follows. I've created alignas19.C to exercise this scenario. What happens is: - when parsing the class S we create a type S in xref_tag, - we see alignas(8) so common_handle_aligned_attribute sets T_U_A in S, - we parse the member function fn and build_memfn_type creates a copy of S to add const; this variant has T_U_A set, - we finish_struct S which calls layout_class_type -> finish_record_type -> finalize_size_type where we reset T_U_A in S (but const S keeps it), - finish_non_static_data_member for arr calls maybe_dummy_object with type = S, - maybe_dummy_object calls same_type_ignoring_top_level_qualifiers_p to check if S and TREE_TYPE (current_class_ref), which is const S, are the same, - same_type_ignoring_top_level_qualifiers_p creates cv-unqualified versions of the passed types. Previously we'd use our main variant S when stripping "const S" of const, but since the T_U_A flags don't match (check_base_type), we create a new variant S'. Then we crash in comptypes because S and S' have the same TYPE_MAIN_VARIANT but different TYPE_CANONICALs. With my patch we'll clear T_U_A for S's variants too, and then instead of S' we'll just use S. gcc/ChangeLog: PR c++/94775 * stor-layout.c (finalize_type_size): If we reset TYPE_USER_ALIGN in the main variant, maybe reset it in its variants too. * tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match. (check_aligned_type): Check if TYPE_USER_ALIGN match. gcc/testsuite/ChangeLog: PR c++/94775 * g++.dg/cpp0x/alignas19.C: New test. * g++.dg/warn/Warray-bounds15.C: New test.
2021-01-28arm: Adjust cost of vector of constant zeroChristophe Lyon2-5/+17
Neon vector comparisons have a dedicated version when comparing with constant zero: it means its cost is free. Adjust the cost in arm_rtx_costs_internal accordingly, for Neon only, since MVE does not support this. 2021-01-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ PR target/98730 * config/arm/arm.c (arm_rtx_costs_internal): Adjust cost of vector of constant zero for comparisons. gcc/testsuite/ PR target/98730 * gcc.target/arm/simd/vceqzq_p64.c: Update expected result.