aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2020-10-10PR97359: Do not cache relops in GORI cache.Aldy Hernandez2-8/+13
logical_stmt_cache::cacheable_p() returns true for relops, but logical_combine (which does the caching) doesn't handle them and ICEs. This patch fixes the inconsistency by returning false for relops. This was working before because even though logical_combine doesn't handle relops, statements with only one SSA are handled in cache_stmt, which seems like the only statement we've ever encountered (even through a full Fedora build). lhs = s_5 > 999; However, with two SSA operands we ICE because logical_combine doesn't handle them: lhs = s_5 > y_8; We can either return false for relops in cacheable_p, or fix logical_combine to handle them. The original idea was to only cache ANDs and ORs, so I've done the former to unbreak trunk. We can decide later if there was ever any benefit in caching relops. gcc/ChangeLog: PR tree-optimization/97359 * gimple-range-gori.cc (logical_stmt_cache::cacheable_p): Only handle ANDs and ORs. (gori_compute_cache::cache_stmt): Adjust comment. gcc/testsuite/ChangeLog: * gcc.dg/pr97359.c: New test.
2020-10-10Daily bump.GCC Administrator3-1/+208
2020-10-09Don't keep strict_low_part in reloads for non-registers. [PR97313]Vladimir N. Makarov2-1/+30
gcc/ChangeLog: 2020-10-09 Vladimir Makarov <vmakarov@redhat.com> PR rtl-optimization/97313 * lra-constraints.c (match_reload): Don't keep strict_low_part in reloads for non-registers. gcc/testsuite/ChangeLog: 2020-10-09 Vladimir Makarov <vmakarov@redhat.com> PR rtl-optimization/97313 * gcc.target/i386/pr97313.c: New.
2020-10-09x86: Add <x86gprintrin.h>H.J. Lu41-272/+463
For sources which can't use any vector instructions, <x86intrin.h> and <immintrin.h> cannot be included for compiler intrinsics: $ echo "#include <x86intrin.h>" | gcc -S -O2 -mno-sse -mno-mmx -x c - In file included from /usr/include/stdlib.h:1013, from /usr/lib/gcc/x86_64-redhat-linux/10/include/mm_malloc.h:27, from /usr/lib/gcc/x86_64-redhat-linux/10/include/xmmintrin.h:34, from /usr/lib/gcc/x86_64-redhat-linux/10/include/immintrin.h:29, from /usr/lib/gcc/x86_64-redhat-linux/10/include/x86intrin.h:32, from <stdin>:1: /usr/include/bits/stdlib-float.h: In function ‘atof’: /usr/include/bits/stdlib-float.h:26:1: error: SSE register return with SSE disabled 26 | { | ^ $ libgcc/config/i386/shadow-stack-unwind.h has a workaround: /* NB: We need _get_ssp and _inc_ssp from <cetintrin.h>. But we can't include <x86intrin.h> which ends up including <mm_malloc.h>, which includes <stdlib.h> and <errno.h> unconditionally. But we can't include any libc system headers unconditionally from libgcc. Avoid including <mm_malloc.h> here by defining _IMMINTRIN_H_INCLUDED. */ #define _IMMINTRIN_H_INCLUDED #include <cetintrin.h> #undef _IMMINTRIN_H_INCLUDED Add a standalone intrinsic header file, <x86gprintrin.h>, to provide integer only intrinsics. All integer only intrinsics are placed in <x86gprintrin.h>. <x86intrin.h> and <immintrin.h> simply include <x86gprintrin.h>. gcc/ PR target/97148 * config.gcc (extra_headers): Add x86gprintrin.h. * config/i386/adxintrin.h: Check _X86GPRINTRIN_H_INCLUDED for <x86gprintrin.h>. * config/i386/bmi2intrin.h: Likewise. * config/i386/bmiintrin.h: Likewise. * config/i386/cetintrin.h: Likewise. * config/i386/cldemoteintrin.h: Likewise. * config/i386/clflushoptintrin.h: Likewise. * config/i386/clwbintrin.h: Likewise. * config/i386/enqcmdintrin.h: Likewise. * config/i386/fxsrintrin.h: Likewise. * config/i386/ia32intrin.h: Likewise. * config/i386/lwpintrin.h: Likewise. * config/i386/lzcntintrin.h: Likewise. * config/i386/movdirintrin.h: Likewise. * config/i386/pconfigintrin.h: Likewise. * config/i386/pkuintrin.h: Likewise. * config/i386/rdseedintrin.h: Likewise. * config/i386/rtmintrin.h: Likewise. * config/i386/serializeintrin.h: Likewise. * config/i386/tbmintrin.h: Likewise. * config/i386/tsxldtrkintrin.h: Likewise. * config/i386/waitpkgintrin.h: Likewise. * config/i386/wbnoinvdintrin.h: Likewise. * config/i386/xsavecintrin.h: Likewise. * config/i386/xsaveintrin.h: Likewise. * config/i386/xsaveoptintrin.h: Likewise. * config/i386/xsavesintrin.h: Likewise. * config/i386/xtestintrin.h: Likewise. * config/i386/immintrin.h: Include <x86gprintrin.h> instead of <fxsrintrin.h>, <xsaveintrin.h>, <xsaveoptintrin.h>, <xsavesintrin.h>, <xsavecintrin.h>, <lzcntintrin.h>, <bmiintrin.h>, <bmi2intrin.h>, <xtestintrin.h>, <cetintrin.h>, <movdirintrin.h>, <sgxintrin.h, <pconfigintrin.h>, <waitpkgintrin.h>, <cldemoteintrin.h>, <enqcmdintrin.h>, <serializeintrin.h>, <tsxldtrkintrin.h>, <adxintrin.h>, <clwbintrin.h>, <clflushoptintrin.h>, <wbnoinvdintrin.h> and <pkuintrin.h>. (_wbinvd): Moved to config/i386/x86gprintrin.h. (_rdrand16_step): Likewise. (_rdrand32_step): Likewise. (_rdpid_u32): Likewise. (_readfsbase_u32): Likewise. (_readfsbase_u64): Likewise. (_readgsbase_u32): Likewise. (_readgsbase_u64): Likewise. (_writefsbase_u32): Likewise. (_writefsbase_u64): Likewise. (_writegsbase_u32): Likewise. (_writegsbase_u64): Likewise. (_rdrand64_step): Likewise. (_ptwrite64): Likewise. (_ptwrite32): Likewise. * config/i386/x86gprintrin.h: New file. * config/i386/x86intrin.h: Include <x86gprintrin.h>. Don't include <ia32intrin.h>, <lwpintrin.h>, <tbmintrin.h>, <popcntintrin.h>, <mwaitxintrin.h> and <clzerointrin.h>. gcc/testsuite/ * gcc.target/i386/avx-1.c (__builtin_ia32_lwpval32): New to support <lwpintrin.h> included in <x86gprintrin.h>. (__builtin_ia32_lwpval64): Likewise. (__builtin_ia32_lwpins32): Likewise. (__builtin_ia32_lwpins64): Likewise. (__builtin_ia32_bextri_u32): New to support <tbmintrin.h> included in <x86gprintrin.h>. (__builtin_ia32_bextri_u64): Likewise. * gcc.target/i386/x86gprintrin-1.c: New test. * gcc.target/i386/x86gprintrin-2.c: Likewise. * gcc.target/i386/x86gprintrin-3.c: Likewise. * gcc.target/i386/x86gprintrin-4.c: Likewise. * gcc.target/i386/x86gprintrin-4a.c: Likewise. * gcc.target/i386/x86gprintrin-5.c: Likewise. * gcc.target/i386/x86gprintrin-5a.c: Likewise. * gcc.target/i386/x86gprintrin-5b.c: Likewise. * gcc.target/i386/x86gprintrin-6.c: Likewise. libgcc/ PR target/97148 * config/i386/shadow-stack-unwind.h: Include <x86gprintrin.h> instead of <cetintrin.h>.
2020-10-09[nvptx] Set -misa=sm_35 by defaultTom de Vries2-2/+6
The nvptx-as assembler verifies the ptx code using ptxas, if there's any in the PATH. The default in the nvptx port for -misa=sm_xx is sm_30, but the ptxas of the latest cuda release (11.1) no longer supports sm_30. Consequently we cannot build gcc against that release (although we should still be able to build without any cuda release). Fix this by setting -misa=sm_35 by default. Tested check-gcc on nvptx. Tested libgomp on x86_64-linux with nvpx accelerator. Both build again cuda 9.1. gcc/ChangeLog: 2020-10-09 Tom de Vries <tdevries@suse.de> PR target/97348 * config/nvptx/nvptx.h (ASM_SPEC): Also pass -m to nvptx-as if default is used. * config/nvptx/nvptx.opt (misa): Init with PTX_ISA_SM35.
2020-10-09Fixup gcc.dg/vect/pr65947-3.c when masked loads are availableRichard Biener3-4/+16
The following adds a effective target to properly allow the gcc.dg/vect/pr65947-3.c expected vectorization to be adjusted when run with, say, -march=cascadelake. 2020-10-09 Richard Biener <rguenther@suse.de> gcc/ * doc/sourcebuild.texi (vect_masked_load): Document. gcc/testsuite * lib/target-supports.exp (check_effective_target_vect_masked_load): New effective target. * gcc.dg/vect/pr65947-3.c: Update.
2020-10-09tree-optimization/97334 - improve BB SLP discoveryRichard Biener2-0/+25
We're running into a multiplication with one unvectorizable operand we expect to build from scalars but SLP discovery fatally fails the build of both since one stmt is commutated: _60 = _58 * _59; _63 = _59 * _62; _66 = _59 * _65; ... where _59 is the "bad" operand. The following patch makes the case work where the first stmt has a good operand by not fatally failing the SLP build for the operand but communicating upwards how to commutate. 2020-10-09 Richard Biener <rguenther@suse.de> PR tree-optimization/97334 * tree-vect-slp.c (vect_build_slp_tree_1): Do not fatally fail lanes other than zero when BB vectorizing. * gcc.dg/vect/bb-slp-pr65935.c: Amend.
2020-10-09IPA modref: fix miscompilation in clone when IPA modref is usedJan Hubicka1-1/+2
gcc/ChangeLog: PR ipa/97292 PR ipa/97335 * ipa-modref-tree.h (copy_from): Drop summary in a clone.
2020-10-09tree-optimization/97347 - fix another SLP constant insertion issueRichard Biener2-6/+54
Just use edge insertion which will appropriately handle the situation from botan. 2020-10-09 Richard Biener <rguenther@suse.de> PR tree-optimization/97347 * tree-vect-slp.c (vect_create_constant_vectors): Use edge insertion when inserting on the fallthru edge, appropriately insert at the start of BBs when inserting after PHIs. * g++.dg/vect/pr97347.cc: New testcase.
2020-10-09Fix for PR97317.Andrew MacLeod2-7/+29
gcc/ChangeLog: PR tree-optimization/97317 * range-op.cc (operator_cast::op1_range): Handle casts where the precision of the RHS is only 1 greater than the precision of the LHS. gcc/testsuite/ChangeLog: * gcc.dg/pr97317.c: New test.
2020-10-09random memory leak fixesRichard Biener6-13/+30
This fixes leaks discovered checking whether I introduced new ones with the last vectorizer changes. 2020-10-09 Richard Biener <rguenther@suse.de> * cgraphunit.c (expand_all_functions): Free tp_first_run_order. * ipa-modref.c (pass_ipa_modref::execute): Free order. * tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Free loop body. * tree-vect-data-refs.c (vect_find_stmt_data_reference): Free data references upon failure. * tree-vect-loop.c (update_epilogue_loop_vinfo): Free BBs array of the original loop. * tree-vect-slp.c (vect_slp_bbs): Use an auto_vec for dataref_groups to release its memory.
2020-10-09vrp: Fix up gcc.target/aarch64/pr90838.c [PR97312, PR94801]Jakub Jelinek3-66/+136
> Perhaps another way out of this would be document and enforce that > __builtin_c[lt]z{,l,ll} etc calls are undefined at zero, but C[TL]Z ifn > calls are defined there based on *_DEFINED_VALUE_AT_ZERO (*) == 2 The following patch implements that, i.e. __builtin_c?z* now take full advantage of them being UB at zero, while the ifns are well defined at zero if *_DEFINED_VALUE_AT_ZERO (*) == 2. That is what fixes PR94801. Furthermore, to fix PR97312, if it is well defined at zero and the value at zero is prec, we don't lower the maximum unless the argument is known to be non-zero. For gimple-range.cc I guess we could improve it if needed e.g. by returning a [0,7][32,32] range for .CTZ of e.g. [0,137], but for now it (roughly) matches what vr-values.c does. 2020-10-09 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94801 PR target/97312 * vr-values.c (vr_values::extract_range_basic) <CASE_CFN_CLZ, CASE_CFN_CTZ>: When stmt is not an internal-fn call or C?Z_DEFINED_VALUE_AT_ZERO is not 2, assume argument is not zero and thus use [0, prec-1] range unless it can be further improved. For CTZ, don't update maxi from upper bound if it was previously prec. * gimple-range.cc (gimple_ranger::range_of_builtin_call) <CASE_CFN_CLZ, CASE_CFN_CTZ>: Likewise. * gcc.dg/tree-ssa/pr94801.c: New test.
2020-10-09match.pd: Fix up FFS -> CTZ + 1 optimization [PR97325]Jakub Jelinek2-1/+17
And no testcase was included, I'm including one below. Anyway, this PR and the other CTZ related discussions led me to discover a bug I've made earlier, CLZ/CTZ builtins have unsigned arguments and e.g. both the vr-values.cc and now gimple-range.cc code heavily relies on that, but __builtin_ffs has a signed operand and this optimization was incorrectly making the operand signed too, so I guess it would greatly confuse VRP in some cases. 2020-10-09 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/97325 * match.pd (FFS(nonzero) -> CTZ(nonzero) + 1): Cast argument to corresponding unsigned type. * gcc.c-torture/execute/pr97325.c: New test.
2020-10-09Move pr97315-1.c test to g++.dg/opt/.Aldy Hernandez1-1/+1
gcc/testsuite/ChangeLog: PR testsuite/97337 * gcc.dg/pr97315-1.c: Moved to... * g++.dg/opt/pr97315-1.C: ...here.
2020-10-09fix ICE with BB vectorization of PHIsRichard Biener2-1/+23
This fixes a vector CTOR insertion issue when we try to insert after a PHI node. 2020-10-09 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_create_constant_vectors): Properly insert after PHIs. * gcc.dg/vect/bb-slp-phis-1.c: New testcase.
2020-10-09Daily bump.GCC Administrator4-1/+342
2020-10-08c++: Fix member alias template in C++17 and up. [PR96805]Jason Merrill2-2/+16
Here we're trying to push into a<T>::c<N> in order to instantiate t<N>, but were building a TYPENAME_TYPE for it because a<T> isn't open yet. Don't do that when we know we're trying to enter the scope. gcc/cp/ChangeLog: PR c++/96805 PR c++/96199 * pt.c (tsubst_aggr_type): Don't build a TYPENAME_TYPE when entering_scope. (tsubst_template_decl): Use tsubst_aggr_type. gcc/testsuite/ChangeLog: PR c++/96805 * g++.dg/cpp0x/alias-decl-pr96805.C: New test.
2020-10-08take type from intrinsic in sincos passAlexandre Oliva3-90/+164
This is a first step towards enabling the sincos optimization in Ada. The issue this patch solves is that sincos takes the type to be looked up with mathfn_built_in from variables or temporaries passed as arguments to SIN and COS intrinsics. In Ada, different float types may be used but, despite their representation equivalence, their distinctness causes the optimization to be skipped, because they are not the types that mathfn_built_in expects. This patch introduces a function that maps intrinsics to the type they're associated with, and uses that type, obtained from the intrinsics used in calls to be optimized, to look up the correspoding CEXPI intrinsic. For the sake of defensive programming, when using the type obtained from the intrinsic, it now checks that, if different types are found for the used argument, or for other calls that use it, that the types are interchangeable. for gcc/ChangeLog * builtins.c (mathfn_built_in_type): New. * builtins.h (mathfn_built_in_type): Declare. * tree-ssa-math-opts.c (execute_cse_sincos_1): Use it to obtain the type expected by the intrinsic.
2020-10-08[PATCH, rs6000] Rename BU_P10_MISC_2 define to BU_P10_POWERPC64_MISC_2Will Schmidt1-6/+6
Rename our BU_P10_MISC_2 built-in define macro to be BU_P10_POWERPC64_MISC_2. This more accurately reflects that the macro includes the RS6000_BTM_POWERPC64 entry, and matches the style we used for the P7 equivalent. gcc/ChangeLog: * config/rs6000/rs6000-builtin.def (BU_P10_MISC_2): Rename to BU_P10_POWERPC64_MISC_2. CFUGED, CNTLZDM, CNTTZDM, PDEPD, PEXTD): Call renamed macro.
2020-10-08Disable TBAA in some uses of call_may_clobber_ref_pJan Hubicka5-9/+9
* tree-nrv.c (dest_safe_for_nrv_p): Disable tbaa in call_may_clobber_ref_p and ref_maybe_used_by_stmt_p. * tree-tailcall.c (find_tail_calls): Likewise. * tree-ssa-alias.c (call_may_clobber_ref_p): Add tbaa_p parameter. * tree-ssa-alias.h (call_may_clobber_ref_p): Update prototype. * tree-ssa-sccvn.c (vn_reference_lookup_3): Pass data->tbaa_p to call_may_clobber_ref_p_1.
2020-10-08debug: Make sure to output .file 0 when generating DWARF5.Mark Wielaard1-0/+21
When gas outputs DWARF5 .debug_line[_str] then we have to tell it the comp_dir and main file name for the zero entry line table. Otherwise gas has to guess at the CU compilation directory and file. Before a gcc -gdwarf-5 ../src/hello.c line table looked like: Directory table: 0 ../src (24) 1 ../src (24) 2 /usr/include (31) File name table: 0 hello.c (16), 0 1 hello.c (16), 1 2 stdio.h (44), 2 With this patch it looks like: Directory table: 0 /tmp/obj (0) 1 ../src (24) 2 /usr/include (31) File name table: 0 ../src/hello.c (9), 0 1 hello.c (16), 1 2 stdio.h (44), 2 gcc/ChangeLog: * dwarf2out.c (dwarf2out_finish): Emit .file 0 entry when generating DWARF5 .debug_line table through gas.
2020-10-08Improve documentation of -fallow-store-data-racesqing zhao1-1/+12
2020-10-08 John Henning <john.henning@oracle.com> gcc/ PR other/97309 * doc/invoke.texi: Improve documentation of -fallow-store-data-races.
2020-10-08arm: [MVE] Add missing __arm_vcvtnq_u32_f32 intrinsic (PR 96914)Christophe Lyon2-0/+21
__arm_vcvtnq_u32_f32 was missing from arm_mve.h, although the s32_f32 and [su]16_f16 versions were present. This patch adds the missing version and testcase, which are cut-and-paste from the other versions. 2020-10-08 Christophe Lyon <christophe.lyon@linaro.org> gcc/ PR target/96914 * config/arm/arm_mve.h (__arm_vcvtnq_u32_f32): New. gcc/testsuite/ PR target/96914 * gcc.target/arm/mve/intrinsics/vcvtnq_u32_f32.c: New test.
2020-10-08SLP vectorize multiple BBs at onceRichard Biener6-179/+203
This work from Martin Liska was motivated by gcc.dg/vect/bb-slp-22.c which shows how poorly we currently BB vectorize code like a0 = in[0] + 23; a1 = in[1] + 142; a2 = in[2] + 2; a3 = in[3] + 31; if (x > y) { b[0] = a0; b[1] = a1; b[2] = a2; b[3] = a3; } else { out[0] = a0 * (x + 1); out[1] = a1 * (y + 1); out[2] = a2 * (x + 1); out[3] = a3 * (y + 1); } namely by vectorizing the stores but not the common load (and add) they are feeded with. Thus with the following patch we change the BB vectorizer from operating on a single basic-block at a time to consider somewhat larger regions (but not the whole function yet because of issues with vector size iteration). I took the opportunity to remove the fancy region iterations again now that we operate on BB granularity and in the end need to visit PHI nodes as well. 2020-10-08 Martin Liska <mliska@suse.cz> Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_bb_vec_info::const_iterator): Remove. (_bb_vec_info::const_reverse_iterator): Likewise. (_bb_vec_info::region_stmts): Likewise. (_bb_vec_info::reverse_region_stmts): Likewise. (_bb_vec_info::_bb_vec_info): Adjust. (_bb_vec_info::bb): Remove. (_bb_vec_info::region_begin): Remove. (_bb_vec_info::region_end): Remove. (_bb_vec_info::bbs): New vector of BBs. (vect_slp_function): Declare. * tree-vect-patterns.c (vect_determine_precisions): Use regular stmt iteration. (vect_pattern_recog): Likewise. * tree-vect-slp.c: Include cfganal.h, tree-eh.h and tree-cfg.h. (vect_build_slp_tree_1): Properly refuse to vectorize volatile and throwing stmts. (vect_build_slp_tree_2): Pass group-size down to get_vectype_for_scalar_type. (_bb_vec_info::_bb_vec_info): Use regular stmt iteration, adjust for changed region specification. (_bb_vec_info::~_bb_vec_info): Likewise. (vect_slp_check_for_constructors): Likewise. (vect_slp_region): Likewise. (vect_slp_bbs): New worker operating on a vector of BBs. (vect_slp_bb): Wrap it. (vect_slp_function): New function splitting the function into multi-BB regions. (vect_create_constant_vectors): Handle the case of inserting after a throwing def. (vect_schedule_slp_instance): Adjust. * tree-vectorizer.c (vec_info::remove_stmt): Simplify again. (vec_info::insert_seq_on_entry): Adjust. (pass_slp_vectorize::execute): Also init PHIs. Call vect_slp_function. * gcc.dg/vect/bb-slp-22.c: Adjust. * gfortran.dg/pr68627.f: Likewise.
2020-10-08tree-optimization/97330 - fix bad load sinkingRichard Biener3-1/+36
This fixes bad placement of sunk loads. 2020-10-08 Richard Biener <rguenther@suse.de> PR tree-optimization/97330 * tree-ssa-sink.c (statement_sink_location): Avoid skipping PHIs when they dominate the insert location. * gcc.dg/torture/pr97330-1.c: New testcase. * gcc.dg/torture/pr97330-2.c: Likewise.
2020-10-08Fix handling of parm_offset in ipa-modref on 32bit targets.Jan Hubicka2-14/+25
* ipa-modref.c (get_access): Fix handling of offsets. * tree-ssa-alias.c (modref_may_conflict): Watch for overflows.
2020-10-08IPA MOD REF: add debug counter.Martin Liska2-0/+5
gcc/ChangeLog: * dbgcnt.def (DEBUG_COUNTER): Add ipa_mod_ref debug counter. * tree-ssa-alias.c (modref_may_conflict): Handle the counter.
2020-10-08adjust BB vectorization dump scanningRichard Biener72-83/+75
This adjusts BB vectorization testcases to look for the number of SLP subgraphs vectorized rather than for the number of basic blocks we've found opportunities in because followup patches will play with the granularity we work on, vectorizing multiple basic blocks at a time. Together with this, because I noticed when looking at non-obvious mismatches, I avoid analyzing group-size 1 SLP instances which result in pointless V1mode vectorizations. It might be interesting to work on adding sth like dg-warning to look for -fopt-info-{optimized,missing} so we could directly annotate (not) vectorized loops instead of relying on fragile counts. 2020-10-08 Richard Biener <rguenther@suse.de> * tree-vectorizer.c (try_vectorize_loop_1): Do not dump "basic block vectorized". (pass_slp_vectorize::execute): Likewise. * tree-vect-slp.c (vect_analyze_slp_instance): Avoid re-analyzing split single stmts. * g++.dg/vect/slp-pr50819.cc: Adjust. * gcc.dg/vect/bb-slp-1.c: Adjust. * gcc.dg/vect/bb-slp-10.c: Adjust. * gcc.dg/vect/bb-slp-11.c: Adjust. * gcc.dg/vect/bb-slp-13.c: Adjust. * gcc.dg/vect/bb-slp-14.c: Adjust. * gcc.dg/vect/bb-slp-15.c: Adjust. * gcc.dg/vect/bb-slp-16.c: Adjust. * gcc.dg/vect/bb-slp-17.c: Adjust. * gcc.dg/vect/bb-slp-18.c: Adjust. * gcc.dg/vect/bb-slp-19.c: Adjust. * gcc.dg/vect/bb-slp-2.c: Adjust. * gcc.dg/vect/bb-slp-20.c: Adjust. * gcc.dg/vect/bb-slp-21.c: Adjust. * gcc.dg/vect/bb-slp-22.c: Adjust. * gcc.dg/vect/bb-slp-23.c: Adjust. * gcc.dg/vect/bb-slp-24.c: Adjust. * gcc.dg/vect/bb-slp-25.c: Adjust. * gcc.dg/vect/bb-slp-26.c: Adjust. * gcc.dg/vect/bb-slp-27.c: Adjust. * gcc.dg/vect/bb-slp-28.c: Adjust. * gcc.dg/vect/bb-slp-29.c: Adjust. * gcc.dg/vect/bb-slp-3.c: Adjust. * gcc.dg/vect/bb-slp-30.c: Adjust. * gcc.dg/vect/bb-slp-31.c: Adjust. * gcc.dg/vect/bb-slp-34.c: Adjust. * gcc.dg/vect/bb-slp-35.c: Adjust. * gcc.dg/vect/bb-slp-36.c: Adjust. * gcc.dg/vect/bb-slp-38.c: Adjust. * gcc.dg/vect/bb-slp-4.c: Adjust. * gcc.dg/vect/bb-slp-45.c: Adjust. * gcc.dg/vect/bb-slp-46.c: Adjust. * gcc.dg/vect/bb-slp-48.c: Adjust. * gcc.dg/vect/bb-slp-5.c: Adjust. * gcc.dg/vect/bb-slp-6.c: Adjust. * gcc.dg/vect/bb-slp-7.c: Adjust. * gcc.dg/vect/bb-slp-8.c: Adjust. * gcc.dg/vect/bb-slp-8a.c: Adjust. * gcc.dg/vect/bb-slp-8b.c: Adjust. * gcc.dg/vect/bb-slp-9.c: Adjust. * gcc.dg/vect/bb-slp-div-2.c: Adjust. * gcc.dg/vect/bb-slp-over-widen-1.c: Adjust. * gcc.dg/vect/bb-slp-over-widen-2.c: Adjust. * gcc.dg/vect/bb-slp-pattern-2.c: Adjust. * gcc.dg/vect/bb-slp-pow-1.c: Adjust. * gcc.dg/vect/bb-slp-pr58135.c: Adjust. * gcc.dg/vect/bb-slp-pr65935.c: Adjust. * gcc.dg/vect/bb-slp-pr78205.c: Adjust. * gcc.dg/vect/bb-slp-pr81635-1.c: Adjust. * gcc.dg/vect/bb-slp-pr81635-3.c: Adjust. * gcc.dg/vect/bb-slp-pr95839-2.c: Adjust. * gcc.dg/vect/bb-slp-pr95839.c: Adjust. * gcc.dg/vect/bb-slp-pr95866.c: Adjust. * gcc.dg/vect/bb-slp-subgroups-1.c: Adjust. * gcc.dg/vect/bb-slp-subgroups-2.c: Adjust. * gcc.dg/vect/bb-slp-subgroups-3.c: Adjust. * gcc.dg/vect/fast-math-bb-slp-call-1.c: Adjust. * gcc.dg/vect/no-tree-reassoc-bb-slp-12.c: Adjust. * gcc.dg/vect/no-tree-sra-bb-slp-pr50730.c: Adjust. * gfortran.dg/vect/pr62283-2.f: Adjust. * gcc.target/i386/pr68961.c: Adjust. * gcc.target/i386/pr84101.c: Adjust. * gcc.dg/vect/bb-slp-pr81635-2.c: Adjust. * gcc.dg/vect/bb-slp-pr81635-4.c: Adjust. * gcc.dg/vect/fast-math-bb-slp-call-2.c: Adjust. * gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c: Adjust. * gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c: Adjust. * gcc.dg/vect/bb-slp-div-1.c: Adjust. * gcc.dg/vect/bb-slp-pr90006.c: Adjust. * g++.dg/vect/slp-pr50413.cc: Adjust.
2020-10-08arm: [MVE] Remove illegal intrinsics (PR target/96914)Christophe Lyon16-443/+19
A few MVE intrinsics had an unsigned variant implement while they are supported by the hardware. This patch removes them: __arm_vqrdmlashq_n_u8 __arm_vqrdmlahq_n_u8 __arm_vqdmlahq_n_u8 __arm_vqrdmlashq_n_u16 __arm_vqrdmlahq_n_u16 __arm_vqdmlahq_n_u16 __arm_vqrdmlashq_n_u32 __arm_vqrdmlahq_n_u32 __arm_vqdmlahq_n_u32 __arm_vmlaldavaxq_p_u32 __arm_vmlaldavaxq_p_u16 2020-10-08 Christophe Lyon <christophe.lyon@linaro.org> gcc/ PR target/96914 * config/arm/arm_mve.h (vqrdmlashq_n_u8, vqrdmlashq_n_u16) (vqrdmlashq_n_u32, vqrdmlahq_n_u8, vqrdmlahq_n_u16) (vqrdmlahq_n_u32, vqdmlahq_n_u8, vqdmlahq_n_u16, vqdmlahq_n_u32) (vmlaldavaxq_p_u16, vmlaldavaxq_p_u32): Remove. * config/arm/arm_mve_builtins.def (vqrdmlashq_n_u, vqrdmlahq_n_u) (vqdmlahq_n_u, vmlaldavaxq_p_u): Remove. * config/arm/unspecs.md (VQDMLAHQ_N_U, VQRDMLAHQ_N_U) (VQRDMLASHQ_N_U) (VMLALDAVAXQ_P_U): Remove unspecs. * config/arm/iterators.md (VQDMLAHQ_N_U, VQRDMLAHQ_N_U) (VQRDMLASHQ_N_U, VMLALDAVAXQ_P_U): Remove attributes. (VQDMLAHQ_N, VQRDMLAHQ_N, VQRDMLASHQ_N, VMLALDAVAXQ_P): Remove unsigned variants from iterators. * config/arm/mve.md (mve_vqdmlahq_n_<supf><mode>) (mve_vqrdmlahq_n_<supf><mode>) (mve_vqrdmlashq_n_<supf><mode>, mve_vmlaldavaxq_p_<supf><mode>): Update comment. gcc/testsuite/ PR target/96914 * gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_u16.c: Remove. * gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_u32.c: Remove. * gcc.target/arm/mve/intrinsics/vqdmlahq_n_u16.c: Remove. * gcc.target/arm/mve/intrinsics/vqdmlahq_n_u32.c: Remove. * gcc.target/arm/mve/intrinsics/vqdmlahq_n_u8.c: Remove. * gcc.target/arm/mve/intrinsics/vqrdmlahq_n_u16.c: Remove. * gcc.target/arm/mve/intrinsics/vqrdmlahq_n_u32.c: Remove. * gcc.target/arm/mve/intrinsics/vqrdmlahq_n_u8.c: Remove. * gcc.target/arm/mve/intrinsics/vqrdmlashq_n_u16.c: Remove. * gcc.target/arm/mve/intrinsics/vqrdmlashq_n_u32.c: Remove. * gcc.target/arm/mve/intrinsics/vqrdmlashq_n_u8.c: Remove.
2020-10-08arm: [MVE[ Add vqdmlashq intrinsics (PR target/96914)Christophe Lyon11-0/+288
This patch adds: vqdmlashq_m_n_s16 vqdmlashq_m_n_s32 vqdmlashq_m_n_s8 vqdmlashq_n_s16 vqdmlashq_n_s32 vqdmlashq_n_s8 2020-10-08 Christophe Lyon <christophe.lyon@linaro.org> gcc/ PR target/96914 * config/arm/arm_mve.h (vqdmlashq, vqdmlashq_m): Define. * config/arm/arm_mve_builtins.def (vqdmlashq_n_s) (vqdmlashq_m_n_s,): New. * config/arm/unspecs.md (VQDMLASHQ_N_S, VQDMLASHQ_M_N_S): New unspecs. * config/arm/iterators.md (VQDMLASHQ_N_S, VQDMLASHQ_M_N_S): New attributes. (VQDMLASHQ_N): New iterator. * config/arm/mve.md (mve_vqdmlashq_n_, mve_vqdmlashq_m_n_s): New patterns. gcc/testsuite/ PR target/96914 * gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s16.c: New test. * gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s32.c: New test. * gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s8.c: New test. * gcc.target/arm/mve/intrinsics/vqdmlashq_n_s16.c: New test. * gcc.target/arm/mve/intrinsics/vqdmlashq_n_s32.c: New test. * gcc.target/arm/mve/intrinsics/vqdmlashq_n_s8.c: New test.
2020-10-08arm: Fix ICE on glibc compilation after my DIVMOD optimization [PR97322]Jakub Jelinek2-3/+18
The arm target hook for divmod wasn't prepared to handle constants passed to the function. 2020-10-08 Jakub Jelinek <jakub@redhat.com> PR target/97322 * config/arm/arm.c (arm_expand_divmod_libfunc): Pass mode instead of GET_MODE (op0) or GET_MODE (op1) to emit_library_call_value. * gcc.dg/pr97322.c: New test.
2020-10-08Fix PR97325.Aldy Hernandez1-0/+2
gcc/ChangeLog: PR tree-optimization/97325 * gimple-range.cc (gimple_ranger::range_of_builtin_call): Handle negative numbers in __builtin_ffs and __builtin_popcount.
2020-10-08Fix PR97315 (part 2 of 2)Aldy Hernandez2-0/+20
gcc/ChangeLog: PR tree-optimization/97315 * range-op.cc (value_range_with_overflow): Change any non-overflow calculation in which both bounds are overflow/underflow to be undefined. gcc/testsuite/ChangeLog: * gcc.dg/pr97315-2.c: New test.
2020-10-08Fix PR97315 (part 1 of 2)Aldy Hernandez2-21/+54
gcc/ChangeLog: PR tree-optimization/97315 * gimple-ssa-evrp.c (hybrid_folder::choose_value): Removes the trap and instead annotates the listing. gcc/testsuite/ChangeLog: * gcc.dg/pr97315-1.c: New test.
2020-10-08openmp: Set cfun->calls_alloca when needed in OpenMP outlined regions [PR97294]Jakub Jelinek3-0/+44
The following testcase FAILs, because we don't mark the child OpenMP function as cfun->calls_alloca when it does call alloca. When optimizing, during DCE we reset those flags and recompute them again, but with -O0 DCE is not performed. Fixed by calling notice_special_calls when moving insns to the child function. cfun->calls_alloca is normally set during gimplification and most of the alloca calls omp-low.c does go through the gimplifier, but one spot didn't and built the gcall directly, so that one needs to set calls_alloca too. 2020-10-08 Jakub Jelinek <jakub@redhat.com> PR sanitizer/97294 * tree-cfg.c (move_block_to_fn): Call notice_special_calls on call stmts being moved into dest_cfun. * omp-low.c (lower_rec_input_clauses): Set cfun->calls_alloca when adding __builtin_alloca_with_align call without gimplification. * gcc.dg/asan/pr97294.c: New test.
2020-10-08c++: ICE in dependent_type_p with constrained auto [PR97052]Patrick Palka3-0/+17
This patch fixes an "unguarded" call to coerce_template_parms in build_standard_check: processing_template_decl could be zero if we get here during processing of the first 'auto' parameter of an abbreviated function template, or if we're processing the type constraint of a non-templated variable. In the testcase below, this leads to an ICE when coerce_template_parms instantiates C's dependent default template argument. gcc/cp/ChangeLog: PR c++/97052 * constraint.cc (build_type_constraint): Temporarily increment processing_template_decl before calling build_concept_check. * pt.c (make_constrained_placeholder_type): Likewise. gcc/testsuite/ChangeLog: PR c++/97052 * g++.dg/cpp2a/concepts-defarg2.C: New test.
2020-10-08c++: Set the constraints of a class type sooner [PR96229]Patrick Palka3-7/+19
In the testcase below, during processing (at parse time) of Y's base class X<Y>, convert_template_argument calls is_compatible_template_arg to check if the template argument Y is no more constrained than the parameter P. But at this point we haven't yet set Y's constraints, so get_normalized_constraints_from_decl yields NULL_TREE as the normal form and caches this result into the normalized_map. We set Y's constraints later in cp_parser_class_specifier_1 but the stale normal form in the normalized_map remains. This ultimately causes us to miss the constraint failure for Y<Z> because according to the cached normal form, Y is not constrained. This patch fixes this issue by moving up the call to associate_classtype_constraints so that we set constraints before we start processing a class's bases. gcc/cp/ChangeLog: PR c++/96229 * parser.c (cp_parser_class_specifier_1): Move call to associate_classtype_constraints from here to ... (cp_parser_class_head): ... here. * pt.c (is_compatible_template_arg): Correct documentation to say "argument is _no_ more constrained than the parameter". gcc/testsuite/ChangeLog: PR c++/96229 * g++.dg/cpp2a/concepts-class2.C: New test.
2020-10-08Daily bump.GCC Administrator6-1/+218
2020-10-07c++: Fix P0846 (ADL and function templates) in template [PR97010]Marek Polacek3-9/+77
To quickly recap, P0846 says that a name is also considered to refer to a template if it is an unqualified-id followed by a < and name lookup finds either one or more functions or finds nothing. In a template, when parsing a function call that has type-dependent arguments, we can't perform ADL right away so we set KOENIG_LOOKUP_P in the call to remember to do it when instantiating the call (tsubst_copy_and_build/CALL_EXPR). When the called function is a function template, we represent the call with a TEMPLATE_ID_EXPR; usually the operand is an OVERLOAD. In the P0846 case though, the operand can be an IDENTIFIER_NODE, when name lookup found nothing when parsing the template name. But we weren't handling this correctly in tsubst_copy_and_build. First we need to pass the FUNCTION_P argument from <case TEMPLATE_ID_EXPR> to <case IDENTIFIER_NODE>, otherwise we give a bogus error. And then in <case CALL_EXPR> we need to perform ADL. The rest of the changes is to give better errors when ADL didn't find anything. gcc/cp/ChangeLog: PR c++/97010 * pt.c (tsubst_copy_and_build) <case TEMPLATE_ID_EXPR>: Call tsubst_copy_and_build explicitly instead of using the RECUR macro. Handle a TEMPLATE_ID_EXPR with an IDENTIFIER_NODE as its operand. <case CALL_EXPR>: Perform ADL for a TEMPLATE_ID_EXPR with an IDENTIFIER_NODE as its operand. gcc/testsuite/ChangeLog: PR c++/97010 * g++.dg/cpp2a/fn-template21.C: New test. * g++.dg/cpp2a/fn-template22.C: New test.
2020-10-07libgo/configure: remove -fno-section-anchors for AIXClément Chigot1-1/+1
This option is no longer needed. There is no crash without it since at least gcc-9. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/260157
2020-10-07libgo: handle go1.10+ correctly in match.shNikhil Benesch1-1/+1
match.sh was not correctly handling build constraints for Go versions that have a two-digit suffix, like "go1.10". The same issue will arise with Go 1.100, but that is a long ways off. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/260077
2020-10-07Rename -fevrp-mode= to --param=evrp-mode=.Aldy Hernandez4-36/+36
* common.opt (-fevrp-mode): Rename and move... * params.opt (--param=evrp-mode): ...here. * gimple-range.h (DEBUG_RANGE_CACHE): Use param_evrp_mode instead of flag_evrp_mode. * gimple-ssa-evrp.c (rvrp_folder): Same. (hybrid_folder): Same. (execute_early_vrp): Same.
2020-10-07tree-optimization/97307 - improve sinking of loadsRichard Biener3-22/+43
This improves the heuristics finding a sink location for loads that does not cross any store. 2020-10-07 Richard Biener <rguenther@suse.de> PR tree-optimization/97307 * tree-ssa-sink.c (statement_sink_location): Change heuristic for not skipping stores to look for virtual definitions rather than uses. * gcc.dg/tree-ssa/ssa-sink-17.c: New testcase. * gcc.dg/vect/pr65947-3.c: XFAIL.
2020-10-07c++: Distinguish alignof and __alignof__ in cp_tree_equal [PR97273]Patrick Palka2-0/+15
cp_tree_equal currently considers alignof the same as __alignof__, but these operators are semantically different ever since r8-7957. In the testcase below, this causes the second static_assert to fail on targets where alignof(double) != __alignof__(double) because the specialization table (which uses cp_tree_equal as its equality predicate) conflates the two dependent specializations integral_constant<__alignof__(T)> and integral_constant<alignof(T)>. This patch makes cp_tree_equal distinguish between these two operators by inspecting the ALIGNOF_EXPR_STD_P flag. gcc/cp/ChangeLog: PR c++/88115 PR libstdc++/97273 * tree.c (cp_tree_equal) <case ALIGNOF_EXPR>: Return false if ALIGNOF_EXPR_STD_P differ. gcc/testsuite/ChangeLog: PR c++/88115 PR libstdc++/97273 * g++.dg/template/alignof3.C: New test.
2020-10-07Off by one final fix.Andrew MacLeod1-7/+6
Allocate the memory in an approved portable way. gcc/ChangeLog: 2020-10-06 Andrew MacLeod <amacleod@redhat.com> * value-range.h (irange_allocator::allocate): Allocate in two hunks instead of using the variably-sized trailing array approach.
2020-10-07This patch fixes PR47469 - a trivial bit of tidying up.Paul Thomas1-6/+2
2020-07-10 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/47469 * trans-expr.c (arrayfunc_assign_needs_temporary): Tidy detection of pointer and allocatable functions.
2020-10-07analyzer: handle C++ argument numbers and "this" [PR97116]David Malcolm2-14/+86
gcc/analyzer/ChangeLog: PR analyzer/97116 * sm-malloc.cc (method_p): New. (describe_argument_index): New. (inform_nonnull_attribute): Use describe_argument_index. (possible_null_arg::describe_final_event): Likewise. (null_arg::describe_final_event): Likewise. gcc/testsuite/ChangeLog: PR analyzer/97116 * g++.dg/analyzer/pr97116.C: New test.
2020-10-07Add -fdiagnostics-path-format=separate-events to -fdiagnostics-plain-outputDavid Malcolm9-151/+53
The path-printing default of -fdiagnostics-path-format=inline-events interacted poorly with -fdiagnostics-plain-output, so it makes most sense to add -fdiagnostics-path-format=separate-events to -fdiagnostics-plain-output. Seen when adding an experimental analyzer plugin to gcc.dg/plugin.exp. gcc/ChangeLog: * doc/invoke.texi (-fdiagnostics-plain-output): Add -fdiagnostics-path-format=separate-events to list of options injected by -fdiagnostics-plain-output. * opts-common.c (decode_cmdline_options_to_array): Likewise. gcc/testsuite/ChangeLog: * g++.dg/analyzer/analyzer.exp (DEFAULT_CXXFLAGS): Remove -fdiagnostics-path-format=separate-events. * gcc.dg/analyzer/analyzer.exp (DEFAULT_CFLAGS): Likewise. * gcc.dg/plugin/diagnostic-path-format-default.c: Rename to... * gcc.dg/plugin/diagnostic-path-format-plain.c: ...this. Remove dg-options directive. Copy remainder of test from diagnostic-path-format-separate-events.c. * gcc.dg/plugin/diagnostic-test-paths-2.c: Add -fdiagnostics-path-format=inline-events to options. Fix expected output for location of conditional within "for" loop. * gcc.dg/plugin/plugin.exp (plugin_test_list): Update for renaming. * gfortran.dg/analyzer/analyzer.exp (DEFAULT_FFLAGS): Remove -fdiagnostics-path-format=separate-events.
2020-10-07c++: block-scope externs get an alias [PR95677,PR31775,PR95677]Nathan Sidwell21-178/+206
This patch improves block-scope extern handling by always injecting a hidden copy into the enclosing namespace (or using a match already there). This hidden copy will be revealed if the user explicitly declares it later. We can get from the DECL_LOCAL_DECL_P local extern to the alias via DECL_LOCAL_DECL_ALIAS. This fixes several bugs and removes the kludgy per-function extern_decl_map. We only do this pushing for non-dependent local externs -- dependent ones will be pushed during instantiation. User code that expected to be able to handle incompatible local externs in different block-scopes will no longer work. That code is ill-formed. (always was, despite what 31775 claimed). I had to adjust a number of testcases that fell into this. I tried using DECL_VALUE_EXPR, but that didn't work out. Due to constexpr requirements we have to do the replacement very late (it happens in the gimplifier). Consider: extern int l[]; // #1 constexpr bool foo () { extern int l[3]; // this does not complete the type of decl #1 constexpr int *p = &l[2]; // ok return !p; } This requirement, coupled with our use of the common folding machinery makes pr97306 hard to fix, as we end up with an expression containing the two different decls for 'l', and only the c++ FE knows how to reconcile those. I punted on this. gcc/cp/ * cp-tree.h (struct language_function): Delete extern_decl_map. (DECL_LOCAL_DECL_ALIAS): New. * name-lookup.h (is_local_extern): Delete. * name-lookup.c (set_local_extern_decl_linkage): Replace with ... (push_local_extern_decl): ... this new function. (do_pushdecl): Call new function after pushing new decl. Unhide hidden non-functions. (is_local_extern): Delete. * decl.c (layout_var_decl): Do not allow VLA local externs. * decl2.c (mark_used): Also mark DECL_LOCAL_DECL_ALIAS. Drop old local-extern treatment. * parser.c (cp_parser_oacc_declare): Deal with local extern aliases. * pt.c (tsubst_expr): Adjust local extern instantiation. * cp-gimplify.c (cp_genericize_r): Remap DECL_LOCAL_DECLs. gcc/testsuite/ * g++.dg/cpp0x/lambda/lambda-sfinae1.C: Avoid ill-formed local extern * g++.dg/init/pr42844.C: Add expected error. * g++.dg/lookup/extern-redecl1.C: Likewise. * g++.dg/lookup/koenig15.C: Avoid ill-formed. * g++.dg/lto/pr95677.C: New. * g++.dg/other/nested-extern-1.C: Correct expected behabviour. * g++.dg/other/nested-extern-2.C: Likewise. * g++.dg/other/nested-extern.cc: Split ... * g++.dg/other/nested-extern-1.cc: ... here ... * g++.dg/other/nested-extern-2.cc: ... here. * g++.dg/template/scope5.C: Avoid ill-formed * g++.old-deja/g++.law/missed-error2.C: Allow extension. * g++.old-deja/g++.pt/crash3.C: Add expected error.
2020-10-07ipa-prop: Fix multiple-target speculation resolutionMartin Jambor2-4/+70
As the FIXME which this patch removes states, the current code does not work when a call with multiple speculative targets gets resolved through parameter tracking during inlining - it feeds the inliner an edge it has already dealt with. The patch makes the code which should prevent it aware of the possibility that that speculation can have more than one target now. gcc/ChangeLog: 2020-09-30 Martin Jambor <mjambor@suse.cz> PR ipa/96394 * ipa-prop.c (update_indirect_edges_after_inlining): Do not add resolved speculation edges to vector of new direct edges even in presence of multiple speculative direct edges for a single call. gcc/testsuite/ChangeLog: 2020-09-30 Martin Jambor <mjambor@suse.cz> PR ipa/96394 * gcc.dg/tree-prof/pr96394.c: New test.