aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-08-05libstdc++: Replace operator>>(istream&, char*) [LWG 2499]Jonathan Wakely18-41/+386
P0487R1 resolved LWG 2499 for C++20 by removing the operator>> overloads that have high risk of buffer overflows. They were replaced by equivalents that only accept a reference to an array, and so can guarantee not to write past the end of the array. In order to support both the old and new functionality, this patch introduces a new overloaded __istream_extract function which takes a maximum length. The new operator>> overloads use the array size as the maximum length. The old overloads now use __builtin_object_size to determine the available buffer size if available (which requires -O2) or use numeric_limits<streamsize>::max()/sizeof(char_type) otherwise. This is a change in behaviour, as the old overloads previously always used numeric_limits<streamsize>::max(), without considering sizeof(char_type) and without attempting to prevent overflows. Because they now do little more than call __istream_extract, the old operator>> overloads are very small inline functions. This means there is no advantage to explicitly instantiating them in the library (in fact that would prevent the __builtin_object_size checks from ever working). As a result, the explicit instantiation declarations can be removed from the header. The explicit instantiation definitions are still needed, for backwards compatibility with existing code that expects to link to the definitions in the library. While working on this change I noticed that src/c++11/istream-inst.cc has the following explicit instantiation definition: template istream& operator>>(istream&, char*); This had no effect (and so should not have been present in that file), because there was an explicit specialization declared in <istream> and defined in src/++98/istream.cc. However, this change removes the explicit specialization, and now the explicit instantiation definition is necessary to ensure the symbol gets defined in the library. libstdc++-v3/ChangeLog: * config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Export new symbols. * include/bits/istream.tcc (__istream_extract): New function template implementing both of operator>>(istream&, char*) and operator>>(istream&, char(&)[N]). Add explicit instantiation declaration for it. Remove explicit instantiation declarations for old function templates. * include/std/istream (__istream_extract): Declare. (operator>>(basic_istream<C,T>&, C*)): Define inline and simply call __istream_extract. (operator>>(basic_istream<char,T>&, signed char*)): Likewise. (operator>>(basic_istream<char,T>&, unsigned char*)): Likewise. (operator>>(basic_istream<C,T>&, C(7)[N])): Define for LWG 2499. (operator>>(basic_istream<char,T>&, signed char(&)[N])): Likewise. (operator>>(basic_istream<char,T>&, unsigned char(&)[N])): Likewise. * include/std/streambuf (basic_streambuf): Declare char overload of __istream_extract as a friend. * src/c++11/istream-inst.cc: Add explicit instantiation definition for wchar_t overload of __istream_extract. Remove explicit instantiation definitions of old operator>> overloads for versioned-namespace build. * src/c++98/istream.cc (operator>>(istream&, char*)): Replace with __istream_extract(istream&, char*, streamsize). * testsuite/27_io/basic_istream/extractors_character/char/3.cc: Do not use variable-length array. * testsuite/27_io/basic_istream/extractors_character/char/4.cc: Do not run test for C++20. * testsuite/27_io/basic_istream/extractors_character/char/9555-ic.cc: Do not test writing to pointers for C++20. * testsuite/27_io/basic_istream/extractors_character/char/9826.cc: Use array instead of pointer. * testsuite/27_io/basic_istream/extractors_character/wchar_t/3.cc: Do not use variable-length array. * testsuite/27_io/basic_istream/extractors_character/wchar_t/4.cc: Do not run test for C++20. * testsuite/27_io/basic_istream/extractors_character/wchar_t/9555-ic.cc: Do not test writing to pointers for C++20. * testsuite/27_io/basic_istream/extractors_character/char/lwg2499.cc: New test. * testsuite/27_io/basic_istream/extractors_character/char/lwg2499_neg.cc: New test. * testsuite/27_io/basic_istream/extractors_character/char/overflow.cc: New test. * testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499.cc: New test. * testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499_neg.cc: New test.
2020-08-05c++: cxx_eval_vec_init after zero-initialization [PR96282]Patrick Palka4-1/+59
In the first testcase below, expand_aggr_init_1 sets up t's default constructor such that the ctor first zero-initializes the entire base b, followed by calling b's default constructor, the latter of which just default-initializes the array member b::m via a VEC_INIT_EXPR. So upon constexpr evaluation of this latter VEC_INIT_EXPR, ctx->ctor is nonempty due to the prior zero-initialization, and we proceed in cxx_eval_vec_init to append new constructor_elts to the end of ctx->ctor without first checking if a matching constructor_elt already exists. This leads to ctx->ctor having two matching constructor_elts for each index. This patch fixes this issue by truncating a zero-initialized array CONSTRUCTOR in cxx_eval_vec_init_1 before we begin appending array elements to it. We propagate its zeroed out state during evaluation by clearing CONSTRUCTOR_NO_CLEARING on each new appended aggregate element. gcc/cp/ChangeLog: PR c++/96282 * constexpr.c (cxx_eval_vec_init_1): Truncate ctx->ctor and then clear CONSTRUCTOR_NO_CLEARING on each appended element initializer if we're initializing a previously zero-initialized array object. gcc/testsuite/ChangeLog: PR c++/96282 * g++.dg/cpp0x/constexpr-array26.C: New test. * g++.dg/cpp0x/constexpr-array27.C: New test. * g++.dg/cpp2a/constexpr-init18.C: New test. Co-authored-by: Jason Merrill <jason@redhat.com>
2020-08-05Added test case to make sure that legal cases still pass.Thomas Koenig1-0/+56
gcc/testsuite/ChangeLog: PR fortran/96469 * gfortran.dg/do_check_14.f90: New test.
2020-08-05Static analysis for definition of DO index variables in contained procedures.Thomas Koenig3-11/+357
When encountering a procedure call in a DO loop, this patch checks if the call is to a contained procedure, and if it is, check for changes in the index variable. gcc/fortran/ChangeLog: PR fortran/96469 * frontend-passes.c (doloop_contained_function_call): New function. (doloop_contained_procedure_code): New function. (CHECK_INQ): Macro for inquire checks. (doloop_code): Invoke doloop_contained_procedure_code and doloop_contained_function_call if appropriate. (do_intent): Likewise. gcc/testsuite/ChangeLog: PR fortran/96469 * gfortran.dg/do_check_4.f90: Hide change in index variable from compile-time analysis. * gfortran.dg/do_check_13.f90: New test.
2020-08-05VEC_COND_EXPR optimizationsMarc Glisse4-12/+96
When vector comparisons were forced to use vec_cond_expr, we lost a number of optimizations (my fault for not adding enough testcases to prevent that). This patch tries to unwrap vec_cond_expr a bit so some optimizations can still happen. I wasn't planning to add all those transformations together, but adding one caused a regression, whose fix introduced a second regression, etc. Restricting to constant folding would not be sufficient, we also need at least things like X|0 or X&X. The transformations are quite conservative with :s and folding only if everything simplifies, we may want to relax this later. And of course we are going to miss things like a?b:c + a?c:b -> b+c. In terms of number of operations, some transformations turning 2 VEC_COND_EXPR into VEC_COND_EXPR + BIT_IOR_EXPR + BIT_NOT_EXPR might not look like a gain... I expect the bit_not disappears in most cases, and VEC_COND_EXPR looks more costly than a simpler BIT_IOR_EXPR. 2020-08-05 Marc Glisse <marc.glisse@inria.fr> PR tree-optimization/95906 PR target/70314 * match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e), (v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): New transformations. (op (c ? a : b)): Update to match the new transformations. * gcc.dg/tree-ssa/andnot-2.c: New file. * gcc.dg/tree-ssa/pr95906.c: Likewise. * gcc.target/i386/pr70314.c: Likewise.
2020-08-05aarch64: Clear canary value after stack_protect_test [PR96191]Richard Sandiford3-19/+110
The stack_protect_test patterns were leaving the canary value in the temporary register, meaning that it was often still in registers on return from the function. An attacker might therefore have been able to use it to defeat stack-smash protection for a later function. gcc/ PR target/96191 * config/aarch64/aarch64.md (stack_protect_test_<mode>): Set the CC register directly, instead of a GPR. Replace the original GPR destination with an extra scratch register. Zero out operand 3 after use. (stack_protect_test): Update accordingly. gcc/testsuite/ PR target/96191 * gcc.target/aarch64/stack-protector-1.c: New test. * gcc.target/aarch64/stack-protector-2.c: Likewise.
2020-08-05aarch64: Add missing %z prefixes to LDP/STP patternsRichard Sandiford2-17/+17
For LDP/STP Q, the memory operand might not be valid for "m", so we need to use %z<N> instead of %<N> in the asm template. This patch does that for all Ump LDP/STP patterns, regardless of whether it's strictly needed. This is needed to unbreak bootstrap. 2020-08-05 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64.md (load_pair_sw_<SX:mode><SX2:mode>) (load_pair_dw_<DX:mode><DX2:mode>, load_pair_dw_tftf) (store_pair_sw_<SX:mode><SX2:mode>) (store_pair_dw_<DX:mode><DX2:mode>, store_pair_dw_tftf) (*load_pair_extendsidi2_aarch64) (*load_pair_zero_extendsidi2_aarch64): Use %z for the memory operand. * config/aarch64/aarch64-simd.md (load_pair<DREG:mode><DREG2:mode>) (vec_store_pair<DREG:mode><DREG2:mode>, load_pair<VQ:mode><VQ2:mode>) (vec_store_pair<VQ:mode><VQ2:mode>): Likewise.
2020-08-05refactor LIM a bitRichard Biener1-95/+58
This refactors LIM to eschew alloc_aux_for_edges and re-uses the RPO order of the move_computations walk for invariantness computation as well. It also removes one unnecessary sorting (but retaining it as checking code because we bsearch the vector) and moves edge insert commit code to the place where it doesn't have to scan all the functions edges. This was all done when investigating whether LIM can be refactored to work on a specific loop for on-demand processing (but we're not there yet). 2020-08-05 Richard Biener <rguenther@suse.de> * tree-ssa-loop-im.c (invariantness_dom_walker): Remove. (invariantness_dom_walker::before_dom_children): Move to ... (compute_invariantness): ... this function. (move_computations): Inline ... (tree_ssa_lim): ... here, share RPO order and avoid some cfun references. (analyze_memory_references): Remove sorting of location lists, instead assert they are sorted already when checking. (prev_flag_edges): Remove. (execute_sm_if_changed): Pass down and adjust prev edge state. (execute_sm_exit): Likewise. (hoist_memory_references): Likewise. Commit edge insertions of each processed exit. (store_motion_loop): Do not commit edge insertions on all edges in the function. (tree_ssa_lim_initialize): Do not call alloc_aux_for_edges. (tree_ssa_lim_finalize): Do not call free_aux_for_edges.
2020-08-05Make genmatch transform failure handling more consistentRichard Biener1-15/+29
Currently whether a fail during the transform stage is fatal or whether following patterns are still considers is a bit random depending on whether the pattern is wrapped in a for for example. The follwing makes it consistent by replacing early returns with gotos to the end of the pattern processing. 2020-08-05 Richard Biener <rguenther@suse.de> * genmatch.c (fail_label): New global. (expr::gen_transform): Branch to fail_label instead of returning. Fix indent of call argument checking. (dt_simplify::gen_1): Compute and emit fail_label, branch to it instead of returning early.
2020-08-05openmp: Handle even some combined non-rectangular loopsJakub Jelinek3-5/+378
The number of loops computation and logical iteration -> actual iterator values computations can now be done separately even on composite constructs (though for triangular loops it would still be more efficient to propagate a few values through, will handle that incrementally). simd and taskloop are still unhandled. 2020-08-05 Jakub Jelinek <jakub@redhat.com> * omp-expand.c (expand_omp_for): Don't disallow combined non-rectangular loops. * testsuite/libgomp.c/loop-22.c: New test. * testsuite/libgomp.c/loop-23.c: New test.
2020-08-05openmp: Handle reduction clauses on host teams construct [PR96459]Jakub Jelinek4-28/+83
As the new testcase shows, we weren't actually performing reductions on host teams construct. And fixing that revealed a flaw in the for-14.c testcase. The problem is that the tests perform also initialization and checking around the calls to the functions with the OpenMP constructs. In that testcase, all the tests have been spawned from a teams construct but only the tested loops were distribute, which means the initialization and checking has been performed redundantly and racily in each team. Fixed by performing the initialization and checking outside of host teams and only do the calls to functions with the tested constructs inside of host teams. 2020-08-05 Jakub Jelinek <jakub@redhat.com> PR middle-end/96459 * omp-low.c (lower_omp_taskreg): Call lower_reduction_clauses even in for host teams. * testsuite/libgomp.c/teams-3.c: New test. * testsuite/libgomp.c-c++-common/for-2.h (OMPTEAMS): Define to nothing if not defined yet. (N(test)): Use it before all N(f*) calls. * testsuite/libgomp.c-c++-common/for-14.c (DO_PRAGMA, OMPTEAMS): Define. (main): Don't call all test_* functions from within #pragma omp teams reduction(|:err), call them directly.
2020-08-05openmp: Use more efficient logical -> actual computation even if # ↵Jakub Jelinek1-7/+22
iterations is computed at runtime For triangular loops use more efficient logical iteration number to actual iterator values computation even for non-rectangular loops where number of loop iterations could not be computed at compile time. 2020-08-05 Jakub Jelinek <jakub@redhat.com> * omp-expand.c (expand_omp_for_init_counts): Remember first_inner_iterations, factor and n1o from the number of iterations computation in *fd. (expand_omp_for_init_vars): Use more efficient logical iteration number to actual iterator values computation even for non-rectangular loops where number of loop iterations could not be computed at compile time.
2020-08-04rs6000 Add vector blend, permute builtin supportCarl Love8-6/+830
GCC maintainers: The following patch adds support for the vec_blendv and vec_permx builtins. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) with no regression errors. The test cases were compiled on a Power 9 system and then tested on Mambo. Carl Love rs6000 RFC2609 vector blend, permute instructions gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.h (vec_blendv, vec_permx): Add define. * config/rs6000/altivec.md (UNSPEC_XXBLEND, UNSPEC_XXPERMX.): New unspecs. (VM3): New define_mode. (VM3_char): New define_attr. (xxblend_<mode> mode VM3): New define_insn. (xxpermx): New define_expand. (xxpermx_inst): New define_insn. * config/rs6000/rs6000-builtin.def (VXXBLEND_V16QI, VXXBLEND_V8HI, VXXBLEND_V4SI, VXXBLEND_V2DI, VXXBLEND_V4SF, VXXBLEND_V2DF): New BU_P10V_3 definitions. (XXBLEND): New BU_P10_OVERLOAD_3 definition. (XXPERMX): New BU_P10_OVERLOAD_4 definition. * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): (P10_BUILTIN_VXXPERMX): Add if statement. * config/rs6000/rs6000-call.c (P10_BUILTIN_VXXBLEND_V16QI, P10_BUILTIN_VXXBLEND_V8HI, P10_BUILTIN_VXXBLEND_V4SI, P10_BUILTIN_VXXBLEND_V2DI, P10_BUILTIN_VXXBLEND_V4SF, P10_BUILTIN_VXXBLEND_V2DF, P10_BUILTIN_VXXPERMX): Define overloaded arguments. (rs6000_expand_quaternop_builtin): Add if case for CODE_FOR_xxpermx. (builtin_quaternary_function_type): Add v16uqi_type and xxpermx_type variables, add case statement for P10_BUILTIN_VXXPERMX. (builtin_function_type): Add case statements for P10_BUILTIN_VXXBLEND_V16QI, P10_BUILTIN_VXXBLEND_V8HI, P10_BUILTIN_VXXBLEND_V4SI, P10_BUILTIN_VXXBLEND_V2DI. * doc/extend.texi: Add documentation for vec_blendv and vec_permx. gcc/testsuite/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vec-blend-runnable.c: New test. * gcc.target/powerpc/vec-permute-ext-runnable.c: New test.
2020-08-04rs6000, Add vector splat builtin supportCarl Love9-0/+379
GCC maintainers: The following patch adds support for the vec_splati, vec_splatid and vec_splati_ins builtins. This patch adds support for instructions that take a 32-bit immediate value that represents a floating point value. This support adds new predicates and a support function to properly handle the immediate value. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) with no regression errors. The test case was compiled on a Power 9 system and then tested on Mambo. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love -------------------------------------------------------- gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.h (vec_splati, vec_splatid, vec_splati_ins): Add defines. * config/rs6000/altivec.md (UNSPEC_XXSPLTIW, UNSPEC_XXSPLTID, UNSPEC_XXSPLTI32DX): New. (vxxspltiw_v4si, vxxspltiw_v4sf_inst, vxxspltidp_v2df_inst, vxxsplti32dx_v4si_inst, vxxsplti32dx_v4sf_inst): New define_insn. (vxxspltiw_v4sf, vxxspltidp_v2df, vxxsplti32dx_v4si, vxxsplti32dx_v4sf.): New define_expands. * config/rs6000/predicates.md (u1bit_cint_operand, s32bit_cint_operand, c32bit_cint_operand): New predicates. * config/rs6000/rs6000-builtin.def (VXXSPLTIW_V4SI, VXXSPLTIW_V4SF, VXXSPLTID): New definitions. (VXXSPLTI32DX_V4SI, VXXSPLTI32DX_V4SF): New BU_P10V_3 definitions. (XXSPLTIW, XXSPLTID): New definitions. (XXSPLTI32DX): Add definitions. * config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_XXSPLTIW, P10_BUILTIN_VEC_XXSPLTID, P10_BUILTIN_VEC_XXSPLTI32DX): New definitions. * config/rs6000/rs6000-protos.h (rs6000_constF32toI32): New extern declaration. * config/rs6000/rs6000.c (rs6000_constF32toI32): New function. * doc/extend.texi: Add documentation for vec_splati, vec_splatid, and vec_splati_ins. gcc/testsuite/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vec-splati-runnable.c: New test.
2020-08-04rs6000, Add vector shift double builtin supportCarl Love6-0/+539
GCC maintainers: The following patch adds support for the vector shift double builtins. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) and Mambo with no regression errors. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love ------------------------------------------------------- gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.h (vec_sldb, vec_srdb): New defines. * config/rs6000/altivec.md (UNSPEC_SLDB, UNSPEC_SRDB): New. (SLDB_lr): New attribute. (VSHIFT_DBL_LR): New iterator. (vs<SLDB_lr>db_<mode>): New define_insn. * config/rs6000/rs6000-builtin.def (VSLDB_V16QI, VSLDB_V8HI, VSLDB_V4SI, VSLDB_V2DI, VSRDB_V16QI, VSRDB_V8HI, VSRDB_V4SI, VSRDB_V2DI): New BU_P10V_3 definitions. (SLDB, SRDB): New BU_P10_OVERLOAD_3 definitions. * config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_SLDB, P10_BUILTIN_VEC_SRDB): New definitions. (rs6000_expand_ternop_builtin) [CODE_FOR_vsldb_v16qi, CODE_FOR_vsldb_v8hi, CODE_FOR_vsldb_v4si, CODE_FOR_vsldb_v2di, CODE_FOR_vsrdb_v16qi, CODE_FOR_vsrdb_v8hi, CODE_FOR_vsrdb_v4si, CODE_FOR_vsrdb_v2di]: Add clauses. * doc/extend.texi: Add description for vec_sldb and vec_srdb. gcc/testsuite/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vec-shift-double-runnable.c: New test file.
2020-08-04rs6000, Add vector replace builtin support GCC maintainers:Carl Love6-0/+478
The following patch adds support for builtins vec_replace_elt and vec_replace_unaligned. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) and mambo with no regression errors. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love ------------------------------------------------------- gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.h: Add define for vec_replace_elt and vec_replace_unaligned. * config/rs6000/vsx.md (UNSPEC_REPLACE_ELT, UNSPEC_REPLACE_UN): New unspecs. (REPLACE_ELT): New mode iterator. (REPLACE_ELT_char, REPLACE_ELT_sh, REPLACE_ELT_max): New mode attributes. (vreplace_un_<mode>, vreplace_elt_<mode>_inst): New. * config/rs6000/rs6000-builtin.def (VREPLACE_ELT_V4SI, VREPLACE_ELT_UV4SI, VREPLACE_ELT_V4SF, VREPLACE_ELT_UV2DI, VREPLACE_ELT_V2DF, VREPLACE_UN_V4SI, VREPLACE_UN_UV4SI, VREPLACE_UN_V4SF, VREPLACE_UN_V2DI, VREPLACE_UN_UV2DI, VREPLACE_UN_V2DF, (REPLACE_ELT, REPLACE_UN, VREPLACE_ELT_V2DI): New builtin entries. * config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_REPLACE_ELT, P10_BUILTIN_VEC_REPLACE_UN): New builtin argument definitions. (rs6000_expand_quaternop_builtin): Add 3rd argument checks for CODE_FOR_vreplace_elt_v4si, CODE_FOR_vreplace_elt_v4sf, CODE_FOR_vreplace_un_v4si, CODE_FOR_vreplace_un_v4sf. (builtin_function_type) [P10_BUILTIN_VREPLACE_ELT_UV4SI, P10_BUILTIN_VREPLACE_ELT_UV2DI, P10_BUILTIN_VREPLACE_UN_UV4SI, P10_BUILTIN_VREPLACE_UN_UV2DI]: New cases. * doc/extend.texi: Add description for vec_replace_elt and vec_replace_unaligned builtins. gcc/testsuite/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vec-replace-word-runnable.c: New test.
2020-08-04rs6000 Add vector insert builtin supportCarl Love6-0/+595
GCC maintainers: This patch adds support for vec_insertl and vec_inserth builtins. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) and mambo with no regression errors. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love -------------------------------------------------------------- gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.h (vec_insertl, vec_inserth): New defines. * config/rs6000/rs6000-builtin.def (VINSERTGPRBL, VINSERTGPRHL, VINSERTGPRWL, VINSERTGPRDL, VINSERTVPRBL, VINSERTVPRHL, VINSERTVPRWL, VINSERTGPRBR, VINSERTGPRHR, VINSERTGPRWR, VINSERTGPRDR, VINSERTVPRBR, VINSERTVPRHR, VINSERTVPRWR): New builtins. (INSERTL, INSERTH): New builtins. * config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_INSERTL, P10_BUILTIN_VEC_INSERTH): New overloaded definitions. (P10_BUILTIN_VINSERTGPRBL, P10_BUILTIN_VINSERTGPRHL, P10_BUILTIN_VINSERTGPRWL, P10_BUILTIN_VINSERTGPRDL, P10_BUILTIN_VINSERTVPRBL, P10_BUILTIN_VINSERTVPRHL, P10_BUILTIN_VINSERTVPRWL): Add case entries. * config/rs6000/vsx.md (define_c_enum): Add UNSPEC_INSERTL, UNSPEC_INSERTR. (define_expand): Add vinsertvl_<mode>, vinsertvr_<mode>, vinsertgl_<mode>, vinsertgr_<mode>, mode is VI2. (define_ins): vinsertvl_internal_<mode>, vinsertvr_internal_<mode>, vinsertgl_internal_<mode>, vinsertgr_internal_<mode>, mode VEC_I. * doc/extend.texi: Add documentation for vec_insertl, vec_inserth. gcc/testsuite/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vec-insert-word-runnable.c: New test case.
2020-08-04rs6000, Update support for vec_extractCarl Love3-103/+110
GCC maintainers: Move the existing vector extract support in altivec.md to vsx.md so all of the vector insert and extract support is in the same file. The patch also updates the name of the builtins and descriptions for the builtins in the documentation file so they match the approved builtin names and descriptions. The patch does not make any functional changes. Please let me know if the changes are acceptable for mainline. Thanks. Carl Love ------------------------------------------------------ gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR) (vextractl<mode>, vextractr<mode>) (vextractl<mode>_internal, vextractr<mode>_internal for mode VI2) (VI2): Move to ... * config/rs6000/vsx.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR) (vextractl<mode>, vextractr<mode>) (vextractl<mode>_internal, vextractr<mode>_internal for mode VI2) (VI2): ..here. * doc/extend.texi: Update documentation for vec_extractl. Replace builtin name vec_extractr with vec_extracth. Update description of vec_extracth.
2020-08-05Daily bump.GCC Administrator7-1/+318
2020-08-04aarch64: Delete duplicated option docs.Jim Wilson1-18/+0
Noticed while reviewing the RISC-V -mstack-protector-guard docs. The AArch64 section has two identical copies of the docs for this option. gcc/ * doc/invoke.texi (AArch64 Options): Delete duplicate -mstack-protector-guard docs.
2020-08-05[PATCH] nvptx: Add support for PTX highpart multiplications (HI/SI)Roger Sayle3-0/+78
This patch adds support for signed and unsigned, HImode and SImode highpart multiplications to the nvptx backend. This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu with a "make" and "make -k check" with no new failures with the above patch. 2020-08-04 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog: * config/nvptx/nvptx.md (smulhi3_highpart, smulsi3_highpart) (umulhi3_highpart, umulsi3_highpart): New instructions. gcc/testsuite/ChangeLog: * gcc.target/nvptx/mul-hi.c: New test. * gcc.target/nvptx/umul-hi.c: New test.
2020-08-04c++: Template keyword following :: [PR96082]Marek Polacek2-1/+12
In r9-4235 I tried to make sure that the template keyword follows a nested-name-specifier. :: is a valid nested-name-specifier, so I also have to check 'globalscope' before giving the error. gcc/cp/ChangeLog: PR c++/96082 * parser.c (cp_parser_elaborated_type_specifier): Allow 'template' following ::. gcc/testsuite/ChangeLog: PR c++/96082 * g++.dg/template/template-keyword3.C: New test.
2020-08-04compiler: delete lowered constant stringsIan Lance Taylor2-2/+9
If we lower a constant string operation in a Binary_expression, delete the strings. This is safe because constant strings are always newly allocated. This is a hack to use much less memory when compiling the new time/tzdata package, which has a file that contains the sum of over 13,000 constant strings. We don't do this for numeric expressions because that could cause us to delete an Iota_expression. We should have a cleaner approach to memory usage some day. Fixes PR go/96450
2020-08-04amdgcn: Remove dead defines from gcn-runAndrew Stubbs1-18/+0
Nothing uses these since the switch to HSACOv3. gcc/ChangeLog: * config/gcn/gcn-run.c (R_AMDGPU_NONE): Delete. (R_AMDGPU_ABS32_LO): Delete. (R_AMDGPU_ABS32_HI): Delete. (R_AMDGPU_ABS64): Delete. (R_AMDGPU_REL32): Delete. (R_AMDGPU_REL64): Delete. (R_AMDGPU_ABS32): Delete. (R_AMDGPU_GOTPCREL): Delete. (R_AMDGPU_GOTPCREL32_LO): Delete. (R_AMDGPU_GOTPCREL32_HI): Delete. (R_AMDGPU_REL32_LO): Delete. (R_AMDGPU_REL32_HI): Delete. (reserved): Delete. (R_AMDGPU_RELATIVE64): Delete.
2020-08-04[Arm] Modify default tuning of armv8.1-m.main to use Cortex-M55Omar Tahir1-1/+1
Previously, compiling with -march=armv8.1-m.main would tune for Cortex-M7. However, the Cortex-M7 only supports up to Armv7e-M. The Cortex-M55 is the earliest CPU that supports Armv8.1-M Mainline so is more appropriate. This also has the effect of changing the branch cost function used, which will be necessary to correctly prioritise conditional instructions over branches in the rest of this patch series. Regression tested on arm-none-eabi. gcc/ChangeLog 2020-08-04 Omar Tahir <omar.tahir@arm.com> * config/arm/arm-cpus.in (armv8.1-m.main): Tune for Cortex-M55.
2020-08-04aarch64: Delete unnecessary codeHu Jiangping1-2/+0
gcc/ * config/aarch64/aarch64.c (aarch64_if_then_else_costs): Delete redundant extra_cost variable.
2020-08-04c++: fix template parm count leakNathan Sidwell3-30/+34
I noticed that we could leak parser->num_template_parameter_lists with erroneous specializations. We'd increment, notice a problem and then bail out. This refactors cp_parser_explicit_specialization to avoid that code path. A couple of tests get different diagnostics because of the fix. pr39425 then goes to unbounded template instantiation and exceeds the implementation limit. gcc/cp/ * parser.c (cp_parser_explicit_specialization): Refactor to avoid leak of num_template_parameter_lists value. gcc/testsuite/ * g++.dg/template/pr39425.C: Adjust errors, (unbounded template recursion). * g++.old-deja/g++.pt/spec20.C: Remove fallout diagnostics.
2020-08-04AArch64: Use FLOAT_MODE_P macro and add FLAG_AUTO_FP [PR94442]xiezhiheng1-20/+6
Since all FP intrinsics are set by FLAG_FP by default, but not all FP intrinsics raise FP exceptions or read FPCR register. So we add a global flag FLAG_AUTO_FP to suppress the flag FLAG_FP. 2020-08-04 Zhiheng Xie <xiezhiheng@huawei.com> gcc/ChangeLog: * config/aarch64/aarch64-builtins.c (aarch64_call_properties): Use FLOAT_MODE_P macro instead of enumerating all floating-point modes and add global flag FLAG_AUTO_FP.
2020-08-04Fortran/OpenMP: Fix detecting not perfectly nested loopsTobias Burnus3-4/+34
gcc/fortran/ChangeLog: * openmp.c (resolve_omp_do): Detect not perfectly nested loop with innermost collapse. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/collapse1.f90: Add dg-error. * gfortran.dg/gomp/collapse2.f90: New test.
2020-08-04doc: Add @cindex to symver attributeJakub Jelinek1-0/+1
When looking at the symver attr documentation in html, I found there is no name to refer to for it. 2020-08-04 Jakub Jelinek <jakub@redhat.com> * doc/extend.texi (symver): Add @cindex for symver function attribute.
2020-08-04Test case for PR rtl-optimization/60473Roger Sayle1-0/+12
PR rtl-optimization/60473 is code quality regression that has been cured by improvements to register allocation. For the function in the test case, GCC 4.4, 4.5 and 4.6 generated very poor code requiring two mov instructions, and GCC 4.7 and 4.8 (when the PR was filed) produced better but still poor code with one mov instruction. Since GCC 4.9 (including current mainline), it generates optimal code with no mov instructions, matching what used to be generated in GCC 4.1. 2020-08-04 Roger Sayle <roger@nextmovesoftware.com> gcc/testsuite/ChangeLog PR rtl-optimization/60473 * gcc.target/i386/pr60473.c: New test.
2020-08-04Simplify X * C1 == C2 with undefined overflowMarc Glisse3-1/+23
this transformation is quite straightforward, without overflow, 3*X==15 is the same as X==5 and 3*X==5 cannot happen. Adding a single_use restriction for the first case didn't seem necessary, although of course it can slightly increase register pressure in some cases. 2020-08-04 Marc Glisse <marc.glisse@inria.fr> PR tree-optimization/95433 * match.pd (X * C1 == C2): New transformation. * gcc.c-torture/execute/pr23135.c: Add -fwrapv to avoid undefined behavior. * gcc.dg/tree-ssa/pr95433.c: New file.
2020-08-04Adjust gimple-ssa-sprintf.c for irange API.Aldy Hernandez1-21/+16
gcc/ChangeLog: * gimple-ssa-sprintf.c (get_int_range): Adjust for irange API. (format_integer): Same. (handle_printf_call): Same.
2020-08-04d: Fix struct literals that have non-deterministic hash values (PR96153)Iain Buclaw3-36/+101
Adds code generation for generating a temporary for, and pre-filling struct and array literals with zeroes before assigning, so that alignment holes don't cause objects to produce a non-deterministic hash value. A new field has been added to the expression visitor to track whether the result is being generated for another literal, so that memset() is only called once on the top-level literal expression, and not for nesting struct or arrays. gcc/d/ChangeLog: PR d/96153 * d-tree.h (build_expr): Add literalp argument. * expr.cc (ExprVisitor): Add literalp_ field. (ExprVisitor::ExprVisitor): Initialize literalp_. (ExprVisitor::visit (AssignExp *)): Call memset() on blits where RHS is a struct literal. Elide assignment if initializer is all zeroes. (ExprVisitor::visit (CastExp *)): Forward literalp_ to generation of subexpression. (ExprVisitor::visit (AddrExp *)): Likewise. (ExprVisitor::visit (ArrayLiteralExp *)): Use memset() to pre-fill object with zeroes. Set literalp in subexpressions. (ExprVisitor::visit (StructLiteralExp *)): Likewise. (ExprVisitor::visit (TupleExp *)): Set literalp in subexpressions. (ExprVisitor::visit (VectorExp *)): Likewise. (ExprVisitor::visit (VectorArrayExp *)): Likewise. (build_expr): Forward literal_p to ExprVisitor. gcc/testsuite/ChangeLog: PR d/96153 * gdc.dg/pr96153.d: New test.
2020-08-04amdgcn: TImode shiftsAndrew Stubbs1-0/+105
Implement TImode shifts in the backend. The middle-end support that does it for other architectures doesn't work for GCN because BITS_PER_WORD==32, meaning that TImode is quad-word, not double-word. gcc/ChangeLog: * config/gcn/gcn.md ("<expander>ti3"): New.
2020-08-04c++: Member initializer list diagnostic locations [PR94024]Patrick Palka4-1/+52
This patch preserves the source locations of each node in a member initializer list so that during processing of the list we can set input_location appropriately for generally more accurate diagnostic locations. Since TREE_LIST nodes are tcc_exceptional, they can't have source locations, so we instead store the location in a dummy tcc_expression node within the TREE_TYPE of the list node. gcc/cp/ChangeLog: PR c++/94024 * init.c (sort_mem_initializers): Preserve TREE_TYPE of the member initializer list node. (emit_mem_initializers): Set input_location when performing each member initialization. * parser.c (cp_parser_mem_initializer): Attach the source location of this initializer to a dummy EMPTY_CLASS_EXPR within the TREE_TYPE of the list node. * pt.c (tsubst_initializer_list): Preserve TREE_TYPE of the member initializer list node. gcc/testsuite/ChangeLog: PR c++/94024 * g++.dg/diagnostic/mem-init1.C: New test.
2020-08-04tree-optimization/88240 - stopgap for floating point code-hoisting issuesRichard Biener4-1/+49
This adds a stopgap measure to avoid performing code-hoisting on mixed type loads when the load we'd insert in the hoisting position would be a floating point one. This is because certain targets (hello x87) cannot perform floating point loads without possibly altering the bit representation and thus cannot be used in place of integral loads. 2020-08-04 Richard Biener <rguenther@suse.de> PR tree-optimization/88240 * tree-ssa-sccvn.h (vn_reference_s::punned): New flag. * tree-ssa-sccvn.c (vn_reference_insert): Initialize punned. (vn_reference_insert_pieces): Likewise. (visit_reference_op_call): Likewise. (visit_reference_op_load): Track whether a ref was punned. * tree-ssa-pre.c (do_hoist_insertion): Refuse to perform hoist insertion on punned floating point loads. * gcc.target/i386/pr88240.c: New testcase.
2020-08-04Fortran: Fix for OpenMP's 'lastprivate(conditional:'Tobias Burnus2-8/+6
gcc/fortran/ChangeLog: * trans-openmp.c (gfc_trans_omp_do): Fix 'lastprivate(conditional:'. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/lastprivate-conditional-3.f90: Enable some previously disabled 'lastprivate(conditional:' dg-warnings.
2020-08-04aarch64: Use Q-reg loads/stores in movmem expansionSudakshina Das3-7/+52
This is my attempt at reviving the old patch https://gcc.gnu.org/pipermail/gcc-patches/2019-January/514632.html I have followed on Kyrill's comment upstream on the link above and I am using the recommended option iii that he mentioned. "1) Adjust the copy_limit to 256 bits after checking AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS in the tuning. 2) Adjust aarch64_copy_one_block_and_progress_pointers to handle 256-bit moves. by iii: iii) Emit explicit V4SI (or any other 128-bit vector mode) pairs ldp/stps. This wouldn't need any adjustments to MD patterns, but would make aarch64_copy_one_block_and_progress_pointers more complex as it would now have two paths, where one handles two adjacent memory addresses in one calls." gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_gen_store_pair): Add case for E_V4SImode. (aarch64_gen_load_pair): Likewise. (aarch64_copy_one_block_and_progress_pointers): Handle 256 bit copy. (aarch64_expand_cpymem): Expand copy_limit to 256bits where appropriate. gcc/testsuite/ChangeLog: * gcc.target/aarch64/cpymem-q-reg_1.c: New test. * gcc.target/aarch64/large_struct_copy_2.c: Update for ldp q regs.
2020-08-04aarch64: Add missing clobber for fjcvtzsAndrea Corallo4-1/+59
gcc/ChangeLog 2020-07-30 Andrea Corallo <andrea.corallo@arm.com> * config/aarch64/aarch64.md (aarch64_fjcvtzs): Add missing clobber. * doc/sourcebuild.texi (aarch64_fjcvtzs_hw) Document new target supports option. gcc/testsuite/ChangeLog 2020-07-30 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/aarch64/acle/jcvt_2.c: New testcase. * lib/target-supports.exp (check_effective_target_aarch64_fjcvtzs_hw): Add new check for FJCVTZS hw.
2020-08-04[nvptx] Handle V2DI/V2SI mode in nvptx_gen_shuffleTom de Vries3-0/+95
With the pr96628-part1.f90 source and -ftree-slp-vectorize, we run into an ICE due to the fact that V2DI mode is not handled in nvptx_gen_shuffle. Fix this by adding handling of V2DI as well as V2SI mode in nvptx_gen_shuffle. Build and reg-tested on x86_64 with nvptx accelerator. gcc/ChangeLog: PR target/96428 * config/nvptx/nvptx.c (nvptx_gen_shuffle): Handle V2SI/V2DI. libgomp/ChangeLog: PR target/96428 * testsuite/libgomp.oacc-fortran/pr96628-part1.f90: New test. * testsuite/libgomp.oacc-fortran/pr96628-part2.f90: New test.
2020-08-04veclower: Don't ICE on .VEC_CONVERT calls with no lhs [PR96426]Jakub Jelinek2-0/+16
.VEC_CONVERT is a const internal call, so normally if the lhs is not used, we'd DCE it far before getting to veclower, but with -O0 (or perhaps -fno-tree-dce and some other -fno-* options) it can happen. But as the internal fn needs the lhs to know the type to which the conversion is done (and I think that is a reasonable representation, having some magic another argument and having to create constants with that type looks overkill to me), we just should DCE those calls ourselves. During veclower, we can't really remove insns, as the callers would be upset, so this just replaces it with a GIMPLE_NOP. 2020-08-04 Jakub Jelinek <jakub@redhat.com> PR middle-end/96426 * tree-vect-generic.c (expand_vector_conversion): Replace .VEC_CONVERT call with GIMPLE_NOP if there is no lhs. * gcc.c-torture/compile/pr96426.c: New test.
2020-08-04gimple-fold: Fix ICE in maybe_canonicalize_mem_ref_addr on debug stmt [PR96354]Jakub Jelinek2-3/+31
In debug stmts, we are less strict about what is and what is not accepted there, so this patch just punts on optimization of a debug stmt rather than ICEing. 2020-08-04 Jakub Jelinek <jakub@redhat.com> PR debug/96354 * gimple-fold.c (maybe_canonicalize_mem_ref_addr): Add IS_DEBUG argument. Return false instead of gcc_unreachable if it is true and get_addr_base_and_unit_offset returns NULL. (fold_stmt_1) <case GIMPLE_DEBUG>: Adjust caller. * g++.dg/opt/pr96354.C: New test.
2020-08-04Add is_gimple_min_invariant dropped from previous patch.Aldy Hernandez1-1/+3
gcc/ChangeLog: * vr-values.c (simplify_using_ranges::vrp_evaluate_conditional): Call is_gimple_min_invariant dropped from previous patch.
2020-08-04openmp: Compute number of collapsed loop iterations more efficiently for ↵Jakub Jelinek1-100/+352
some non-rectangular loops 2020-08-04 Jakub Jelinek <jakub@redhat.com> * omp-expand.c (expand_omp_for_init_counts): For triangular loops compute number of iterations at runtime more efficiently. (expand_omp_for_init_vars): Adjust immediate dominators. (extract_omp_for_update_vars): Likewise.
2020-08-04d: Fix PR96429: Pointer subtraction uses TRUNC_DIV_EXPRIain Buclaw2-0/+38
gcc/d/ChangeLog: PR d/96429 * expr.cc (ExprVisitor::visit (BinExp*)): Use EXACT_DIV_EXPR for pointer diff expressions. gcc/testsuite/ChangeLog: PR d/96429 * gdc.dg/pr96429.d: New test.
2020-08-04Change testcase for pr96325 from run to compile.Paul Thomas1-1/+1
2020-08-04 Paul Thomas <pault@gcc.gnu.org> gcc/testsuite/ PR fortran/96325 * gfortran.dg/pr96325.f90: Change from run to compile.
2020-08-04Adjust two_valued_val_range_p for irange API.Aldy Hernandez1-22/+9
gcc/ChangeLog: * vr-values.c (simplify_using_ranges::two_valued_val_range_p): Use irange API.
2020-08-04Adjust simplify_conversion_using_ranges for irange API.Aldy Hernandez1-4/+7
gcc/ChangeLog: * vr-values.c (simplify_conversion_using_ranges): Convert to irange API.
2020-08-04Use irange API in test_for_singularity.Aldy Hernandez1-5/+8
gcc/ChangeLog: * vr-values.c (test_for_singularity): Use irange API. (simplify_using_ranges::simplify_cond_using_ranges_1): Do not special case VR_RANGE.