riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2020-08-05	VEC_COND_EXPR optimizations	Marc Glisse	4	-12/+96
	When vector comparisons were forced to use vec_cond_expr, we lost a number of optimizations (my fault for not adding enough testcases to prevent that). This patch tries to unwrap vec_cond_expr a bit so some optimizations can still happen. I wasn't planning to add all those transformations together, but adding one caused a regression, whose fix introduced a second regression, etc. Restricting to constant folding would not be sufficient, we also need at least things like X\|0 or X&X. The transformations are quite conservative with :s and folding only if everything simplifies, we may want to relax this later. And of course we are going to miss things like a?b:c + a?c:b -> b+c. In terms of number of operations, some transformations turning 2 VEC_COND_EXPR into VEC_COND_EXPR + BIT_IOR_EXPR + BIT_NOT_EXPR might not look like a gain... I expect the bit_not disappears in most cases, and VEC_COND_EXPR looks more costly than a simpler BIT_IOR_EXPR. 2020-08-05 Marc Glisse <marc.glisse@inria.fr> PR tree-optimization/95906 PR target/70314 * match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e), (v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): New transformations. (op (c ? a : b)): Update to match the new transformations. * gcc.dg/tree-ssa/andnot-2.c: New file. * gcc.dg/tree-ssa/pr95906.c: Likewise. * gcc.target/i386/pr70314.c: Likewise.
2020-08-05	aarch64: Clear canary value after stack_protect_test [PR96191]	Richard Sandiford	3	-19/+110
	The stack_protect_test patterns were leaving the canary value in the temporary register, meaning that it was often still in registers on return from the function. An attacker might therefore have been able to use it to defeat stack-smash protection for a later function. gcc/ PR target/96191 * config/aarch64/aarch64.md (stack_protect_test_<mode>): Set the CC register directly, instead of a GPR. Replace the original GPR destination with an extra scratch register. Zero out operand 3 after use. (stack_protect_test): Update accordingly. gcc/testsuite/ PR target/96191 * gcc.target/aarch64/stack-protector-1.c: New test. * gcc.target/aarch64/stack-protector-2.c: Likewise.
2020-08-05	aarch64: Add missing %z prefixes to LDP/STP patterns	Richard Sandiford	2	-17/+17
	For LDP/STP Q, the memory operand might not be valid for "m", so we need to use %z<N> instead of %<N> in the asm template. This patch does that for all Ump LDP/STP patterns, regardless of whether it's strictly needed. This is needed to unbreak bootstrap. 2020-08-05 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64.md (load_pair_sw_<SX:mode><SX2:mode>) (load_pair_dw_<DX:mode><DX2:mode>, load_pair_dw_tftf) (store_pair_sw_<SX:mode><SX2:mode>) (store_pair_dw_<DX:mode><DX2:mode>, store_pair_dw_tftf) (load_pair_extendsidi2_aarch64) (load_pair_zero_extendsidi2_aarch64): Use %z for the memory operand. * config/aarch64/aarch64-simd.md (load_pair<DREG:mode><DREG2:mode>) (vec_store_pair<DREG:mode><DREG2:mode>, load_pair<VQ:mode><VQ2:mode>) (vec_store_pair<VQ:mode><VQ2:mode>): Likewise.
2020-08-05	refactor LIM a bit	Richard Biener	1	-95/+58
	This refactors LIM to eschew alloc_aux_for_edges and re-uses the RPO order of the move_computations walk for invariantness computation as well. It also removes one unnecessary sorting (but retaining it as checking code because we bsearch the vector) and moves edge insert commit code to the place where it doesn't have to scan all the functions edges. This was all done when investigating whether LIM can be refactored to work on a specific loop for on-demand processing (but we're not there yet). 2020-08-05 Richard Biener <rguenther@suse.de> * tree-ssa-loop-im.c (invariantness_dom_walker): Remove. (invariantness_dom_walker::before_dom_children): Move to ... (compute_invariantness): ... this function. (move_computations): Inline ... (tree_ssa_lim): ... here, share RPO order and avoid some cfun references. (analyze_memory_references): Remove sorting of location lists, instead assert they are sorted already when checking. (prev_flag_edges): Remove. (execute_sm_if_changed): Pass down and adjust prev edge state. (execute_sm_exit): Likewise. (hoist_memory_references): Likewise. Commit edge insertions of each processed exit. (store_motion_loop): Do not commit edge insertions on all edges in the function. (tree_ssa_lim_initialize): Do not call alloc_aux_for_edges. (tree_ssa_lim_finalize): Do not call free_aux_for_edges.
2020-08-05	Make genmatch transform failure handling more consistent	Richard Biener	1	-15/+29
	Currently whether a fail during the transform stage is fatal or whether following patterns are still considers is a bit random depending on whether the pattern is wrapped in a for for example. The follwing makes it consistent by replacing early returns with gotos to the end of the pattern processing. 2020-08-05 Richard Biener <rguenther@suse.de> * genmatch.c (fail_label): New global. (expr::gen_transform): Branch to fail_label instead of returning. Fix indent of call argument checking. (dt_simplify::gen_1): Compute and emit fail_label, branch to it instead of returning early.
2020-08-05	openmp: Handle even some combined non-rectangular loops	Jakub Jelinek	3	-5/+378
	The number of loops computation and logical iteration -> actual iterator values computations can now be done separately even on composite constructs (though for triangular loops it would still be more efficient to propagate a few values through, will handle that incrementally). simd and taskloop are still unhandled. 2020-08-05 Jakub Jelinek <jakub@redhat.com> * omp-expand.c (expand_omp_for): Don't disallow combined non-rectangular loops. * testsuite/libgomp.c/loop-22.c: New test. * testsuite/libgomp.c/loop-23.c: New test.
2020-08-05	openmp: Handle reduction clauses on host teams construct [PR96459]	Jakub Jelinek	4	-28/+83
	As the new testcase shows, we weren't actually performing reductions on host teams construct. And fixing that revealed a flaw in the for-14.c testcase. The problem is that the tests perform also initialization and checking around the calls to the functions with the OpenMP constructs. In that testcase, all the tests have been spawned from a teams construct but only the tested loops were distribute, which means the initialization and checking has been performed redundantly and racily in each team. Fixed by performing the initialization and checking outside of host teams and only do the calls to functions with the tested constructs inside of host teams. 2020-08-05 Jakub Jelinek <jakub@redhat.com> PR middle-end/96459 * omp-low.c (lower_omp_taskreg): Call lower_reduction_clauses even in for host teams. * testsuite/libgomp.c/teams-3.c: New test. * testsuite/libgomp.c-c++-common/for-2.h (OMPTEAMS): Define to nothing if not defined yet. (N(test)): Use it before all N(f) calls. testsuite/libgomp.c-c++-common/for-14.c (DO_PRAGMA, OMPTEAMS): Define. (main): Don't call all test_* functions from within #pragma omp teams reduction(\|:err), call them directly.
2020-08-05	openmp: Use more efficient logical -> actual computation even if # ↵	Jakub Jelinek	1	-7/+22
	iterations is computed at runtime For triangular loops use more efficient logical iteration number to actual iterator values computation even for non-rectangular loops where number of loop iterations could not be computed at compile time. 2020-08-05 Jakub Jelinek <jakub@redhat.com> * omp-expand.c (expand_omp_for_init_counts): Remember first_inner_iterations, factor and n1o from the number of iterations computation in *fd. (expand_omp_for_init_vars): Use more efficient logical iteration number to actual iterator values computation even for non-rectangular loops where number of loop iterations could not be computed at compile time.
2020-08-04	rs6000 Add vector blend, permute builtin support	Carl Love	8	-6/+830
	GCC maintainers: The following patch adds support for the vec_blendv and vec_permx builtins. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) with no regression errors. The test cases were compiled on a Power 9 system and then tested on Mambo. Carl Love rs6000 RFC2609 vector blend, permute instructions gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.h (vec_blendv, vec_permx): Add define. * config/rs6000/altivec.md (UNSPEC_XXBLEND, UNSPEC_XXPERMX.): New unspecs. (VM3): New define_mode. (VM3_char): New define_attr. (xxblend_<mode> mode VM3): New define_insn. (xxpermx): New define_expand. (xxpermx_inst): New define_insn. * config/rs6000/rs6000-builtin.def (VXXBLEND_V16QI, VXXBLEND_V8HI, VXXBLEND_V4SI, VXXBLEND_V2DI, VXXBLEND_V4SF, VXXBLEND_V2DF): New BU_P10V_3 definitions. (XXBLEND): New BU_P10_OVERLOAD_3 definition. (XXPERMX): New BU_P10_OVERLOAD_4 definition. * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): (P10_BUILTIN_VXXPERMX): Add if statement. * config/rs6000/rs6000-call.c (P10_BUILTIN_VXXBLEND_V16QI, P10_BUILTIN_VXXBLEND_V8HI, P10_BUILTIN_VXXBLEND_V4SI, P10_BUILTIN_VXXBLEND_V2DI, P10_BUILTIN_VXXBLEND_V4SF, P10_BUILTIN_VXXBLEND_V2DF, P10_BUILTIN_VXXPERMX): Define overloaded arguments. (rs6000_expand_quaternop_builtin): Add if case for CODE_FOR_xxpermx. (builtin_quaternary_function_type): Add v16uqi_type and xxpermx_type variables, add case statement for P10_BUILTIN_VXXPERMX. (builtin_function_type): Add case statements for P10_BUILTIN_VXXBLEND_V16QI, P10_BUILTIN_VXXBLEND_V8HI, P10_BUILTIN_VXXBLEND_V4SI, P10_BUILTIN_VXXBLEND_V2DI. * doc/extend.texi: Add documentation for vec_blendv and vec_permx. gcc/testsuite/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vec-blend-runnable.c: New test. * gcc.target/powerpc/vec-permute-ext-runnable.c: New test.
2020-08-04	rs6000, Add vector splat builtin support	Carl Love	9	-0/+379
	GCC maintainers: The following patch adds support for the vec_splati, vec_splatid and vec_splati_ins builtins. This patch adds support for instructions that take a 32-bit immediate value that represents a floating point value. This support adds new predicates and a support function to properly handle the immediate value. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) with no regression errors. The test case was compiled on a Power 9 system and then tested on Mambo. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love -------------------------------------------------------- gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.h (vec_splati, vec_splatid, vec_splati_ins): Add defines. * config/rs6000/altivec.md (UNSPEC_XXSPLTIW, UNSPEC_XXSPLTID, UNSPEC_XXSPLTI32DX): New. (vxxspltiw_v4si, vxxspltiw_v4sf_inst, vxxspltidp_v2df_inst, vxxsplti32dx_v4si_inst, vxxsplti32dx_v4sf_inst): New define_insn. (vxxspltiw_v4sf, vxxspltidp_v2df, vxxsplti32dx_v4si, vxxsplti32dx_v4sf.): New define_expands. * config/rs6000/predicates.md (u1bit_cint_operand, s32bit_cint_operand, c32bit_cint_operand): New predicates. * config/rs6000/rs6000-builtin.def (VXXSPLTIW_V4SI, VXXSPLTIW_V4SF, VXXSPLTID): New definitions. (VXXSPLTI32DX_V4SI, VXXSPLTI32DX_V4SF): New BU_P10V_3 definitions. (XXSPLTIW, XXSPLTID): New definitions. (XXSPLTI32DX): Add definitions. * config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_XXSPLTIW, P10_BUILTIN_VEC_XXSPLTID, P10_BUILTIN_VEC_XXSPLTI32DX): New definitions. * config/rs6000/rs6000-protos.h (rs6000_constF32toI32): New extern declaration. * config/rs6000/rs6000.c (rs6000_constF32toI32): New function. * doc/extend.texi: Add documentation for vec_splati, vec_splatid, and vec_splati_ins. gcc/testsuite/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vec-splati-runnable.c: New test.
2020-08-04	rs6000, Add vector shift double builtin support	Carl Love	6	-0/+539
	GCC maintainers: The following patch adds support for the vector shift double builtins. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) and Mambo with no regression errors. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love ------------------------------------------------------- gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.h (vec_sldb, vec_srdb): New defines. * config/rs6000/altivec.md (UNSPEC_SLDB, UNSPEC_SRDB): New. (SLDB_lr): New attribute. (VSHIFT_DBL_LR): New iterator. (vs<SLDB_lr>db_<mode>): New define_insn. * config/rs6000/rs6000-builtin.def (VSLDB_V16QI, VSLDB_V8HI, VSLDB_V4SI, VSLDB_V2DI, VSRDB_V16QI, VSRDB_V8HI, VSRDB_V4SI, VSRDB_V2DI): New BU_P10V_3 definitions. (SLDB, SRDB): New BU_P10_OVERLOAD_3 definitions. * config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_SLDB, P10_BUILTIN_VEC_SRDB): New definitions. (rs6000_expand_ternop_builtin) [CODE_FOR_vsldb_v16qi, CODE_FOR_vsldb_v8hi, CODE_FOR_vsldb_v4si, CODE_FOR_vsldb_v2di, CODE_FOR_vsrdb_v16qi, CODE_FOR_vsrdb_v8hi, CODE_FOR_vsrdb_v4si, CODE_FOR_vsrdb_v2di]: Add clauses. * doc/extend.texi: Add description for vec_sldb and vec_srdb. gcc/testsuite/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vec-shift-double-runnable.c: New test file.
2020-08-04	rs6000, Add vector replace builtin support GCC maintainers:	Carl Love	6	-0/+478
	The following patch adds support for builtins vec_replace_elt and vec_replace_unaligned. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) and mambo with no regression errors. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love ------------------------------------------------------- gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.h: Add define for vec_replace_elt and vec_replace_unaligned. * config/rs6000/vsx.md (UNSPEC_REPLACE_ELT, UNSPEC_REPLACE_UN): New unspecs. (REPLACE_ELT): New mode iterator. (REPLACE_ELT_char, REPLACE_ELT_sh, REPLACE_ELT_max): New mode attributes. (vreplace_un_<mode>, vreplace_elt_<mode>_inst): New. * config/rs6000/rs6000-builtin.def (VREPLACE_ELT_V4SI, VREPLACE_ELT_UV4SI, VREPLACE_ELT_V4SF, VREPLACE_ELT_UV2DI, VREPLACE_ELT_V2DF, VREPLACE_UN_V4SI, VREPLACE_UN_UV4SI, VREPLACE_UN_V4SF, VREPLACE_UN_V2DI, VREPLACE_UN_UV2DI, VREPLACE_UN_V2DF, (REPLACE_ELT, REPLACE_UN, VREPLACE_ELT_V2DI): New builtin entries. * config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_REPLACE_ELT, P10_BUILTIN_VEC_REPLACE_UN): New builtin argument definitions. (rs6000_expand_quaternop_builtin): Add 3rd argument checks for CODE_FOR_vreplace_elt_v4si, CODE_FOR_vreplace_elt_v4sf, CODE_FOR_vreplace_un_v4si, CODE_FOR_vreplace_un_v4sf. (builtin_function_type) [P10_BUILTIN_VREPLACE_ELT_UV4SI, P10_BUILTIN_VREPLACE_ELT_UV2DI, P10_BUILTIN_VREPLACE_UN_UV4SI, P10_BUILTIN_VREPLACE_UN_UV2DI]: New cases. * doc/extend.texi: Add description for vec_replace_elt and vec_replace_unaligned builtins. gcc/testsuite/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vec-replace-word-runnable.c: New test.
2020-08-04	rs6000 Add vector insert builtin support	Carl Love	6	-0/+595
	GCC maintainers: This patch adds support for vec_insertl and vec_inserth builtins. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) and mambo with no regression errors. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love -------------------------------------------------------------- gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.h (vec_insertl, vec_inserth): New defines. * config/rs6000/rs6000-builtin.def (VINSERTGPRBL, VINSERTGPRHL, VINSERTGPRWL, VINSERTGPRDL, VINSERTVPRBL, VINSERTVPRHL, VINSERTVPRWL, VINSERTGPRBR, VINSERTGPRHR, VINSERTGPRWR, VINSERTGPRDR, VINSERTVPRBR, VINSERTVPRHR, VINSERTVPRWR): New builtins. (INSERTL, INSERTH): New builtins. * config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_INSERTL, P10_BUILTIN_VEC_INSERTH): New overloaded definitions. (P10_BUILTIN_VINSERTGPRBL, P10_BUILTIN_VINSERTGPRHL, P10_BUILTIN_VINSERTGPRWL, P10_BUILTIN_VINSERTGPRDL, P10_BUILTIN_VINSERTVPRBL, P10_BUILTIN_VINSERTVPRHL, P10_BUILTIN_VINSERTVPRWL): Add case entries. * config/rs6000/vsx.md (define_c_enum): Add UNSPEC_INSERTL, UNSPEC_INSERTR. (define_expand): Add vinsertvl_<mode>, vinsertvr_<mode>, vinsertgl_<mode>, vinsertgr_<mode>, mode is VI2. (define_ins): vinsertvl_internal_<mode>, vinsertvr_internal_<mode>, vinsertgl_internal_<mode>, vinsertgr_internal_<mode>, mode VEC_I. * doc/extend.texi: Add documentation for vec_insertl, vec_inserth. gcc/testsuite/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vec-insert-word-runnable.c: New test case.
2020-08-04	rs6000, Update support for vec_extract	Carl Love	3	-103/+110
	GCC maintainers: Move the existing vector extract support in altivec.md to vsx.md so all of the vector insert and extract support is in the same file. The patch also updates the name of the builtins and descriptions for the builtins in the documentation file so they match the approved builtin names and descriptions. The patch does not make any functional changes. Please let me know if the changes are acceptable for mainline. Thanks. Carl Love ------------------------------------------------------ gcc/ChangeLog 2020-08-04 Carl Love <cel@us.ibm.com> * config/rs6000/altivec.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR) (vextractl<mode>, vextractr<mode>) (vextractl<mode>_internal, vextractr<mode>_internal for mode VI2) (VI2): Move to ... * config/rs6000/vsx.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR) (vextractl<mode>, vextractr<mode>) (vextractl<mode>_internal, vextractr<mode>_internal for mode VI2) (VI2): ..here. * doc/extend.texi: Update documentation for vec_extractl. Replace builtin name vec_extractr with vec_extracth. Update description of vec_extracth.
2020-08-05	Daily bump.	GCC Administrator	7	-1/+318

2020-08-04	aarch64: Delete duplicated option docs.	Jim Wilson	1	-18/+0
	Noticed while reviewing the RISC-V -mstack-protector-guard docs. The AArch64 section has two identical copies of the docs for this option. gcc/ * doc/invoke.texi (AArch64 Options): Delete duplicate -mstack-protector-guard docs.
2020-08-05	[PATCH] nvptx: Add support for PTX highpart multiplications (HI/SI)	Roger Sayle	3	-0/+78
	This patch adds support for signed and unsigned, HImode and SImode highpart multiplications to the nvptx backend. This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu with a "make" and "make -k check" with no new failures with the above patch. 2020-08-04 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog: * config/nvptx/nvptx.md (smulhi3_highpart, smulsi3_highpart) (umulhi3_highpart, umulsi3_highpart): New instructions. gcc/testsuite/ChangeLog: * gcc.target/nvptx/mul-hi.c: New test. * gcc.target/nvptx/umul-hi.c: New test.
2020-08-04	c++: Template keyword following :: [PR96082]	Marek Polacek	2	-1/+12
	In r9-4235 I tried to make sure that the template keyword follows a nested-name-specifier. :: is a valid nested-name-specifier, so I also have to check 'globalscope' before giving the error. gcc/cp/ChangeLog: PR c++/96082 * parser.c (cp_parser_elaborated_type_specifier): Allow 'template' following ::. gcc/testsuite/ChangeLog: PR c++/96082 * g++.dg/template/template-keyword3.C: New test.
2020-08-04	compiler: delete lowered constant strings	Ian Lance Taylor	2	-2/+9
	If we lower a constant string operation in a Binary_expression, delete the strings. This is safe because constant strings are always newly allocated. This is a hack to use much less memory when compiling the new time/tzdata package, which has a file that contains the sum of over 13,000 constant strings. We don't do this for numeric expressions because that could cause us to delete an Iota_expression. We should have a cleaner approach to memory usage some day. Fixes PR go/96450
2020-08-04	amdgcn: Remove dead defines from gcn-run	Andrew Stubbs	1	-18/+0
	Nothing uses these since the switch to HSACOv3. gcc/ChangeLog: * config/gcn/gcn-run.c (R_AMDGPU_NONE): Delete. (R_AMDGPU_ABS32_LO): Delete. (R_AMDGPU_ABS32_HI): Delete. (R_AMDGPU_ABS64): Delete. (R_AMDGPU_REL32): Delete. (R_AMDGPU_REL64): Delete. (R_AMDGPU_ABS32): Delete. (R_AMDGPU_GOTPCREL): Delete. (R_AMDGPU_GOTPCREL32_LO): Delete. (R_AMDGPU_GOTPCREL32_HI): Delete. (R_AMDGPU_REL32_LO): Delete. (R_AMDGPU_REL32_HI): Delete. (reserved): Delete. (R_AMDGPU_RELATIVE64): Delete.
2020-08-04	[Arm] Modify default tuning of armv8.1-m.main to use Cortex-M55	Omar Tahir	1	-1/+1
	Previously, compiling with -march=armv8.1-m.main would tune for Cortex-M7. However, the Cortex-M7 only supports up to Armv7e-M. The Cortex-M55 is the earliest CPU that supports Armv8.1-M Mainline so is more appropriate. This also has the effect of changing the branch cost function used, which will be necessary to correctly prioritise conditional instructions over branches in the rest of this patch series. Regression tested on arm-none-eabi. gcc/ChangeLog 2020-08-04 Omar Tahir <omar.tahir@arm.com> * config/arm/arm-cpus.in (armv8.1-m.main): Tune for Cortex-M55.
2020-08-04	aarch64: Delete unnecessary code	Hu Jiangping	1	-2/+0
	gcc/ * config/aarch64/aarch64.c (aarch64_if_then_else_costs): Delete redundant extra_cost variable.
2020-08-04	c++: fix template parm count leak	Nathan Sidwell	3	-30/+34
	I noticed that we could leak parser->num_template_parameter_lists with erroneous specializations. We'd increment, notice a problem and then bail out. This refactors cp_parser_explicit_specialization to avoid that code path. A couple of tests get different diagnostics because of the fix. pr39425 then goes to unbounded template instantiation and exceeds the implementation limit. gcc/cp/ * parser.c (cp_parser_explicit_specialization): Refactor to avoid leak of num_template_parameter_lists value. gcc/testsuite/ * g++.dg/template/pr39425.C: Adjust errors, (unbounded template recursion). * g++.old-deja/g++.pt/spec20.C: Remove fallout diagnostics.
2020-08-04	AArch64: Use FLOAT_MODE_P macro and add FLAG_AUTO_FP [PR94442]	xiezhiheng	1	-20/+6
	Since all FP intrinsics are set by FLAG_FP by default, but not all FP intrinsics raise FP exceptions or read FPCR register. So we add a global flag FLAG_AUTO_FP to suppress the flag FLAG_FP. 2020-08-04 Zhiheng Xie <xiezhiheng@huawei.com> gcc/ChangeLog: * config/aarch64/aarch64-builtins.c (aarch64_call_properties): Use FLOAT_MODE_P macro instead of enumerating all floating-point modes and add global flag FLAG_AUTO_FP.
2020-08-04	Fortran/OpenMP: Fix detecting not perfectly nested loops	Tobias Burnus	3	-4/+34
	gcc/fortran/ChangeLog: * openmp.c (resolve_omp_do): Detect not perfectly nested loop with innermost collapse. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/collapse1.f90: Add dg-error. * gfortran.dg/gomp/collapse2.f90: New test.
2020-08-04	doc: Add @cindex to symver attribute	Jakub Jelinek	1	-0/+1
	When looking at the symver attr documentation in html, I found there is no name to refer to for it. 2020-08-04 Jakub Jelinek <jakub@redhat.com> * doc/extend.texi (symver): Add @cindex for symver function attribute.
2020-08-04	Test case for PR rtl-optimization/60473	Roger Sayle	1	-0/+12
	PR rtl-optimization/60473 is code quality regression that has been cured by improvements to register allocation. For the function in the test case, GCC 4.4, 4.5 and 4.6 generated very poor code requiring two mov instructions, and GCC 4.7 and 4.8 (when the PR was filed) produced better but still poor code with one mov instruction. Since GCC 4.9 (including current mainline), it generates optimal code with no mov instructions, matching what used to be generated in GCC 4.1. 2020-08-04 Roger Sayle <roger@nextmovesoftware.com> gcc/testsuite/ChangeLog PR rtl-optimization/60473 * gcc.target/i386/pr60473.c: New test.
2020-08-04	Simplify X * C1 == C2 with undefined overflow	Marc Glisse	3	-1/+23
	this transformation is quite straightforward, without overflow, 3X==15 is the same as X==5 and 3X==5 cannot happen. Adding a single_use restriction for the first case didn't seem necessary, although of course it can slightly increase register pressure in some cases. 2020-08-04 Marc Glisse <marc.glisse@inria.fr> PR tree-optimization/95433 * match.pd (X * C1 == C2): New transformation. * gcc.c-torture/execute/pr23135.c: Add -fwrapv to avoid undefined behavior. * gcc.dg/tree-ssa/pr95433.c: New file.
2020-08-04	Adjust gimple-ssa-sprintf.c for irange API.	Aldy Hernandez	1	-21/+16
	gcc/ChangeLog: * gimple-ssa-sprintf.c (get_int_range): Adjust for irange API. (format_integer): Same. (handle_printf_call): Same.
2020-08-04	d: Fix struct literals that have non-deterministic hash values (PR96153)	Iain Buclaw	3	-36/+101
	Adds code generation for generating a temporary for, and pre-filling struct and array literals with zeroes before assigning, so that alignment holes don't cause objects to produce a non-deterministic hash value. A new field has been added to the expression visitor to track whether the result is being generated for another literal, so that memset() is only called once on the top-level literal expression, and not for nesting struct or arrays. gcc/d/ChangeLog: PR d/96153 * d-tree.h (build_expr): Add literalp argument. * expr.cc (ExprVisitor): Add literalp_ field. (ExprVisitor::ExprVisitor): Initialize literalp_. (ExprVisitor::visit (AssignExp )): Call memset() on blits where RHS is a struct literal. Elide assignment if initializer is all zeroes. (ExprVisitor::visit (CastExp )): Forward literalp_ to generation of subexpression. (ExprVisitor::visit (AddrExp )): Likewise. (ExprVisitor::visit (ArrayLiteralExp )): Use memset() to pre-fill object with zeroes. Set literalp in subexpressions. (ExprVisitor::visit (StructLiteralExp )): Likewise. (ExprVisitor::visit (TupleExp )): Set literalp in subexpressions. (ExprVisitor::visit (VectorExp )): Likewise. (ExprVisitor::visit (VectorArrayExp )): Likewise. (build_expr): Forward literal_p to ExprVisitor. gcc/testsuite/ChangeLog: PR d/96153 * gdc.dg/pr96153.d: New test.
2020-08-04	amdgcn: TImode shifts	Andrew Stubbs	1	-0/+105
	Implement TImode shifts in the backend. The middle-end support that does it for other architectures doesn't work for GCN because BITS_PER_WORD==32, meaning that TImode is quad-word, not double-word. gcc/ChangeLog: * config/gcn/gcn.md ("<expander>ti3"): New.
2020-08-04	c++: Member initializer list diagnostic locations [PR94024]	Patrick Palka	4	-1/+52
	This patch preserves the source locations of each node in a member initializer list so that during processing of the list we can set input_location appropriately for generally more accurate diagnostic locations. Since TREE_LIST nodes are tcc_exceptional, they can't have source locations, so we instead store the location in a dummy tcc_expression node within the TREE_TYPE of the list node. gcc/cp/ChangeLog: PR c++/94024 * init.c (sort_mem_initializers): Preserve TREE_TYPE of the member initializer list node. (emit_mem_initializers): Set input_location when performing each member initialization. * parser.c (cp_parser_mem_initializer): Attach the source location of this initializer to a dummy EMPTY_CLASS_EXPR within the TREE_TYPE of the list node. * pt.c (tsubst_initializer_list): Preserve TREE_TYPE of the member initializer list node. gcc/testsuite/ChangeLog: PR c++/94024 * g++.dg/diagnostic/mem-init1.C: New test.
2020-08-04	tree-optimization/88240 - stopgap for floating point code-hoisting issues	Richard Biener	4	-1/+49
	This adds a stopgap measure to avoid performing code-hoisting on mixed type loads when the load we'd insert in the hoisting position would be a floating point one. This is because certain targets (hello x87) cannot perform floating point loads without possibly altering the bit representation and thus cannot be used in place of integral loads. 2020-08-04 Richard Biener <rguenther@suse.de> PR tree-optimization/88240 * tree-ssa-sccvn.h (vn_reference_s::punned): New flag. * tree-ssa-sccvn.c (vn_reference_insert): Initialize punned. (vn_reference_insert_pieces): Likewise. (visit_reference_op_call): Likewise. (visit_reference_op_load): Track whether a ref was punned. * tree-ssa-pre.c (do_hoist_insertion): Refuse to perform hoist insertion on punned floating point loads. * gcc.target/i386/pr88240.c: New testcase.
2020-08-04	Fortran: Fix for OpenMP's 'lastprivate(conditional:'	Tobias Burnus	2	-8/+6
	gcc/fortran/ChangeLog: * trans-openmp.c (gfc_trans_omp_do): Fix 'lastprivate(conditional:'. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/lastprivate-conditional-3.f90: Enable some previously disabled 'lastprivate(conditional:' dg-warnings.
2020-08-04	aarch64: Use Q-reg loads/stores in movmem expansion	Sudakshina Das	3	-7/+52
	This is my attempt at reviving the old patch https://gcc.gnu.org/pipermail/gcc-patches/2019-January/514632.html I have followed on Kyrill's comment upstream on the link above and I am using the recommended option iii that he mentioned. "1) Adjust the copy_limit to 256 bits after checking AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS in the tuning. 2) Adjust aarch64_copy_one_block_and_progress_pointers to handle 256-bit moves. by iii: iii) Emit explicit V4SI (or any other 128-bit vector mode) pairs ldp/stps. This wouldn't need any adjustments to MD patterns, but would make aarch64_copy_one_block_and_progress_pointers more complex as it would now have two paths, where one handles two adjacent memory addresses in one calls." gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_gen_store_pair): Add case for E_V4SImode. (aarch64_gen_load_pair): Likewise. (aarch64_copy_one_block_and_progress_pointers): Handle 256 bit copy. (aarch64_expand_cpymem): Expand copy_limit to 256bits where appropriate. gcc/testsuite/ChangeLog: * gcc.target/aarch64/cpymem-q-reg_1.c: New test. * gcc.target/aarch64/large_struct_copy_2.c: Update for ldp q regs.
2020-08-04	aarch64: Add missing clobber for fjcvtzs	Andrea Corallo	4	-1/+59
	gcc/ChangeLog 2020-07-30 Andrea Corallo <andrea.corallo@arm.com> * config/aarch64/aarch64.md (aarch64_fjcvtzs): Add missing clobber. * doc/sourcebuild.texi (aarch64_fjcvtzs_hw) Document new target supports option. gcc/testsuite/ChangeLog 2020-07-30 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/aarch64/acle/jcvt_2.c: New testcase. * lib/target-supports.exp (check_effective_target_aarch64_fjcvtzs_hw): Add new check for FJCVTZS hw.
2020-08-04	[nvptx] Handle V2DI/V2SI mode in nvptx_gen_shuffle	Tom de Vries	3	-0/+95
	With the pr96628-part1.f90 source and -ftree-slp-vectorize, we run into an ICE due to the fact that V2DI mode is not handled in nvptx_gen_shuffle. Fix this by adding handling of V2DI as well as V2SI mode in nvptx_gen_shuffle. Build and reg-tested on x86_64 with nvptx accelerator. gcc/ChangeLog: PR target/96428 * config/nvptx/nvptx.c (nvptx_gen_shuffle): Handle V2SI/V2DI. libgomp/ChangeLog: PR target/96428 * testsuite/libgomp.oacc-fortran/pr96628-part1.f90: New test. * testsuite/libgomp.oacc-fortran/pr96628-part2.f90: New test.
2020-08-04	veclower: Don't ICE on .VEC_CONVERT calls with no lhs [PR96426]	Jakub Jelinek	2	-0/+16
	.VEC_CONVERT is a const internal call, so normally if the lhs is not used, we'd DCE it far before getting to veclower, but with -O0 (or perhaps -fno-tree-dce and some other -fno-* options) it can happen. But as the internal fn needs the lhs to know the type to which the conversion is done (and I think that is a reasonable representation, having some magic another argument and having to create constants with that type looks overkill to me), we just should DCE those calls ourselves. During veclower, we can't really remove insns, as the callers would be upset, so this just replaces it with a GIMPLE_NOP. 2020-08-04 Jakub Jelinek <jakub@redhat.com> PR middle-end/96426 * tree-vect-generic.c (expand_vector_conversion): Replace .VEC_CONVERT call with GIMPLE_NOP if there is no lhs. * gcc.c-torture/compile/pr96426.c: New test.
2020-08-04	gimple-fold: Fix ICE in maybe_canonicalize_mem_ref_addr on debug stmt [PR96354]	Jakub Jelinek	2	-3/+31
	In debug stmts, we are less strict about what is and what is not accepted there, so this patch just punts on optimization of a debug stmt rather than ICEing. 2020-08-04 Jakub Jelinek <jakub@redhat.com> PR debug/96354 * gimple-fold.c (maybe_canonicalize_mem_ref_addr): Add IS_DEBUG argument. Return false instead of gcc_unreachable if it is true and get_addr_base_and_unit_offset returns NULL. (fold_stmt_1) <case GIMPLE_DEBUG>: Adjust caller. * g++.dg/opt/pr96354.C: New test.
2020-08-04	Add is_gimple_min_invariant dropped from previous patch.	Aldy Hernandez	1	-1/+3
	gcc/ChangeLog: * vr-values.c (simplify_using_ranges::vrp_evaluate_conditional): Call is_gimple_min_invariant dropped from previous patch.
2020-08-04	openmp: Compute number of collapsed loop iterations more efficiently for ↵	Jakub Jelinek	1	-100/+352
	some non-rectangular loops 2020-08-04 Jakub Jelinek <jakub@redhat.com> * omp-expand.c (expand_omp_for_init_counts): For triangular loops compute number of iterations at runtime more efficiently. (expand_omp_for_init_vars): Adjust immediate dominators. (extract_omp_for_update_vars): Likewise.
2020-08-04	d: Fix PR96429: Pointer subtraction uses TRUNC_DIV_EXPR	Iain Buclaw	2	-0/+38
	gcc/d/ChangeLog: PR d/96429 * expr.cc (ExprVisitor::visit (BinExp)): Use EXACT_DIV_EXPR for pointer diff expressions. gcc/testsuite/ChangeLog: PR d/96429 gdc.dg/pr96429.d: New test.
2020-08-04	Change testcase for pr96325 from run to compile.	Paul Thomas	1	-1/+1
	2020-08-04 Paul Thomas <pault@gcc.gnu.org> gcc/testsuite/ PR fortran/96325 * gfortran.dg/pr96325.f90: Change from run to compile.
2020-08-04	Adjust two_valued_val_range_p for irange API.	Aldy Hernandez	1	-22/+9
	gcc/ChangeLog: * vr-values.c (simplify_using_ranges::two_valued_val_range_p): Use irange API.
2020-08-04	Adjust simplify_conversion_using_ranges for irange API.	Aldy Hernandez	1	-4/+7
	gcc/ChangeLog: * vr-values.c (simplify_conversion_using_ranges): Convert to irange API.
2020-08-04	Use irange API in test_for_singularity.	Aldy Hernandez	1	-5/+8
	gcc/ChangeLog: * vr-values.c (test_for_singularity): Use irange API. (simplify_using_ranges::simplify_cond_using_ranges_1): Do not special case VR_RANGE.
2020-08-04	Adjust vrp_evaluate_conditional for irange API.	Aldy Hernandez	1	-5/+1
	gcc/ChangeLog: * vr-values.c (simplify_using_ranges::vrp_evaluate_conditional): Adjust for irange API.
2020-08-04	Adjust op_with_boolean_value_range_p for irange API.	Aldy Hernandez	1	-3/+4
	gcc/ChangeLog: * vr-values.c (simplify_using_ranges::op_with_boolean_value_range_p): Adjust for irange API.
2020-08-04	Adjust get_range_info to use the base irange class.	Aldy Hernandez	2	-2/+2
	gcc/ChangeLog: * tree-ssanames.c (get_range_info): Use irange instead of value_range. * tree-ssanames.h (get_range_info): Same.
2020-08-04	Adjust expr_not_equal_to to use irange API.	Aldy Hernandez	1	-13/+4
	gcc/ChangeLog: * fold-const.c (expr_not_equal_to): Adjust for irange API.