aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2021-12-08nvptx: Add test-case gcc.target/nvptx/exttrunc-1.cRoger Sayle1-0/+20
Add new test-case converting short to char and back to short. Tested on nvptx. gcc/testsuite/ChangeLog: * gcc.target/nvptx/exttrunc-1.c: New test case.
2021-12-08openmp: Improve OpenMP target support for C++ (PR92120)Chung-Lin Tang19-114/+1237
This patch implements several C++ specific mapping capabilities introduced for OpenMP 5.0, including implicit mapping of this[:1] for non-static member functions, zero-length array section mapping of pointer-typed members, lambda captured variable access in target regions, and use of lambda objects inside target regions. Several adjustments to the C/C++ front-ends to allow more member-access syntax as valid is also included. PR middle-end/92120 gcc/cp/ChangeLog: * cp-tree.h (finish_omp_target): New declaration. (finish_omp_target_clauses): Likewise. * parser.c (cp_parser_omp_clause_map): Adjust call to cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true. (cp_parser_omp_target): Factor out code, adjust into calls to new function finish_omp_target. * pt.c (tsubst_expr): Add call to finish_omp_target_clauses for OMP_TARGET case. * semantics.c (handle_omp_array_sections_1): Add handling to create 'this->member' from 'member' FIELD_DECL. Remove case of rejecting 'this' when not in declare simd. (handle_omp_array_sections): Likewise. (finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP map clauses. Handle 'A->member' case in map clauses. Remove case of rejecting 'this' when not in declare simd. (struct omp_target_walk_data): New struct for walking over target-directive tree body. (finish_omp_target_clauses_r): New function for tree walk. (finish_omp_target_clauses): New function. (finish_omp_target): New function. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in call to c_parser_omp_variable_list to 'true'. * c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in array base handling. (c_finish_omp_clauses): Handle 'A->member' case in map clauses. gcc/ChangeLog: * gimplify.c ("tree-hash-traits.h"): Add include. (gimplify_scan_omp_clauses): Change struct_map_to_clause to type hash_map<tree_operand, tree> *. Adjust struct map handling to handle cases of *A and A->B expressions. Under !DECL_P case of GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to struct_deref_set for map(*ptr_to_struct) cases. Add MEM_REF case when handling component_ref_p case. Add unshare_expr and gimplification when created GOMP_MAP_STRUCT is not a DECL. Add code to add firstprivate pointer for *pointer-to-struct case. (gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for exit data directives code to earlier position. * omp-low.c (lower_omp_target): Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. * tree-pretty-print.c (dump_omp_clause): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/gomp/target-3.c: New testcase. * g++.dg/gomp/target-3.C: New testcase. * g++.dg/gomp/target-lambda-1.C: New testcase. * g++.dg/gomp/target-lambda-2.C: New testcase. * g++.dg/gomp/target-this-1.C: New testcase. * g++.dg/gomp/target-this-2.C: New testcase. * g++.dg/gomp/target-this-3.C: New testcase. * g++.dg/gomp/target-this-4.C: New testcase. * g++.dg/gomp/target-this-5.C: New testcase. * g++.dg/gomp/this-2.C: Adjust testcase. include/ChangeLog: * gomp-constants.h (enum gomp_map_kind): Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. (GOMP_MAP_POINTER_P): Include GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION. libgomp/ChangeLog: * libgomp.h (gomp_attach_pointer): Add bool parameter. * oacc-mem.c (acc_attach_async): Update call to gomp_attach_pointer. (goacc_enter_data_internal): Likewise. * target.c (gomp_map_vars_existing): Update assert condition to include GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION. (gomp_map_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for mapping a pointer with NULL target. (gomp_attach_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for attaching a pointer with NULL target. (gomp_map_vars_internal): Update calls to gomp_map_pointer and gomp_attach_pointer, add handling for GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION cases. * testsuite/libgomp.c++/target-23.C: New testcase. * testsuite/libgomp.c++/target-lambda-1.C: New testcase. * testsuite/libgomp.c++/target-lambda-2.C: New testcase. * testsuite/libgomp.c++/target-this-1.C: New testcase. * testsuite/libgomp.c++/target-this-2.C: New testcase. * testsuite/libgomp.c++/target-this-3.C: New testcase. * testsuite/libgomp.c++/target-this-4.C: New testcase. * testsuite/libgomp.c++/target-this-5.C: New testcase.
2021-12-08dwarf: Multi-register CFI address support.Andrew Stubbs3-70/+284
Add support for architectures such as AMD GCN, in which the pointer size is larger than the register size. This allows the CFI information to include multi-register locations for the stack pointer, frame pointer, and return address. This patch was originally posted by Andrew Stubbs in https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552873.html It has now been re-worked according to the review comments. It does not use DW_OP_piece or DW_OP_LLVM_piece_end. Instead it uses DW_OP_bregx/DW_OP_shl/DW_OP_bregx/DW_OP_plus to build the CFA from multiple consecutive registers. Here is how .debug_frame looks before and after this patch: $ cat factorial.c int factorial(int n) { if (n == 0) return 1; return n * factorial (n - 1); } $ amdgcn-amdhsa-gcc -g factorial.c -O0 -c -o fac.o $ llvm-dwarfdump -debug-frame fac.o *** without this patch (edited for brevity)*** 00000000 00000014 ffffffff CIE DW_CFA_def_cfa: reg48 +0 DW_CFA_register: reg16 reg50 00000018 0000002c 00000000 FDE cie=00000000 pc=00000000...000001ac DW_CFA_advance_loc4: 96 DW_CFA_offset: reg46 0 DW_CFA_offset: reg47 4 DW_CFA_offset: reg50 8 DW_CFA_offset: reg51 12 DW_CFA_offset: reg16 8 DW_CFA_advance_loc4: 4 DW_CFA_def_cfa_sf: reg46 -16 *** with this patch (edited for brevity)*** 00000000 00000024 ffffffff CIE DW_CFA_def_cfa_expression: DW_OP_bregx SGPR49+0, DW_OP_const1u 0x20, DW_OP_shl, DW_OP_bregx SGPR48+0, DW_OP_plus DW_CFA_expression: reg16 DW_OP_bregx SGPR51+0, DW_OP_const1u 0x20, DW_OP_shl, DW_OP_bregx SGPR50+0, DW_OP_plus 00000028 0000003c 00000000 FDE cie=00000000 pc=00000000...000001ac DW_CFA_advance_loc4: 96 DW_CFA_offset: reg46 0 DW_CFA_offset: reg47 4 DW_CFA_offset: reg50 8 DW_CFA_offset: reg51 12 DW_CFA_offset: reg16 8 DW_CFA_advance_loc4: 4 DW_CFA_def_cfa_expression: DW_OP_bregx SGPR47+0, DW_OP_const1u 0x20, DW_OP_shl, DW_OP_bregx SGPR46+0, DW_OP_plus, DW_OP_lit16, DW_OP_minus gcc/ChangeLog: * dwarf2cfi.c (dw_stack_pointer_regnum): Change type to struct cfa_reg. (dw_frame_pointer_regnum): Likewise. (new_cfi_row): Use set_by_dwreg. (get_cfa_from_loc_descr): Use set_by_dwreg. Support register spans. handle DW_OP_bregx with DW_OP_breg{0-31}. Support DW_OP_lit*, DW_OP_const*, DW_OP_minus, DW_OP_shl and DW_OP_plus. (lookup_cfa_1): Use set_by_dwreg. (def_cfa_0): Update for cfa_reg and support register spans. (reg_save): Change sreg parameter to struct cfa_reg. Support register spans. (dwf_cfa_reg): New function. (dwarf2out_flush_queued_reg_saves): Use dwf_cfa_reg instead of dwf_regno. (dwarf2out_frame_debug_def_cfa): Likewise. (dwarf2out_frame_debug_adjust_cfa): Likewise. (dwarf2out_frame_debug_cfa_offset): Likewise. Update reg_save usage. (dwarf2out_frame_debug_cfa_register): Likewise. (dwarf2out_frame_debug_expr): Likewise. (create_pseudo_cfg): Use set_by_dwreg. (initial_return_save): Use set_by_dwreg and dwf_cfa_reg, (create_cie_data): Use dwf_cfa_reg. (execute_dwarf2_frame): Use dwf_cfa_reg. (dump_cfi_row): Use set_by_dwreg. * dwarf2out.c (build_span_loc, build_breg_loc): New function. (build_cfa_loc): Support register spans. (build_cfa_aligned_loc): Update cfa_reg usage. (convert_cfa_to_fb_loc_list): Use set_by_dwreg. * dwarf2out.h (struct cfa_reg): New type. (struct dw_cfa_location): Use struct cfa_reg. (build_span_loc): New prototype. co-authored-By: Hafiz Abid Qadeer <abidh@codesourcery.com>
2021-12-08Add combine splitter to transform vpcmpeqd/vpxor/vblendvps to vblendvps for ~op0Haochen Jiang2-0/+46
gcc/ChangeLog: PR target/100738 * config/i386/sse.md (*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_not_ltint): Add new define_insn_and_split. gcc/testsuite/ChangeLog: PR target/100738 * g++.target/i386/pr100738-1.C: New test.
2021-12-07[PR103149] detach values through mem only if general regs won't doAlexandre Oliva2-6/+75
When hardening compares or conditional branches, we perform redundant tests, and to prevent them from being optimized out, we use asm statements that preserve a value used in a compare, but in a way that the compiler can no longer assume it's the same value, so it can't optimize the redundant test away. We used to use +g, but that requires general regs or mem. You might think that, if a reg constraint can't be satisfied, the register allocator will fall back to memory, but that's not so: we decide on matching MEMs very early on, by using the same addressable operand on both input and output, and only if the constraint does not allow registers. If it does, we use gimple registers and then pseudos as inputs and outputs, and then inputs can be substituted by equivalent expressions, and then, if no register contraint fits (e.g. because that mode won't fit in general regs, or won't fit in regs at all), the register allocator will give up before even trying to allocate some temporary memory to unify input and output. This patch arranges for us to create and use the temporary stack slot if we can tell the mode requires memory, or won't otherwise fit in general regs, and thus to use +m for that asm. for gcc/ChangeLog PR middle-end/103149 * gimple-harden-conditionals.cc (detach_value): Use memory if general regs won't do. for gcc/testsuite/ChangeLog PR middle-end/103149 * gcc.target/aarch64/pr103149.c: New.
2021-12-08Daily bump.GCC Administrator5-1/+132
2021-12-07Fortran: perform array subscript checks only for valid INTEGER boundsHarald Anlauf2-0/+16
gcc/fortran/ChangeLog: PR fortran/103607 * frontend-passes.c (do_subscript): Ensure that array bounds are of type INTEGER before performing checks on array subscripts. gcc/testsuite/ChangeLog: PR fortran/103607 * gfortran.dg/pr103607.f90: New test.
2021-12-07c++: Fix decltype-bitfield1.C on i?86Marek Polacek1-24/+24
This test was failing on i?86 because of: warning: width of 'A::l' exceeds its type so change the type to 'long long' and make the test run only on arches where sizeof(long long) == 8 to avoid failing like this again. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/decltype-bitfield1.C: Change a type to unsigned long long. Only run on longlong64 targets.
2021-12-07testsuite: Fix check_effective_target_rop_ok [PR103556, PR103586]Peter Bergner1-2/+1
The new rop_ok effective target test doesn't correctly compute its expression result because a new line starts a new statement. Solution is to remove the new line. 2021-12-07 Peter Bergner <bergner@linux.ibm.com> gcc/testsuite/ PR testsuite/103556 PR testsuite/103586 * lib/target-supports.exp (check_effective_target_rop_ok): Remove '\n'.
2021-12-07Fortran: catch failed simplification of bad stride expressionHarald Anlauf2-5/+12
gcc/fortran/ChangeLog: PR fortran/103588 * array.c (gfc_ref_dimen_size): Do not generate internal error on failed simplification of stride expression; just return failure. gcc/testsuite/ChangeLog: PR fortran/103588 * gfortran.dg/pr103588.f90: New test.
2021-12-07Fortran: add check for type of upper bound in case rangeHarald Anlauf2-0/+19
gcc/fortran/ChangeLog: PR fortran/103591 * match.c (match_case_selector): Check type of upper bound in case range. gcc/testsuite/ChangeLog: PR fortran/103591 * gfortran.dg/select_9.f90: New test.
2021-12-07Fix --help -Q outputMartin Liska4-11/+18
PR middle-end/103438 gcc/ChangeLog: * config/s390/s390.c (s390_valid_target_attribute_inner_p): Use new enum CLVC_INTEGER. * opt-functions.awk: Use new CLVC_INTEGER. * opts-common.c (set_option): Likewise. (option_enabled): Return -1,0,1 for CLVC_INTEGER. (get_option_state): Use new CLVC_INTEGER. (control_warning_option): Likewise. * opts.h (enum cl_var_type): Likewise.
2021-12-07c++: Fix for decltype and bit-fields [PR95009]Marek Polacek3-3/+94
Here, decltype deduces the wrong type for certain expressions involving bit-fields. Unlike in C, in C++ bit-field width is explicitly not part of the type, so I think decltype should never deduce to 'int:N'. The problem isn't that we're not calling unlowered_expr_type--we are--it's that is_bitfield_expr_with_lowered_type only handles certain codes, but not others. For example, += works fine but ++ does not. This also fixes decltype-bitfield2.C where we were crashing (!), but unfortunately it does not fix 84516 or 70733 where the problem is likely a missing call to unlowered_expr_type. It occurs to me now that typeof likely has had the same issue, but this patch should fix that too. PR c++/95009 gcc/cp/ChangeLog: * typeck.c (is_bitfield_expr_with_lowered_type) <case MODIFY_EXPR>: Handle UNARY_PLUS_EXPR, NEGATE_EXPR, NON_LVALUE_EXPR, BIT_NOT_EXPR, P*CREMENT_EXPR too. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/decltype-bitfield1.C: New test. * g++.dg/cpp0x/decltype-bitfield2.C: New test.
2021-12-07x86: Check FUNCTION_DECL before calling cgraph_node::getH.J. Lu2-1/+16
gcc/ PR target/103594 * config/i386/i386.c (ix86_call_use_plt_p): Check FUNCTION_DECL before calling cgraph_node::get. gcc/testsuite/ PR target/103594 * gcc.dg/pr103594.c: New test.
2021-12-07tree-optimization/103596 - fix missed propagation into switchesRichard Biener4-40/+62
may_propagate_copy unnecessarily restricts propagating non-abnormals into places that currently contain an abnormal SSA name but are not the PHI argument for an abnormal edge. This causes VN to not elide a CFG path that it assumes is elided, resulting in released SSA names in the IL. The fix is to enhance the may_propagate_copy API to specify the destination is _not_ a PHI argument. I chose to not update only the relevant caller in VN and the may_propagate_copy_into_stmt API at this point because this is a regression and needs backporting. 2021-12-07 Richard Biener <rguenther@suse.de> PR tree-optimization/103596 * tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt): Note we are not propagating into a PHI argument to may_propagate_copy. * tree-ssa-propagate.h (may_propagate_copy): Add argument specifying whether we propagate into a PHI arg. * tree-ssa-propagate.c (may_propagate_copy): Likewise. When not doing so we can replace an abnormal with something else. (may_propagate_into_stmt): Update may_propagate_copy calls. (replace_exp_1): Move propagation checking code to propagate_value and rename to ... (replace_exp): ... this and elide previous wrapper. (propagate_value): Perform checking with adjusted may_propagate_copy call and dispatch to replace_exp. * gcc.dg/torture/pr103596.c: New testcase.
2021-12-07Fix hash_map::traverse overloadMatthias Kretz2-3/+5
The hash_map::traverse overload taking a non-const Value pointer breaks if the callback returns false. The other overload should behave the same. Signed-off-by: Matthias Kretz <m.kretz@gsi.de> gcc/ChangeLog: * hash-map.h (hash_map::traverse): Let both overloads behave the same. * predict.c (assert_is_empty): Return true, thus not changing behavior.
2021-12-07MIPS: R6: load/store can process unaligned addressYunQiang Su7-1/+136
MIPS release 6 requires the lw/ld/sw/sd can work with unaligned address, while it can be implemented by full hardware or trap&emulate. Since it doesn't have to be fully done by hardware, we add a pair of options -m(no-)unaligned-access. Kernels may need them. gcc/ChangeLog: * config/mips/mips.h (ISA_HAS_UNALIGNED_ACCESS, STRICT_ALIGNMENT): R6 can unaligned access. * config/mips/mips.md (movmisalign<mode>): Likewise. * config/mips/mips.opt: add -m(no-)unaligned-access * doc/invoke.texi: Likewise. gcc/testsuite/ChangeLog: * gcc.target/mips/mips.exp: add unaligned-access * gcc.target/mips/unaligned-2.c: New test. * gcc.target/mips/unaligned-3.c: New test.
2021-12-06Improve AutoFDO count propagation algorithmEugene Rozenfeld2-2/+61
When a basic block A has been annotated with a count and it has only one successor (or predecessor) B, we can propagate the A's count to B. The algoritm without this change could leave B without an annotation if B had other unannotated predecessors (or successors). For example, in the test case I added, the loop header block was left unannotated, which prevented loop unrolling. gcc/ChangeLog: * auto-profile.c (afdo_propagate_edge): Improve count propagation algorithm. gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/init-array.c: New test for unrolling inner loops.
2021-12-07Daily bump.GCC Administrator4-1/+109
2021-12-06analyzer: fix equivalence class state purging [PR103533]David Malcolm2-2/+149
Whilst debugging state explosions seen when enabling taint detection with -fanalyzer (PR analyzer/103533), I noticed that constraint manager instances could contain stray, redundant constants, such as this instance: constraint_manager: equiv classes: ec0: {(int)0 == [m_constant]‘0’} ec1: {(size_t)4 == [m_constant]‘4’} constraints: where there are two equivalence classes, each just containing a constant, with no constraints using them. This patch makes constraint_manager::canonicalize more aggressive about purging state, handling the case of purging a redundant EC containing just a constant. gcc/analyzer/ChangeLog: PR analyzer/103533 * constraint-manager.cc (equiv_class::contains_non_constant_p): New. (constraint_manager::canonicalize): Call it when determining redundant ECs. (selftest::test_purging): New selftest. (selftest::run_constraint_manager_tests): Likewise. * constraint-manager.h (equiv_class::contains_non_constant_p): New decl. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-12-06rs6000: Fix errant "vector" instead of "__vector"Paul A. Clarke1-1/+1
Fixes 85289ba36c2e62de84cc0232c954d9a74bda708a. 2021-12-06 Paul A. Clarke <pc@us.ibm.com> gcc PR target/103545 * config/rs6000/xmmintrin.h (_mm_movemask_ps): Replace "vector" with "__vector".
2021-12-06bpf: mark/remove unused arguments and remove an unused functionJose E. Marchesi1-20/+5
This patch does a little bit of cleanup by removing some unused arguments, or marking them as unused. It also removes the function ctfc_debuginfo_early_finish_p and the corresponding hook macro definition, which are not used by GCC. gcc/ * config/bpf/bpf.c (bpf_handle_preserve_access_index_attribute): Mark arguments `args' and flags' as unused. (bpf_core_newdecl): Remove unused local `newdecl'. (bpf_core_newdecl): Remove unused argument `loc'. (ctfc_debuginfo_early_finish_p): Remove unused function. (TARGET_CTFC_DEBUGINFO_EARLY_FINISH_P): Remove definition. (bpf_core_walk): Do not pass a location to bpf_core_newdecl.
2021-12-06ranger: Add shortcuts for single-successor blocksRichard Sandiford2-0/+6
When compiling an optabs.ii at -O2 with a release-checking build, there were 6,643,575 calls to gimple_outgoing_range_stmt_p. 96.8% of them were for blocks with a single successor, which never have a control statement that generates new range info. This patch therefore adds a shortcut for that case. This gives a ~1% compile-time improvement for the test. I tried making the function inline (in the header) so that the single_succ_p didn't need to be repeated, but it seemed to make things slightly worse. gcc/ * gimple-range-edge.cc (gimple_outgoing_range::edge_range_p): Add a shortcut for blocks with single successors. * gimple-range-gori.cc (gori_map::calculate_gori): Likewise.
2021-12-06ranger: Optimise irange_unionRichard Sandiford1-33/+13
When compiling an optabs.ii at -O2 with a release-checking build, the hottest function in the profile was irange_union. This patch tries to optimise it a bit. The specific changes are: - Use quick_push rather than safe_push, since the final number of entries is known in advance. - Avoid assigning wi::to_wide & co. to a temporary wide_int, such as in: wide_int val_j = wi::to_wide (res[j]); wi::to_wide returns a wide_int "view" of the in-place INTEGER_CST storage. Assigning the result to wide_int forces an unnecessary copy to temporary storage. This is one area where "auto" helps a lot. In the end though, it seemed more readable to inline the wi::to_*s rather than use auto. - Use to_widest_int rather than to_wide_int. Both are functionally correct, but to_widest_int is more efficient, for three reasons: - to_wide returns a wide-int representation in which the most significant element might not be canonically sign-extended. This is because we want to allow the storage of an INTEGER_CST like 0x1U << 31 to be accessed directly with both a wide_int view (where only 32 bits matter) and a widest_int view (where many more bits matter, and where the 32 bits are zero-extended to match the unsigned type). However, operating on uncanonicalised wide_int forms is less efficient than operating on canonicalised forms. - to_widest_int has a constant rather than variable precision and there are never any redundant upper bits to worry about. - Using widest_int avoids the need for an overflow check, since there is enough precision to add 1 to any IL constant without wrap-around. This gives a ~2% compile-time speed up with the test above. I also tried adding a path for two single-pair ranges, but it wasn't a win. gcc/ * value-range.cc (irange::irange_union): Use quick_push rather than safe_push. Use widest_int rather than wide_int. Avoid assigning wi::to_* results to wide*_int temporaries.
2021-12-06Use dominators to reduce cache-flling.Andrew MacLeod2-0/+74
Before walking the CFG and filling all cache entries, check if the same information is available in a dominator. * gimple-range-cache.cc (ranger_cache::fill_block_cache): Check for a range from dominators before filling the cache. (ranger_cache::range_from_dom): New. * gimple-range-cache.h (ranger_cache::range_from_dom): Add prototype.
2021-12-06Add BB option for outgoing_edge_range_p and may_reocmpute_p.Andrew MacLeod2-29/+51
There are times we only need to know if any edge from a block can calculate a range. * gimple-range-gori.h (class gori_compute):: Add prototypes. * gimple-range-gori.cc (gori_compute::has_edge_range_p): Add alternate API for basic block. Call for edge alterantive. (gori_compute::may_recompute_p): Ditto.
2021-12-06tree-optimization/103581 - fix masked gather on x86Richard Biener2-2/+61
The recent fix to PR103527 exposed an issue with how the various special casing for AVX512 masks in vect_build_gather_load_calls are handled. The following makes that more obvious, fixing the miscompile of 403.gcc. 2021-12-06 Richard Biener <rguenther@suse.de> PR tree-optimization/103581 * tree-vect-stmts.c (vect_build_gather_load_calls): Properly guard all the AVX512 mask cases. * gcc.dg/vect/pr103581.c: New testcase.
2021-12-06tree-optimization/103544 - SLP reduction chain as SLP reduction issueRichard Biener2-3/+33
When SLP reduction chain vectorization support added handling of an outer conversion in the chain picking a failed reduction up as SLP reduction that broke the invariant that the whole reduction was forward reachable. The following plugs that hole noting a future enhancement possibility. 2021-12-06 Richard Biener <rguenther@suse.de> PR tree-optimization/103544 * tree-vect-slp.c (vect_analyze_slp): Only add a SLP reduction opportunity if the stmt in question is the reduction root. (dot_slp_tree): Add missing check for NULL child. * gcc.dg/vect/pr103544.c: New testcase.
2021-12-06avr: Fix AVR build [PR71934]Jakub Jelinek1-2/+2
On Mon, Dec 06, 2021 at 11:00:30AM +0100, Martin Liška wrote: > Jakub, I think the patch broke avr-linux target: > > g++ -fno-PIE -c -g -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-erro > /home/marxin/Programming/gcc/gcc/config/avr/avr.c: In function ‘void avr_output_data_section_asm_op(const void*)’: > /home/marxin/Programming/gcc/gcc/config/avr/avr.c:10097:26: error: invalid conversion from ‘const void*’ to ‘const char*’ [-fpermissive] This patch fixes that. 2021-12-06 Jakub Jelinek <jakub@redhat.com> PR pch/71934 * config/avr/avr.c (avr_output_data_section_asm_op, avr_output_bss_section_asm_op): Change argument type from const void * to const char *.
2021-12-06cse: Make sure duplicate elements are not entered into the equivalence set ↵Tamar Christina2-1/+38
[PR103404] CSE uses equivalence classes to keep track of expressions that all have the same values at the current point in the program. Normal equivalences through SETs only insert and perform lookups in this set but equivalence determined from comparisons, e.g. (insn 46 44 47 7 (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 105 [ iD.2893 ]) (const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1} (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ]) (nil))) creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 [0]). This causes a merge to happen between the two equivalence sets denoted by (const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively. The operation happens through merge_equiv_classes however this function has an invariant that the classes to be merge not contain any duplicates. This is because it frees entries before merging. The given testcase when using the supplied flags trigger an ICE due to the equivalence set being (rr) p dump_class (class1) Equivalence chain for (reg:SI 105 [ iD.2893 ]): (reg:SI 105 [ iD.2893 ]) $3 = void (rr) p dump_class (class2) Equivalence chain for (const_int 0 [0]): (const_int 0 [0]) (reg:SI 97 [ _10 ]) (reg:SI 97 [ _10 ]) $4 = void This happens because the original INSN being recorded is (insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0) (const_vector:V1SI [ (const_int 0 [0]) ])) "cse.c":11:9 1363 {*movv1si_internal} (expr_list:REG_UNUSED (reg:SI 97 [ _10 ]) (nil))) and we end up generating two equivalences. the first one is simply that reg:SI 97 is 0. The second one is that 0 can be extracted from the V1SI, so subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0. This nested subreg gets folded away to just reg:SI 97 and we re-insert the same equivalence. This patch changes it so that if the nunits of a subreg is 1 then don't generate a vec_select from the subreg as the subreg will be folded away and we get a dup. gcc/ChangeLog: PR rtl-optimization/103404 * cse.c (find_sets_in_insn): Don't select elements out of a V1 mode subreg. gcc/testsuite/ChangeLog: PR rtl-optimization/103404 * gcc.target/i386/pr103404.c: New test.
2021-12-06Prefer INT_SSE_REGS for SSE_FLOAT_MODE_P in preferred_reload_class.liuhongt3-2/+38
When moves between integer and sse registers are cheap. 2021-12-06 Hongtao Liu <Hongtao.liu@intel.com> Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog: PR target/95740 * config/i386/i386.c (ix86_preferred_reload_class): Allow integer regs when moves between register units are cheap. * config/i386/i386.h (INT_SSE_CLASS_P): New. gcc/testsuite/ChangeLog: * gcc.target/i386/pr95740.c: New test.
2021-12-06Daily bump.GCC Administrator3-1/+14
2021-12-05Objective-C, NeXT: Reorganise meta-data declarations.Iain Sandoe4-23/+6
This moves the GTY declaration of the meta-data indentifier array into the header that enumerates these and provides shorthand defines for them. This avoids a problem seen with a relocatable PCH implementation. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/objc/ChangeLog: * objc-next-metadata-tags.h (objc_rt_trees): Declare here. * objc-next-runtime-abi-01.c: Remove from here. * objc-next-runtime-abi-02.c: Likewise. * objc-runtime-shared-support.c: Reorder headers, provide a GTY declaration the definition of objc_rt_trees.
2021-12-04aix: Move AIX math builtins before new builtin machinery.David Edelsohn1-23/+23
The new builtin machinery has an early exit, so move the AIX-specific builtins before the new machinery. gcc/ChangeLog: * config/rs6000/rs6000-call.c (rs6000_init_builtins): Move AIX math builtin initialization before new_builtins_are_live.
2021-12-05Daily bump.GCC Administrator5-1/+81
2021-12-04c++: Add fixed test [PR93614]Marek Polacek1-0/+17
This was fixed by r11-86. PR c++/93614 gcc/testsuite/ChangeLog: * g++.dg/template/lookup18.C: New test.
2021-12-04Fortran/OpenMP: Support most of 5.1 atomic extensionsTobias Burnus20-261/+1248
Implements moste of OpenMP 5.1 atomic extensions, except that 'compare' is parsed but rejected during resolution. (As the trans-openmp.c handling is missing.) gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_clauses): Handle weak/compare/fail clause. * gfortran.h (gfc_omp_clauses): Add weak, compare, fail. * openmp.c (enum omp_mask1, gfc_match_omp_clauses, OMP_ATOMIC_CLAUSES): Update for new clauses. (gfc_match_omp_atomic): Update for 5.1 atomic changes. (is_conversion): Support widening in one go. (is_scalar_intrinsic_expr): New. (resolve_omp_atomic): Update for 5.1 atomic changes. * parse.c (parse_omp_oacc_atomic): Update for compare. * resolve.c (gfc_resolve_blocks): Update asserts. * trans-openmp.c (gfc_trans_omp_atomic): Handle new clauses. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/atomic-2.f90: Move now supported code to ... * gfortran.dg/gomp/atomic.f90: here. * gfortran.dg/gomp/atomic-10.f90: New test. * gfortran.dg/gomp/atomic-12.f90: New test. * gfortran.dg/gomp/atomic-15.f90: New test. * gfortran.dg/gomp/atomic-16.f90: New test. * gfortran.dg/gomp/atomic-17.f90: New test. * gfortran.dg/gomp/atomic-18.f90: New test. * gfortran.dg/gomp/atomic-19.f90: New test. * gfortran.dg/gomp/atomic-20.f90: New test. * gfortran.dg/gomp/atomic-22.f90: New test. * gfortran.dg/gomp/atomic-24.f90: New test. * gfortran.dg/gomp/atomic-25.f90: New test. * gfortran.dg/gomp/atomic-26.f90: New test. libgomp/ChangeLog * libgomp.texi (OpenMP 5.1): Update status.
2021-12-04i386, ipa-modref: Comment spelling fixJakub Jelinek2-3/+3
This patch fixes spelling of prefer (misspelled as preffer). 2021-12-04 Jakub Jelinek <jakub@redhat.com> * config/i386/x86-tune.def (X86_TUNE_PARTIAL_REG_DEPENDENCY): Fix comment typo, Preffer -> prefer. * ipa-modref-tree.c (modref_access_node::closer_pair_p): Likewise.
2021-12-04c++: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4Jakub Jelinek7-2/+411
P1272R4 has added to the std::byteswap new stuff to me quite unrelated clarification for std::bit_cast. The patch treats it as DR, applying to all languages. We no longer diagnose if padding bits are stored into unsigned char or std::byte result, fields or bitfields, instead arrange for that result, those fields or bitfields to get indeterminate value (empty CONSTRUCTOR with CONSTRUCTOR_NO_ZEROING or just leaving the member's initializer out and setting CONSTRUCTOR_NO_ZEROING on parent). We still have a bug that we don't diagnose in lots of places lvalue-to-rvalue conversions of indeterminate values or class objects with some indeterminate members. 2021-12-04 Jakub Jelinek <jakub@redhat.com> * cp-tree.h (is_byte_access_type_not_plain_char): Declare. * tree.c (is_byte_access_type_not_plain_char): New function. * constexpr.c (clear_uchar_or_std_byte_in_mask): New function. (cxx_eval_bit_cast): Don't error about padding bits if target type is unsigned char or std::byte, instead return no clearing ctor. Use clear_uchar_or_std_byte_in_mask. * g++.dg/cpp2a/bit-cast11.C: New test. * g++.dg/cpp2a/bit-cast12.C: New test. * g++.dg/cpp2a/bit-cast13.C: New test. * g++.dg/cpp2a/bit-cast14.C: New test.
2021-12-04libcpp: Fix up handling of deferred pragmas [PR102432]Jakub Jelinek2-0/+46
The https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557903.html change broke the following testcases. The problem is when a pragma namespace allows expansion (i.e. p->is_nspace && p->allow_expansion), e.g. the omp or acc namespaces do, then when parsing the second pragma token we do it with pfile->state.in_directive set, pfile->state.prevent_expansion clear and pfile->state.in_deferred_pragma clear (the last one because we don't know yet if it will be a deferred pragma or not). If the pragma line only contains a single name and newline after it, and there exists a function-like macro with the same name, the preprocessor needs to peek in funlike_invocation_p the next token whether it isn't ( but in this case it will see a newline. As pfile->state.in_directive is set, we don't read anything after the newline, pfile->buffer->need_line is set and CPP_EOF is lexed, which funlike_invocation_p doesn't push back. Because name is a function-like macro and on the pragma line there is no ( after the name, it isn't expanded, and control flow returns to do_pragma. If name is valid deferred pragma, we set pfile->state.in_deferred_pragma (and really need it set so that e.g. end_directive later on doesn't eat all the tokens from the pragma line). Before Nathan's change (which unfortunately didn't contain rationale on why it is better to do it like that), this wasn't a problem, next _cpp_lex_direct called when we want next token would return CPP_PRAGMA_EOF when it saw buffer->need_line, which would turn off pfile->state.in_deferred_pragma and following get token would already read the next line. But Nathan's patch replaced it with an assertion failure that now triggers and CPP_PRAGMA_EOL is done only when lexing the '\n'. Except for this special case that works fine, but in this case it doesn't because when peeking the token we still didn't know that it will be a deferred pragma. I've tried to fix that up in do_pragma by detecting this and pushing CPP_PRAGMA_EOL as lookahead, but that doesn't work because end_directive still needs to see pfile->state.in_deferred_pragma set. So, this patch affectively reverts part of Nathan's change, CPP_PRAGMA_EOL addition isn't done only when parsing the '\n', but is now done in both places, in the first one instead of the assertion failure. 2021-12-04 Jakub Jelinek <jakub@redhat.com> PR preprocessor/102432 * lex.c (_cpp_lex_direct): If buffer->need_line while pfile->state.in_deferred_pragma, return CPP_PRAGMA_EOL token instead of assertion failure. * c-c++-common/gomp/pr102432.c: New test. * c-c++-common/goacc/pr102432.c: New test.
2021-12-04[PR103028] test ifcvt trap_if seq more strictly after reloadAlexandre Oliva2-1/+24
When -fif-conversion2 is enabled, we attempt to replace conditional branches around unconditional traps with conditional traps. That canonicalizes compares, which may change an immediate that barely fits into one that doesn't. The compare for the trap is first checked using the predicates of cbranch predicates, and then, compare and conditional trap insns are emitted and recognized. In the failing s390x testcase, i <=u 0xffff_ffff is canonicalized into i <u 0x1_0000_0000, and the latter immediate doesn't fit. The insn predicates (both cbranch and cmpdi_ccu) happily accept it, since the register allocator has no trouble getting them into registers. The problem is that ifcvt2 runs after reload, so we recognize the compare insn successfully, but later on we barf when we find that none of the constraints fit. This patch arranges for the trap_if-issuing bits in ifcvt to validate post-reload insns using a stricter test that also checks that operands fit the constraints. for gcc/ChangeLog PR rtl-optimization/103028 * ifcvt.c (find_cond_trap): Validate new insns more strictly after reload. for gcc/testsuite/ChangeLog PR rtl-optimization/103028 * gcc.dg/pr103028.c: New.
2021-12-03testsuite: powerpc/vec_reve_1.c requires VSX.David Edelsohn1-2/+2
vector long long int and vector double require VSX not just Altivec. gcc/testsuite/ChangeLog: * gcc.target/powerpc/vec_reve_1.c: Require VSX.
2021-12-04Daily bump.GCC Administrator6-1/+323
2021-12-03c++: avoid redundant scope in diagnosticsJason Merrill2-1/+21
We can make some function signatures shorter to print by omitting redundant nested-name-specifiers in the rest of the declarator. gcc/cp/ChangeLog: * error.c (current_dump_scope): New variable. (dump_scope): Check it. (dump_function_decl): Set it. gcc/testsuite/ChangeLog: * g++.dg/diagnostic/scope1.C: New test.
2021-12-03rs6000: Fix up flag_shrink_wrap handling in presence of -mrop-protect [PR101324]Martin Liska2-4/+21
PR101324 shows a problem in disabling shrink-wrapping when using -mrop-protect when there is a attribute optimize/pragma. The fix envolves moving the handling of flag_shrink_wrap so it gets re-disbled when we change or add options. 2021-12-03 Martin Liska <mliska@suse.cz> gcc/ PR target/101324 * config/rs6000/rs6000.c (rs6000_option_override_internal): Move the disabling of shrink-wrapping when using -mrop-protect from here... (rs6000_override_options_after_change): ...to here. 2021-12-03 Peter Bergner <bergner@linux.ibm.com> gcc/testsuite/ PR target/101324 * gcc.target/powerpc/pr101324.c: New test.
2021-12-03rs6000: testsuite: Add rop_ok effective-target functionPeter Bergner6-5/+12
This patch adds a new effective-target function that tests whether it is safe to emit the ROP-protect instructions and updates the ROP test cases to use it. 2021-12-03 Peter Bergner <bergner@linux.ibm.com> gcc/testsuite/ * lib/target-supports.exp (check_effective_target_rop_ok): New function. * gcc.target/powerpc/rop-1.c: Use it. * gcc.target/powerpc/rop-2.c: Likewise. * gcc.target/powerpc/rop-3.c: Likewise. * gcc.target/powerpc/rop-4.c: Likewise. * gcc.target/powerpc/rop-5.c: Likewise.
2021-12-03Fortran: improve checking of array specificationsHarald Anlauf4-0/+39
gcc/fortran/ChangeLog: PR fortran/103505 * array.c (match_array_element_spec): Try to simplify array element specifications to improve early checking. * expr.c (gfc_try_simplify_expr): New. Try simplification of an expression via gfc_simplify_expr. When an error occurs, roll back. * gfortran.h (gfc_try_simplify_expr): Declare it. gcc/testsuite/ChangeLog: PR fortran/103505 * gfortran.dg/pr103505.f90: New test. Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
2021-12-03c++: Fix for decltype(auto) and parenthesized expr [PR103403]Marek Polacek7-20/+133
In r11-4758, I tried to fix this problem: int &&i = 0; decltype(auto) j = i; // should behave like int &&j = i; error wherein do_auto_deduction was getting confused with a REFERENCE_REF_P and it didn't realize its operand was a name, not an expression, and deduced the wrong type. Unfortunately that fix broke this: int&& r = 1; decltype(auto) rr = (r); where 'rr' should be 'int &' since '(r)' is an expression, not a name. But because I stripped the INDIRECT_REF with the r11-4758 change, we deduced 'rr's type as if decltype had gotten a name, resulting in 'int &&'. I suspect I thought that the REF_PARENTHESIZED_P check when setting 'bool id' in do_auto_deduction would handle the (r) case, but that's not the case; while the documentation for REF_PARENTHESIZED_P specifically says it can be set in INDIRECT_REF, we don't actually do so. This patch sets REF_PARENTHESIZED_P even on REFERENCE_REF_P, so that do_auto_deduction can use it. It also removes code in maybe_undo_parenthesized_ref that I think is dead -- and we don't hit it while running dg.exp. To adduce more data, it also looks dead here: https://splichal.eu/lcov/gcc/cp/semantics.c.gcov.html (It's dead since r9-1417.) Also add a fixed test for c++/81176. PR c++/103403 gcc/cp/ChangeLog: * cp-gimplify.c (cp_fold): Don't recurse if maybe_undo_parenthesized_ref doesn't change its argument. * pt.c (do_auto_deduction): Don't strip REFERENCE_REF_P trees if they are REF_PARENTHESIZED_P. Use stripped_init when checking for id-expression. * semantics.c (force_paren_expr): Set REF_PARENTHESIZED_P on REFERENCE_REF_P trees too. (maybe_undo_parenthesized_ref): Remove dead code. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/decltype-auto2.C: New test. * g++.dg/cpp1y/decltype-auto3.C: New test. * g++.dg/cpp1y/decltype-auto4.C: New test. * g++.dg/cpp1z/decomp-decltype1.C: New test.
2021-12-03x86: Add -mmove-max=bits and -mstore-max=bitsH.J. Lu17-18/+276
Add -mmove-max=bits and -mstore-max=bits to enable 256-bit/512-bit move and store, independent of -mprefer-vector-width=bits: 1. Add X86_TUNE_AVX512_MOVE_BY_PIECES and X86_TUNE_AVX512_STORE_BY_PIECES which are enabled for Intel Sapphire Rapids processor. 2. Add -mmove-max=bits to set the maximum number of bits can be moved from memory to memory efficiently. The default value is derived from X86_TUNE_AVX512_MOVE_BY_PIECES, X86_TUNE_AVX256_MOVE_BY_PIECES, and the preferred vector width. 3. Add -mstore-max=bits to set the maximum number of bits can be stored to memory efficiently. The default value is derived from X86_TUNE_AVX512_STORE_BY_PIECES, X86_TUNE_AVX256_STORE_BY_PIECES and the preferred vector width. gcc/ PR target/103269 * config/i386/i386-expand.c (ix86_expand_builtin): Pass PVW_NONE and PVW_NONE to ix86_target_string. * config/i386/i386-options.c (ix86_target_string): Add arguments for move_max and store_max. (ix86_target_string::add_vector_width): New lambda. (ix86_debug_options): Pass ix86_move_max and ix86_store_max to ix86_target_string. (ix86_function_specific_print): Pass ptr->x_ix86_move_max and ptr->x_ix86_store_max to ix86_target_string. (ix86_valid_target_attribute_tree): Handle x_ix86_move_max and x_ix86_store_max. (ix86_option_override_internal): Set the default x_ix86_move_max and x_ix86_store_max. * config/i386/i386-options.h (ix86_target_string): Add prefer_vector_width and prefer_vector_width. * config/i386/i386.h (TARGET_AVX256_MOVE_BY_PIECES): Removed. (TARGET_AVX256_STORE_BY_PIECES): Likewise. (MOVE_MAX): Use 64 if ix86_move_max or ix86_store_max == PVW_AVX512. Use 32 if ix86_move_max or ix86_store_max >= PVW_AVX256. (STORE_MAX_PIECES): Use 64 if ix86_store_max == PVW_AVX512. Use 32 if ix86_store_max >= PVW_AVX256. * config/i386/i386.opt: Add -mmove-max=bits and -mstore-max=bits. * config/i386/x86-tune.def (X86_TUNE_AVX512_MOVE_BY_PIECES): New. (X86_TUNE_AVX512_STORE_BY_PIECES): Likewise. * doc/invoke.texi: Document -mmove-max=bits and -mstore-max=bits. gcc/testsuite/ PR target/103269 * gcc.target/i386/pieces-memcpy-17.c: New test. * gcc.target/i386/pieces-memcpy-18.c: Likewise. * gcc.target/i386/pieces-memcpy-19.c: Likewise. * gcc.target/i386/pieces-memcpy-20.c: Likewise. * gcc.target/i386/pieces-memcpy-21.c: Likewise. * gcc.target/i386/pieces-memset-45.c: Likewise. * gcc.target/i386/pieces-memset-46.c: Likewise. * gcc.target/i386/pieces-memset-47.c: Likewise. * gcc.target/i386/pieces-memset-48.c: Likewise. * gcc.target/i386/pieces-memset-49.c: Likewise.
2021-12-03rs6000: Fix use of wrong enum for built-in function codeBill Schmidt1-2/+2
I discovered this bug while working on patches to remove the old built-ins infrastructure. I missed a spot in converting from the rs6000_builtins enum to the rs6000_gen_builtins_enum. This fixes it. The fix is technically not right if new_builtins_are_enabled were to be set to zero, but we're not going to do that anymore, and the remnants of that code will be removed shortly. 2021-12-02 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Fix builtin identifiers.