diff options
author | Martin Liska <mliska@suse.cz> | 2021-06-17 12:05:57 +0200 |
---|---|---|
committer | Martin Liska <mliska@suse.cz> | 2021-06-17 12:05:57 +0200 |
commit | d79a408d0e2693048ac20d7ac469115fc906f2da (patch) | |
tree | cee29b35d07339f02ee1edbd5c41bd32c2218db2 | |
parent | 78a55ff9ef07c948d7fde6d7b9a88f99b8e93112 (diff) | |
parent | 8eac92a07e386301f7b09f7ef6146e6e3ac6b6cd (diff) | |
download | gcc-d79a408d0e2693048ac20d7ac469115fc906f2da.zip gcc-d79a408d0e2693048ac20d7ac469115fc906f2da.tar.gz gcc-d79a408d0e2693048ac20d7ac469115fc906f2da.tar.bz2 |
Merge branch 'master' into devel/sphinx
79 files changed, 1778 insertions, 362 deletions
diff --git a/MAINTAINERS b/MAINTAINERS index a0e6c71..e044778 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -224,6 +224,8 @@ option handling Joseph Myers <joseph@codesourcery.com> middle-end Jeff Law <jeffreyalaw@gmail.com> middle-end Ian Lance Taylor <ian@airs.com> middle-end Richard Biener <rguenther@suse.de> +*vrp, ranger Aldy Hernandez <aldyh@redhat.com> +*vrp, ranger Andrew MacLeod <amacleod@redhat.com> tree-ssa Andrew MacLeod <amacleod@redhat.com> tree browser/unparser Sebastian Pop <sebpop@gmail.com> scev, data dependence Sebastian Pop <sebpop@gmail.com> diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 92423fd..61a714d 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,131 @@ +2021-06-17 Andrew MacLeod <amacleod@redhat.com> + + * gimple-range-gori.cc (gori_compute::has_edge_range_p): Check with + may_recompute_p. + (gori_compute::may_recompute_p): New. + (gori_compute::outgoing_edge_range_p): Perform recomputations. + * gimple-range-gori.h (class gori_compute): Add prototype. + +2021-06-17 Andrew MacLeod <amacleod@redhat.com> + + * gimple-range-cache.cc (ranger_cache::range_on_edge): Always return + true when a range can be calculated. + * gimple-range.cc (gimple_ranger::dump_bb): Check has_edge_range_p. + +2021-06-16 Martin Sebor <msebor@redhat.com> + + * doc/invoke.texi (-Wmismatched-dealloc, -Wmismatched-new-delete): + Correct documented defaults. + +2021-06-16 Andrew MacLeod <amacleod@redhat.com> + + * gimple-range-cache.cc (ranger_cache::ranger_cache): Initialize + m_new_value_p directly. + +2021-06-16 Uroš Bizjak <ubizjak@gmail.com> + + PR target/89021 + * config/i386/i386-expand.c (expand_vec_perm_2perm_pblendv): + Handle 64bit modes for TARGET_SSE4_1. + (expand_vec_perm_pshufb2): Handle 64bit modes for TARGET_SSSE3. + (expand_vec_perm_even_odd_pack): Handle V4HI mode. + (expand_vec_perm_even_odd_1) <case E_V4HImode>: Expand via + expand_vec_perm_pshufb2 for TARGET_SSSE3 and via + expand_vec_perm_even_odd_pack for TARGET_SSE4_1. + * config/i386/mmx.md (mmx_packusdw): New insn pattern. + +2021-06-16 Jonathan Wright <jonathan.wright@arm.com> + + * config/aarch64/aarch64-simd.md (aarch64_<sur><addsub>hn<mode>): + Change to an expander that emits the correct instruction + depending on endianness. + (aarch64_<sur><addsub>hn<mode>_insn_le): Define. + (aarch64_<sur><addsub>hn<mode>_insn_be): Define. + +2021-06-16 Jonathan Wright <jonathan.wright@arm.com> + + * config/aarch64/aarch64-simd-builtins.def: Split generator + for aarch64_<su>qmovn builtins into scalar and vector + variants. + * config/aarch64/aarch64-simd.md (aarch64_<su>qmovn<mode>_insn_le): + Define. + (aarch64_<su>qmovn<mode>_insn_be): Define. + (aarch64_<su>qmovn<mode>): Split into scalar and vector + variants. Change vector variant to an expander that emits the + correct instruction depending on endianness. + +2021-06-16 Jonathan Wright <jonathan.wright@arm.com> + + * config/aarch64/aarch64-simd-builtins.def: Split generator + for aarch64_sqmovun builtins into scalar and vector variants. + * config/aarch64/aarch64-simd.md (aarch64_sqmovun<mode>): + Split into scalar and vector variants. Change vector variant + to an expander that emits the correct instruction depending + on endianness. + (aarch64_sqmovun<mode>_insn_le): Define. + (aarch64_sqmovun<mode>_insn_be): Define. + +2021-06-16 Jonathan Wright <jonathan.wright@arm.com> + + * config/aarch64/aarch64-simd.md (aarch64_xtn<mode>_insn_le): + Define - modeling zero-high-half semantics. + (aarch64_xtn<mode>): Change to an expander that emits the + appropriate instruction depending on endianness. + (aarch64_xtn<mode>_insn_be): Define - modeling zero-high-half + semantics. + (aarch64_xtn2<mode>_le): Rename to... + (aarch64_xtn2<mode>_insn_le): This. + (aarch64_xtn2<mode>_be): Rename to... + (aarch64_xtn2<mode>_insn_be): This. + (vec_pack_trunc_<mode>): Emit truncation instruction instead + of aarch64_xtn. + * config/aarch64/iterators.md (Vnarrowd): Add Vnarrowd mode + attribute iterator. + +2021-06-16 Martin Jambor <mjambor@suse.cz> + + PR tree-optimization/100453 + * tree-sra.c (create_access): Disqualify any const candidates + which are written to. + (sra_modify_expr): Do not store sub-replacements back to a const base. + (handle_unscalarized_data_in_subtree): Likewise. + (sra_modify_assign): Likewise. Earlier, use TREE_READONLy test + instead of constant_decl_p. + +2021-06-16 Jakub Jelinek <jakub@redhat.com> + + PR middle-end/101062 + * stor-layout.c (finish_bitfield_representative): For fields in unions + assume nextf is always NULL. + (finish_bitfield_layout): Compute bit field representatives also in + unions, but handle it as if each bitfield was the only field in the + aggregate. + +2021-06-16 Richard Biener <rguenther@suse.de> + + PR tree-optimization/101088 + * tree-ssa-loop-im.c (sm_seq_valid_bb): Only look for + supported refs on edges. Do not assert same ref but + different kind stores are unsuported but mark them so. + (hoist_memory_references): Only look for supported refs + on exits. + +2021-06-16 Roger Sayle <roger@nextmovesoftware.com> + + PR rtl-optimization/46235 + * config/i386/i386.md: New define_split for bt followed by cmov. + (*bt<mode>_setcqi): New define_insn_and_split for bt followed by setc. + (*bt<mode>_setncqi): New define_insn_and_split for bt then setnc. + (*bt<mode>_setnc<mode>): New define_insn_and_split for bt followed + by setnc with zero extension. + +2021-06-16 Richard Biener <rguenther@suse.de> + + PR tree-optimization/101083 + * tree-vect-slp.c (vect_slp_build_two_operator_nodes): Get + vectype as argument. + (vect_build_slp_tree_2): Adjust. + 2021-06-15 Martin Sebor <msebor@redhat.com> PR middle-end/100876 diff --git a/gcc/DATESTAMP b/gcc/DATESTAMP index 052decd..f84fbff 100644 --- a/gcc/DATESTAMP +++ b/gcc/DATESTAMP @@ -1 +1 @@ -20210616 +20210617 diff --git a/gcc/ada/ChangeLog b/gcc/ada/ChangeLog index 31eca3f..f102600 100644 --- a/gcc/ada/ChangeLog +++ b/gcc/ada/ChangeLog @@ -1,3 +1,189 @@ +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * sem_util.adb (Is_Volatile_Function): Follow the exact wording + of SPARK (regarding volatile functions) and Ada (regarding + protected functions). + +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * sem_util.adb (Is_OK_Volatile_Context): All references to + volatile objects are legal in preanalysis. + (Within_Volatile_Function): Previously it was wrongly called on + Empty entities; now it is only called on E_Return_Statement, + which allow the body to be greatly simplified. + +2021-06-16 Yannick Moy <moy@adacore.com> + + * sem_res.adb (Set_Slice_Subtype): Revert special-case + introduced previously, which is not needed as Itypes created for + slices are precisely always used. + +2021-06-16 Eric Botcazou <ebotcazou@adacore.com> + + * urealp.adb (Scale): Change first paramter to Uint and adjust. + (Equivalent_Decimal_Exponent): Pass U.Den directly to Scale. + * libgnat/s-exponr.adb (Negative): Rename to... + (Safe_Negative): ...this and change its lower bound. + (Exponr): Adjust to above renaming and deal with Integer'First. + +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * sem_res.adb (Flag_Effectively_Volatile_Objects): Detect also + allocators within restricted contexts and not just entity names. + (Resolve_Actuals): Remove duplicated code for detecting + restricted contexts; it is now exclusively done in + Is_OK_Volatile_Context. + (Resolve_Entity_Name): Adapt to new parameter of + Is_OK_Volatile_Context. + * sem_util.ads, sem_util.adb (Is_OK_Volatile_Context): Adapt to + handle contexts both inside and outside of subprogram call + actual parameters. + (Within_Subprogram_Call): Remove; now handled by + Is_OK_Volatile_Context itself and its parameter. + +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * sinput.adb (Sloc_Range): Refactor several repeated calls to + Sloc and two comparisons with No_Location. + +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * checks.adb (Apply_Scalar_Range_Check): Fix handling of check depending + on the parameter passing mechanism. Grammar adjustment ("has" + => "have"). + (Parameter_Passing_Mechanism_Specified): Add a hyphen in a comment. + +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * exp_ch3.adb (Build_Slice_Assignment): Remove unused + initialization. + +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * restrict.adb, sem_attr.adb, types.ads: Fix typos in + "occuring"; refill comment as necessary. + +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * sem_util.ads (Is_Actual_Parameter): Update comment. + * sem_util.adb (Is_Actual_Parameter): Also detect entry parameters. + +2021-06-16 Arnaud Charlet <charlet@adacore.com> + + * rtsfind.ads, libgnarl/s-taskin.ads, exp_ch3.adb, exp_ch4.adb, + exp_ch6.adb, exp_ch9.adb, sem_ch6.adb: Move master related + entities to the expander directly. + +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * sem_res.adb (Is_Assignment_Or_Object_Expression): Whitespace + cleanup. + (Is_Attribute_Expression): Prevent AST climbing from going to + the root of the compilation unit. + +2021-06-16 Steve Baird <baird@adacore.com> + + * doc/gnat_rm/implementation_advice.rst: Add a section for RM + A.18 . + * gnat_rm.texi: Regenerate. + +2021-06-16 Justin Squirek <squirek@adacore.com> + + * sem_ch13.adb (Analyze_Enumeration_Representation_Clause): Add + check for the mixing of entries. + +2021-06-16 Justin Squirek <squirek@adacore.com> + + * sem_ch13.adb (Make_Aitem_Pragma): Check for static expressions + in Priority aspect arguments for restriction Static_Priorities. + +2021-06-16 Justin Squirek <squirek@adacore.com> + + * sem_util.adb (Accessibility_Level): Take into account + renamings of loop parameters. + +2021-06-16 Matthieu Eyraud <eyraud@adacore.com> + + * par_sco.adb (Set_Statement_Entry): Change sloc for dominance + marker. + (Traverse_One): Fix typo. + (Output_Header): Fix comment. + +2021-06-16 Richard Kenner <kenner@adacore.com> + + * exp_unst.adb (Register_Subprogram): Don't look for aliases for + subprograms that are generic. Reorder tests for efficiency. + +2021-06-16 Eric Botcazou <ebotcazou@adacore.com> + + * sem_util.adb (Incomplete_Or_Partial_View): Retrieve the scope of + the parameter and use it to find its incomplete view, if any. + +2021-06-16 Eric Botcazou <ebotcazou@adacore.com> + + * freeze.adb (Check_No_Parts_Violations): Return earlier if the + type is elementary or does not come from source. + +2021-06-16 Bob Duff <duff@adacore.com> + + * ghost.adb: Add another special case where full analysis is + needed. This bug is due to quirks in the way + Mark_And_Set_Ghost_Assignment works (it happens very early, + before name resolution is done). + +2021-06-16 Eric Botcazou <ebotcazou@adacore.com> + + * sem_util.adb (Current_Entity_In_Scope): Reimplement. + +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * sem_ch8.adb (End_Scope): Remove extra parens. + +2021-06-16 Javier Miranda <miranda@adacore.com> + + * exp_disp.adb (Build_Class_Wide_Check): Ensure that evaluation + of actuals is side effects free (since the check duplicates + actuals). + +2021-06-16 Ed Schonberg <schonberg@adacore.com> + + * sem_res.adb (Resolve_Raise_Expression): Apply Ada_2020 rules + concerning the need for parentheses around Raise_Expressions in + various contexts. + +2021-06-16 Piotr Trojanek <trojanek@adacore.com> + + * sem_ch13.adb (Validate_Unchecked_Conversion): Move detection + of generic types before switching to their private views; fix + style in using AND THEN. + +2021-06-16 Arnaud Charlet <charlet@adacore.com> + + * sem_ch3.adb (Analyze_Component_Declaration): Do not special + case raise expressions. + +2021-06-16 Sergey Rybin <rybin@adacore.com> + + * doc/gnat_ugn/building_executable_programs_with_gnat.rst: + Instead of referring to the formatting of the Ada examples in + Ada RM add use the list of checks that are actually performed. + * gnat_ugn.texi: Regenerate. + +2021-06-16 Eric Botcazou <ebotcazou@adacore.com> + + * initialize.c: Do not include vxWorks.h and fcntl.h from here. + (__gnat_initialize) [__MINGW32__]: Remove #ifdef and attribute + (__gnat_initialize) [init_float]: Delete. + (__gnat_initialize) [VxWorks]: Likewise. + (__gnat_initialize) [PA-RISC HP-UX 10]: Likewise. + * runtime.h: Add comment about vxWorks.h include. + +2021-06-16 Eric Botcazou <ebotcazou@adacore.com> + + * libgnat/s-except.ads (ZCX_By_Default): Delete. + (Require_Body): Likewise. + * libgnat/s-except.adb: Replace body with pragma No_Body. + 2021-06-15 Steve Baird <baird@adacore.com> * exp_util.adb (Kill_Dead_Code): Generalize the existing diff --git a/gcc/ada/gcc-interface/Make-lang.in b/gcc/ada/gcc-interface/Make-lang.in index 896e760..b68081e 100644 --- a/gcc/ada/gcc-interface/Make-lang.in +++ b/gcc/ada/gcc-interface/Make-lang.in @@ -87,7 +87,8 @@ endif ifeq ($(STAGE1),True) ADA_INCLUDES=$(COMMON_ADA_INCLUDES) - GNATLIB=$(dir $(shell $(CC) -print-libgcc-file-name))adalib/libgnat.a $(STAGE1_LIBS) + adalib=$(dir $(shell $(CC) -print-libgcc-file-name))adalib + GNATLIB=$(adalib)/$(if $(wildcard $(adalib)/libgnat.a),libgnat.a,libgnat.so) $(STAGE1_LIBS) else ADA_INCLUDES=-nostdinc $(COMMON_ADA_INCLUDES) -Iada/libgnat -I$(srcdir)/ada/libgnat -Iada/gcc-interface -I$(srcdir)/ada/gcc-interface endif diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c index ed788dc..43d6fa0 100644 --- a/gcc/auto-profile.c +++ b/gcc/auto-profile.c @@ -1155,13 +1155,10 @@ afdo_find_equiv_class (bb_set *annotated_bb) FOR_ALL_BB_FN (bb, cfun) { - vec<basic_block> dom_bbs; - if (bb->aux != NULL) continue; bb->aux = bb; - dom_bbs = get_dominated_by (CDI_DOMINATORS, bb); - for (basic_block bb1 : dom_bbs) + for (basic_block bb1 : get_dominated_by (CDI_DOMINATORS, bb)) if (bb1->aux == NULL && dominated_by_p (CDI_POST_DOMINATORS, bb, bb1) && bb1->loop_father == bb->loop_father) { @@ -1172,8 +1169,8 @@ afdo_find_equiv_class (bb_set *annotated_bb) set_bb_annotated (bb, annotated_bb); } } - dom_bbs = get_dominated_by (CDI_POST_DOMINATORS, bb); - for (basic_block bb1 : dom_bbs) + + for (basic_block bb1 : get_dominated_by (CDI_POST_DOMINATORS, bb)) if (bb1->aux == NULL && dominated_by_p (CDI_DOMINATORS, bb, bb1) && bb1->loop_father == bb->loop_father) { diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c index 17edc4f..7b1e1ba 100644 --- a/gcc/cfgcleanup.c +++ b/gcc/cfgcleanup.c @@ -3027,7 +3027,7 @@ delete_unreachable_blocks (void) delete_basic_block (b); else { - vec<basic_block> h + auto_vec<basic_block> h = get_all_dominated_blocks (CDI_DOMINATORS, b); while (h.length ()) @@ -3040,8 +3040,6 @@ delete_unreachable_blocks (void) delete_basic_block (b); } - - h.release (); } changed = true; diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index 113241d..5e69927 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -840,7 +840,7 @@ enum extern void doloop_optimize_loops (void); extern void move_loop_invariants (void); -extern vec<basic_block> get_loop_hot_path (const class loop *loop); +extern auto_vec<basic_block> get_loop_hot_path (const class loop *loop); /* Returns the outermost loop of the loop nest that contains LOOP.*/ static inline class loop * diff --git a/gcc/cfgloopanal.c b/gcc/cfgloopanal.c index 54426b5..fdd8d3f 100644 --- a/gcc/cfgloopanal.c +++ b/gcc/cfgloopanal.c @@ -500,7 +500,7 @@ single_likely_exit (class loop *loop, vec<edge> exits) order against direction of edges from latch. Specially, if header != latch, latch is the 1-st block. */ -vec<basic_block> +auto_vec<basic_block> get_loop_hot_path (const class loop *loop) { basic_block bb = loop->header; diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c index 4a9ab74..e6df280 100644 --- a/gcc/cfgloopmanip.c +++ b/gcc/cfgloopmanip.c @@ -1414,13 +1414,12 @@ duplicate_loop_to_header_edge (class loop *loop, edge e, for (i = 0; i < n; i++) { basic_block dominated, dom_bb; - vec<basic_block> dom_bbs; unsigned j; bb = bbs[i]; bb->aux = 0; - dom_bbs = get_dominated_by (CDI_DOMINATORS, bb); + auto_vec<basic_block> dom_bbs = get_dominated_by (CDI_DOMINATORS, bb); FOR_EACH_VEC_ELT (dom_bbs, j, dominated) { if (flow_bb_inside_loop_p (loop, dominated)) @@ -1429,7 +1428,6 @@ duplicate_loop_to_header_edge (class loop *loop, edge e, CDI_DOMINATORS, first_active[i], first_active_latch); set_immediate_dominator (CDI_DOMINATORS, dominated, dom_bb); } - dom_bbs.release (); } free (first_active); diff --git a/gcc/cgraph.c b/gcc/cgraph.c index d7c78d5..abe4e3e 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -3074,10 +3074,10 @@ collect_callers_of_node_1 (cgraph_node *node, void *data) /* Collect all callers of cgraph_node and its aliases that are known to lead to cgraph_node (i.e. are not overwritable). */ -vec<cgraph_edge *> +auto_vec<cgraph_edge *> cgraph_node::collect_callers (void) { - vec<cgraph_edge *> redirect_callers = vNULL; + auto_vec<cgraph_edge *> redirect_callers; call_for_symbol_thunks_and_aliases (collect_callers_of_node_1, &redirect_callers, false); return redirect_callers; diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 4a1f899..9f4338f 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -1139,7 +1139,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public symtab_node /* Collect all callers of cgraph_node and its aliases that are known to lead to NODE (i.e. are not overwritable) and that are not thunks. */ - vec<cgraph_edge *> collect_callers (void); + auto_vec<cgraph_edge *> collect_callers (void); /* Remove all callers from the node. */ void remove_callers (void); diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index 18baa67..ac5d4fc 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -263,14 +263,18 @@ BUILTIN_VQ_HSI (TERNOP, smlal_hi_n, 0, NONE) BUILTIN_VQ_HSI (TERNOPU, umlal_hi_n, 0, NONE) - BUILTIN_VSQN_HSDI (UNOPUS, sqmovun, 0, NONE) + /* Implemented by aarch64_sqmovun<mode>. */ + BUILTIN_VQN (UNOPUS, sqmovun, 0, NONE) + BUILTIN_SD_HSDI (UNOPUS, sqmovun, 0, NONE) /* Implemented by aarch64_sqxtun2<mode>. */ BUILTIN_VQN (BINOP_UUS, sqxtun2, 0, NONE) /* Implemented by aarch64_<su>qmovn<mode>. */ - BUILTIN_VSQN_HSDI (UNOP, sqmovn, 0, NONE) - BUILTIN_VSQN_HSDI (UNOP, uqmovn, 0, NONE) + BUILTIN_VQN (UNOP, sqmovn, 0, NONE) + BUILTIN_SD_HSDI (UNOP, sqmovn, 0, NONE) + BUILTIN_VQN (UNOP, uqmovn, 0, NONE) + BUILTIN_SD_HSDI (UNOP, uqmovn, 0, NONE) /* Implemented by aarch64_<su>qxtn2<mode>. */ BUILTIN_VQN (BINOP, sqxtn2, 0, NONE) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index e750fae..540244c 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1690,17 +1690,48 @@ ;; Narrowing operations. -;; For doubles. +(define_insn "aarch64_xtn<mode>_insn_le" + [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") + (vec_concat:<VNARROWQ2> + (truncate:<VNARROWQ> (match_operand:VQN 1 "register_operand" "w")) + (match_operand:<VNARROWQ> 2 "aarch64_simd_or_scalar_imm_zero")))] + "TARGET_SIMD && !BYTES_BIG_ENDIAN" + "xtn\\t%0.<Vntype>, %1.<Vtype>" + [(set_attr "type" "neon_move_narrow_q")] +) -(define_insn "aarch64_xtn<mode>" - [(set (match_operand:<VNARROWQ> 0 "register_operand" "=w") - (truncate:<VNARROWQ> (match_operand:VQN 1 "register_operand" "w")))] - "TARGET_SIMD" +(define_insn "aarch64_xtn<mode>_insn_be" + [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") + (vec_concat:<VNARROWQ2> + (match_operand:<VNARROWQ> 2 "aarch64_simd_or_scalar_imm_zero") + (truncate:<VNARROWQ> (match_operand:VQN 1 "register_operand" "w"))))] + "TARGET_SIMD && BYTES_BIG_ENDIAN" "xtn\\t%0.<Vntype>, %1.<Vtype>" [(set_attr "type" "neon_move_narrow_q")] ) -(define_insn "aarch64_xtn2<mode>_le" +(define_expand "aarch64_xtn<mode>" + [(set (match_operand:<VNARROWQ> 0 "register_operand") + (truncate:<VNARROWQ> (match_operand:VQN 1 "register_operand")))] + "TARGET_SIMD" + { + rtx tmp = gen_reg_rtx (<VNARROWQ2>mode); + if (BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_xtn<mode>_insn_be (tmp, operands[1], + CONST0_RTX (<VNARROWQ>mode))); + else + emit_insn (gen_aarch64_xtn<mode>_insn_le (tmp, operands[1], + CONST0_RTX (<VNARROWQ>mode))); + + /* The intrinsic expects a narrow result, so emit a subreg that will get + optimized away as appropriate. */ + emit_move_insn (operands[0], lowpart_subreg (<VNARROWQ>mode, tmp, + <VNARROWQ2>mode)); + DONE; + } +) + +(define_insn "aarch64_xtn2<mode>_insn_le" [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") (vec_concat:<VNARROWQ2> (match_operand:<VNARROWQ> 1 "register_operand" "0") @@ -1710,7 +1741,7 @@ [(set_attr "type" "neon_move_narrow_q")] ) -(define_insn "aarch64_xtn2<mode>_be" +(define_insn "aarch64_xtn2<mode>_insn_be" [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") (vec_concat:<VNARROWQ2> (truncate:<VNARROWQ> (match_operand:VQN 2 "register_operand" "w")) @@ -1727,15 +1758,17 @@ "TARGET_SIMD" { if (BYTES_BIG_ENDIAN) - emit_insn (gen_aarch64_xtn2<mode>_be (operands[0], operands[1], - operands[2])); + emit_insn (gen_aarch64_xtn2<mode>_insn_be (operands[0], operands[1], + operands[2])); else - emit_insn (gen_aarch64_xtn2<mode>_le (operands[0], operands[1], - operands[2])); + emit_insn (gen_aarch64_xtn2<mode>_insn_le (operands[0], operands[1], + operands[2])); DONE; } ) +;; Packing doubles. + (define_expand "vec_pack_trunc_<mode>" [(match_operand:<VNARROWD> 0 "register_operand") (match_operand:VDN 1 "register_operand") @@ -1748,10 +1781,35 @@ emit_insn (gen_move_lo_quad_<Vdbl> (tempreg, operands[lo])); emit_insn (gen_move_hi_quad_<Vdbl> (tempreg, operands[hi])); - emit_insn (gen_aarch64_xtn<Vdbl> (operands[0], tempreg)); + emit_insn (gen_trunc<Vdbl><Vnarrowd>2 (operands[0], tempreg)); DONE; }) +;; Packing quads. + +(define_expand "vec_pack_trunc_<mode>" + [(set (match_operand:<VNARROWQ2> 0 "register_operand") + (vec_concat:<VNARROWQ2> + (truncate:<VNARROWQ> (match_operand:VQN 1 "register_operand")) + (truncate:<VNARROWQ> (match_operand:VQN 2 "register_operand"))))] + "TARGET_SIMD" + { + rtx tmpreg = gen_reg_rtx (<VNARROWQ>mode); + int lo = BYTES_BIG_ENDIAN ? 2 : 1; + int hi = BYTES_BIG_ENDIAN ? 1 : 2; + + emit_insn (gen_trunc<mode><Vnarrowq>2 (tmpreg, operands[lo])); + + if (BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_xtn2<mode>_insn_be (operands[0], tmpreg, + operands[hi])); + else + emit_insn (gen_aarch64_xtn2<mode>_insn_le (operands[0], tmpreg, + operands[hi])); + DONE; + } +) + (define_insn "aarch64_shrn<mode>_insn_le" [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") (vec_concat:<VNARROWQ2> @@ -1936,29 +1994,6 @@ } ) -;; For quads. - -(define_expand "vec_pack_trunc_<mode>" - [(set (match_operand:<VNARROWQ2> 0 "register_operand") - (vec_concat:<VNARROWQ2> - (truncate:<VNARROWQ> (match_operand:VQN 1 "register_operand")) - (truncate:<VNARROWQ> (match_operand:VQN 2 "register_operand"))))] - "TARGET_SIMD" - { - rtx tmpreg = gen_reg_rtx (<VNARROWQ>mode); - int lo = BYTES_BIG_ENDIAN ? 2 : 1; - int hi = BYTES_BIG_ENDIAN ? 1 : 2; - - emit_insn (gen_aarch64_xtn<mode> (tmpreg, operands[lo])); - - if (BYTES_BIG_ENDIAN) - emit_insn (gen_aarch64_xtn2<mode>_be (operands[0], tmpreg, operands[hi])); - else - emit_insn (gen_aarch64_xtn2<mode>_le (operands[0], tmpreg, operands[hi])); - DONE; - } -) - ;; Widening operations. (define_insn "aarch64_simd_vec_unpack<su>_lo_<mode>" @@ -4626,16 +4661,53 @@ ;; <r><addsub>hn<q>. -(define_insn "aarch64_<sur><addsub>hn<mode>" - [(set (match_operand:<VNARROWQ> 0 "register_operand" "=w") - (unspec:<VNARROWQ> [(match_operand:VQN 1 "register_operand" "w") - (match_operand:VQN 2 "register_operand" "w")] - ADDSUBHN))] - "TARGET_SIMD" +(define_insn "aarch64_<sur><addsub>hn<mode>_insn_le" + [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") + (vec_concat:<VNARROWQ2> + (unspec:<VNARROWQ> [(match_operand:VQN 1 "register_operand" "w") + (match_operand:VQN 2 "register_operand" "w")] + ADDSUBHN) + (match_operand:<VNARROWQ> 3 "aarch64_simd_or_scalar_imm_zero")))] + "TARGET_SIMD && !BYTES_BIG_ENDIAN" + "<sur><addsub>hn\\t%0.<Vntype>, %1.<Vtype>, %2.<Vtype>" + [(set_attr "type" "neon_<addsub>_halve_narrow_q")] +) + +(define_insn "aarch64_<sur><addsub>hn<mode>_insn_be" + [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") + (vec_concat:<VNARROWQ2> + (match_operand:<VNARROWQ> 3 "aarch64_simd_or_scalar_imm_zero") + (unspec:<VNARROWQ> [(match_operand:VQN 1 "register_operand" "w") + (match_operand:VQN 2 "register_operand" "w")] + ADDSUBHN)))] + "TARGET_SIMD && BYTES_BIG_ENDIAN" "<sur><addsub>hn\\t%0.<Vntype>, %1.<Vtype>, %2.<Vtype>" [(set_attr "type" "neon_<addsub>_halve_narrow_q")] ) +(define_expand "aarch64_<sur><addsub>hn<mode>" + [(set (match_operand:<VNARROWQ> 0 "register_operand") + (unspec:<VNARROWQ> [(match_operand:VQN 1 "register_operand") + (match_operand:VQN 2 "register_operand")] + ADDSUBHN))] + "TARGET_SIMD" + { + rtx tmp = gen_reg_rtx (<VNARROWQ2>mode); + if (BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_<sur><addsub>hn<mode>_insn_be (tmp, operands[1], + operands[2], CONST0_RTX (<VNARROWQ>mode))); + else + emit_insn (gen_aarch64_<sur><addsub>hn<mode>_insn_le (tmp, operands[1], + operands[2], CONST0_RTX (<VNARROWQ>mode))); + + /* The intrinsic expects a narrow result, so emit a subreg that will get + optimized away as appropriate. */ + emit_move_insn (operands[0], lowpart_subreg (<VNARROWQ>mode, tmp, + <VNARROWQ2>mode)); + DONE; + } +) + (define_insn "aarch64_<sur><addsub>hn2<mode>_insn_le" [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") (vec_concat:<VNARROWQ2> @@ -4835,26 +4907,59 @@ [(set_attr "type" "neon_qadd<q>")] ) -;; sqmovun - -(define_insn "aarch64_sqmovun<mode>" - [(set (match_operand:<VNARROWQ> 0 "register_operand" "=w") - (unspec:<VNARROWQ> [(match_operand:VSQN_HSDI 1 "register_operand" "w")] - UNSPEC_SQXTUN))] - "TARGET_SIMD" - "sqxtun\\t%<vn2>0<Vmntype>, %<v>1<Vmtype>" - [(set_attr "type" "neon_sat_shift_imm_narrow_q")] -) - ;; sqmovn and uqmovn (define_insn "aarch64_<su>qmovn<mode>" [(set (match_operand:<VNARROWQ> 0 "register_operand" "=w") (SAT_TRUNC:<VNARROWQ> - (match_operand:VSQN_HSDI 1 "register_operand" "w")))] + (match_operand:SD_HSDI 1 "register_operand" "w")))] "TARGET_SIMD" "<su>qxtn\\t%<vn2>0<Vmntype>, %<v>1<Vmtype>" - [(set_attr "type" "neon_sat_shift_imm_narrow_q")] + [(set_attr "type" "neon_sat_shift_imm_narrow_q")] +) + +(define_insn "aarch64_<su>qmovn<mode>_insn_le" + [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") + (vec_concat:<VNARROWQ2> + (SAT_TRUNC:<VNARROWQ> + (match_operand:VQN 1 "register_operand" "w")) + (match_operand:<VNARROWQ> 2 "aarch64_simd_or_scalar_imm_zero")))] + "TARGET_SIMD && !BYTES_BIG_ENDIAN" + "<su>qxtn\\t%<vn2>0<Vmntype>, %<v>1<Vmtype>" + [(set_attr "type" "neon_sat_shift_imm_narrow_q")] +) + +(define_insn "aarch64_<su>qmovn<mode>_insn_be" + [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") + (vec_concat:<VNARROWQ2> + (match_operand:<VNARROWQ> 2 "aarch64_simd_or_scalar_imm_zero") + (SAT_TRUNC:<VNARROWQ> + (match_operand:VQN 1 "register_operand" "w"))))] + "TARGET_SIMD && BYTES_BIG_ENDIAN" + "<su>qxtn\\t%<vn2>0<Vmntype>, %<v>1<Vmtype>" + [(set_attr "type" "neon_sat_shift_imm_narrow_q")] +) + +(define_expand "aarch64_<su>qmovn<mode>" + [(set (match_operand:<VNARROWQ> 0 "register_operand") + (SAT_TRUNC:<VNARROWQ> + (match_operand:VQN 1 "register_operand")))] + "TARGET_SIMD" + { + rtx tmp = gen_reg_rtx (<VNARROWQ2>mode); + if (BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_<su>qmovn<mode>_insn_be (tmp, operands[1], + CONST0_RTX (<VNARROWQ>mode))); + else + emit_insn (gen_aarch64_<su>qmovn<mode>_insn_le (tmp, operands[1], + CONST0_RTX (<VNARROWQ>mode))); + + /* The intrinsic expects a narrow result, so emit a subreg that will get + optimized away as appropriate. */ + emit_move_insn (operands[0], lowpart_subreg (<VNARROWQ>mode, tmp, + <VNARROWQ2>mode)); + DONE; + } ) (define_insn "aarch64_<su>qxtn2<mode>_le" @@ -4896,6 +5001,61 @@ } ) +;; sqmovun + +(define_insn "aarch64_sqmovun<mode>" + [(set (match_operand:<VNARROWQ> 0 "register_operand" "=w") + (unspec:<VNARROWQ> [(match_operand:SD_HSDI 1 "register_operand" "w")] + UNSPEC_SQXTUN))] + "TARGET_SIMD" + "sqxtun\\t%<vn2>0<Vmntype>, %<v>1<Vmtype>" + [(set_attr "type" "neon_sat_shift_imm_narrow_q")] +) + +(define_insn "aarch64_sqmovun<mode>_insn_le" + [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") + (vec_concat:<VNARROWQ2> + (unspec:<VNARROWQ> [(match_operand:VQN 1 "register_operand" "w")] + UNSPEC_SQXTUN) + (match_operand:<VNARROWQ> 2 "aarch64_simd_or_scalar_imm_zero")))] + "TARGET_SIMD && !BYTES_BIG_ENDIAN" + "sqxtun\\t%<vn2>0<Vmntype>, %<v>1<Vmtype>" + [(set_attr "type" "neon_sat_shift_imm_narrow_q")] +) + +(define_insn "aarch64_sqmovun<mode>_insn_be" + [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") + (vec_concat:<VNARROWQ2> + (match_operand:<VNARROWQ> 2 "aarch64_simd_or_scalar_imm_zero") + (unspec:<VNARROWQ> [(match_operand:VQN 1 "register_operand" "w")] + UNSPEC_SQXTUN)))] + "TARGET_SIMD && BYTES_BIG_ENDIAN" + "sqxtun\\t%<vn2>0<Vmntype>, %<v>1<Vmtype>" + [(set_attr "type" "neon_sat_shift_imm_narrow_q")] +) + +(define_expand "aarch64_sqmovun<mode>" + [(set (match_operand:<VNARROWQ> 0 "register_operand") + (unspec:<VNARROWQ> [(match_operand:VQN 1 "register_operand")] + UNSPEC_SQXTUN))] + "TARGET_SIMD" + { + rtx tmp = gen_reg_rtx (<VNARROWQ2>mode); + if (BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_sqmovun<mode>_insn_be (tmp, operands[1], + CONST0_RTX (<VNARROWQ>mode))); + else + emit_insn (gen_aarch64_sqmovun<mode>_insn_le (tmp, operands[1], + CONST0_RTX (<VNARROWQ>mode))); + + /* The intrinsic expects a narrow result, so emit a subreg that will get + optimized away as appropriate. */ + emit_move_insn (operands[0], lowpart_subreg (<VNARROWQ>mode, tmp, + <VNARROWQ2>mode)); + DONE; + } +) + (define_insn "aarch64_sqxtun2<mode>_le" [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w") (vec_concat:<VNARROWQ2> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index e9047d0..caa42f8 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -1257,6 +1257,8 @@ ;; Narrowed modes for VDN. (define_mode_attr VNARROWD [(V4HI "V8QI") (V2SI "V4HI") (DI "V2SI")]) +(define_mode_attr Vnarrowd [(V4HI "v8qi") (V2SI "v4hi") + (DI "v2si")]) ;; Narrowed double-modes for VQN (Used for XTN). (define_mode_attr VNARROWQ [(V8HI "V8QI") (V4SI "V4HI") diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index dee3df2..eb6f9b0 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -17633,8 +17633,10 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) if (vmode == V8QImode) { + rtx m128 = GEN_INT (-128); + for (i = nelt; i < 16; ++i) - rperm[i] = constm1_rtx; + rperm[i] = m128; vpmode = V16QImode; } @@ -18972,7 +18974,8 @@ expand_vec_perm_2perm_pblendv (struct expand_vec_perm_d *d, bool two_insn) ; else if (TARGET_AVX && (vmode == V4DFmode || vmode == V8SFmode)) ; - else if (TARGET_SSE4_1 && GET_MODE_SIZE (vmode) == 16) + else if (TARGET_SSE4_1 && (GET_MODE_SIZE (vmode) == 16 + || GET_MODE_SIZE (vmode) == 8)) ; else return false; @@ -19229,14 +19232,31 @@ expand_vec_perm_pshufb2 (struct expand_vec_perm_d *d) { rtx rperm[2][16], vperm, l, h, op, m128; unsigned int i, nelt, eltsz; + machine_mode mode; + rtx (*gen) (rtx, rtx, rtx); - if (!TARGET_SSSE3 || GET_MODE_SIZE (d->vmode) != 16) + if (!TARGET_SSSE3 || (GET_MODE_SIZE (d->vmode) != 16 + && GET_MODE_SIZE (d->vmode) != 8)) return false; gcc_assert (!d->one_operand_p); if (d->testing_p) return true; + switch (GET_MODE_SIZE (d->vmode)) + { + case 8: + mode = V8QImode; + gen = gen_mmx_pshufbv8qi3; + break; + case 16: + mode = V16QImode; + gen = gen_ssse3_pshufbv16qi3; + break; + default: + gcc_unreachable (); + } + nelt = d->nelt; eltsz = GET_MODE_UNIT_SIZE (d->vmode); @@ -19247,7 +19267,7 @@ expand_vec_perm_pshufb2 (struct expand_vec_perm_d *d) m128 = GEN_INT (-128); for (i = 0; i < nelt; ++i) { - unsigned j, e = d->perm[i]; + unsigned j, k, e = d->perm[i]; unsigned which = (e >= nelt); if (e >= nelt) e -= nelt; @@ -19257,26 +19277,29 @@ expand_vec_perm_pshufb2 (struct expand_vec_perm_d *d) rperm[which][i*eltsz + j] = GEN_INT (e*eltsz + j); rperm[1-which][i*eltsz + j] = m128; } + + for (k = i*eltsz + j; k < 16; ++k) + rperm[0][k] = rperm[1][k] = m128; } vperm = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, rperm[0])); vperm = force_reg (V16QImode, vperm); - l = gen_reg_rtx (V16QImode); - op = gen_lowpart (V16QImode, d->op0); - emit_insn (gen_ssse3_pshufbv16qi3 (l, op, vperm)); + l = gen_reg_rtx (mode); + op = gen_lowpart (mode, d->op0); + emit_insn (gen (l, op, vperm)); vperm = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, rperm[1])); vperm = force_reg (V16QImode, vperm); - h = gen_reg_rtx (V16QImode); - op = gen_lowpart (V16QImode, d->op1); - emit_insn (gen_ssse3_pshufbv16qi3 (h, op, vperm)); + h = gen_reg_rtx (mode); + op = gen_lowpart (mode, d->op1); + emit_insn (gen (h, op, vperm)); op = d->target; - if (d->vmode != V16QImode) - op = gen_reg_rtx (V16QImode); - emit_insn (gen_iorv16qi3 (op, l, h)); + if (d->vmode != mode) + op = gen_reg_rtx (mode); + emit_insn (gen_rtx_SET (op, gen_rtx_IOR (mode, l, h))); if (op != d->target) emit_move_insn (d->target, gen_lowpart (d->vmode, op)); @@ -19455,6 +19478,17 @@ expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d) switch (d->vmode) { + case E_V4HImode: + /* Required for "pack". */ + if (!TARGET_SSE4_1) + return false; + c = 0xffff; + s = 16; + half_mode = V2SImode; + gen_and = gen_andv2si3; + gen_pack = gen_mmx_packusdw; + gen_shift = gen_lshrv2si3; + break; case E_V8HImode: /* Required for "pack". */ if (!TARGET_SSE4_1) @@ -19507,7 +19541,7 @@ expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d) end_perm = true; break; default: - /* Only V8QI, V8HI, V16QI, V16HI and V32QI modes + /* Only V4HI, V8QI, V8HI, V16QI, V16HI and V32QI modes are more profitable than general shuffles. */ return false; } @@ -19698,18 +19732,25 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) break; case E_V4HImode: - if (d->testing_p) - break; - /* We need 2*log2(N)-1 operations to achieve odd/even - with interleave. */ - t1 = gen_reg_rtx (V4HImode); - emit_insn (gen_mmx_punpckhwd (t1, d->op0, d->op1)); - emit_insn (gen_mmx_punpcklwd (d->target, d->op0, d->op1)); - if (odd) - t2 = gen_mmx_punpckhwd (d->target, d->target, t1); + if (TARGET_SSE4_1) + return expand_vec_perm_even_odd_pack (d); + else if (TARGET_SSSE3 && !TARGET_SLOW_PSHUFB) + return expand_vec_perm_pshufb2 (d); else - t2 = gen_mmx_punpcklwd (d->target, d->target, t1); - emit_insn (t2); + { + if (d->testing_p) + break; + /* We need 2*log2(N)-1 operations to achieve odd/even + with interleave. */ + t1 = gen_reg_rtx (V4HImode); + emit_insn (gen_mmx_punpckhwd (t1, d->op0, d->op1)); + emit_insn (gen_mmx_punpcklwd (d->target, d->op0, d->op1)); + if (odd) + t2 = gen_mmx_punpckhwd (d->target, d->target, t1); + else + t2 = gen_mmx_punpcklwd (d->target, d->target, t1); + emit_insn (t2); + } break; case E_V8HImode: diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 1a9e7b0..59a16f4 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -2477,6 +2477,22 @@ (set_attr "type" "mmxshft,sselog,sselog") (set_attr "mode" "DI,TI,TI")]) +(define_insn_and_split "mmx_packusdw" + [(set (match_operand:V4HI 0 "register_operand" "=Yr,*x,Yw") + (vec_concat:V4HI + (us_truncate:V2HI + (match_operand:V2SI 1 "register_operand" "0,0,Yw")) + (us_truncate:V2HI + (match_operand:V2SI 2 "register_operand" "Yr,*x,Yw"))))] + "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" + "#" + "&& reload_completed" + [(const_int 0)] + "ix86_split_mmx_pack (operands, US_TRUNCATE); DONE;" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "sselog") + (set_attr "mode" "TI")]) + (define_insn_and_split "mmx_punpckhbw" [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yw") (vec_select:V8QI diff --git a/gcc/config/s390/vecintrin.h b/gcc/config/s390/vecintrin.h index 6bd26f8..9a3f7c3 100644 --- a/gcc/config/s390/vecintrin.h +++ b/gcc/config/s390/vecintrin.h @@ -109,8 +109,8 @@ __lcbb(const void *ptr, int bndry) #define vec_rint(X) __builtin_s390_vfi((X), 0, 0) #define vec_roundc(X) __builtin_s390_vfi((X), 4, 0) #define vec_round(X) __builtin_s390_vfi((X), 4, 4) -#define vec_doublee(X) __builtin_s390_vfll((X)) -#define vec_floate(X) __builtin_s390_vflr((X), 0, 0) +#define vec_doublee(X) __builtin_s390_vflls((X)) +#define vec_floate(X) __builtin_s390_vflrd((X), 0, 0) #define vec_load_len_r(X,L) \ (__vector unsigned char)__builtin_s390_vlrlr((L),(X)) #define vec_store_len_r(X,Y,L) \ diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog index 3016da8..8d14d38 100644 --- a/gcc/cp/ChangeLog +++ b/gcc/cp/ChangeLog @@ -1,3 +1,9 @@ +2021-06-16 Jason Merrill <jason@redhat.com> + + PR c++/101078 + PR c++/91706 + * pt.c (tsubst_baselink): Update binfos in non-dependent case. + 2021-06-15 Robin Dapp <rdapp@linux.ibm.com> * decl.c (duplicate_decls): Likewise. diff --git a/gcc/cp/init.c b/gcc/cp/init.c index 622d6e9..4bd942f 100644 --- a/gcc/cp/init.c +++ b/gcc/cp/init.c @@ -4226,7 +4226,7 @@ build_vec_init (tree base, tree maxindex, tree init, { /* Shortcut zero element case to avoid unneeded constructor synthesis. */ if (init && TREE_SIDE_EFFECTS (init)) - base = build2 (COMPOUND_EXPR, void_type_node, init, base); + base = build2 (COMPOUND_EXPR, ptype, init, base); return base; } diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index d4bb5cc..15947b2 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -16249,8 +16249,19 @@ tsubst_baselink (tree baselink, tree object_type, fns = BASELINK_FUNCTIONS (baselink); } else - /* We're going to overwrite pieces below, make a duplicate. */ - baselink = copy_node (baselink); + { + /* We're going to overwrite pieces below, make a duplicate. */ + baselink = copy_node (baselink); + + if (qualifying_scope != BINFO_TYPE (BASELINK_ACCESS_BINFO (baselink))) + { + /* The decl we found was from non-dependent scope, but we still need + to update the binfos for the instantiated qualifying_scope. */ + BASELINK_ACCESS_BINFO (baselink) = TYPE_BINFO (qualifying_scope); + BASELINK_BINFO (baselink) = lookup_base (qualifying_scope, binfo_type, + ba_unique, nullptr, complain); + } + } /* If lookup found a single function, mark it as used at this point. (If lookup found multiple functions the one selected later by diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 040cd4c..95020a0 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -218,7 +218,7 @@ in the following sections. -Wno-inherited-variadic-ctor -Wno-init-list-lifetime @gol -Winvalid-imported-macros @gol -Wno-invalid-offsetof -Wno-literal-suffix @gol --Wno-mismatched-new-delete -Wmismatched-tags @gol +-Wmismatched-new-delete -Wmismatched-tags @gol -Wmultiple-inheritance -Wnamespaces -Wnarrowing @gol -Wnoexcept -Wnoexcept-type -Wnon-virtual-dtor @gol -Wpessimizing-move -Wno-placement-new -Wplacement-new=@var{n} @gol @@ -3926,7 +3926,7 @@ The warning is inactive inside a system header file, such as the STL, so one can still use the STL. One may also instantiate or specialize templates. -@item -Wno-mismatched-new-delete @r{(C++ and Objective-C++ only)} +@item -Wmismatched-new-delete @r{(C++ and Objective-C++ only)} @opindex Wmismatched-new-delete @opindex Wno-mismatched-new-delete Warn for mismatches between calls to @code{operator new} or @code{operator @@ -3958,7 +3958,7 @@ The related option @option{-Wmismatched-dealloc} diagnoses mismatches involving allocation and deallocation functions other than @code{operator new} and @code{operator delete}. -@option{-Wmismatched-new-delete} is enabled by default. +@option{-Wmismatched-new-delete} is included in @option{-Wall}. @item -Wmismatched-tags @r{(C++ and Objective-C++ only)} @opindex Wmismatched-tags @@ -5502,6 +5502,8 @@ Options} and @ref{Objective-C and Objective-C++ Dialect Options}. -Wmemset-elt-size @gol -Wmemset-transposed-args @gol -Wmisleading-indentation @r{(only for C/C++)} @gol +-Wmismatched-dealloc @gol +-Wmismatched-new-delete @r{(only for C/C++)} @gol -Wmissing-attributes @gol -Wmissing-braces @r{(only for C/ObjC)} @gol -Wmultistatement-macros @gol @@ -6398,7 +6400,7 @@ Ignoring the warning can result in poorly optimized code. disable the warning, but this is not recommended and should be done only when non-existent profile data is justified. -@item -Wno-mismatched-dealloc +@item -Wmismatched-dealloc @opindex Wmismatched-dealloc @opindex Wno-mismatched-dealloc @@ -6431,7 +6433,7 @@ void f (void) In C++, the related option @option{-Wmismatched-new-delete} diagnoses mismatches involving either @code{operator new} or @code{operator delete}. -Option @option{-Wmismatched-dealloc} is enabled by default. +Option @option{-Wmismatched-dealloc} is included in @option{-Wall}. @item -Wmultistatement-macros @opindex Wmultistatement-macros @@ -7921,9 +7923,9 @@ Warnings controlled by the option can be disabled either by specifying Disable @option{-Wframe-larger-than=} warnings. The option is equivalent to @option{-Wframe-larger-than=}@samp{SIZE_MAX} or larger. -@item -Wno-free-nonheap-object -@opindex Wno-free-nonheap-object +@item -Wfree-nonheap-object @opindex Wfree-nonheap-object +@opindex Wno-free-nonheap-object Warn when attempting to deallocate an object that was either not allocated on the heap, or by using a pointer that was not returned from a prior call to the corresponding allocation function. For example, because the call @@ -7940,7 +7942,7 @@ void f (char *p) @} @end smallexample -@option{-Wfree-nonheap-object} is enabled by default. +@option{-Wfree-nonheap-object} is included in @option{-Wall}. @item -Wstack-usage=@var{byte-size} @opindex Wstack-usage @@ -9905,7 +9907,7 @@ This option causes GCC to create markers in the internal representation at the beginning of statements, and to keep them roughly in place throughout compilation, using them to guide the output of @code{is_stmt} markers in the line number table. This is enabled by default when -compiling with optimization (@option{-Os}, @option{-O}, @option{-O2}, +compiling with optimization (@option{-Os}, @option{-O1}, @option{-O2}, @dots{}), and outputting DWARF 2 debug information at the normal level. @item -gvariable-location-views @@ -10184,7 +10186,7 @@ that do not involve a space-speed tradeoff. As compared to @option{-O}, this option increases both compilation time and the performance of the generated code. -@option{-O2} turns on all optimization flags specified by @option{-O}. It +@option{-O2} turns on all optimization flags specified by @option{-O1}. It also turns on the following optimization flags: @c Please keep the following list alphabetized! @@ -10331,7 +10333,7 @@ instructions and checks if the result can be simplified. If loop unrolling is active, two passes are performed and the second is scheduled after loop unrolling. -This option is enabled by default at optimization levels @option{-O}, +This option is enabled by default at optimization levels @option{-O1}, @option{-O2}, @option{-O3}, @option{-Os}. @item -ffp-contract=@var{style} @@ -10359,7 +10361,7 @@ Note that @option{-fno-omit-frame-pointer} doesn't guarantee the frame pointer is used in all functions. Several targets always omit the frame pointer in leaf functions. -Enabled by default at @option{-O} and higher. +Enabled by default at @option{-O1} and higher. @item -foptimize-sibling-calls @opindex foptimize-sibling-calls @@ -10513,7 +10515,7 @@ This option is the default for optimized compilation if the assembler and linker support it. Use @option{-fno-merge-constants} to inhibit this behavior. -Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}. +Enabled at levels @option{-O1}, @option{-O2}, @option{-O3}, @option{-Os}. @item -fmerge-all-constants @opindex fmerge-all-constants @@ -10599,7 +10601,7 @@ long} on a 32-bit system, split the registers apart and allocate them independently. This normally generates better code for those types, but may make debugging more difficult. -Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, +Enabled at levels @option{-O1}, @option{-O2}, @option{-O3}, @option{-Os}. @item -fsplit-wide-types-early @@ -10711,18 +10713,18 @@ Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}. @opindex fauto-inc-dec Combine increments or decrements of addresses with memory accesses. This pass is always skipped on architectures that do not have -instructions to support this. Enabled by default at @option{-O} and +instructions to support this. Enabled by default at @option{-O1} and higher on architectures that support this. @item -fdce @opindex fdce Perform dead code elimination (DCE) on RTL@. -Enabled by default at @option{-O} and higher. +Enabled by default at @option{-O1} and higher. @item -fdse @opindex fdse Perform dead store elimination (DSE) on RTL@. -Enabled by default at @option{-O} and higher. +Enabled by default at @option{-O1} and higher. @item -fif-conversion @opindex fif-conversion @@ -10731,7 +10733,7 @@ includes use of conditional moves, min, max, set flags and abs instructions, and some tricks doable by standard arithmetics. The use of conditional execution on chips where it is available is controlled by @option{-fif-conversion2}. -Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}, but +Enabled at levels @option{-O1}, @option{-O2}, @option{-O3}, @option{-Os}, but not with @option{-Og}. @item -fif-conversion2 @@ -10739,7 +10741,7 @@ not with @option{-Og}. Use conditional execution (where available) to transform conditional jumps into branch-less equivalents. -Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}, but +Enabled at levels @option{-O1}, @option{-O2}, @option{-O3}, @option{-Os}, but not with @option{-Og}. @item -fdeclone-ctor-dtor @@ -10919,7 +10921,7 @@ If supported for the target machine, attempt to reorder instructions to exploit instruction slots available after delayed branch instructions. -Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}, +Enabled at levels @option{-O1}, @option{-O2}, @option{-O3}, @option{-Os}, but not at @option{-Og}. @item -fschedule-insns @@ -11157,7 +11159,7 @@ and the @option{large-stack-frame-growth} parameter to 400. @item -ftree-reassoc @opindex ftree-reassoc Perform reassociation on trees. This flag is enabled by default -at @option{-O} and higher. +at @option{-O1} and higher. @item -fcode-hoisting @opindex fcode-hoisting @@ -11180,7 +11182,7 @@ enabled by default at @option{-O3}. @item -ftree-forwprop @opindex ftree-forwprop Perform forward propagation on trees. This flag is enabled by default -at @option{-O} and higher. +at @option{-O1} and higher. @item -ftree-fre @opindex ftree-fre @@ -11188,12 +11190,12 @@ Perform full redundancy elimination (FRE) on trees. The difference between FRE and PRE is that FRE only considers expressions that are computed on all paths leading to the redundant computation. This analysis is faster than PRE, though it exposes fewer redundancies. -This flag is enabled by default at @option{-O} and higher. +This flag is enabled by default at @option{-O1} and higher. @item -ftree-phiprop @opindex ftree-phiprop Perform hoisting of loads from conditional pointers on trees. This -pass is enabled by default at @option{-O} and higher. +pass is enabled by default at @option{-O1} and higher. @item -fhoist-adjacent-loads @opindex fhoist-adjacent-loads @@ -11205,24 +11207,24 @@ by default at @option{-O2} and higher. @item -ftree-copy-prop @opindex ftree-copy-prop Perform copy propagation on trees. This pass eliminates unnecessary -copy operations. This flag is enabled by default at @option{-O} and +copy operations. This flag is enabled by default at @option{-O1} and higher. @item -fipa-pure-const @opindex fipa-pure-const Discover which functions are pure or constant. -Enabled by default at @option{-O} and higher. +Enabled by default at @option{-O1} and higher. @item -fipa-reference @opindex fipa-reference Discover which static variables do not escape the compilation unit. -Enabled by default at @option{-O} and higher. +Enabled by default at @option{-O1} and higher. @item -fipa-reference-addressable @opindex fipa-reference-addressable Discover read-only, write-only and non-addressable static variables. -Enabled by default at @option{-O} and higher. +Enabled by default at @option{-O1} and higher. @item -fipa-stack-alignment @opindex fipa-stack-alignment @@ -11243,14 +11245,14 @@ cold functions are marked as cold. Also functions executed once (such as @code{cold}, @code{noreturn}, static constructors or destructors) are identified. Cold functions and loop less parts of functions executed once are then optimized for size. -Enabled by default at @option{-O} and higher. +Enabled by default at @option{-O1} and higher. @item -fipa-modref @opindex fipa-modref Perform interprocedural mod/ref analysis. This optimization analyzes the side effects of functions (memory locations that are modified or referenced) and enables better optimization across the function call boundary. This flag is -enabled by default at @option{-O} and higher. +enabled by default at @option{-O1} and higher. @item -fipa-cp @opindex fipa-cp @@ -11378,7 +11380,7 @@ currently enabled, but may be enabled by @option{-O2} in the future. @item -ftree-sink @opindex ftree-sink Perform forward store motion on trees. This flag is -enabled by default at @option{-O} and higher. +enabled by default at @option{-O1} and higher. @item -ftree-bit-ccp @opindex ftree-bit-ccp @@ -11392,14 +11394,14 @@ It requires that @option{-ftree-ccp} is enabled. @opindex ftree-ccp Perform sparse conditional constant propagation (CCP) on trees. This pass only operates on local scalar variables and is enabled by default -at @option{-O} and higher. +at @option{-O1} and higher. @item -fssa-backprop @opindex fssa-backprop Propagate information about uses of a value up the definition chain in order to simplify the definitions. For example, this pass strips sign operations if the sign of a value never matters. The flag is -enabled by default at @option{-O} and higher. +enabled by default at @option{-O1} and higher. @item -fssa-phiopt @opindex fssa-phiopt @@ -11425,7 +11427,7 @@ be limited using @option{max-tail-merge-comparisons} parameter and @item -ftree-dce @opindex ftree-dce Perform dead code elimination (DCE) on trees. This flag is enabled by -default at @option{-O} and higher. +default at @option{-O1} and higher. @item -ftree-builtin-call-dce @opindex ftree-builtin-call-dce @@ -11450,26 +11452,26 @@ Perform a variety of simple scalar cleanups (constant/copy propagation, redundancy elimination, range propagation and expression simplification) based on a dominator tree traversal. This also performs jump threading (to reduce jumps to jumps). This flag is -enabled by default at @option{-O} and higher. +enabled by default at @option{-O1} and higher. @item -ftree-dse @opindex ftree-dse Perform dead store elimination (DSE) on trees. A dead store is a store into a memory location that is later overwritten by another store without any intervening loads. In this case the earlier store can be deleted. This -flag is enabled by default at @option{-O} and higher. +flag is enabled by default at @option{-O1} and higher. @item -ftree-ch @opindex ftree-ch Perform loop header copying on trees. This is beneficial since it increases effectiveness of code motion optimizations. It also saves one jump. This flag -is enabled by default at @option{-O} and higher. It is not enabled +is enabled by default at @option{-O1} and higher. It is not enabled for @option{-Os}, since it usually increases code size. @item -ftree-loop-optimize @opindex ftree-loop-optimize Perform loop optimizations on trees. This flag is enabled by default -at @option{-O} and higher. +at @option{-O1} and higher. @item -ftree-loop-linear @itemx -floop-strip-mine @@ -11624,7 +11626,7 @@ in such a way that its value when exiting the loop can be determined using only its initial value and the number of loop iterations, replace uses of the final value by such a computation, provided it is sufficiently cheap. This reduces data dependencies and may allow further simplifications. -Enabled by default at @option{-O} and higher. +Enabled by default at @option{-O1} and higher. @item -fivopts @opindex fivopts @@ -11666,13 +11668,13 @@ Perform temporary expression replacement during the SSA->normal phase. Single use/single def temporaries are replaced at their use location with their defining expression. This results in non-GIMPLE code, but gives the expanders much more complex trees to work on resulting in better RTL generation. This is -enabled by default at @option{-O} and higher. +enabled by default at @option{-O1} and higher. @item -ftree-slsr @opindex ftree-slsr Perform straight-line strength reduction on trees. This recognizes related expressions involving multiplications and replaces them by less expensive -calculations when possible. This is enabled by default at @option{-O} and +calculations when possible. This is enabled by default at @option{-O1} and higher. @item -ftree-vectorize @@ -11853,7 +11855,7 @@ The default is @option{-fguess-branch-probability} at levels Reorder basic blocks in the compiled function in order to reduce number of taken branches and improve code locality. -Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}. +Enabled at levels @option{-O1}, @option{-O2}, @option{-O3}, @option{-Os}. @item -freorder-blocks-algorithm=@var{algorithm} @opindex freorder-blocks-algorithm @@ -11864,7 +11866,7 @@ or @samp{stc}, the ``software trace cache'' algorithm, which tries to put all often executed code together, minimizing the number of branches executed by making extra copies of code. -The default is @samp{simple} at levels @option{-O}, @option{-Os}, and +The default is @samp{simple} at levels @option{-O1}, @option{-Os}, and @samp{stc} at levels @option{-O2}, @option{-O3}. @item -freorder-blocks-and-partition @@ -12434,7 +12436,7 @@ explicit comparison operation. This pass only applies to certain targets that cannot explicitly represent the comparison operation before register allocation is complete. -Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}. +Enabled at levels @option{-O1}, @option{-O2}, @option{-O3}, @option{-Os}. @item -fcprop-registers @opindex fcprop-registers @@ -12442,7 +12444,7 @@ After register allocation and post-register allocation instruction splitting, perform a copy-propagation pass to try to reduce scheduling dependencies and occasionally eliminate the copy. -Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}. +Enabled at levels @option{-O1}, @option{-O2}, @option{-O3}, @option{-Os}. @item -fprofile-correction @opindex fprofile-correction diff --git a/gcc/dominance.c b/gcc/dominance.c index 5fa172f..6a262ce 100644 --- a/gcc/dominance.c +++ b/gcc/dominance.c @@ -883,17 +883,17 @@ set_immediate_dominator (enum cdi_direction dir, basic_block bb, /* Returns the list of basic blocks immediately dominated by BB, in the direction DIR. */ -vec<basic_block> +auto_vec<basic_block> get_dominated_by (enum cdi_direction dir, basic_block bb) { unsigned int dir_index = dom_convert_dir_to_idx (dir); struct et_node *node = bb->dom[dir_index], *son = node->son, *ason; - vec<basic_block> bbs = vNULL; + auto_vec<basic_block> bbs; gcc_checking_assert (dom_computed[dir_index]); if (!son) - return vNULL; + return bbs; bbs.safe_push ((basic_block) son->data); for (ason = son->right; ason != son; ason = ason->right) @@ -906,13 +906,13 @@ get_dominated_by (enum cdi_direction dir, basic_block bb) direction DIR) by some block between N_REGION ones stored in REGION, except for blocks in the REGION itself. */ -vec<basic_block> +auto_vec<basic_block> get_dominated_by_region (enum cdi_direction dir, basic_block *region, unsigned n_region) { unsigned i; basic_block dom; - vec<basic_block> doms = vNULL; + auto_vec<basic_block> doms; for (i = 0; i < n_region; i++) region[i]->flags |= BB_DUPLICATED; @@ -933,10 +933,10 @@ get_dominated_by_region (enum cdi_direction dir, basic_block *region, produce a vector containing all dominated blocks. The vector will be sorted in preorder. */ -vec<basic_block> +auto_vec<basic_block> get_dominated_to_depth (enum cdi_direction dir, basic_block bb, int depth) { - vec<basic_block> bbs = vNULL; + auto_vec<basic_block> bbs; unsigned i; unsigned next_level_start; @@ -965,7 +965,7 @@ get_dominated_to_depth (enum cdi_direction dir, basic_block bb, int depth) /* Returns the list of basic blocks including BB dominated by BB, in the direction DIR. The vector will be sorted in preorder. */ -vec<basic_block> +auto_vec<basic_block> get_all_dominated_blocks (enum cdi_direction dir, basic_block bb) { return get_dominated_to_depth (dir, bb, 0); diff --git a/gcc/dominance.h b/gcc/dominance.h index 4eeac59..1a8c248 100644 --- a/gcc/dominance.h +++ b/gcc/dominance.h @@ -46,14 +46,14 @@ extern void free_dominance_info_for_region (function *, extern basic_block get_immediate_dominator (enum cdi_direction, basic_block); extern void set_immediate_dominator (enum cdi_direction, basic_block, basic_block); -extern vec<basic_block> get_dominated_by (enum cdi_direction, basic_block); -extern vec<basic_block> get_dominated_by_region (enum cdi_direction, +extern auto_vec<basic_block> get_dominated_by (enum cdi_direction, basic_block); +extern auto_vec<basic_block> get_dominated_by_region (enum cdi_direction, basic_block *, unsigned); -extern vec<basic_block> get_dominated_to_depth (enum cdi_direction, - basic_block, int); -extern vec<basic_block> get_all_dominated_blocks (enum cdi_direction, - basic_block); +extern auto_vec<basic_block> get_dominated_to_depth (enum cdi_direction, + basic_block, int); +extern auto_vec<basic_block> get_all_dominated_blocks (enum cdi_direction, + basic_block); extern void redirect_immediate_dominators (enum cdi_direction, basic_block, basic_block); extern basic_block nearest_common_dominator (enum cdi_direction, diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog index 3c71933..f073f34 100644 --- a/gcc/fortran/ChangeLog +++ b/gcc/fortran/ChangeLog @@ -1,3 +1,36 @@ +2021-06-16 Harald Anlauf <anlauf@gmx.de> + + PR fortran/95501 + PR fortran/95502 + * expr.c (gfc_check_pointer_assign): Avoid NULL pointer + dereference. + * match.c (gfc_match_pointer_assignment): Likewise. + * parse.c (gfc_check_do_variable): Avoid comparison with NULL + symtree. + +2021-06-16 Harald Anlauf <anlauf@gmx.de> + + Revert: + 2021-06-16 Harald Anlauf <anlauf@gmx.de> + + PR fortran/95501 + PR fortran/95502 + * expr.c (gfc_check_pointer_assign): Avoid NULL pointer + dereference. + * match.c (gfc_match_pointer_assignment): Likewise. + * parse.c (gfc_check_do_variable): Avoid comparison with NULL + symtree. + +2021-06-16 Harald Anlauf <anlauf@gmx.de> + + PR fortran/95501 + PR fortran/95502 + * expr.c (gfc_check_pointer_assign): Avoid NULL pointer + dereference. + * match.c (gfc_match_pointer_assignment): Likewise. + * parse.c (gfc_check_do_variable): Avoid comparison with NULL + symtree. + 2021-06-15 Tobias Burnus <tobias@codesourcery.com> PR fortran/92568 diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c index 956003e..b11ae7c 100644 --- a/gcc/fortran/expr.c +++ b/gcc/fortran/expr.c @@ -3815,6 +3815,9 @@ gfc_check_pointer_assign (gfc_expr *lvalue, gfc_expr *rvalue, int proc_pointer; bool same_rank; + if (!lvalue->symtree) + return false; + lhs_attr = gfc_expr_attr (lvalue); if (lvalue->ts.type == BT_UNKNOWN && !lhs_attr.proc_pointer) { diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c index 2946201..d148de3 100644 --- a/gcc/fortran/match.c +++ b/gcc/fortran/match.c @@ -1409,7 +1409,7 @@ gfc_match_pointer_assignment (void) gfc_matching_procptr_assignment = 0; m = gfc_match (" %v =>", &lvalue); - if (m != MATCH_YES) + if (m != MATCH_YES || !lvalue->symtree) { m = MATCH_NO; goto cleanup; diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c index 0522b39..6d7845e 100644 --- a/gcc/fortran/parse.c +++ b/gcc/fortran/parse.c @@ -4588,6 +4588,9 @@ gfc_check_do_variable (gfc_symtree *st) { gfc_state_data *s; + if (!st) + return 0; + for (s=gfc_state_stack; s; s = s->previous) if (s->do_variable == st) { diff --git a/gcc/gcov-io.h b/gcc/gcov-io.h index f7584eb..ff92afe 100644 --- a/gcc/gcov-io.h +++ b/gcc/gcov-io.h @@ -42,15 +42,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see Numbers are recorded in the 32 bit unsigned binary form of the endianness of the machine generating the file. 64 bit numbers are - stored as two 32 bit numbers, the low part first. Strings are - padded with 1 to 4 NUL bytes, to bring the length up to a multiple - of 4. The number of 4 bytes is stored, followed by the padded + stored as two 32 bit numbers, the low part first. + The number of bytes is stored, followed by the string. Zero length and NULL strings are simply stored as a length of zero (they have no trailing NUL or padding). int32: byte3 byte2 byte1 byte0 | byte0 byte1 byte2 byte3 int64: int32:low int32:high - string: int32:0 | int32:length char* char:0 padding + string: int32:0 | int32:length char* char:0 padding: | char:0 | char:0 char:0 | char:0 char:0 char:0 item: int32 | int64 | string @@ -3050,9 +3050,7 @@ static int hoist_code (void) { basic_block bb, dominated; - vec<basic_block> dom_tree_walk; unsigned int dom_tree_walk_index; - vec<basic_block> domby; unsigned int i, j, k; struct gcse_expr **index_map; struct gcse_expr *expr; @@ -3106,15 +3104,16 @@ hoist_code (void) if (flag_ira_hoist_pressure) hoisted_bbs = BITMAP_ALLOC (NULL); - dom_tree_walk = get_all_dominated_blocks (CDI_DOMINATORS, - ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb); + auto_vec<basic_block> dom_tree_walk + = get_all_dominated_blocks (CDI_DOMINATORS, + ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb); /* Walk over each basic block looking for potentially hoistable expressions, nothing gets hoisted from the entry block. */ FOR_EACH_VEC_ELT (dom_tree_walk, dom_tree_walk_index, bb) { - domby = get_dominated_to_depth (CDI_DOMINATORS, bb, - param_max_hoist_depth); + auto_vec<basic_block> domby + = get_dominated_to_depth (CDI_DOMINATORS, bb, param_max_hoist_depth); if (domby.length () == 0) continue; @@ -3315,10 +3314,8 @@ hoist_code (void) bitmap_clear (from_bbs); } } - domby.release (); } - dom_tree_walk.release (); BITMAP_FREE (from_bbs); if (flag_ira_hoist_pressure) BITMAP_FREE (hoisted_bbs); diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc index d9a57c2..cc2b709 100644 --- a/gcc/gimple-range-cache.cc +++ b/gcc/gimple-range-cache.cc @@ -727,7 +727,7 @@ ranger_cache::ranger_cache (gimple_ranger &q) : query (q) if (bb) m_gori.exports (bb); } - enable_new_values (true); + m_new_value_p = true; } ranger_cache::~ranger_cache () @@ -978,8 +978,7 @@ ranger_cache::range_of_expr (irange &r, tree name, gimple *stmt) } -// Implement range_on_edge. Return TRUE if the edge generates a range, -// otherwise false.. but still return a range. +// Implement range_on_edge. Always return the best available range. bool ranger_cache::range_on_edge (irange &r, edge e, tree expr) @@ -989,14 +988,11 @@ ranger_cache::range_of_expr (irange &r, tree name, gimple *stmt) exit_range (r, expr, e->src); int_range_max edge_range; if (m_gori.outgoing_edge_range_p (edge_range, e, expr, *this)) - { - r.intersect (edge_range); - return true; - } + r.intersect (edge_range); + return true; } - else - get_tree_range (r, expr, NULL); - return false; + + return get_tree_range (r, expr, NULL); } diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc index 09dcd69..647f496 100644 --- a/gcc/gimple-range-gori.cc +++ b/gcc/gimple-range-gori.cc @@ -972,16 +972,18 @@ gori_compute::compute_operand1_and_operand2_range (irange &r, r.intersect (op_range); return true; } -// Return TRUE if a range can be calcalated for NAME on edge E. +// Return TRUE if a range can be calculated or recomputed for NAME on edge E. bool gori_compute::has_edge_range_p (tree name, edge e) { - // If no edge is specified, check if NAME is an export on any edge. - if (!e) - return is_export_p (name); + // Check if NAME is an export or can be recomputed. + if (e) + return is_export_p (name, e->src) || may_recompute_p (name, e); - return is_export_p (name, e->src); + // If no edge is specified, check if NAME can have a range calculated + // on any edge. + return is_export_p (name) || may_recompute_p (name); } // Dump what is known to GORI computes to listing file F. @@ -992,6 +994,32 @@ gori_compute::dump (FILE *f) gori_map::dump (f); } +// Return TRUE if NAME can be recomputed on edge E. If any direct dependant +// is exported on edge E, it may change the computed value of NAME. + +bool +gori_compute::may_recompute_p (tree name, edge e) +{ + tree dep1 = depend1 (name); + tree dep2 = depend2 (name); + + // If the first dependency is not set, there is no recompuation. + if (!dep1) + return false; + + // Don't recalculate PHIs or statements with side_effects. + gimple *s = SSA_NAME_DEF_STMT (name); + if (is_a<gphi *> (s) || gimple_has_side_effects (s)) + return false; + + // If edge is specified, check if NAME can be recalculated on that edge. + if (e) + return ((is_export_p (dep1, e->src)) + || (dep2 && is_export_p (dep2, e->src))); + + return (is_export_p (dep1)) || (dep2 && is_export_p (dep2)); +} + // Calculate a range on edge E and return it in R. Try to evaluate a // range for NAME on this edge. Return FALSE if this is either not a // control edge or NAME is not defined by this edge. @@ -1026,6 +1054,27 @@ gori_compute::outgoing_edge_range_p (irange &r, edge e, tree name, return true; } } + // If NAME isn't exported, check if it can be recomputed. + else if (may_recompute_p (name, e)) + { + gimple *def_stmt = SSA_NAME_DEF_STMT (name); + + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "recomputation attempt on edge %d->%d for ", + e->src->index, e->dest->index); + print_generic_expr (dump_file, name, TDF_SLIM); + } + // Simply calculate DEF_STMT on edge E usng the range query Q. + fold_range (r, def_stmt, e, &q); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, " : Calculated :"); + r.dump (dump_file); + fputc ('\n', dump_file); + } + return true; + } return false; } diff --git a/gcc/gimple-range-gori.h b/gcc/gimple-range-gori.h index 06f3b79..6f187db 100644 --- a/gcc/gimple-range-gori.h +++ b/gcc/gimple-range-gori.h @@ -157,6 +157,7 @@ public: bool has_edge_range_p (tree name, edge e = NULL); void dump (FILE *f); private: + bool may_recompute_p (tree name, edge e = NULL); bool compute_operand_range (irange &r, gimple *stmt, const irange &lhs, tree name, class fur_source &src); bool compute_operand_range_switch (irange &r, gswitch *s, const irange &lhs, diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc index 481b89b..5810970 100644 --- a/gcc/gimple-range.cc +++ b/gcc/gimple-range.cc @@ -1394,7 +1394,8 @@ gimple_ranger::dump_bb (FILE *f, basic_block bb) for (x = 1; x < num_ssa_names; x++) { tree name = gimple_range_ssa_p (ssa_name (x)); - if (name && m_cache.range_on_edge (range, e, name)) + if (name && gori ().has_edge_range_p (name, e) + && m_cache.range_on_edge (range, e, name)) { gimple *s = SSA_NAME_DEF_STMT (name); // Only print the range if this is the def block, or @@ -1661,4 +1662,83 @@ disable_ranger (struct function *fun) fun->x_range_query = &global_ranges; } +// ========================================= +// Debugging helpers. +// ========================================= + +// Query all statements in the IL to precalculate computable ranges in RANGER. + +static DEBUG_FUNCTION void +debug_seed_ranger (gimple_ranger &ranger) +{ + // Recalculate SCEV to make sure the dump lists everything. + if (scev_initialized_p ()) + { + scev_finalize (); + scev_initialize (); + } + + basic_block bb; + int_range_max r; + gimple_stmt_iterator gsi; + FOR_EACH_BB_FN (bb, cfun) + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + + if (is_gimple_debug (stmt)) + continue; + + ranger.range_of_stmt (r, stmt); + } +} + +// Dump all that ranger knows for the current function. + +DEBUG_FUNCTION void +dump_ranger (FILE *out) +{ + gimple_ranger ranger; + debug_seed_ranger (ranger); + ranger.dump (out); +} + +DEBUG_FUNCTION void +debug_ranger () +{ + dump_ranger (stderr); +} + +// Dump all that ranger knows on a path of BBs. +// +// Note that the blocks are in reverse order, thus the exit block is +// path[0]. + +DEBUG_FUNCTION void +dump_ranger (FILE *dump_file, const vec<basic_block> &path) +{ + if (path.length () == 0) + { + fprintf (dump_file, "empty\n"); + return; + } + + gimple_ranger ranger; + debug_seed_ranger (ranger); + + unsigned i = path.length (); + do + { + i--; + ranger.dump_bb (dump_file, path[i]); + } + while (i > 0); +} + +DEBUG_FUNCTION void +debug_ranger (const vec<basic_block> &path) +{ + dump_ranger (stderr, path); +} + #include "gimple-range-tests.cc" diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 2cae69e..57c18af 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -4527,7 +4527,7 @@ create_specialized_node (struct cgraph_node *node, vec<tree> known_csts, vec<ipa_polymorphic_call_context> known_contexts, struct ipa_agg_replacement_value *aggvals, - vec<cgraph_edge *> callers) + vec<cgraph_edge *> &callers) { ipa_node_params *new_info, *info = ipa_node_params_sum->get (node); vec<ipa_replace_map *, va_gc> *replace_trees = NULL; @@ -4672,7 +4672,6 @@ create_specialized_node (struct cgraph_node *node, ipcp_discover_new_direct_edges (new_node, known_csts, known_contexts, aggvals); - callers.release (); return new_node; } @@ -5562,6 +5561,7 @@ decide_about_value (struct cgraph_node *node, int index, HOST_WIDE_INT offset, offset, val->value)); val->spec_node = create_specialized_node (node, known_csts, known_contexts, aggvals, callers); + callers.release (); overall_size += val->local_size_cost; if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " overall size reached %li\n", @@ -5638,7 +5638,7 @@ decide_whether_version_node (struct cgraph_node *node) } struct cgraph_node *clone; - vec<cgraph_edge *> callers = node->collect_callers (); + auto_vec<cgraph_edge *> callers = node->collect_callers (); for (int i = callers.length () - 1; i >= 0; i--) { @@ -5654,7 +5654,6 @@ decide_whether_version_node (struct cgraph_node *node) /* If node is not called by anyone, or all its caller edges are self-recursive, the node is not really in use, no need to do cloning. */ - callers.release (); info->do_clone_for_all_contexts = false; return ret; } diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c index 3f90d4d..3272daf 100644 --- a/gcc/ipa-sra.c +++ b/gcc/ipa-sra.c @@ -3755,7 +3755,7 @@ process_isra_node_results (cgraph_node *node, unsigned &suffix_counter = clone_num_suffixes->get_or_insert ( IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME ( node->decl))); - vec<cgraph_edge *> callers = node->collect_callers (); + auto_vec<cgraph_edge *> callers = node->collect_callers (); cgraph_node *new_node = node->create_virtual_clone (callers, NULL, new_adjustments, "isra", suffix_counter); diff --git a/gcc/loop-unroll.c b/gcc/loop-unroll.c index 8b6fcf5..66d9348 100644 --- a/gcc/loop-unroll.c +++ b/gcc/loop-unroll.c @@ -884,7 +884,7 @@ unroll_loop_runtime_iterations (class loop *loop) { rtx old_niter, niter, tmp; rtx_insn *init_code, *branch_code; - unsigned i, j; + unsigned i; profile_probability p; basic_block preheader, *body, swtch, ezc_swtch = NULL; int may_exit_copy; @@ -908,15 +908,9 @@ unroll_loop_runtime_iterations (class loop *loop) body = get_loop_body (loop); for (i = 0; i < loop->num_nodes; i++) { - vec<basic_block> ldom; - basic_block bb; - - ldom = get_dominated_by (CDI_DOMINATORS, body[i]); - FOR_EACH_VEC_ELT (ldom, j, bb) + for (basic_block bb : get_dominated_by (CDI_DOMINATORS, body[i])) if (!flow_bb_inside_loop_p (loop, bb)) dom_bbs.safe_push (bb); - - ldom.release (); } free (body); @@ -1013,7 +1007,7 @@ unroll_loop_runtime_iterations (class loop *loop) gcc_assert (ok); /* Create item for switch. */ - j = n_peel - i - (extra_zero_check ? 0 : 1); + unsigned j = n_peel - i - (extra_zero_check ? 0 : 1); p = profile_probability::always ().apply_scale (1, i + 2); preheader = split_edge (loop_preheader_edge (loop)); diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 8b27352..c6e8817 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,93 @@ +2021-06-16 Jason Merrill <jason@redhat.com> + + PR c++/101078 + PR c++/91706 + * g++.dg/template/access39.C: New test. + +2021-06-16 Harald Anlauf <anlauf@gmx.de> + + PR fortran/95501 + PR fortran/95502 + * gfortran.dg/pr95502.f90: New test. + +2021-06-16 Harald Anlauf <anlauf@gmx.de> + + Revert: + 2021-06-16 Harald Anlauf <anlauf@gmx.de> + + PR fortran/95501 + PR fortran/95502 + * gfortran.dg/pr95502.f90: New test. + +2021-06-16 Harald Anlauf <anlauf@gmx.de> + + PR fortran/95501 + PR fortran/95502 + * gfortran.dg/pr95502.f90: New test. + +2021-06-16 Jason Merrill <jason@redhat.com> + + PR c++/100796 + PR preprocessor/96391 + * g++.dg/plugin/location-overflow-test-pr100796.c: New test. + * g++.dg/plugin/plugin.exp: Run it. + +2021-06-16 Jonathan Wright <jonathan.wright@arm.com> + + * gcc.target/aarch64/narrow_zero_high_half.c: Add new tests. + +2021-06-16 Jonathan Wright <jonathan.wright@arm.com> + + * gcc.target/aarch64/narrow_zero_high_half.c: Add new tests. + +2021-06-16 Jonathan Wright <jonathan.wright@arm.com> + + * gcc.target/aarch64/narrow_zero_high_half.c: Add new tests. + +2021-06-16 Jonathan Wright <jonathan.wright@arm.com> + + * gcc.target/aarch64/narrow_zero_high_half.c: Add new tests. + +2021-06-16 Jonathan Wright <jonathan.wright@arm.com> + + * gcc.target/aarch64/narrow_zero_high_half.c: New test. + +2021-06-16 Martin Jambor <mjambor@suse.cz> + + PR tree-optimization/100453 + * gcc.dg/tree-ssa/pr100453.c: New test. + +2021-06-16 Jakub Jelinek <jakub@redhat.com> + + * gcc.dg/guality/pr49888.c (f): Use noipa attribute instead of + noinline, noclone. + +2021-06-16 Jakub Jelinek <jakub@redhat.com> + + PR middle-end/101062 + * gcc.dg/pr101062.c: New test. + +2021-06-16 Richard Biener <rguenther@suse.de> + + PR tree-optimization/101088 + * gcc.dg/torture/pr101088.c: New testcase. + +2021-06-16 Roger Sayle <roger@nextmovesoftware.com> + + PR rtl-optimization/46235 + * gcc.target/i386/bt-5.c: New test. + * gcc.target/i386/bt-6.c: New test. + * gcc.target/i386/bt-7.c: New test. + +2021-06-16 Arnaud Charlet <charlet@adacore.com> + + * gnat.dg/limited4.adb: Disable illegal code. + +2021-06-16 Richard Biener <rguenther@suse.de> + + PR tree-optimization/101083 + * gcc.dg/vect/pr97832-4.c: New testcase. + 2021-06-15 Tobias Burnus <tobias@codesourcery.com> PR fortran/92568 diff --git a/gcc/testsuite/g++.dg/plugin/location-overflow-test-pr100796.c b/gcc/testsuite/g++.dg/plugin/location-overflow-test-pr100796.c new file mode 100644 index 0000000..7fa964c --- /dev/null +++ b/gcc/testsuite/g++.dg/plugin/location-overflow-test-pr100796.c @@ -0,0 +1,25 @@ +// PR c++/100796 +// { dg-additional-options "-Wsuggest-override -fplugin-arg-location_overflow_plugin-value=0x60000001" } +// Passing LINE_MAP_MAX_LOCATION_WITH_COLS meant we stopped distinguishing between lines in a macro. + +#define DO_PRAGMA(text) _Pragma(#text) +#define WARNING_PUSH DO_PRAGMA(GCC diagnostic push) +#define WARNING_POP DO_PRAGMA(GCC diagnostic pop) +#define WARNING_DISABLE(text) DO_PRAGMA(GCC diagnostic ignored text) +#define NO_OVERRIDE_WARNING WARNING_DISABLE("-Wsuggest-override") + +#define BOILERPLATE \ + WARNING_PUSH \ + NO_OVERRIDE_WARNING \ + void f(); \ + WARNING_POP + +struct B +{ + virtual void f(); +}; + +struct D: B +{ + BOILERPLATE +}; diff --git a/gcc/testsuite/g++.dg/plugin/plugin.exp b/gcc/testsuite/g++.dg/plugin/plugin.exp index 5cd4b4b..74e12df 100644 --- a/gcc/testsuite/g++.dg/plugin/plugin.exp +++ b/gcc/testsuite/g++.dg/plugin/plugin.exp @@ -73,7 +73,8 @@ set plugin_test_list [list \ ../../gcc.dg/plugin/diagnostic-test-string-literals-3.c \ ../../gcc.dg/plugin/diagnostic-test-string-literals-4.c } \ { ../../gcc.dg/plugin/location_overflow_plugin.c \ - location-overflow-test-pr96391.c } \ + location-overflow-test-pr96391.c \ + location-overflow-test-pr100796.c } \ { show_template_tree_color_plugin.c \ show-template-tree-color.C \ show-template-tree-color-labels.C \ diff --git a/gcc/testsuite/g++.dg/template/access39.C b/gcc/testsuite/g++.dg/template/access39.C new file mode 100644 index 0000000..d941555 --- /dev/null +++ b/gcc/testsuite/g++.dg/template/access39.C @@ -0,0 +1,17 @@ +// PR c++/101078 + +struct A { + static void f(); +}; + +template<class> +struct B : private A { + struct C { + void g() { f(); } + void g2() { B::f(); } + }; +}; + +int main() { + B<int>::C().g(); +} diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c b/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c new file mode 100644 index 0000000..6b427aa --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c @@ -0,0 +1,43 @@ +/* { dg-do run } */ +/* { dg-require-effective-target vect_float} */ +/* { dg-additional-options "-w -Wno-psabi -ffast-math" } */ + +#include "tree-vect.h" + +typedef float v4sf __attribute__((vector_size(sizeof(float)*4))); + +float __attribute__((noipa)) +f(v4sf v) +{ + return v[0]+v[1]+v[2]+v[3]; +} + +float __attribute__((noipa)) +g(float *v) +{ + return v[0]+v[1]+v[2]+v[3]; +} + +float __attribute__((noipa)) +h(float *v) +{ + return 2*v[0]+3*v[1]+4*v[2]+5*v[3]; +} + +int +main () +{ + check_vect (); + v4sf v = (v4sf) { 1.f, 3.f, 4.f, 2.f }; + if (f (v) != 10.f) + abort (); + if (g (&v[0]) != 10.f) + abort (); + if (h (&v[0]) != 37.f) + abort (); + return 0; +} + +/* We are lacking an effective target for .REDUC_PLUS support. */ +/* { dg-final { scan-tree-dump-times "basic block part vectorized" 3 "slp2" { target x86_64-*-* } } } */ +/* { dg-final { scan-tree-dump-not " = VEC_PERM_EXPR" "slp2" { target x86_64-*-* } } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/narrow_zero_high_half.c b/gcc/testsuite/gcc.target/aarch64/narrow_zero_high_half.c new file mode 100644 index 0000000..dd5ddf8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/narrow_zero_high_half.c @@ -0,0 +1,130 @@ +/* { dg-skip-if "" { arm*-*-* } } */ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#include <arm_neon.h> + +#define TEST_SHIFT(name, rettype, intype, fs, rs) \ + rettype test_ ## name ## _ ## fs ## _zero_high \ + (intype a) \ + { \ + return vcombine_ ## rs (name ## _ ## fs (a, 4), \ + vdup_n_ ## rs (0)); \ + } + +TEST_SHIFT (vshrn_n, int8x16_t, int16x8_t, s16, s8) +TEST_SHIFT (vshrn_n, int16x8_t, int32x4_t, s32, s16) +TEST_SHIFT (vshrn_n, int32x4_t, int64x2_t, s64, s32) +TEST_SHIFT (vshrn_n, uint8x16_t, uint16x8_t, u16, u8) +TEST_SHIFT (vshrn_n, uint16x8_t, uint32x4_t, u32, u16) +TEST_SHIFT (vshrn_n, uint32x4_t, uint64x2_t, u64, u32) + +TEST_SHIFT (vrshrn_n, int8x16_t, int16x8_t, s16, s8) +TEST_SHIFT (vrshrn_n, int16x8_t, int32x4_t, s32, s16) +TEST_SHIFT (vrshrn_n, int32x4_t, int64x2_t, s64, s32) +TEST_SHIFT (vrshrn_n, uint8x16_t, uint16x8_t, u16, u8) +TEST_SHIFT (vrshrn_n, uint16x8_t, uint32x4_t, u32, u16) +TEST_SHIFT (vrshrn_n, uint32x4_t, uint64x2_t, u64, u32) + +TEST_SHIFT (vqshrn_n, int8x16_t, int16x8_t, s16, s8) +TEST_SHIFT (vqshrn_n, int16x8_t, int32x4_t, s32, s16) +TEST_SHIFT (vqshrn_n, int32x4_t, int64x2_t, s64, s32) +TEST_SHIFT (vqshrn_n, uint8x16_t, uint16x8_t, u16, u8) +TEST_SHIFT (vqshrn_n, uint16x8_t, uint32x4_t, u32, u16) +TEST_SHIFT (vqshrn_n, uint32x4_t, uint64x2_t, u64, u32) + +TEST_SHIFT (vqrshrn_n, int8x16_t, int16x8_t, s16, s8) +TEST_SHIFT (vqrshrn_n, int16x8_t, int32x4_t, s32, s16) +TEST_SHIFT (vqrshrn_n, int32x4_t, int64x2_t, s64, s32) +TEST_SHIFT (vqrshrn_n, uint8x16_t, uint16x8_t, u16, u8) +TEST_SHIFT (vqrshrn_n, uint16x8_t, uint32x4_t, u32, u16) +TEST_SHIFT (vqrshrn_n, uint32x4_t, uint64x2_t, u64, u32) + +TEST_SHIFT (vqshrun_n, uint8x16_t, int16x8_t, s16, u8) +TEST_SHIFT (vqshrun_n, uint16x8_t, int32x4_t, s32, u16) +TEST_SHIFT (vqshrun_n, uint32x4_t, int64x2_t, s64, u32) + +TEST_SHIFT (vqrshrun_n, uint8x16_t, int16x8_t, s16, u8) +TEST_SHIFT (vqrshrun_n, uint16x8_t, int32x4_t, s32, u16) +TEST_SHIFT (vqrshrun_n, uint32x4_t, int64x2_t, s64, u32) + +#define TEST_UNARY(name, rettype, intype, fs, rs) \ + rettype test_ ## name ## _ ## fs ## _zero_high \ + (intype a) \ + { \ + return vcombine_ ## rs (name ## _ ## fs (a), \ + vdup_n_ ## rs (0)); \ + } + +TEST_UNARY (vmovn, int8x16_t, int16x8_t, s16, s8) +TEST_UNARY (vmovn, int16x8_t, int32x4_t, s32, s16) +TEST_UNARY (vmovn, int32x4_t, int64x2_t, s64, s32) +TEST_UNARY (vmovn, uint8x16_t, uint16x8_t, u16, u8) +TEST_UNARY (vmovn, uint16x8_t, uint32x4_t, u32, u16) +TEST_UNARY (vmovn, uint32x4_t, uint64x2_t, u64, u32) + +TEST_UNARY (vqmovun, uint8x16_t, int16x8_t, s16, u8) +TEST_UNARY (vqmovun, uint16x8_t, int32x4_t, s32, u16) +TEST_UNARY (vqmovun, uint32x4_t, int64x2_t, s64, u32) + +TEST_UNARY (vqmovn, int8x16_t, int16x8_t, s16, s8) +TEST_UNARY (vqmovn, int16x8_t, int32x4_t, s32, s16) +TEST_UNARY (vqmovn, int32x4_t, int64x2_t, s64, s32) +TEST_UNARY (vqmovn, uint8x16_t, uint16x8_t, u16, u8) +TEST_UNARY (vqmovn, uint16x8_t, uint32x4_t, u32, u16) +TEST_UNARY (vqmovn, uint32x4_t, uint64x2_t, u64, u32) + +#define TEST_ARITH(name, rettype, intype, fs, rs) \ + rettype test_ ## name ## _ ## fs ## _zero_high \ + (intype a, intype b) \ + { \ + return vcombine_ ## rs (name ## _ ## fs (a, b), \ + vdup_n_ ## rs (0)); \ + } + +TEST_ARITH (vaddhn, int8x16_t, int16x8_t, s16, s8) +TEST_ARITH (vaddhn, int16x8_t, int32x4_t, s32, s16) +TEST_ARITH (vaddhn, int32x4_t, int64x2_t, s64, s32) +TEST_ARITH (vaddhn, uint8x16_t, uint16x8_t, u16, u8) +TEST_ARITH (vaddhn, uint16x8_t, uint32x4_t, u32, u16) +TEST_ARITH (vaddhn, uint32x4_t, uint64x2_t, u64, u32) + +TEST_ARITH (vraddhn, int8x16_t, int16x8_t, s16, s8) +TEST_ARITH (vraddhn, int16x8_t, int32x4_t, s32, s16) +TEST_ARITH (vraddhn, int32x4_t, int64x2_t, s64, s32) +TEST_ARITH (vraddhn, uint8x16_t, uint16x8_t, u16, u8) +TEST_ARITH (vraddhn, uint16x8_t, uint32x4_t, u32, u16) +TEST_ARITH (vraddhn, uint32x4_t, uint64x2_t, u64, u32) + +TEST_ARITH (vsubhn, int8x16_t, int16x8_t, s16, s8) +TEST_ARITH (vsubhn, int16x8_t, int32x4_t, s32, s16) +TEST_ARITH (vsubhn, int32x4_t, int64x2_t, s64, s32) +TEST_ARITH (vsubhn, uint8x16_t, uint16x8_t, u16, u8) +TEST_ARITH (vsubhn, uint16x8_t, uint32x4_t, u32, u16) +TEST_ARITH (vsubhn, uint32x4_t, uint64x2_t, u64, u32) + +TEST_ARITH (vrsubhn, int8x16_t, int16x8_t, s16, s8) +TEST_ARITH (vrsubhn, int16x8_t, int32x4_t, s32, s16) +TEST_ARITH (vrsubhn, int32x4_t, int64x2_t, s64, s32) +TEST_ARITH (vrsubhn, uint8x16_t, uint16x8_t, u16, u8) +TEST_ARITH (vrsubhn, uint16x8_t, uint32x4_t, u32, u16) +TEST_ARITH (vrsubhn, uint32x4_t, uint64x2_t, u64, u32) + +/* { dg-final { scan-assembler-not "dup\\t" } } */ + +/* { dg-final { scan-assembler-times "\\tshrn\\tv" 6} } */ +/* { dg-final { scan-assembler-times "\\trshrn\\tv" 6} } */ +/* { dg-final { scan-assembler-times "\\tsqshrn\\tv" 3} } */ +/* { dg-final { scan-assembler-times "\\tuqshrn\\tv" 3} } */ +/* { dg-final { scan-assembler-times "\\tsqrshrn\\tv" 3} } */ +/* { dg-final { scan-assembler-times "\\tuqrshrn\\tv" 3} } */ +/* { dg-final { scan-assembler-times "\\tsqshrun\\tv" 3} } */ +/* { dg-final { scan-assembler-times "\\tsqrshrun\\tv" 3} } */ +/* { dg-final { scan-assembler-times "\\txtn\\tv" 6} } */ +/* { dg-final { scan-assembler-times "\\tsqxtun\\tv" 3} } */ +/* { dg-final { scan-assembler-times "\\tuqxtn\\tv" 3} } */ +/* { dg-final { scan-assembler-times "\\tsqxtn\\tv" 3} } */ +/* { dg-final { scan-assembler-times "\\taddhn\\tv" 6} } */ +/* { dg-final { scan-assembler-times "\\tsubhn\\tv" 6} } */ +/* { dg-final { scan-assembler-times "\\trsubhn\\tv" 6} } */ +/* { dg-final { scan-assembler-times "\\traddhn\\tv" 6} } */ diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-doublee.c b/gcc/testsuite/gcc.target/s390/zvector/vec-doublee.c new file mode 100644 index 0000000..11610f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/zvector/vec-doublee.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=z14 -mzarch -mzvector --save-temps" } */ +/* { dg-do run { target { s390_z14_hw } } } */ + +/* + * The vector intrinsic vec_doublee(a) converts the even-indexed + * single-precision numbers in a vector to double precision. + */ +#include <assert.h> +#include <vecintrin.h> + +int +main (void) +{ + vector float in = { 1.0, 2.0, 3.0, 4.0 }; + + vector double result = vec_doublee(in); + /* { dg-final { scan-assembler-times {\n\tvldeb} 1 } } */ + + assert(result[0] == (double)in[0]); + assert(result[1] == (double)in[2]); +} diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-floate.c b/gcc/testsuite/gcc.target/s390/zvector/vec-floate.c new file mode 100644 index 0000000..0b9cbe3 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/zvector/vec-floate.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=z14 -mzarch -mzvector --save-temps" } */ +/* { dg-do run { target { s390_z14_hw } } } */ + +/* + * The vector intrinsic vec_floate(a) rounds a vector of double-precision + * numbers to single-precision. The results are stored in the even-numbered + * target elements. + */ +#include <assert.h> +#include <vecintrin.h> + +int +main (void) +{ + vector double in = { 1.0, 2.0 }; + + vector float result = vec_floate(in); + /* { dg-final { scan-assembler-times {\n\tvledb} 1 } } */ + + assert(result[0] == (float)in[0]); + assert(result[2] == (float)in[1]); +} diff --git a/gcc/testsuite/gfortran.dg/pr95502.f90 b/gcc/testsuite/gfortran.dg/pr95502.f90 new file mode 100644 index 0000000..d40fd9a --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr95502.f90 @@ -0,0 +1,8 @@ +! { dg-do compile } +! PR fortran/95502 - ICE in gfc_check_do_variable, at fortran/parse.c:4446 + +program p + integer, pointer :: z + nullify (z%kind) ! { dg-error "in variable definition context" } + z%kind => NULL() ! { dg-error "constant expression" } +end diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 0225658..4c32f83 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -6495,7 +6495,6 @@ gimple_duplicate_sese_region (edge entry, edge exit, bool free_region_copy = false, copying_header = false; class loop *loop = entry->dest->loop_father; edge exit_copy; - vec<basic_block> doms = vNULL; edge redirected; profile_count total_count = profile_count::uninitialized (); profile_count entry_count = profile_count::uninitialized (); @@ -6549,9 +6548,9 @@ gimple_duplicate_sese_region (edge entry, edge exit, /* Record blocks outside the region that are dominated by something inside. */ + auto_vec<basic_block> doms; if (update_dominance) { - doms.create (0); doms = get_dominated_by_region (CDI_DOMINATORS, region, n_region); } @@ -6596,7 +6595,6 @@ gimple_duplicate_sese_region (edge entry, edge exit, set_immediate_dominator (CDI_DOMINATORS, entry->dest, entry->src); doms.safe_push (get_bb_original (entry->dest)); iterate_fix_dominators (CDI_DOMINATORS, doms, false); - doms.release (); } /* Add the other PHI node arguments. */ @@ -6662,7 +6660,6 @@ gimple_duplicate_sese_tail (edge entry, edge exit, class loop *loop = exit->dest->loop_father; class loop *orig_loop = entry->dest->loop_father; basic_block switch_bb, entry_bb, nentry_bb; - vec<basic_block> doms; profile_count total_count = profile_count::uninitialized (), exit_count = profile_count::uninitialized (); edge exits[2], nexits[2], e; @@ -6705,7 +6702,8 @@ gimple_duplicate_sese_tail (edge entry, edge exit, /* Record blocks outside the region that are dominated by something inside. */ - doms = get_dominated_by_region (CDI_DOMINATORS, region, n_region); + auto_vec<basic_block> doms = get_dominated_by_region (CDI_DOMINATORS, region, + n_region); total_count = exit->src->count; exit_count = exit->count (); @@ -6785,7 +6783,6 @@ gimple_duplicate_sese_tail (edge entry, edge exit, /* Anything that is outside of the region, but was dominated by something inside needs to update dominance info. */ iterate_fix_dominators (CDI_DOMINATORS, doms, false); - doms.release (); /* Update the SSA web. */ update_ssa (TODO_update_ssa); @@ -7567,7 +7564,7 @@ basic_block move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb, basic_block exit_bb, tree orig_block) { - vec<basic_block> bbs, dom_bbs; + vec<basic_block> bbs; basic_block dom_entry = get_immediate_dominator (CDI_DOMINATORS, entry_bb); basic_block after, bb, *entry_pred, *exit_succ, abb; struct function *saved_cfun = cfun; @@ -7599,9 +7596,9 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb, /* The blocks that used to be dominated by something in BBS will now be dominated by the new block. */ - dom_bbs = get_dominated_by_region (CDI_DOMINATORS, - bbs.address (), - bbs.length ()); + auto_vec<basic_block> dom_bbs = get_dominated_by_region (CDI_DOMINATORS, + bbs.address (), + bbs.length ()); /* Detach ENTRY_BB and EXIT_BB from CFUN->CFG. We need to remember the predecessor edges to ENTRY_BB and the successor edges to @@ -7937,7 +7934,6 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb, set_immediate_dominator (CDI_DOMINATORS, bb, dom_entry); FOR_EACH_VEC_ELT (dom_bbs, i, abb) set_immediate_dominator (CDI_DOMINATORS, abb, bb); - dom_bbs.release (); if (exit_bb) { @@ -8687,7 +8683,6 @@ gimple_flow_call_edges_add (sbitmap blocks) void remove_edge_and_dominated_blocks (edge e) { - vec<basic_block> bbs_to_remove = vNULL; vec<basic_block> bbs_to_fix_dom = vNULL; edge f; edge_iterator ei; @@ -8738,6 +8733,7 @@ remove_edge_and_dominated_blocks (edge e) } auto_bitmap df, df_idom; + auto_vec<basic_block> bbs_to_remove; if (none_removed) bitmap_set_bit (df_idom, get_immediate_dominator (CDI_DOMINATORS, e->dest)->index); @@ -8804,7 +8800,6 @@ remove_edge_and_dominated_blocks (edge e) iterate_fix_dominators (CDI_DOMINATORS, bbs_to_fix_dom, true); - bbs_to_remove.release (); bbs_to_fix_dom.release (); } @@ -9917,22 +9912,20 @@ test_linear_chain () calculate_dominance_info (CDI_DOMINATORS); ASSERT_EQ (bb_a, get_immediate_dominator (CDI_DOMINATORS, bb_b)); ASSERT_EQ (bb_b, get_immediate_dominator (CDI_DOMINATORS, bb_c)); - vec<basic_block> dom_by_b = get_dominated_by (CDI_DOMINATORS, bb_b); + auto_vec<basic_block> dom_by_b = get_dominated_by (CDI_DOMINATORS, bb_b); ASSERT_EQ (1, dom_by_b.length ()); ASSERT_EQ (bb_c, dom_by_b[0]); free_dominance_info (CDI_DOMINATORS); - dom_by_b.release (); /* Similarly for post-dominance: each BB in our chain is post-dominated by the one after it. */ calculate_dominance_info (CDI_POST_DOMINATORS); ASSERT_EQ (bb_b, get_immediate_dominator (CDI_POST_DOMINATORS, bb_a)); ASSERT_EQ (bb_c, get_immediate_dominator (CDI_POST_DOMINATORS, bb_b)); - vec<basic_block> postdom_by_b = get_dominated_by (CDI_POST_DOMINATORS, bb_b); + auto_vec<basic_block> postdom_by_b = get_dominated_by (CDI_POST_DOMINATORS, bb_b); ASSERT_EQ (1, postdom_by_b.length ()); ASSERT_EQ (bb_a, postdom_by_b[0]); free_dominance_info (CDI_POST_DOMINATORS); - postdom_by_b.release (); pop_cfun (); } @@ -9991,10 +9984,10 @@ test_diamond () ASSERT_EQ (bb_a, get_immediate_dominator (CDI_DOMINATORS, bb_b)); ASSERT_EQ (bb_a, get_immediate_dominator (CDI_DOMINATORS, bb_c)); ASSERT_EQ (bb_a, get_immediate_dominator (CDI_DOMINATORS, bb_d)); - vec<basic_block> dom_by_a = get_dominated_by (CDI_DOMINATORS, bb_a); + auto_vec<basic_block> dom_by_a = get_dominated_by (CDI_DOMINATORS, bb_a); ASSERT_EQ (3, dom_by_a.length ()); /* B, C, D, in some order. */ dom_by_a.release (); - vec<basic_block> dom_by_b = get_dominated_by (CDI_DOMINATORS, bb_b); + auto_vec<basic_block> dom_by_b = get_dominated_by (CDI_DOMINATORS, bb_b); ASSERT_EQ (0, dom_by_b.length ()); dom_by_b.release (); free_dominance_info (CDI_DOMINATORS); @@ -10004,10 +9997,10 @@ test_diamond () ASSERT_EQ (bb_d, get_immediate_dominator (CDI_POST_DOMINATORS, bb_a)); ASSERT_EQ (bb_d, get_immediate_dominator (CDI_POST_DOMINATORS, bb_b)); ASSERT_EQ (bb_d, get_immediate_dominator (CDI_POST_DOMINATORS, bb_c)); - vec<basic_block> postdom_by_d = get_dominated_by (CDI_POST_DOMINATORS, bb_d); + auto_vec<basic_block> postdom_by_d = get_dominated_by (CDI_POST_DOMINATORS, bb_d); ASSERT_EQ (3, postdom_by_d.length ()); /* A, B, C in some order. */ postdom_by_d.release (); - vec<basic_block> postdom_by_b = get_dominated_by (CDI_POST_DOMINATORS, bb_b); + auto_vec<basic_block> postdom_by_b = get_dominated_by (CDI_POST_DOMINATORS, bb_b); ASSERT_EQ (0, postdom_by_b.length ()); postdom_by_b.release (); free_dominance_info (CDI_POST_DOMINATORS); diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index deff2d5..fe1baef 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -3949,7 +3949,7 @@ oacc_entry_exit_ok (class loop *loop, reduction_info_table_type *reduction_list) { basic_block *loop_bbs = get_loop_body_in_dom_order (loop); - vec<basic_block> region_bbs + auto_vec<basic_block> region_bbs = get_all_dominated_blocks (CDI_DOMINATORS, ENTRY_BLOCK_PTR_FOR_FN (cfun)); bitmap in_loop_bbs = BITMAP_ALLOC (NULL); @@ -3972,7 +3972,6 @@ oacc_entry_exit_ok (class loop *loop, } } - region_bbs.release (); free (loop_bbs); BITMAP_FREE (in_loop_bbs); diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c index def6ae6..e2d3b63 100644 --- a/gcc/tree-ssa-dce.c +++ b/gcc/tree-ssa-dce.c @@ -1275,7 +1275,6 @@ eliminate_unnecessary_stmts (void) gimple_stmt_iterator gsi, psi; gimple *stmt; tree call; - vec<basic_block> h; auto_vec<edge> to_remove_edges; if (dump_file && (dump_flags & TDF_DETAILS)) @@ -1306,6 +1305,7 @@ eliminate_unnecessary_stmts (void) as desired. */ gcc_assert (dom_info_available_p (CDI_DOMINATORS)); + auto_vec<basic_block> h; h = get_all_dominated_blocks (CDI_DOMINATORS, single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun))); @@ -1460,7 +1460,6 @@ eliminate_unnecessary_stmts (void) something_changed |= remove_dead_phis (bb); } - h.release (); /* Since we don't track liveness of virtual PHI nodes, it is possible that we rendered some PHI nodes unreachable while they are still in use. diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c index 3f9e9d0..b1971f8 100644 --- a/gcc/tree-ssa-loop-ivcanon.c +++ b/gcc/tree-ssa-loop-ivcanon.c @@ -218,7 +218,7 @@ tree_estimate_loop_size (class loop *loop, edge exit, edge edge_to_cancel, gimple_stmt_iterator gsi; unsigned int i; bool after_exit; - vec<basic_block> path = get_loop_hot_path (loop); + auto_vec<basic_block> path = get_loop_hot_path (loop); size->overall = 0; size->eliminated_by_peeling = 0; @@ -342,7 +342,6 @@ tree_estimate_loop_size (class loop *loop, edge exit, edge edge_to_cancel, - size->last_iteration_eliminated_by_peeling) > upper_bound) { free (body); - path.release (); return true; } } @@ -379,7 +378,7 @@ tree_estimate_loop_size (class loop *loop, edge exit, edge edge_to_cancel, size->num_branches_on_hot_path++; } } - path.release (); + if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "size: %i-%i, last_iteration: %i-%i\n", size->overall, size->eliminated_by_peeling, size->last_iteration, diff --git a/gcc/tree-ssa-phiprop.c b/gcc/tree-ssa-phiprop.c index 64d6eda..78b0461 100644 --- a/gcc/tree-ssa-phiprop.c +++ b/gcc/tree-ssa-phiprop.c @@ -484,7 +484,6 @@ public: unsigned int pass_phiprop::execute (function *fun) { - vec<basic_block> bbs; struct phiprop_d *phivn; bool did_something = false; basic_block bb; @@ -499,8 +498,9 @@ pass_phiprop::execute (function *fun) phivn = XCNEWVEC (struct phiprop_d, n); /* Walk the dominator tree in preorder. */ - bbs = get_all_dominated_blocks (CDI_DOMINATORS, - single_succ (ENTRY_BLOCK_PTR_FOR_FN (fun))); + auto_vec<basic_block> bbs + = get_all_dominated_blocks (CDI_DOMINATORS, + single_succ (ENTRY_BLOCK_PTR_FOR_FN (fun))); FOR_EACH_VEC_ELT (bbs, i, bb) { /* Since we're going to move dereferences across predecessor @@ -514,7 +514,6 @@ pass_phiprop::execute (function *fun) if (did_something) gsi_commit_edge_inserts (); - bbs.release (); free (phivn); free_dominance_info (CDI_POST_DOMINATORS); diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index 2694d1a..bb086c6 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -843,9 +843,9 @@ vect_slp_analyze_instance_dependence (vec_info *vinfo, slp_instance instance) DUMP_VECT_SCOPE ("vect_slp_analyze_instance_dependence"); /* The stores of this instance are at the root of the SLP tree. */ - slp_tree store = SLP_INSTANCE_TREE (instance); - if (! STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (store))) - store = NULL; + slp_tree store = NULL; + if (SLP_INSTANCE_KIND (instance) == slp_inst_kind_store) + store = SLP_INSTANCE_TREE (instance); /* Verify we can sink stores to the vectorized stmt insert location. */ stmt_vec_info last_store_info = NULL; @@ -2464,8 +2464,7 @@ vect_slp_analyze_instance_alignment (vec_info *vinfo, if (! vect_slp_analyze_node_alignment (vinfo, node)) return false; - node = SLP_INSTANCE_TREE (instance); - if (STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (node)) + if (SLP_INSTANCE_KIND (instance) == slp_inst_kind_store && ! vect_slp_analyze_node_alignment (vinfo, SLP_INSTANCE_TREE (instance))) return false; diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index ee79808..51a46a6 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -3209,7 +3209,7 @@ fold_left_reduction_fn (tree_code code, internal_fn *reduc_fn) Return FALSE if CODE currently cannot be vectorized as reduction. */ -static bool +bool reduction_fn_for_scalar_code (enum tree_code code, internal_fn *reduc_fn) { switch (code) diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index 8ec589b..0c1f85b 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -1442,6 +1442,84 @@ dt_sort_cmp (const void *op1_, const void *op2_, void *) return (int)op1->code - (int)op2->code; } +/* Linearize the associatable expression chain at START with the + associatable operation CODE (where PLUS_EXPR also allows MINUS_EXPR), + filling CHAIN with the result and using WORKLIST as intermediate storage. + CODE_STMT and ALT_CODE_STMT are filled with the first stmt using CODE + or MINUS_EXPR. *CHAIN_STMTS if not NULL is filled with all computation + stmts, starting with START. */ + +static void +vect_slp_linearize_chain (vec_info *vinfo, + vec<std::pair<tree_code, gimple *> > &worklist, + vec<chain_op_t> &chain, + enum tree_code code, gimple *start, + gimple *&code_stmt, gimple *&alt_code_stmt, + vec<gimple *> *chain_stmts) +{ + /* For each lane linearize the addition/subtraction (or other + uniform associatable operation) expression tree. */ + worklist.safe_push (std::make_pair (code, start)); + while (!worklist.is_empty ()) + { + auto entry = worklist.pop (); + gassign *stmt = as_a <gassign *> (entry.second); + enum tree_code in_code = entry.first; + enum tree_code this_code = gimple_assign_rhs_code (stmt); + /* Pick some stmts suitable for SLP_TREE_REPRESENTATIVE. */ + if (!code_stmt + && gimple_assign_rhs_code (stmt) == code) + code_stmt = stmt; + else if (!alt_code_stmt + && gimple_assign_rhs_code (stmt) == MINUS_EXPR) + alt_code_stmt = stmt; + if (chain_stmts) + chain_stmts->safe_push (stmt); + for (unsigned opnum = 1; opnum <= 2; ++opnum) + { + tree op = gimple_op (stmt, opnum); + vect_def_type dt; + stmt_vec_info def_stmt_info; + bool res = vect_is_simple_use (op, vinfo, &dt, &def_stmt_info); + gcc_assert (res); + if (dt == vect_internal_def) + { + stmt_vec_info orig_def_stmt_info = def_stmt_info; + def_stmt_info = vect_stmt_to_vectorize (def_stmt_info); + if (def_stmt_info != orig_def_stmt_info) + op = gimple_get_lhs (def_stmt_info->stmt); + } + gimple *use_stmt; + use_operand_p use_p; + if (dt == vect_internal_def + && single_imm_use (op, &use_p, &use_stmt) + && is_gimple_assign (def_stmt_info->stmt) + && (gimple_assign_rhs_code (def_stmt_info->stmt) == code + || (code == PLUS_EXPR + && (gimple_assign_rhs_code (def_stmt_info->stmt) + == MINUS_EXPR)))) + { + tree_code op_def_code = this_code; + if (op_def_code == MINUS_EXPR && opnum == 1) + op_def_code = PLUS_EXPR; + if (in_code == MINUS_EXPR) + op_def_code = op_def_code == PLUS_EXPR ? MINUS_EXPR : PLUS_EXPR; + worklist.safe_push (std::make_pair (op_def_code, + def_stmt_info->stmt)); + } + else + { + tree_code op_def_code = this_code; + if (op_def_code == MINUS_EXPR && opnum == 1) + op_def_code = PLUS_EXPR; + if (in_code == MINUS_EXPR) + op_def_code = op_def_code == PLUS_EXPR ? MINUS_EXPR : PLUS_EXPR; + chain.safe_push (chain_op_t (op_def_code, dt, op)); + } + } + } +} + typedef hash_map <vec <stmt_vec_info>, slp_tree, simple_hashmap_traits <bst_traits, slp_tree> > scalar_stmts_to_slp_tree_map_t; @@ -1784,63 +1862,14 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node, { /* For each lane linearize the addition/subtraction (or other uniform associatable operation) expression tree. */ - worklist.safe_push (std::make_pair (code, stmts[lane]->stmt)); - while (!worklist.is_empty ()) - { - auto entry = worklist.pop (); - gassign *stmt = as_a <gassign *> (entry.second); - enum tree_code in_code = entry.first; - enum tree_code this_code = gimple_assign_rhs_code (stmt); - /* Pick some stmts suitable for SLP_TREE_REPRESENTATIVE. */ - if (!op_stmt_info - && gimple_assign_rhs_code (stmt) == code) - op_stmt_info = vinfo->lookup_stmt (stmt); - else if (!other_op_stmt_info - && gimple_assign_rhs_code (stmt) == MINUS_EXPR) - other_op_stmt_info = vinfo->lookup_stmt (stmt); - for (unsigned opnum = 1; opnum <= 2; ++opnum) - { - tree op = gimple_op (stmt, opnum); - vect_def_type dt; - stmt_vec_info def_stmt_info; - bool res = vect_is_simple_use (op, vinfo, &dt, &def_stmt_info); - gcc_assert (res); - if (dt == vect_internal_def) - { - def_stmt_info = vect_stmt_to_vectorize (def_stmt_info); - op = gimple_get_lhs (def_stmt_info->stmt); - } - gimple *use_stmt; - use_operand_p use_p; - if (dt == vect_internal_def - && single_imm_use (op, &use_p, &use_stmt) - && is_gimple_assign (def_stmt_info->stmt) - && (gimple_assign_rhs_code (def_stmt_info->stmt) == code - || (code == PLUS_EXPR - && (gimple_assign_rhs_code (def_stmt_info->stmt) - == MINUS_EXPR)))) - { - tree_code op_def_code = this_code; - if (op_def_code == MINUS_EXPR && opnum == 1) - op_def_code = PLUS_EXPR; - if (in_code == MINUS_EXPR) - op_def_code - = op_def_code == PLUS_EXPR ? MINUS_EXPR : PLUS_EXPR; - worklist.safe_push (std::make_pair (op_def_code, - def_stmt_info->stmt)); - } - else - { - tree_code op_def_code = this_code; - if (op_def_code == MINUS_EXPR && opnum == 1) - op_def_code = PLUS_EXPR; - if (in_code == MINUS_EXPR) - op_def_code - = op_def_code == PLUS_EXPR ? MINUS_EXPR : PLUS_EXPR; - chain.safe_push (chain_op_t (op_def_code, dt, op)); - } - } - } + gimple *op_stmt = NULL, *other_op_stmt = NULL; + vect_slp_linearize_chain (vinfo, worklist, chain, code, + stmts[lane]->stmt, op_stmt, other_op_stmt, + NULL); + if (!op_stmt_info && op_stmt) + op_stmt_info = vinfo->lookup_stmt (op_stmt); + if (!other_op_stmt_info && other_op_stmt) + other_op_stmt_info = vinfo->lookup_stmt (other_op_stmt); if (chain.length () == 2) { /* In a chain of just two elements resort to the regular @@ -3915,6 +3944,48 @@ vect_optimize_slp (vec_info *vinfo) } } } + + /* And any permutations of BB reductions. */ + if (is_a <bb_vec_info> (vinfo)) + { + for (slp_instance instance : vinfo->slp_instances) + { + if (SLP_INSTANCE_KIND (instance) != slp_inst_kind_bb_reduc) + continue; + slp_tree old = SLP_INSTANCE_TREE (instance); + if (SLP_TREE_CODE (old) == VEC_PERM_EXPR + && SLP_TREE_CHILDREN (old).length () == 1) + { + slp_tree child = SLP_TREE_CHILDREN (old)[0]; + if (SLP_TREE_DEF_TYPE (child) == vect_external_def) + { + /* Preserve the special VEC_PERM we use to shield existing + vector defs from the rest. But make it a no-op. */ + unsigned i = 0; + for (std::pair<unsigned, unsigned> &p + : SLP_TREE_LANE_PERMUTATION (old)) + p.second = i++; + } + else + { + SLP_INSTANCE_TREE (instance) = child; + SLP_TREE_REF_COUNT (child)++; + vect_free_slp_tree (old); + } + } + else if (SLP_TREE_LOAD_PERMUTATION (old).exists () + && SLP_TREE_REF_COUNT (old) == 1) + { + /* ??? For loads the situation is more complex since + we can't modify the permute in place in case the + node is used multiple times. In fact for loads this + should be somehow handled in the propagation engine. */ + auto fn = [] (const void *a, const void *b) + { return *(const int *)a - *(const int *)b; }; + SLP_TREE_LOAD_PERMUTATION (old).qsort (fn); + } + } + } } /* Gather loads reachable from the individual SLP graph entries. */ @@ -4492,7 +4563,6 @@ vect_slp_analyze_node_operations (vec_info *vinfo, slp_tree node, return res; } - /* Mark lanes of NODE that are live outside of the basic-block vectorized region and that can be vectorized using vectorizable_live_operation with STMT_VINFO_LIVE_P. Not handled live operations will cause the @@ -4596,6 +4666,55 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, slp_tree node, cost_vec, svisited, visited); } +/* Determine whether we can vectorize the reduction epilogue for INSTANCE. */ + +static bool +vectorizable_bb_reduc_epilogue (slp_instance instance, + stmt_vector_for_cost *cost_vec) +{ + enum tree_code reduc_code + = gimple_assign_rhs_code (instance->root_stmts[0]->stmt); + if (reduc_code == MINUS_EXPR) + reduc_code = PLUS_EXPR; + internal_fn reduc_fn; + tree vectype = SLP_TREE_VECTYPE (SLP_INSTANCE_TREE (instance)); + if (!reduction_fn_for_scalar_code (reduc_code, &reduc_fn) + || reduc_fn == IFN_LAST + || !direct_internal_fn_supported_p (reduc_fn, vectype, OPTIMIZE_FOR_BOTH)) + return false; + + /* There's no way to cost a horizontal vector reduction via REDUC_FN so + cost log2 vector operations plus shuffles. */ + unsigned steps = floor_log2 (vect_nunits_for_cost (vectype)); + record_stmt_cost (cost_vec, steps, vector_stmt, instance->root_stmts[0], + vectype, 0, vect_body); + record_stmt_cost (cost_vec, steps, vec_perm, instance->root_stmts[0], + vectype, 0, vect_body); + return true; +} + +/* Prune from ROOTS all stmts that are computed as part of lanes of NODE + and recurse to children. */ + +static void +vect_slp_prune_covered_roots (slp_tree node, hash_set<stmt_vec_info> &roots, + hash_set<slp_tree> &visited) +{ + if (SLP_TREE_DEF_TYPE (node) != vect_internal_def + || visited.add (node)) + return; + + stmt_vec_info stmt; + unsigned i; + FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt) + roots.remove (vect_orig_stmt (stmt)); + + slp_tree child; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) + if (child) + vect_slp_prune_covered_roots (child, roots, visited); +} + /* Analyze statements in SLP instances of VINFO. Return true if the operations are supported. */ @@ -4619,15 +4738,20 @@ vect_slp_analyze_operations (vec_info *vinfo) SLP_INSTANCE_TREE (instance), instance, visited, visited_vec, &cost_vec) - /* Instances with a root stmt require vectorized defs for the - SLP tree root. */ - /* ??? Do inst->kind check instead. */ - || (!SLP_INSTANCE_ROOT_STMTS (instance).is_empty () + /* CTOR instances require vectorized defs for the SLP tree root. */ + || (SLP_INSTANCE_KIND (instance) == slp_inst_kind_ctor && (SLP_TREE_DEF_TYPE (SLP_INSTANCE_TREE (instance)) - != vect_internal_def))) + != vect_internal_def)) + /* Check we can vectorize the reduction. */ + || (SLP_INSTANCE_KIND (instance) == slp_inst_kind_bb_reduc + && !vectorizable_bb_reduc_epilogue (instance, &cost_vec))) { slp_tree node = SLP_INSTANCE_TREE (instance); - stmt_vec_info stmt_info = SLP_TREE_SCALAR_STMTS (node)[0]; + stmt_vec_info stmt_info; + if (!SLP_INSTANCE_ROOT_STMTS (instance).is_empty ()) + stmt_info = SLP_INSTANCE_ROOT_STMTS (instance)[0]; + else + stmt_info = SLP_TREE_SCALAR_STMTS (node)[0]; if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "removing SLP instance operations starting from: %G", @@ -4654,6 +4778,34 @@ vect_slp_analyze_operations (vec_info *vinfo) } } + /* Now look for SLP instances with a root that are covered by other + instances and remove them. */ + hash_set<stmt_vec_info> roots; + for (i = 0; vinfo->slp_instances.iterate (i, &instance); ++i) + if (!SLP_INSTANCE_ROOT_STMTS (instance).is_empty ()) + roots.add (SLP_INSTANCE_ROOT_STMTS (instance)[0]); + if (!roots.is_empty ()) + { + visited.empty (); + for (i = 0; vinfo->slp_instances.iterate (i, &instance); ++i) + vect_slp_prune_covered_roots (SLP_INSTANCE_TREE (instance), roots, + visited); + for (i = 0; vinfo->slp_instances.iterate (i, &instance); ) + if (!SLP_INSTANCE_ROOT_STMTS (instance).is_empty () + && !roots.contains (SLP_INSTANCE_ROOT_STMTS (instance)[0])) + { + stmt_vec_info root = SLP_INSTANCE_ROOT_STMTS (instance)[0]; + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "removing SLP instance operations starting " + "from: %G", root->stmt); + vect_free_slp_instance (instance); + vinfo->slp_instances.ordered_remove (i); + } + else + ++i; + } + /* Compute vectorizable live stmts. */ if (bb_vec_info bb_vinfo = dyn_cast <bb_vec_info> (vinfo)) { @@ -5115,7 +5267,10 @@ vect_slp_check_for_constructors (bb_vec_info bb_vinfo) continue; tree rhs = gimple_assign_rhs1 (assign); - if (gimple_assign_rhs_code (assign) == CONSTRUCTOR) + enum tree_code code = gimple_assign_rhs_code (assign); + use_operand_p use_p; + gimple *use_stmt; + if (code == CONSTRUCTOR) { if (!VECTOR_TYPE_P (TREE_TYPE (rhs)) || maybe_ne (TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs)), @@ -5136,7 +5291,7 @@ vect_slp_check_for_constructors (bb_vec_info bb_vinfo) stmt_vec_info stmt_info = bb_vinfo->lookup_stmt (assign); BB_VINFO_GROUPED_STORES (bb_vinfo).safe_push (stmt_info); } - else if (gimple_assign_rhs_code (assign) == BIT_INSERT_EXPR + else if (code == BIT_INSERT_EXPR && VECTOR_TYPE_P (TREE_TYPE (rhs)) && TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs)).is_constant () && TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs)).to_constant () > 1 @@ -5230,6 +5385,69 @@ vect_slp_check_for_constructors (bb_vec_info bb_vinfo) else roots.release (); } + else if (!VECTOR_TYPE_P (TREE_TYPE (rhs)) + && (associative_tree_code (code) || code == MINUS_EXPR) + /* ??? The flag_associative_math and TYPE_OVERFLOW_WRAPS + checks pessimize a two-element reduction. PR54400. + ??? In-order reduction could be handled if we only + traverse one operand chain in vect_slp_linearize_chain. */ + && ((FLOAT_TYPE_P (TREE_TYPE (rhs)) && flag_associative_math) + || (INTEGRAL_TYPE_P (TREE_TYPE (rhs)) + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (rhs)))) + /* Ops with constants at the tail can be stripped here. */ + && TREE_CODE (rhs) == SSA_NAME + && TREE_CODE (gimple_assign_rhs2 (assign)) == SSA_NAME + /* Should be the chain end. */ + && (!single_imm_use (gimple_assign_lhs (assign), + &use_p, &use_stmt) + || !is_gimple_assign (use_stmt) + || (gimple_assign_rhs_code (use_stmt) != code + && ((code != PLUS_EXPR && code != MINUS_EXPR) + || (gimple_assign_rhs_code (use_stmt) + != (code == PLUS_EXPR ? MINUS_EXPR : PLUS_EXPR)))))) + { + /* We start the match at the end of a possible association + chain. */ + auto_vec<chain_op_t> chain; + auto_vec<std::pair<tree_code, gimple *> > worklist; + auto_vec<gimple *> chain_stmts; + gimple *code_stmt = NULL, *alt_code_stmt = NULL; + if (code == MINUS_EXPR) + code = PLUS_EXPR; + internal_fn reduc_fn; + if (!reduction_fn_for_scalar_code (code, &reduc_fn) + || reduc_fn == IFN_LAST) + continue; + vect_slp_linearize_chain (bb_vinfo, worklist, chain, code, assign, + /* ??? */ + code_stmt, alt_code_stmt, &chain_stmts); + if (chain.length () > 1) + { + /* Sort the chain according to def_type and operation. */ + chain.sort (dt_sort_cmp, bb_vinfo); + /* ??? Now we'd want to strip externals and constants + but record those to be handled in the epilogue. */ + /* ??? For now do not allow mixing ops or externs/constants. */ + bool invalid = false; + for (unsigned i = 0; i < chain.length (); ++i) + if (chain[i].dt != vect_internal_def + || chain[i].code != code) + invalid = true; + if (!invalid) + { + vec<stmt_vec_info> stmts; + stmts.create (chain.length ()); + for (unsigned i = 0; i < chain.length (); ++i) + stmts.quick_push (bb_vinfo->lookup_def (chain[i].op)); + vec<stmt_vec_info> roots; + roots.create (chain_stmts.length ()); + for (unsigned i = 0; i < chain_stmts.length (); ++i) + roots.quick_push (bb_vinfo->lookup_stmt (chain_stmts[i])); + bb_vinfo->roots.safe_push (slp_root (slp_inst_kind_bb_reduc, + stmts, roots)); + } + } + } } } @@ -6861,6 +7079,39 @@ vectorize_slp_instance_root_stmt (slp_tree node, slp_instance instance) rstmt = gimple_build_assign (lhs, r_constructor); } } + else if (instance->kind == slp_inst_kind_bb_reduc) + { + /* Largely inspired by reduction chain epilogue handling in + vect_create_epilog_for_reduction. */ + vec<tree> vec_defs = vNULL; + vect_get_slp_defs (node, &vec_defs); + enum tree_code reduc_code + = gimple_assign_rhs_code (instance->root_stmts[0]->stmt); + /* ??? We actually have to reflect signs somewhere. */ + if (reduc_code == MINUS_EXPR) + reduc_code = PLUS_EXPR; + gimple_seq epilogue = NULL; + /* We may end up with more than one vector result, reduce them + to one vector. */ + tree vec_def = vec_defs[0]; + for (unsigned i = 1; i < vec_defs.length (); ++i) + vec_def = gimple_build (&epilogue, reduc_code, TREE_TYPE (vec_def), + vec_def, vec_defs[i]); + vec_defs.release (); + /* ??? Support other schemes than direct internal fn. */ + internal_fn reduc_fn; + if (!reduction_fn_for_scalar_code (reduc_code, &reduc_fn) + || reduc_fn == IFN_LAST) + gcc_unreachable (); + tree scalar_def = gimple_build (&epilogue, as_combined_fn (reduc_fn), + TREE_TYPE (TREE_TYPE (vec_def)), vec_def); + + gimple_stmt_iterator rgsi = gsi_for_stmt (instance->root_stmts[0]->stmt); + gsi_insert_seq_before (&rgsi, epilogue, GSI_SAME_STMT); + gimple_assign_set_rhs_from_tree (&rgsi, scalar_def); + update_stmt (gsi_stmt (rgsi)); + return; + } else gcc_unreachable (); diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 1fb46c6..04c20f8 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -190,6 +190,7 @@ enum slp_instance_kind { slp_inst_kind_store, slp_inst_kind_reduc_group, slp_inst_kind_reduc_chain, + slp_inst_kind_bb_reduc, slp_inst_kind_ctor }; @@ -1971,6 +1972,7 @@ extern tree vect_get_loop_len (loop_vec_info, vec_loop_lens *, unsigned int, unsigned int); extern gimple_seq vect_gen_len (tree, tree, tree, tree); extern stmt_vec_info info_for_reduction (vec_info *, stmt_vec_info); +extern bool reduction_fn_for_scalar_code (enum tree_code, internal_fn *); /* Drive for loop transformation stage. */ extern class loop *vect_transform_loop (loop_vec_info, gimple *); @@ -1570,14 +1570,43 @@ public: this->m_vec = r.m_vec; r.m_vec = NULL; } + + auto_vec (auto_vec<T> &&r) + { + gcc_assert (!r.using_auto_storage ()); + this->m_vec = r.m_vec; + r.m_vec = NULL; + } + auto_vec& operator= (vec<T, va_heap>&& r) { + if (this == &r) + return *this; + + gcc_assert (!r.using_auto_storage ()); + this->release (); + this->m_vec = r.m_vec; + r.m_vec = NULL; + return *this; + } + + auto_vec& operator= (auto_vec<T> &&r) + { + if (this == &r) + return *this; + gcc_assert (!r.using_auto_storage ()); this->release (); this->m_vec = r.m_vec; r.m_vec = NULL; return *this; } + + // You probably don't want to copy a vector, so these are deleted to prevent + // unintentional use. If you really need a copy of the vectors contents you + // can use copy (). + auto_vec(const auto_vec &) = delete; + auto_vec &operator= (const auto_vec &) = delete; }; @@ -2147,7 +2176,7 @@ template<typename T> inline bool vec<T, va_heap, vl_ptr>::using_auto_storage () const { - return m_vec->m_vecpfx.m_using_auto_storage; + return m_vec ? m_vec->m_vecpfx.m_using_auto_storage : false; } /* Release VEC and call release of all element vectors. */ diff --git a/libcpp/ChangeLog b/libcpp/ChangeLog index ea3bec8..7b9896f 100644 --- a/libcpp/ChangeLog +++ b/libcpp/ChangeLog @@ -1,3 +1,10 @@ +2021-06-16 Jason Merrill <jason@redhat.com> + + PR c++/100796 + PR preprocessor/96391 + * line-map.c (linemap_compare_locations): Only use comparison with + LINE_MAP_MAX_LOCATION_WITH_COLS to avoid abort. + 2021-05-20 Christophe Lyon <christophe.lyon@linaro.org> Torbjörn Svensson <torbjorn.svensson@st.com> diff --git a/libcpp/line-map.c b/libcpp/line-map.c index a03d676..1a6902a 100644 --- a/libcpp/line-map.c +++ b/libcpp/line-map.c @@ -1421,23 +1421,25 @@ linemap_compare_locations (line_maps *set, if (l0 == l1 && pre_virtual_p - && post_virtual_p - && l0 <= LINE_MAP_MAX_LOCATION_WITH_COLS) + && post_virtual_p) { /* So pre and post represent two tokens that are present in a same macro expansion. Let's see if the token for pre was before the token for post in that expansion. */ - unsigned i0, i1; const struct line_map *map = first_map_in_common (set, pre, post, &l0, &l1); if (map == NULL) - /* This should not be possible. */ - abort (); - - i0 = l0 - MAP_START_LOCATION (map); - i1 = l1 - MAP_START_LOCATION (map); - return i1 - i0; + /* This should not be possible while we have column information, but if + we don't, the tokens could be from separate macro expansions on the + same line. */ + gcc_assert (l0 > LINE_MAP_MAX_LOCATION_WITH_COLS); + else + { + unsigned i0 = l0 - MAP_START_LOCATION (map); + unsigned i1 = l1 - MAP_START_LOCATION (map); + return i1 - i0; + } } if (IS_ADHOC_LOC (l0)) diff --git a/libffi/ChangeLog b/libffi/ChangeLog index 58ce572..8ecc9de 100644 --- a/libffi/ChangeLog +++ b/libffi/ChangeLog @@ -1,3 +1,11 @@ +2021-06-16 Jakub Jelinek <jakub@redhat.com> + + * src/x86/ffi64.c (classify_argument): For FFI_TYPE_STRUCT set words + to number of words needed for type->size + byte_offset bytes rather + than just type->size bytes. Compute pos before the loop and check + total size of the structure. + * testsuite/libffi.call/nested_struct12.c: New test. + 2021-01-05 Samuel Thibault <samuel.thibault@ens-lyon.org> * configure: Re-generate. diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog index c802dac..eaebcea 100644 --- a/libstdc++-v3/ChangeLog +++ b/libstdc++-v3/ChangeLog @@ -1,3 +1,36 @@ +2021-06-16 Jonathan Wakely <jwakely@redhat.com> + + * include/bits/iterator_concepts.h (__decay_copy): Name type. + +2021-06-16 Jonathan Wakely <jwakely@redhat.com> + + * include/bits/ranges_base.h (ranges::begin, ranges::end) + (ranges::cbegin, ranges::cend, ranges::rbeing, ranges::rend) + (ranges::crbegin, ranges::crend, ranges::size, ranges::ssize) + (ranges::empty, ranges::data, ranges::cdata): Remove final + keywords and deleted operator& overloads. + * testsuite/24_iterators/customization_points/iter_move.cc: Use + new is_customization_point_object function. + * testsuite/24_iterators/customization_points/iter_swap.cc: + Likewise. + * testsuite/std/concepts/concepts.lang/concept.swappable/swap.cc: + Likewise. + * testsuite/std/ranges/access/begin.cc: Likewise. + * testsuite/std/ranges/access/cbegin.cc: Likewise. + * testsuite/std/ranges/access/cdata.cc: Likewise. + * testsuite/std/ranges/access/cend.cc: Likewise. + * testsuite/std/ranges/access/crbegin.cc: Likewise. + * testsuite/std/ranges/access/crend.cc: Likewise. + * testsuite/std/ranges/access/data.cc: Likewise. + * testsuite/std/ranges/access/empty.cc: Likewise. + * testsuite/std/ranges/access/end.cc: Likewise. + * testsuite/std/ranges/access/rbegin.cc: Likewise. + * testsuite/std/ranges/access/rend.cc: Likewise. + * testsuite/std/ranges/access/size.cc: Likewise. + * testsuite/std/ranges/access/ssize.cc: Likewise. + * testsuite/util/testsuite_iterators.h + (is_customization_point_object): New function. + 2021-06-15 Jonathan Wakely <jwakely@redhat.com> * include/bits/ranges_base.h (ranges::begin, ranges::end) diff --git a/libstdc++-v3/include/bits/iterator_concepts.h b/libstdc++-v3/include/bits/iterator_concepts.h index d18ae32..11748e5 100644 --- a/libstdc++-v3/include/bits/iterator_concepts.h +++ b/libstdc++-v3/include/bits/iterator_concepts.h @@ -930,7 +930,8 @@ namespace ranges { using std::__detail::__class_or_enum; - struct { + struct _Decay_copy final + { template<typename _Tp> constexpr decay_t<_Tp> operator()(_Tp&& __t) const diff --git a/libstdc++-v3/include/bits/ranges_base.h b/libstdc++-v3/include/bits/ranges_base.h index e392c37..25af4b7 100644 --- a/libstdc++-v3/include/bits/ranges_base.h +++ b/libstdc++-v3/include/bits/ranges_base.h @@ -91,7 +91,7 @@ namespace ranges using std::ranges::__detail::__maybe_borrowed_range; using std::__detail::__range_iter_t; - struct _Begin final + struct _Begin { private: template<typename _Tp> @@ -106,8 +106,6 @@ namespace ranges return noexcept(__decay_copy(begin(std::declval<_Tp&>()))); } - void operator&() const = delete; - public: template<__maybe_borrowed_range _Tp> requires is_array_v<remove_reference_t<_Tp>> || __member_begin<_Tp> @@ -144,7 +142,7 @@ namespace ranges { __decay_copy(end(__t)) } -> sentinel_for<__range_iter_t<_Tp>>; }; - struct _End final + struct _End { private: template<typename _Tp> @@ -159,8 +157,6 @@ namespace ranges return noexcept(__decay_copy(end(std::declval<_Tp&>()))); } - void operator&() const = delete; - public: template<__maybe_borrowed_range _Tp> requires is_bounded_array_v<remove_reference_t<_Tp>> @@ -193,7 +189,7 @@ namespace ranges return static_cast<const _Tp&&>(__t); } - struct _CBegin final + struct _CBegin { template<typename _Tp> constexpr auto @@ -203,8 +199,6 @@ namespace ranges { return _Begin{}(__cust_access::__as_const<_Tp>(__e)); } - - void operator&() const = delete; }; struct _CEnd final @@ -217,8 +211,6 @@ namespace ranges { return _End{}(__cust_access::__as_const<_Tp>(__e)); } - - void operator&() const = delete; }; template<typename _Tp> @@ -244,7 +236,7 @@ namespace ranges { _End{}(__t) } -> same_as<decltype(_Begin{}(__t))>; }; - struct _RBegin final + struct _RBegin { private: template<typename _Tp> @@ -268,8 +260,6 @@ namespace ranges } } - void operator&() const = delete; - public: template<__maybe_borrowed_range _Tp> requires __member_rbegin<_Tp> || __adl_rbegin<_Tp> || __reversable<_Tp> @@ -304,7 +294,7 @@ namespace ranges -> sentinel_for<decltype(_RBegin{}(std::forward<_Tp>(__t)))>; }; - struct _REnd final + struct _REnd { private: template<typename _Tp> @@ -328,8 +318,6 @@ namespace ranges } } - void operator&() const = delete; - public: template<__maybe_borrowed_range _Tp> requires __member_rend<_Tp> || __adl_rend<_Tp> || __reversable<_Tp> @@ -346,7 +334,7 @@ namespace ranges } }; - struct _CRBegin final + struct _CRBegin { template<typename _Tp> constexpr auto @@ -356,11 +344,9 @@ namespace ranges { return _RBegin{}(__cust_access::__as_const<_Tp>(__e)); } - - void operator&() const = delete; }; - struct _CREnd final + struct _CREnd { template<typename _Tp> constexpr auto @@ -370,8 +356,6 @@ namespace ranges { return _REnd{}(__cust_access::__as_const<_Tp>(__e)); } - - void operator&() const = delete; }; template<typename _Tp> @@ -402,7 +386,7 @@ namespace ranges __detail::__to_unsigned_like(_End{}(__t) - _Begin{}(__t)); }; - struct _Size final + struct _Size { private: template<typename _Tp> @@ -420,8 +404,6 @@ namespace ranges - _Begin{}(std::declval<_Tp&>())); } - void operator&() const = delete; - public: template<typename _Tp> requires is_bounded_array_v<remove_reference_t<_Tp>> @@ -440,7 +422,7 @@ namespace ranges } }; - struct _SSize final + struct _SSize { // _GLIBCXX_RESOLVE_LIB_DEFECTS // 3403. Domain of ranges::ssize(E) doesn't match ranges::size(E) @@ -469,8 +451,6 @@ namespace ranges else // Must be one of __max_diff_type or __max_size_type. return __detail::__max_diff_type(__size); } - - void operator&() const = delete; }; template<typename _Tp> @@ -487,7 +467,7 @@ namespace ranges bool(_Begin{}(__t) == _End{}(__t)); }; - struct _Empty final + struct _Empty { private: template<typename _Tp> @@ -503,8 +483,6 @@ namespace ranges == _End{}(std::declval<_Tp&>()))); } - void operator&() const = delete; - public: template<typename _Tp> requires __member_empty<_Tp> || __size0_empty<_Tp> @@ -534,7 +512,7 @@ namespace ranges template<typename _Tp> concept __begin_data = contiguous_iterator<__range_iter_t<_Tp>>; - struct _Data final + struct _Data { private: template<typename _Tp> @@ -547,8 +525,6 @@ namespace ranges return noexcept(_Begin{}(std::declval<_Tp&>())); } - void operator&() const = delete; - public: template<__maybe_borrowed_range _Tp> requires __member_data<_Tp> || __begin_data<_Tp> @@ -562,7 +538,7 @@ namespace ranges } }; - struct _CData final + struct _CData { template<typename _Tp> constexpr auto @@ -572,8 +548,6 @@ namespace ranges { return _Data{}(__cust_access::__as_const<_Tp>(__e)); } - - void operator&() const = delete; }; } // namespace __cust_access diff --git a/libstdc++-v3/testsuite/24_iterators/customization_points/iter_move.cc b/libstdc++-v3/testsuite/24_iterators/customization_points/iter_move.cc index 22030ec..a434485 100644 --- a/libstdc++-v3/testsuite/24_iterators/customization_points/iter_move.cc +++ b/libstdc++-v3/testsuite/24_iterators/customization_points/iter_move.cc @@ -20,6 +20,9 @@ #include <iterator> #include <testsuite_hooks.h> +#include <testsuite_iterators.h> + +static_assert(__gnu_test::is_customization_point_object(std::ranges::iter_move)); struct X { diff --git a/libstdc++-v3/testsuite/24_iterators/customization_points/iter_swap.cc b/libstdc++-v3/testsuite/24_iterators/customization_points/iter_swap.cc index 3580272..f170e177 100644 --- a/libstdc++-v3/testsuite/24_iterators/customization_points/iter_swap.cc +++ b/libstdc++-v3/testsuite/24_iterators/customization_points/iter_swap.cc @@ -20,6 +20,9 @@ #include <iterator> #include <testsuite_hooks.h> +#include <testsuite_iterators.h> + +static_assert(__gnu_test::is_customization_point_object(std::ranges::iter_swap)); struct X { diff --git a/libstdc++-v3/testsuite/std/concepts/concepts.lang/concept.swappable/swap.cc b/libstdc++-v3/testsuite/std/concepts/concepts.lang/concept.swappable/swap.cc index 4479c26..ea469e9 100644 --- a/libstdc++-v3/testsuite/std/concepts/concepts.lang/concept.swappable/swap.cc +++ b/libstdc++-v3/testsuite/std/concepts/concepts.lang/concept.swappable/swap.cc @@ -19,6 +19,10 @@ // { dg-do compile { target c++2a } } #include <concepts> +#include <testsuite_hooks.h> +#include <testsuite_iterators.h> + +static_assert(__gnu_test::is_customization_point_object(std::ranges::swap)); namespace nu { diff --git a/libstdc++-v3/testsuite/std/ranges/access/begin.cc b/libstdc++-v3/testsuite/std/ranges/access/begin.cc index 6ef44ee..a08ad37 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/begin.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/begin.cc @@ -22,6 +22,8 @@ #include <testsuite_hooks.h> #include <testsuite_iterators.h> +static_assert(__gnu_test::is_customization_point_object(std::ranges::begin)); + using std::same_as; void diff --git a/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc b/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc index dc06695..ed80af5 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc @@ -20,6 +20,10 @@ #include <ranges> #include <testsuite_hooks.h> +#include <testsuite_iterators.h> + +static_assert(__gnu_test::is_customization_point_object(std::ranges::cbegin)); + using std::same_as; void diff --git a/libstdc++-v3/testsuite/std/ranges/access/cdata.cc b/libstdc++-v3/testsuite/std/ranges/access/cdata.cc index 2dfb683..d51ff1d 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/cdata.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/cdata.cc @@ -20,6 +20,9 @@ #include <ranges> #include <testsuite_hooks.h> +#include <testsuite_iterators.h> + +static_assert(__gnu_test::is_customization_point_object(std::ranges::cdata)); template<typename T> concept has_cdata diff --git a/libstdc++-v3/testsuite/std/ranges/access/cend.cc b/libstdc++-v3/testsuite/std/ranges/access/cend.cc index fcb80f2..3e685ae 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/cend.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/cend.cc @@ -20,6 +20,9 @@ #include <ranges> #include <testsuite_hooks.h> +#include <testsuite_iterators.h> + +static_assert(__gnu_test::is_customization_point_object(std::ranges::cend)); using std::same_as; diff --git a/libstdc++-v3/testsuite/std/ranges/access/crbegin.cc b/libstdc++-v3/testsuite/std/ranges/access/crbegin.cc index a41234b..6b02b47 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/crbegin.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/crbegin.cc @@ -22,6 +22,8 @@ #include <testsuite_hooks.h> #include <testsuite_iterators.h> +static_assert(__gnu_test::is_customization_point_object(std::ranges::crbegin)); + struct R1 { int i = 0; diff --git a/libstdc++-v3/testsuite/std/ranges/access/crend.cc b/libstdc++-v3/testsuite/std/ranges/access/crend.cc index 8f8b08a..eb010d9 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/crend.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/crend.cc @@ -22,6 +22,8 @@ #include <testsuite_hooks.h> #include <testsuite_iterators.h> +static_assert(__gnu_test::is_customization_point_object(std::ranges::crend)); + struct R1 { int i = 0; diff --git a/libstdc++-v3/testsuite/std/ranges/access/data.cc b/libstdc++-v3/testsuite/std/ranges/access/data.cc index 4f16f44..b2083d1 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/data.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/data.cc @@ -22,6 +22,8 @@ #include <testsuite_hooks.h> #include <testsuite_iterators.h> +static_assert(__gnu_test::is_customization_point_object(std::ranges::data)); + template<typename T> concept has_data = requires (T&& t) { std::ranges::data(std::forward<T>(t)); }; diff --git a/libstdc++-v3/testsuite/std/ranges/access/empty.cc b/libstdc++-v3/testsuite/std/ranges/access/empty.cc index b2d8b10..3cad474 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/empty.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/empty.cc @@ -22,6 +22,8 @@ #include <testsuite_hooks.h> #include <testsuite_iterators.h> +static_assert(__gnu_test::is_customization_point_object(std::ranges::empty)); + using std::same_as; void diff --git a/libstdc++-v3/testsuite/std/ranges/access/end.cc b/libstdc++-v3/testsuite/std/ranges/access/end.cc index 7bf0dd4..25f21c7 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/end.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/end.cc @@ -22,6 +22,8 @@ #include <testsuite_hooks.h> #include <testsuite_iterators.h> +static_assert(__gnu_test::is_customization_point_object(std::ranges::end)); + using std::same_as; void diff --git a/libstdc++-v3/testsuite/std/ranges/access/rbegin.cc b/libstdc++-v3/testsuite/std/ranges/access/rbegin.cc index a166ad7..0006f89 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/rbegin.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/rbegin.cc @@ -22,6 +22,8 @@ #include <testsuite_hooks.h> #include <testsuite_iterators.h> +static_assert(__gnu_test::is_customization_point_object(std::ranges::rbegin)); + struct R1 { int i = 0; diff --git a/libstdc++-v3/testsuite/std/ranges/access/rend.cc b/libstdc++-v3/testsuite/std/ranges/access/rend.cc index 4ba5447..0ac86bc 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/rend.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/rend.cc @@ -22,6 +22,8 @@ #include <testsuite_hooks.h> #include <testsuite_iterators.h> +static_assert(__gnu_test::is_customization_point_object(std::ranges::rend)); + struct R1 { int i = 0; diff --git a/libstdc++-v3/testsuite/std/ranges/access/size.cc b/libstdc++-v3/testsuite/std/ranges/access/size.cc index f25a1cb..c7e4f78 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/size.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/size.cc @@ -22,6 +22,8 @@ #include <testsuite_hooks.h> #include <testsuite_iterators.h> +static_assert(__gnu_test::is_customization_point_object(std::ranges::size)); + void test01() { diff --git a/libstdc++-v3/testsuite/std/ranges/access/ssize.cc b/libstdc++-v3/testsuite/std/ranges/access/ssize.cc index fdbf245..7337f62 100644 --- a/libstdc++-v3/testsuite/std/ranges/access/ssize.cc +++ b/libstdc++-v3/testsuite/std/ranges/access/ssize.cc @@ -22,6 +22,8 @@ #include <testsuite_hooks.h> #include <testsuite_iterators.h> +static_assert(__gnu_test::is_customization_point_object(std::ranges::ssize)); + using std::ptrdiff_t; void diff --git a/libstdc++-v3/testsuite/util/testsuite_iterators.h b/libstdc++-v3/testsuite/util/testsuite_iterators.h index 4e668d6..6b835ac 100644 --- a/libstdc++-v3/testsuite/util/testsuite_iterators.h +++ b/libstdc++-v3/testsuite/util/testsuite_iterators.h @@ -894,6 +894,22 @@ namespace __gnu_test // This is also true for test_container, although only when it has forward // iterators (because output_iterator_wrapper and input_iterator_wrapper are // not default constructible so do not model std::input_or_output_iterator). + + + // Test for basic properties of C++20 16.3.3.6 [customization.point.object]. + template<typename T> + constexpr bool + is_customization_point_object(T& obj) noexcept + { + // A [CPO] is a function object with a literal class type. + static_assert( std::is_class_v<T> || std::is_union_v<T> ); + static_assert( __is_literal_type(T) ); + // The type of a [CPO], ignoring cv-qualifiers, shall model semiregular. + static_assert( std::semiregular<std::remove_cv_t<T>> ); + + return true; + } + #endif // C++20 } // namespace __gnu_test #endif // _TESTSUITE_ITERATORS |