diff options
author | Martin Liska <mliska@suse.cz> | 2021-06-24 16:09:54 +0200 |
---|---|---|
committer | Martin Liska <mliska@suse.cz> | 2021-06-24 16:09:54 +0200 |
commit | 441aa2ce23465dbc6f0b108de3a72cb7f8003a9f (patch) | |
tree | 440e6f5f76f63f7154071330d749fb416ad32d53 | |
parent | 0c6508fe976763cf4fe57c3cb6954b7ab7d55619 (diff) | |
parent | addd5f0e61f73659c29f47a02e93bfc5e534dbf6 (diff) | |
download | gcc-441aa2ce23465dbc6f0b108de3a72cb7f8003a9f.zip gcc-441aa2ce23465dbc6f0b108de3a72cb7f8003a9f.tar.gz gcc-441aa2ce23465dbc6f0b108de3a72cb7f8003a9f.tar.bz2 |
Merge branch 'master' into devel/sphinx
135 files changed, 5383 insertions, 2503 deletions
@@ -32,6 +32,7 @@ POTFILES TAGS TAGS.sub +cscope.out .local.vimrc .lvimrc diff --git a/contrib/ChangeLog b/contrib/ChangeLog index b46a215..9cfa3f0 100644 --- a/contrib/ChangeLog +++ b/contrib/ChangeLog @@ -1,3 +1,9 @@ +2021-06-23 Martin Liska <mliska@suse.cz> + + * gcc-git-customization.sh: Use the new wrapper. + * git-commit-mklog.py: New file. + * prepare-commit-msg: Support GCC_MKLOG_ARGS. + 2021-06-22 Martin Liska <mliska@suse.cz> * mklog.py: Fix flake8 issue. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 502a814..71534e4 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,148 @@ +2021-06-23 Dimitar Dimitrov <dimitar@dinux.eu> + + * doc/lto.texi (Design Overview): Update that slim objects are + the default. + +2021-06-23 Aaron Sawdey <acsawdey@linux.ibm.com> + + * config/rs6000/rs6000-cpus.def: Take OPTION_MASK_PCREL_OPT out + of OTHER_POWER10_MASKS so it will not be enabled by default. + +2021-06-23 Richard Biener <rguenther@suse.de> + Martin Jambor <mjambor@suse.cz> + + * tree-inline.c (setup_one_parameter): Set TREE_READONLY of the + param replacement unconditionally. Adjust comment. + +2021-06-23 Andrew MacLeod <amacleod@redhat.com> + + * Makefile.in (OBJS): Add gimple-range-fold.o + * gimple-range-fold.cc: New. + * gimple-range-fold.h: New. + * gimple-range-gori.cc (gimple_range_calc_op1): Move to here. + (gimple_range_calc_op2): Ditto. + * gimple-range-gori.h: Move prototypes to here. + * gimple-range.cc: Adjust include files. + (fur_source:fur_source): Relocate to gimple-range-fold.cc. + (fur_source::get_operand): Ditto. + (fur_source::get_phi_operand): Ditto. + (fur_source::query_relation): Ditto. + (fur_source::register_relation): Ditto. + (class fur_edge): Ditto. + (fur_edge::fur_edge): Ditto. + (fur_edge::get_operand): Ditto. + (fur_edge::get_phi_operand): Ditto. + (fur_stmt::fur_stmt): Ditto. + (fur_stmt::get_operand): Ditto. + (fur_stmt::get_phi_operand): Ditto. + (fur_stmt::query_relation): Ditto. + (class fur_depend): Relocate to gimple-range-fold.h. + (fur_depend::fur_depend): Relocate to gimple-range-fold.cc. + (fur_depend::register_relation): Ditto. + (fur_depend::register_relation): Ditto. + (class fur_list): Ditto. + (fur_list::fur_list): Ditto. + (fur_list::get_operand): Ditto. + (fur_list::get_phi_operand): Ditto. + (fold_range): Ditto. + (adjust_pointer_diff_expr): Ditto. + (gimple_range_adjustment): Ditto. + (gimple_range_base_of_assignment): Ditto. + (gimple_range_operand1): Ditto. + (gimple_range_operand2): Ditto. + (gimple_range_calc_op1): Relocate to gimple-range-gori.cc. + (gimple_range_calc_op2): Ditto. + (fold_using_range::fold_stmt): Relocate to gimple-range-fold.cc. + (fold_using_range::range_of_range_op): Ditto. + (fold_using_range::range_of_address): Ditto. + (fold_using_range::range_of_phi): Ditto. + (fold_using_range::range_of_call): Ditto. + (fold_using_range::range_of_builtin_ubsan_call): Ditto. + (fold_using_range::range_of_builtin_call): Ditto. + (fold_using_range::range_of_cond_expr): Ditto. + (fold_using_range::range_of_ssa_name_with_loop_info): Ditto. + (fold_using_range::relation_fold_and_or): Ditto. + (fold_using_range::postfold_gcond_edges): Ditto. + * gimple-range.h: Add gimple-range-fold.h to include files. Change + GIMPLE_RANGE_STMT_H to GIMPLE_RANGE_H. + (gimple_range_handler): Relocate to gimple-range-fold.h. + (gimple_range_ssa_p): Ditto. + (range_compatible_p): Ditto. + (class fur_source): Ditto. + (class fur_stmt): Ditto. + (class fold_using_range): Ditto. + (gimple_range_calc_op1): Relocate to gimple-range-gori.h + (gimple_range_calc_op2): Ditto. + +2021-06-23 Andrew MacLeod <amacleod@redhat.com> + + PR tree-optimization/101148 + PR tree-optimization/101014 + * gimple-range-cache.cc (ranger_cache::ranger_cache): Adjust. + (ranger_cache::~ranger_cache): Adjust. + (ranger_cache::block_range): Check if propagation disallowed. + (ranger_cache::propagate_cache): Disallow propagation if new value + can't be stored properly. + * gimple-range-cache.h (ranger_cache::m_propfail): New member. + +2021-06-23 Andrew MacLeod <amacleod@redhat.com> + + * gimple-range-cache.cc (class ssa_block_ranges): Adjust prototype. + (sbr_vector::set_bb_range): Return true. + (class sbr_sparse_bitmap): Adjust. + (sbr_sparse_bitmap::set_bb_range): Return value. + (block_range_cache::set_bb_range): Return value. + (ranger_cache::propagate_cache): Use return value to print msg. + * gimple-range-cache.h (class block_range_cache): Adjust. + +2021-06-23 Andrew MacLeod <amacleod@redhat.com> + + * gimple-range.cc (dump_bb): Use range_on_edge from the cache. + +2021-06-23 Jeff Law <jeffreyalaw@gmail.com> + + * config/h8300/logical.md (<code><mode>3<ccnz>): Use <cczn> + so this pattern can be used for test/compare removal. Pass + current insn to compute_logical_op_length and output_logical_op. + * config/h8300/h8300.c (compute_logical_op_cc): Remove. + (h8300_and_costs): Add argument to compute_logical_op_length. + (output_logical_op): Add new argument. Use it to determine if the + condition codes are used and adjust the output accordingly. + (compute_logical_op_length): Add new argument and update length + computations when condition codes are used. + * config/h8300/h8300-protos.h (compute_logical_op_length): Update + prototype. + (output_logical_op): Likewise. + +2021-06-23 Uroš Bizjak <ubizjak@gmail.com> + + PR target/89021 + * config/i386/i386-expand.c (expand_vec_perm_pshufb): + Handle 64bit modes for TARGET_XOP. Use indirect gen_* functions. + * config/i386/mmx.md (mmx_ppermv64): New insn pattern. + * config/i386/i386.md (unspec): Move UNSPEC_XOP_PERMUTE from ... + * config/i386/sse.md (unspec): ... here. + +2021-06-23 Martin Liska <mliska@suse.cz> + + PR target/98636 + * optc-save-gen.awk: Put back arm_fp16_format to + checked_options. + +2021-06-23 Uroš Bizjak <ubizjak@gmail.com> + + PR target/101175 + * config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX. + (bsr): Ditto. + (*bsrhi): Remove. + (clz<mode>2): Update RTX pattern for additions. + +2021-06-23 Jakub Jelinek <jakub@redhat.com> + + PR middle-end/101167 + * omp-low.c (lower_omp_regimplify_p): Regimplify also PARM_DECLs + and RESULT_DECLs that have DECL_HAS_VALUE_EXPR_P set. + 2021-06-22 Sergei Trofimovich <siarheit@google.com> * doc/rtl.texi: drop unbalanced parenthesis. diff --git a/gcc/DATESTAMP b/gcc/DATESTAMP index e8c8a9f..fb7726b 100644 --- a/gcc/DATESTAMP +++ b/gcc/DATESTAMP @@ -1 +1 @@ -20210623 +20210624 diff --git a/gcc/Makefile.in b/gcc/Makefile.in index ebf2644..d32de22 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1398,6 +1398,7 @@ OBJS = \ gimple-range.o \ gimple-range-cache.o \ gimple-range-edge.o \ + gimple-range-fold.o \ gimple-range-gori.o \ gimple-ssa-backprop.o \ gimple-ssa-evrp.o \ diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index be4b29a..88022d0 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -1208,7 +1208,9 @@ enum c_omp_region_type C_ORT_OMP = 1 << 0, C_ORT_ACC = 1 << 1, C_ORT_DECLARE_SIMD = 1 << 2, - C_ORT_OMP_DECLARE_SIMD = C_ORT_OMP | C_ORT_DECLARE_SIMD + C_ORT_TARGET = 1 << 3, + C_ORT_OMP_DECLARE_SIMD = C_ORT_OMP | C_ORT_DECLARE_SIMD, + C_ORT_OMP_TARGET = C_ORT_OMP | C_ORT_TARGET }; extern tree c_finish_omp_master (location_t, tree); diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c index 28fbb1d..cd81a08 100644 --- a/gcc/c-family/c-omp.c +++ b/gcc/c-family/c-omp.c @@ -2092,6 +2092,19 @@ c_omp_split_clauses (location_t loc, enum tree_code code, s = C_OMP_CLAUSE_SPLIT_TEAMS; break; case OMP_CLAUSE_IN_REDUCTION: + if ((mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_MAP)) != 0) + { + /* When on target, map(always, tofrom: item) is added as + well. For non-combined target it is added in the FEs. */ + c = build_omp_clause (OMP_CLAUSE_LOCATION (clauses), + OMP_CLAUSE_MAP); + OMP_CLAUSE_DECL (c) = OMP_CLAUSE_DECL (clauses); + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_ALWAYS_TOFROM); + OMP_CLAUSE_CHAIN (c) = cclauses[C_OMP_CLAUSE_SPLIT_TARGET]; + cclauses[C_OMP_CLAUSE_SPLIT_TARGET] = c; + s = C_OMP_CLAUSE_SPLIT_TARGET; + break; + } /* in_reduction on taskloop simd becomes reduction on the simd and keeps being in_reduction on taskloop. */ if (code == OMP_SIMD) diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index b90710c..c0f7020 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -18701,7 +18701,9 @@ omp_split_clauses (location_t loc, enum tree_code code, c_omp_split_clauses (loc, code, mask, clauses, cclauses); for (i = 0; i < C_OMP_CLAUSE_SPLIT_COUNT; i++) if (cclauses[i]) - cclauses[i] = c_finish_omp_clauses (cclauses[i], C_ORT_OMP); + cclauses[i] = c_finish_omp_clauses (cclauses[i], + i == C_OMP_CLAUSE_SPLIT_TARGET + ? C_ORT_OMP_TARGET : C_ORT_OMP); } /* OpenMP 5.0: @@ -20013,6 +20015,7 @@ c_parser_omp_target_exit_data (location_t loc, c_parser *parser, | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_FIRSTPRIVATE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_ALLOCATE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEFAULTMAP) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IN_REDUCTION) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IS_DEVICE_PTR)) static bool @@ -20179,7 +20182,18 @@ c_parser_omp_target (c_parser *parser, enum pragma_context context, bool *if_p) OMP_TARGET_CLAUSES (stmt) = c_parser_omp_all_clauses (parser, OMP_TARGET_CLAUSE_MASK, - "#pragma omp target"); + "#pragma omp target", false); + for (tree c = OMP_TARGET_CLAUSES (stmt); c; c = OMP_CLAUSE_CHAIN (c)) + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION) + { + tree nc = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); + OMP_CLAUSE_DECL (nc) = OMP_CLAUSE_DECL (c); + OMP_CLAUSE_SET_MAP_KIND (nc, GOMP_MAP_ALWAYS_TOFROM); + OMP_CLAUSE_CHAIN (nc) = OMP_CLAUSE_CHAIN (c); + OMP_CLAUSE_CHAIN (c) = nc; + } + OMP_TARGET_CLAUSES (stmt) + = c_finish_omp_clauses (OMP_TARGET_CLAUSES (stmt), C_ORT_OMP_TARGET); c_omp_adjust_map_clauses (OMP_TARGET_CLAUSES (stmt), true); pc = &OMP_TARGET_CLAUSES (stmt); diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index 77de881..d0d36c3 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -13648,32 +13648,29 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort) && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)) return false; gcc_assert (OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_FORCE_DEVICEPTR); - if (ort == C_ORT_OMP || ort == C_ORT_ACC) - switch (OMP_CLAUSE_MAP_KIND (c)) - { - case GOMP_MAP_ALLOC: - case GOMP_MAP_IF_PRESENT: - case GOMP_MAP_TO: - case GOMP_MAP_FROM: - case GOMP_MAP_TOFROM: - case GOMP_MAP_ALWAYS_TO: - case GOMP_MAP_ALWAYS_FROM: - case GOMP_MAP_ALWAYS_TOFROM: - case GOMP_MAP_RELEASE: - case GOMP_MAP_DELETE: - case GOMP_MAP_FORCE_TO: - case GOMP_MAP_FORCE_FROM: - case GOMP_MAP_FORCE_TOFROM: - case GOMP_MAP_FORCE_PRESENT: - OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION (c) = 1; - break; - default: - break; - } + switch (OMP_CLAUSE_MAP_KIND (c)) + { + case GOMP_MAP_ALLOC: + case GOMP_MAP_IF_PRESENT: + case GOMP_MAP_TO: + case GOMP_MAP_FROM: + case GOMP_MAP_TOFROM: + case GOMP_MAP_ALWAYS_TO: + case GOMP_MAP_ALWAYS_FROM: + case GOMP_MAP_ALWAYS_TOFROM: + case GOMP_MAP_RELEASE: + case GOMP_MAP_DELETE: + case GOMP_MAP_FORCE_TO: + case GOMP_MAP_FORCE_FROM: + case GOMP_MAP_FORCE_TOFROM: + case GOMP_MAP_FORCE_PRESENT: + OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION (c) = 1; + break; + default: + break; + } tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); - if (ort != C_ORT_OMP && ort != C_ORT_ACC) - OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_POINTER); - else if (TREE_CODE (t) == COMPONENT_REF) + if (TREE_CODE (t) == COMPONENT_REF) OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_ATTACH_DETACH); else OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER); @@ -13970,6 +13967,7 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) int reduction_seen = 0; bool allocate_seen = false; bool implicit_moved = false; + bool target_in_reduction_seen = false; bitmap_obstack_initialize (NULL); bitmap_initialize (&generic_head, &bitmap_default_obstack); @@ -13981,7 +13979,7 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) bitmap_initialize (&map_field_head, &bitmap_default_obstack); bitmap_initialize (&map_firstprivate_head, &bitmap_default_obstack); /* If ort == C_ORT_OMP used as nontemporal_head or use_device_xxx_head - instead. */ + instead and for ort == C_ORT_OMP_TARGET used as in_reduction_head. */ bitmap_initialize (&oacc_reduction_head, &bitmap_default_obstack); if (ort & C_ORT_ACC) @@ -14374,8 +14372,22 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) || (ort == C_ORT_OMP && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_PTR || (OMP_CLAUSE_CODE (c) - == OMP_CLAUSE_USE_DEVICE_ADDR)))) + == OMP_CLAUSE_USE_DEVICE_ADDR))) + || (ort == C_ORT_OMP_TARGET + && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION)) { + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION + && (bitmap_bit_p (&generic_head, DECL_UID (t)) + || bitmap_bit_p (&firstprivate_head, DECL_UID (t)))) + { + error_at (OMP_CLAUSE_LOCATION (c), + "%qD appears more than once in data-sharing " + "clauses", t); + remove = true; + break; + } + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION) + target_in_reduction_seen = true; if (bitmap_bit_p (&oacc_reduction_head, DECL_UID (t))) { error_at (OMP_CLAUSE_LOCATION (c), @@ -14390,7 +14402,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } else if (bitmap_bit_p (&generic_head, DECL_UID (t)) || bitmap_bit_p (&firstprivate_head, DECL_UID (t)) - || bitmap_bit_p (&lastprivate_head, DECL_UID (t))) + || bitmap_bit_p (&lastprivate_head, DECL_UID (t)) + || bitmap_bit_p (&map_firstprivate_head, DECL_UID (t))) { error_at (OMP_CLAUSE_LOCATION (c), "%qE appears more than once in data clauses", t); @@ -14457,7 +14470,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) && bitmap_bit_p (&map_firstprivate_head, DECL_UID (t))) remove = true; else if (bitmap_bit_p (&generic_head, DECL_UID (t)) - || bitmap_bit_p (&firstprivate_head, DECL_UID (t))) + || bitmap_bit_p (&firstprivate_head, DECL_UID (t)) + || bitmap_bit_p (&map_firstprivate_head, DECL_UID (t))) { error_at (OMP_CLAUSE_LOCATION (c), "%qE appears more than once in data clauses", t); @@ -14861,7 +14875,7 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) if (VAR_P (t) || TREE_CODE (t) == PARM_DECL) { if (bitmap_bit_p (&map_field_head, DECL_UID (t)) - || (ort == C_ORT_OMP + || (ort != C_ORT_ACC && bitmap_bit_p (&map_head, DECL_UID (t)))) break; } @@ -14918,7 +14932,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER) { if (bitmap_bit_p (&generic_head, DECL_UID (t)) - || bitmap_bit_p (&firstprivate_head, DECL_UID (t))) + || bitmap_bit_p (&firstprivate_head, DECL_UID (t)) + || bitmap_bit_p (&map_firstprivate_head, DECL_UID (t))) { error_at (OMP_CLAUSE_LOCATION (c), "%qD appears more than once in data clauses", t); @@ -14935,13 +14950,10 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) remove = true; } else - { - bitmap_set_bit (&generic_head, DECL_UID (t)); - bitmap_set_bit (&map_firstprivate_head, DECL_UID (t)); - } + bitmap_set_bit (&map_firstprivate_head, DECL_UID (t)); } else if (bitmap_bit_p (&map_head, DECL_UID (t)) - && (ort != C_ORT_OMP + && (ort == C_ORT_ACC || !bitmap_bit_p (&map_field_head, DECL_UID (t)))) { if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP) @@ -14955,8 +14967,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) "%qD appears more than once in map clauses", t); remove = true; } - else if (bitmap_bit_p (&generic_head, DECL_UID (t)) - && ort == C_ORT_ACC) + else if (ort == C_ORT_ACC + && bitmap_bit_p (&generic_head, DECL_UID (t))) { error_at (OMP_CLAUSE_LOCATION (c), "%qD appears more than once in data clauses", t); @@ -15050,7 +15062,7 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) if (TREE_CODE (TREE_TYPE (t)) != POINTER_TYPE) { if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_PTR - && ort == C_ORT_OMP) + && ort != C_ORT_ACC) { error_at (OMP_CLAUSE_LOCATION (c), "%qs variable is not a pointer", @@ -15335,7 +15347,10 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) reduction_seen = -2; } - if (linear_variable_step_check || reduction_seen == -2 || allocate_seen) + if (linear_variable_step_check + || reduction_seen == -2 + || allocate_seen + || target_in_reduction_seen) for (pc = &clauses, c = clauses; c ; c = *pc) { bool remove = false; @@ -15383,6 +15398,20 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION && reduction_seen == -2) OMP_CLAUSE_REDUCTION_INSCAN (c) = 0; + if (target_in_reduction_seen + && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP) + { + tree t = OMP_CLAUSE_DECL (c); + while (handled_component_p (t) + || TREE_CODE (t) == INDIRECT_REF + || TREE_CODE (t) == ADDR_EXPR + || TREE_CODE (t) == MEM_REF + || TREE_CODE (t) == NON_LVALUE_EXPR) + t = TREE_OPERAND (t, 0); + if (DECL_P (t) + && bitmap_bit_p (&oacc_reduction_head, DECL_UID (t))) + OMP_CLAUSE_MAP_IN_REDUCTION (c) = 1; + } if (remove) *pc = OMP_CLAUSE_CHAIN (c); diff --git a/gcc/config/h8300/h8300-protos.h b/gcc/config/h8300/h8300-protos.h index af65329..d7efa97 100644 --- a/gcc/config/h8300/h8300-protos.h +++ b/gcc/config/h8300/h8300-protos.h @@ -36,10 +36,11 @@ extern const char *output_simode_bld (int, rtx[]); extern void final_prescan_insn (rtx_insn *, rtx *, int); extern int h8300_expand_movsi (rtx[]); extern machine_mode h8300_select_cc_mode (RTX_CODE, rtx, rtx); -extern const char *output_logical_op (machine_mode, rtx_code code, rtx *); -extern unsigned int compute_logical_op_length (machine_mode, rtx_code, rtx *); +extern const char *output_logical_op (machine_mode, rtx_code code, + rtx *, rtx_insn *); +extern unsigned int compute_logical_op_length (machine_mode, rtx_code, + rtx *, rtx_insn *); -extern int compute_logical_op_cc (machine_mode, rtx *); extern int compute_a_shift_cc (rtx, rtx *); #ifdef HAVE_ATTR_cc extern enum attr_cc compute_plussi_cc (rtx *); diff --git a/gcc/config/h8300/h8300.c b/gcc/config/h8300/h8300.c index 2b88325..511c2b2 100644 --- a/gcc/config/h8300/h8300.c +++ b/gcc/config/h8300/h8300.c @@ -1100,7 +1100,7 @@ h8300_and_costs (rtx x) operands[1] = XEXP (x, 0); operands[2] = XEXP (x, 1); operands[3] = x; - return compute_logical_op_length (GET_MODE (x), AND, operands) / 2; + return compute_logical_op_length (GET_MODE (x), AND, operands, NULL) / 2; } /* Compute the cost of a shift insn. */ @@ -2881,7 +2881,7 @@ compute_plussi_cc (rtx *operands) /* Output a logical insn. */ const char * -output_logical_op (machine_mode mode, rtx_code code, rtx *operands) +output_logical_op (machine_mode mode, rtx_code code, rtx *operands, rtx_insn *insn) { /* Pretend that every byte is affected if both operands are registers. */ const unsigned HOST_WIDE_INT intval = @@ -2906,6 +2906,19 @@ output_logical_op (machine_mode mode, rtx_code code, rtx *operands) const char *opname; char insn_buf[100]; + /* INSN is the current insn, we examine its overall form to see if we're + supposed to set or clobber the condition codes. + + This is important to know. If we are setting condition codes, then we + must do the operation in MODE and not in some smaller size. + + The key is to look at the second object in the PARALLEL. If it is not + a CLOBBER, then we care about the condition codes. */ + rtx pattern = PATTERN (insn); + gcc_assert (GET_CODE (pattern) == PARALLEL); + rtx second_op = XVECEXP (pattern, 0, 1); + bool cc_meaningful = (GET_CODE (second_op) != CLOBBER); + switch (code) { case AND: @@ -2928,8 +2941,9 @@ output_logical_op (machine_mode mode, rtx_code code, rtx *operands) output_asm_insn (insn_buf, operands); break; case E_HImode: - /* First, see if we can finish with one insn. */ - if (b0 != 0 && b1 != 0) + /* First, see if we can (or must) finish with one insn. */ + if (cc_meaningful + || (b0 != 0 && b1 != 0)) { sprintf (insn_buf, "%s.w\t%%T2,%%T0", opname); output_asm_insn (insn_buf, operands); @@ -2964,10 +2978,11 @@ output_logical_op (machine_mode mode, rtx_code code, rtx *operands) /* Check if doing everything with one insn is no worse than using multiple insns. */ - if (w0 != 0 && w1 != 0 - && !(lower_half_easy_p && upper_half_easy_p) - && !(code == IOR && w1 == 0xffff - && (w0 & 0x8000) != 0 && lower_half_easy_p)) + if (cc_meaningful + || (w0 != 0 && w1 != 0 + && !(lower_half_easy_p && upper_half_easy_p) + && !(code == IOR && w1 == 0xffff + && (w0 & 0x8000) != 0 && lower_half_easy_p))) { sprintf (insn_buf, "%s.l\t%%S2,%%S0", opname); output_asm_insn (insn_buf, operands); @@ -3037,7 +3052,7 @@ output_logical_op (machine_mode mode, rtx_code code, rtx *operands) /* Compute the length of a logical insn. */ unsigned int -compute_logical_op_length (machine_mode mode, rtx_code code, rtx *operands) +compute_logical_op_length (machine_mode mode, rtx_code code, rtx *operands, rtx_insn *insn) { /* Pretend that every byte is affected if both operands are registers. */ const unsigned HOST_WIDE_INT intval = @@ -3061,6 +3076,23 @@ compute_logical_op_length (machine_mode mode, rtx_code code, rtx *operands) /* Insn length. */ unsigned int length = 0; + /* INSN is the current insn, we examine its overall form to see if we're + supposed to set or clobber the condition codes. + + This is important to know. If we are setting condition codes, then we + must do the operation in MODE and not in some smaller size. + + The key is to look at the second object in the PARALLEL. If it is not + a CLOBBER, then we care about the condition codes. */ + bool cc_meaningful = false; + if (insn) + { + rtx pattern = PATTERN (insn); + gcc_assert (GET_CODE (pattern) == PARALLEL); + rtx second_op = XVECEXP (pattern, 0, 1); + cc_meaningful = (GET_CODE (second_op) != CLOBBER); + } + switch (mode) { case E_QImode: @@ -3068,7 +3100,8 @@ compute_logical_op_length (machine_mode mode, rtx_code code, rtx *operands) case E_HImode: /* First, see if we can finish with one insn. */ - if (b0 != 0 && b1 != 0) + if (cc_meaningful + || (b0 != 0 && b1 != 0)) { length = h8300_length_from_table (operands[1], operands[2], &logicw_length_table); @@ -3098,10 +3131,11 @@ compute_logical_op_length (machine_mode mode, rtx_code code, rtx *operands) /* Check if doing everything with one insn is no worse than using multiple insns. */ - if (w0 != 0 && w1 != 0 - && !(lower_half_easy_p && upper_half_easy_p) - && !(code == IOR && w1 == 0xffff - && (w0 & 0x8000) != 0 && lower_half_easy_p)) + if (cc_meaningful + || (w0 != 0 && w1 != 0 + && !(lower_half_easy_p && upper_half_easy_p) + && !(code == IOR && w1 == 0xffff + && (w0 & 0x8000) != 0 && lower_half_easy_p))) { length = h8300_length_from_table (operands[1], operands[2], &logicl_length_table); @@ -3158,80 +3192,6 @@ compute_logical_op_length (machine_mode mode, rtx_code code, rtx *operands) return length; } -/* Compute which flag bits are valid after a logical insn. */ - -int -compute_logical_op_cc (machine_mode mode, rtx *operands) -{ - /* Figure out the logical op that we need to perform. */ - enum rtx_code code = GET_CODE (operands[3]); - /* Pretend that every byte is affected if both operands are registers. */ - const unsigned HOST_WIDE_INT intval = - (unsigned HOST_WIDE_INT) ((GET_CODE (operands[2]) == CONST_INT) - /* Always use the full instruction if the - first operand is in memory. It is better - to use define_splits to generate the shorter - sequence where valid. */ - && register_operand (operands[1], VOIDmode) - ? INTVAL (operands[2]) : 0x55555555); - /* The determinant of the algorithm. If we perform an AND, 0 - affects a bit. Otherwise, 1 affects a bit. */ - const unsigned HOST_WIDE_INT det = (code != AND) ? intval : ~intval; - /* Break up DET into pieces. */ - const unsigned HOST_WIDE_INT b0 = (det >> 0) & 0xff; - const unsigned HOST_WIDE_INT b1 = (det >> 8) & 0xff; - const unsigned HOST_WIDE_INT w0 = (det >> 0) & 0xffff; - const unsigned HOST_WIDE_INT w1 = (det >> 16) & 0xffff; - int lower_half_easy_p = 0; - int upper_half_easy_p = 0; - /* Condition code. */ - enum attr_old_cc cc = OLD_CC_CLOBBER; - - switch (mode) - { - case E_HImode: - /* First, see if we can finish with one insn. */ - if (b0 != 0 && b1 != 0) - { - cc = OLD_CC_SET_ZNV; - } - break; - case E_SImode: - /* Determine if the lower half can be taken care of in no more - than two bytes. */ - lower_half_easy_p = (b0 == 0 - || b1 == 0 - || (code != IOR && w0 == 0xffff)); - - /* Determine if the upper half can be taken care of in no more - than two bytes. */ - upper_half_easy_p = ((code != IOR && w1 == 0xffff) - || (code == AND && w1 == 0xff00)); - - /* Check if doing everything with one insn is no worse than - using multiple insns. */ - if (w0 != 0 && w1 != 0 - && !(lower_half_easy_p && upper_half_easy_p) - && !(code == IOR && w1 == 0xffff - && (w0 & 0x8000) != 0 && lower_half_easy_p)) - { - cc = OLD_CC_SET_ZNV; - } - else - { - if (code == IOR - && w1 == 0xffff - && (w0 & 0x8000) != 0) - { - cc = OLD_CC_SET_ZNV; - } - } - break; - default: - gcc_unreachable (); - } - return cc; -} #if 0 /* Expand a conditional branch. */ diff --git a/gcc/config/h8300/logical.md b/gcc/config/h8300/logical.md index 07d36cf..f07c79e 100644 --- a/gcc/config/h8300/logical.md +++ b/gcc/config/h8300/logical.md @@ -251,17 +251,16 @@ (logicals:QHSI (match_dup 1) (match_dup 2))) (clobber (reg:CC CC_REG))])]) -(define_insn "*<code><mode>3_clobber_flags" +(define_insn "*<code><mode>3<cczn>" [(set (match_operand:QHSI 0 "h8300_dst_operand" "=rQ") (logicals:QHSI (match_operand:QHSI 1 "h8300_dst_operand" "%0") (match_operand:QHSI 2 "h8300_src_operand" "rQi"))) (clobber (reg:CC CC_REG))] "h8300_operands_match_p (operands)" - { return output_logical_op (<MODE>mode, <CODE>, operands); } + { return output_logical_op (<MODE>mode, <CODE>, operands, insn); } [(set (attr "length") - (symbol_ref "compute_logical_op_length (<MODE>mode, <CODE>, operands)"))]) - + (symbol_ref "compute_logical_op_length (<MODE>mode, <CODE>, operands, insn)"))]) ;; ---------------------------------------------------------------------- ;; NOT INSTRUCTIONS diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def index 31df3a6..ea79e0b 100644 --- a/gcc/config/i386/i386-builtin.def +++ b/gcc/config/i386/i386-builtin.def @@ -855,8 +855,8 @@ BDESC (OPTION_MASK_ISA_SSE2 | OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv1di3, "__ BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_movshdup, "__builtin_ia32_movshdup", IX86_BUILTIN_MOVSHDUP, UNKNOWN, (int) V4SF_FTYPE_V4SF) BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_movsldup, "__builtin_ia32_movsldup", IX86_BUILTIN_MOVSLDUP, UNKNOWN, (int) V4SF_FTYPE_V4SF) -BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_addsubv4sf3, "__builtin_ia32_addsubps", IX86_BUILTIN_ADDSUBPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF) -BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_addsubv2df3, "__builtin_ia32_addsubpd", IX86_BUILTIN_ADDSUBPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF) +BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_vec_addsubv4sf3, "__builtin_ia32_addsubps", IX86_BUILTIN_ADDSUBPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF) +BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_vec_addsubv2df3, "__builtin_ia32_addsubpd", IX86_BUILTIN_ADDSUBPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF) BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_haddv4sf3, "__builtin_ia32_haddps", IX86_BUILTIN_HADDPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF) BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_haddv2df3, "__builtin_ia32_haddpd", IX86_BUILTIN_HADDPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF) BDESC (OPTION_MASK_ISA_SSE3, 0, CODE_FOR_sse3_hsubv4sf3, "__builtin_ia32_hsubps", IX86_BUILTIN_HSUBPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF) @@ -996,8 +996,8 @@ BDESC (OPTION_MASK_ISA_SSE2, 0, CODE_FOR_pclmulqdq, 0, IX86_BUILTIN_PCLMULQDQ128 /* AVX */ BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_addv4df3, "__builtin_ia32_addpd256", IX86_BUILTIN_ADDPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF) BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_addv8sf3, "__builtin_ia32_addps256", IX86_BUILTIN_ADDPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF) -BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_addsubv4df3, "__builtin_ia32_addsubpd256", IX86_BUILTIN_ADDSUBPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF) -BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_addsubv8sf3, "__builtin_ia32_addsubps256", IX86_BUILTIN_ADDSUBPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF) +BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_vec_addsubv4df3, "__builtin_ia32_addsubpd256", IX86_BUILTIN_ADDSUBPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF) +BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_vec_addsubv8sf3, "__builtin_ia32_addsubps256", IX86_BUILTIN_ADDSUBPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF) BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_andv4df3, "__builtin_ia32_andpd256", IX86_BUILTIN_ANDPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF) BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_andv8sf3, "__builtin_ia32_andps256", IX86_BUILTIN_ANDPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF) BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_andnotv4df3, "__builtin_ia32_andnpd256", IX86_BUILTIN_ANDNPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF) diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 2986b49..2cb939e 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -17467,10 +17467,23 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) if (!d->one_operand_p) { - if (!TARGET_XOP || GET_MODE_SIZE (d->vmode) != 16) + if (GET_MODE_SIZE (d->vmode) == 8) + { + if (!TARGET_XOP) + return false; + vmode = V8QImode; + } + else if (GET_MODE_SIZE (d->vmode) == 16) + { + if (!TARGET_XOP) + return false; + } + else if (GET_MODE_SIZE (d->vmode) == 32) { - if (TARGET_AVX2 - && valid_perm_using_mode_p (V2TImode, d)) + if (!TARGET_AVX2) + return false; + + if (valid_perm_using_mode_p (V2TImode, d)) { if (d->testing_p) return true; @@ -17492,6 +17505,8 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) } return false; } + else + return false; } else { @@ -17651,8 +17666,22 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) { rtx m128 = GEN_INT (-128); + /* Remap elements from the second operand, as we have to + account for inactive top 8 elements from the first operand. */ + if (!d->one_operand_p) + for (i = 0; i < nelt; ++i) + { + int ival = INTVAL (rperm[i]); + if (ival >= 8) + ival += 8; + rperm[i] = GEN_INT (ival); + } + + /* V8QI is emulated with V16QI instruction, fill inactive + elements in the top 8 positions with zeros. */ for (i = nelt; i < 16; ++i) rperm[i] = m128; + vpmode = V16QImode; } @@ -17660,36 +17689,54 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) gen_rtvec_v (GET_MODE_NUNITS (vpmode), rperm)); vperm = force_reg (vpmode, vperm); - target = d->target; - if (d->vmode != vmode) + if (vmode == d->vmode) + target = d->target; + else target = gen_reg_rtx (vmode); + op0 = gen_lowpart (vmode, d->op0); + if (d->one_operand_p) { + rtx (*gen) (rtx, rtx, rtx); + if (vmode == V8QImode) - emit_insn (gen_mmx_pshufbv8qi3 (target, op0, vperm)); + gen = gen_mmx_pshufbv8qi3; else if (vmode == V16QImode) - emit_insn (gen_ssse3_pshufbv16qi3 (target, op0, vperm)); + gen = gen_ssse3_pshufbv16qi3; else if (vmode == V32QImode) - emit_insn (gen_avx2_pshufbv32qi3 (target, op0, vperm)); + gen = gen_avx2_pshufbv32qi3; else if (vmode == V64QImode) - emit_insn (gen_avx512bw_pshufbv64qi3 (target, op0, vperm)); + gen = gen_avx512bw_pshufbv64qi3; else if (vmode == V8SFmode) - emit_insn (gen_avx2_permvarv8sf (target, op0, vperm)); + gen = gen_avx2_permvarv8sf; else if (vmode == V8SImode) - emit_insn (gen_avx2_permvarv8si (target, op0, vperm)); + gen = gen_avx2_permvarv8si; else if (vmode == V16SFmode) - emit_insn (gen_avx512f_permvarv16sf (target, op0, vperm)); + gen = gen_avx512f_permvarv16sf; else if (vmode == V16SImode) - emit_insn (gen_avx512f_permvarv16si (target, op0, vperm)); + gen = gen_avx512f_permvarv16si; else gcc_unreachable (); + + emit_insn (gen (target, op0, vperm)); } else { + rtx (*gen) (rtx, rtx, rtx, rtx); + op1 = gen_lowpart (vmode, d->op1); - emit_insn (gen_xop_pperm (target, op0, op1, vperm)); + + if (vmode == V8QImode) + gen = gen_mmx_ppermv64; + else if (vmode == V16QImode) + gen = gen_xop_pperm; + else + gcc_unreachable (); + + emit_insn (gen (target, op0, op1, vperm)); } + if (target != d->target) emit_move_insn (d->target, gen_lowpart (d->vmode, target)); @@ -20658,8 +20705,9 @@ ix86_expand_vec_interleave (rtx targ, rtx op0, rtx op1, bool high_p) gcc_assert (ok); } -/* Optimize vector MUL generation for V8QI, V16QI and V32QI - under TARGET_AVX512BW. i.e. for v16qi a * b, it has +/* This function is similar as ix86_expand_vecop_qihi, + but optimized under AVX512BW by using vpmovwb. + For example, optimize vector MUL generation like vpmovzxbw ymm2, xmm0 vpmovzxbw ymm3, xmm1 @@ -20669,13 +20717,14 @@ ix86_expand_vec_interleave (rtx targ, rtx op0, rtx op1, bool high_p) it would take less instructions than ix86_expand_vecop_qihi. Return true if success. */ -bool -ix86_expand_vecmul_qihi (rtx dest, rtx op1, rtx op2) +static bool +ix86_expand_vecop_qihi2 (enum rtx_code code, rtx dest, rtx op1, rtx op2) { machine_mode himode, qimode = GET_MODE (dest); rtx hop1, hop2, hdest; rtx (*gen_extend)(rtx, rtx); rtx (*gen_truncate)(rtx, rtx); + bool uns_p = (code == ASHIFTRT) ? false : true; /* There's no V64HImode multiplication instruction. */ if (qimode == E_V64QImode) @@ -20696,17 +20745,17 @@ ix86_expand_vecmul_qihi (rtx dest, rtx op1, rtx op2) { case E_V8QImode: himode = V8HImode; - gen_extend = gen_zero_extendv8qiv8hi2; + gen_extend = uns_p ? gen_zero_extendv8qiv8hi2 : gen_extendv8qiv8hi2; gen_truncate = gen_truncv8hiv8qi2; break; case E_V16QImode: himode = V16HImode; - gen_extend = gen_zero_extendv16qiv16hi2; + gen_extend = uns_p ? gen_zero_extendv16qiv16hi2 : gen_extendv16qiv16hi2; gen_truncate = gen_truncv16hiv16qi2; break; case E_V32QImode: himode = V32HImode; - gen_extend = gen_zero_extendv32qiv32hi2; + gen_extend = uns_p ? gen_zero_extendv32qiv32hi2 : gen_extendv32qiv32hi2; gen_truncate = gen_truncv32hiv32qi2; break; default: @@ -20718,7 +20767,7 @@ ix86_expand_vecmul_qihi (rtx dest, rtx op1, rtx op2) hdest = gen_reg_rtx (himode); emit_insn (gen_extend (hop1, op1)); emit_insn (gen_extend (hop2, op2)); - emit_insn (gen_rtx_SET (hdest, simplify_gen_binary (MULT, himode, + emit_insn (gen_rtx_SET (hdest, simplify_gen_binary (code, himode, hop1, hop2))); emit_insn (gen_truncate (dest, hdest)); return true; @@ -20726,8 +20775,9 @@ ix86_expand_vecmul_qihi (rtx dest, rtx op1, rtx op2) /* Expand a vector operation shift by constant for a V*QImode in terms of the same operation on V*HImode. Return true if success. */ -bool -ix86_expand_vec_shift_qihi_constant (enum rtx_code code, rtx dest, rtx op1, rtx op2) +static bool +ix86_expand_vec_shift_qihi_constant (enum rtx_code code, + rtx dest, rtx op1, rtx op2) { machine_mode qimode, himode; HOST_WIDE_INT and_constant, xor_constant; @@ -20839,6 +20889,16 @@ ix86_expand_vecop_qihi (enum rtx_code code, rtx dest, rtx op1, rtx op2) bool uns_p = false; int i; + if (CONST_INT_P (op2) + && (code == ASHIFT || code == LSHIFTRT || code == ASHIFTRT) + && ix86_expand_vec_shift_qihi_constant (code, dest, op1, op2)) + return; + + if (TARGET_AVX512BW + && VECTOR_MODE_P (GET_MODE (op2)) + && ix86_expand_vecop_qihi2 (code, dest, op1, op2)) + return; + switch (qimode) { case E_V16QImode: @@ -20860,7 +20920,6 @@ ix86_expand_vecop_qihi (enum rtx_code code, rtx dest, rtx op1, rtx op2) gcc_unreachable (); } - op2_l = op2_h = op2; switch (code) { case MULT: @@ -20889,17 +20948,46 @@ ix86_expand_vecop_qihi (enum rtx_code code, rtx dest, rtx op1, rtx op2) op1_h = gen_reg_rtx (himode); ix86_expand_sse_unpack (op1_l, op1, uns_p, false); ix86_expand_sse_unpack (op1_h, op1, uns_p, true); + /* vashr/vlshr/vashl */ + if (GET_MODE_CLASS (GET_MODE (op2)) == MODE_VECTOR_INT) + { + rtx tmp = force_reg (qimode, op2); + op2_l = gen_reg_rtx (himode); + op2_h = gen_reg_rtx (himode); + ix86_expand_sse_unpack (op2_l, tmp, uns_p, false); + ix86_expand_sse_unpack (op2_h, tmp, uns_p, true); + } + else + op2_l = op2_h = op2; + full_interleave = true; break; default: gcc_unreachable (); } - /* Perform the operation. */ - res_l = expand_simple_binop (himode, code, op1_l, op2_l, NULL_RTX, - 1, OPTAB_DIRECT); - res_h = expand_simple_binop (himode, code, op1_h, op2_h, NULL_RTX, - 1, OPTAB_DIRECT); + /* Perform vashr/vlshr/vashl. */ + if (code != MULT + && GET_MODE_CLASS (GET_MODE (op2)) == MODE_VECTOR_INT) + { + res_l = gen_reg_rtx (himode); + res_h = gen_reg_rtx (himode); + emit_insn (gen_rtx_SET (res_l, + simplify_gen_binary (code, himode, + op1_l, op2_l))); + emit_insn (gen_rtx_SET (res_h, + simplify_gen_binary (code, himode, + op1_h, op2_h))); + } + /* Performance mult/ashr/lshr/ashl. */ + else + { + res_l = expand_simple_binop (himode, code, op1_l, op2_l, NULL_RTX, + 1, OPTAB_DIRECT); + res_h = expand_simple_binop (himode, code, op1_h, op2_h, NULL_RTX, + 1, OPTAB_DIRECT); + } + gcc_assert (res_l && res_h); /* Merge the data back into the right place. */ diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 1d05206..65fc307 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -208,10 +208,7 @@ extern void ix86_expand_round (rtx, rtx); extern void ix86_expand_rounddf_32 (rtx, rtx); extern void ix86_expand_round_sse4 (rtx, rtx); -extern bool ix86_expand_vecmul_qihi (rtx, rtx, rtx); extern void ix86_expand_vecop_qihi (enum rtx_code, rtx, rtx, rtx); -extern bool ix86_expand_vec_shift_qihi_constant (enum rtx_code, rtx, rtx, rtx); - extern rtx ix86_split_stack_guard (void); extern void ix86_move_vector_high_sse_to_mmx (rtx); diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 700c158..9043be3 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -120,6 +120,7 @@ UNSPEC_MOVMSK UNSPEC_BLENDV UNSPEC_PSHUFB + UNSPEC_XOP_PERMUTE UNSPEC_RCP UNSPEC_RSQRT UNSPEC_PSADBW @@ -14533,10 +14534,12 @@ (set_attr "mode" "SI")]) (define_insn "bsr_rex64" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (reg:CCZ FLAGS_REG) + (compare:CCZ (match_operand:DI 1 "nonimmediate_operand" "rm") + (const_int 0))) + (set (match_operand:DI 0 "register_operand" "=r") (minus:DI (const_int 63) - (clz:DI (match_operand:DI 1 "nonimmediate_operand" "rm")))) - (clobber (reg:CC FLAGS_REG))] + (clz:DI (match_dup 1))))] "TARGET_64BIT" "bsr{q}\t{%1, %0|%0, %1}" [(set_attr "type" "alu1") @@ -14545,10 +14548,12 @@ (set_attr "mode" "DI")]) (define_insn "bsr" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (reg:CCZ FLAGS_REG) + (compare:CCZ (match_operand:SI 1 "nonimmediate_operand" "rm") + (const_int 0))) + (set (match_operand:SI 0 "register_operand" "=r") (minus:SI (const_int 31) - (clz:SI (match_operand:SI 1 "nonimmediate_operand" "rm")))) - (clobber (reg:CC FLAGS_REG))] + (clz:SI (match_dup 1))))] "" "bsr{l}\t{%1, %0|%0, %1}" [(set_attr "type" "alu1") @@ -14556,25 +14561,15 @@ (set_attr "znver1_decode" "vector") (set_attr "mode" "SI")]) -(define_insn "*bsrhi" - [(set (match_operand:HI 0 "register_operand" "=r") - (minus:HI (const_int 15) - (clz:HI (match_operand:HI 1 "nonimmediate_operand" "rm")))) - (clobber (reg:CC FLAGS_REG))] - "" - "bsr{w}\t{%1, %0|%0, %1}" - [(set_attr "type" "alu1") - (set_attr "prefix_0f" "1") - (set_attr "znver1_decode" "vector") - (set_attr "mode" "HI")]) - (define_expand "clz<mode>2" [(parallel - [(set (match_operand:SWI48 0 "register_operand") + [(set (reg:CCZ FLAGS_REG) + (compare:CCZ (match_operand:SWI48 1 "nonimmediate_operand" "rm") + (const_int 0))) + (set (match_operand:SWI48 0 "register_operand") (minus:SWI48 (match_dup 2) - (clz:SWI48 (match_operand:SWI48 1 "nonimmediate_operand")))) - (clobber (reg:CC FLAGS_REG))]) + (clz:SWI48 (match_dup 1))))]) (parallel [(set (match_dup 0) (xor:SWI48 (match_dup 0) (match_dup 2))) (clobber (reg:CC FLAGS_REG))])] diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index a107ac5..7a827dc 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -2331,6 +2331,19 @@ "vpcmov\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "sse4arg")]) +;; XOP permute instructions +(define_insn "mmx_ppermv64" + [(set (match_operand:V8QI 0 "register_operand" "=x") + (unspec:V8QI + [(match_operand:V8QI 1 "register_operand" "x") + (match_operand:V8QI 2 "register_operand" "x") + (match_operand:V16QI 3 "nonimmediate_operand" "xm")] + UNSPEC_XOP_PERMUTE))] + "TARGET_XOP && TARGET_MMX_WITH_SSE" + "vpperm\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "sse4arg") + (set_attr "mode" "TI")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; Parallel integral logical operations diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index f5f9403..2d29877 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -53,7 +53,6 @@ UNSPEC_FMADDSUB UNSPEC_XOP_UNSIGNED_CMP UNSPEC_XOP_TRUEFALSE - UNSPEC_XOP_PERMUTE UNSPEC_FRCZ ;; For AES support @@ -398,6 +397,10 @@ (define_mode_iterator VI1_AVX512F [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI]) +(define_mode_iterator VI12_256_512_AVX512VL + [V64QI (V32QI "TARGET_AVX512VL") + V32HI (V16HI "TARGET_AVX512VL")]) + (define_mode_iterator VI2_AVX2 [(V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI]) @@ -2407,69 +2410,36 @@ (set_attr "prefix" "<round_saeonly_scalar_prefix>") (set_attr "mode" "<ssescalarmode>")]) -(define_insn "avx_addsubv4df3" - [(set (match_operand:V4DF 0 "register_operand" "=x") - (vec_merge:V4DF - (minus:V4DF - (match_operand:V4DF 1 "register_operand" "x") - (match_operand:V4DF 2 "nonimmediate_operand" "xm")) - (plus:V4DF (match_dup 1) (match_dup 2)) - (const_int 5)))] - "TARGET_AVX" - "vaddsubpd\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "sseadd") - (set_attr "prefix" "vex") - (set_attr "mode" "V4DF")]) - -(define_insn "sse3_addsubv2df3" - [(set (match_operand:V2DF 0 "register_operand" "=x,x") - (vec_merge:V2DF - (minus:V2DF - (match_operand:V2DF 1 "register_operand" "0,x") - (match_operand:V2DF 2 "vector_operand" "xBm,xm")) - (plus:V2DF (match_dup 1) (match_dup 2)) - (const_int 1)))] - "TARGET_SSE3" - "@ - addsubpd\t{%2, %0|%0, %2} - vaddsubpd\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,avx") - (set_attr "type" "sseadd") - (set_attr "atom_unit" "complex") - (set_attr "prefix" "orig,vex") - (set_attr "mode" "V2DF")]) - -(define_insn "avx_addsubv8sf3" - [(set (match_operand:V8SF 0 "register_operand" "=x") - (vec_merge:V8SF - (minus:V8SF - (match_operand:V8SF 1 "register_operand" "x") - (match_operand:V8SF 2 "nonimmediate_operand" "xm")) - (plus:V8SF (match_dup 1) (match_dup 2)) - (const_int 85)))] - "TARGET_AVX" - "vaddsubps\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "sseadd") - (set_attr "prefix" "vex") - (set_attr "mode" "V8SF")]) +(define_mode_attr addsub_cst [(V4DF "5") (V2DF "1") + (V4SF "5") (V8SF "85")]) -(define_insn "sse3_addsubv4sf3" - [(set (match_operand:V4SF 0 "register_operand" "=x,x") - (vec_merge:V4SF - (minus:V4SF - (match_operand:V4SF 1 "register_operand" "0,x") - (match_operand:V4SF 2 "vector_operand" "xBm,xm")) - (plus:V4SF (match_dup 1) (match_dup 2)) - (const_int 5)))] +(define_insn "vec_addsub<mode>3" + [(set (match_operand:VF_128_256 0 "register_operand" "=x,x") + (vec_merge:VF_128_256 + (minus:VF_128_256 + (match_operand:VF_128_256 1 "register_operand" "0,x") + (match_operand:VF_128_256 2 "vector_operand" "xBm, xm")) + (plus:VF_128_256 (match_dup 1) (match_dup 2)) + (const_int <addsub_cst>)))] "TARGET_SSE3" "@ - addsubps\t{%2, %0|%0, %2} - vaddsubps\t{%2, %1, %0|%0, %1, %2}" + addsub<ssemodesuffix>\t{%2, %0|%0, %2} + vaddsub<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") + (set (attr "atom_unit") + (if_then_else + (match_test "<MODE>mode == V2DFmode") + (const_string "complex") + (const_string "*"))) (set_attr "prefix" "orig,vex") - (set_attr "prefix_rep" "1,*") - (set_attr "mode" "V4SF")]) + (set (attr "prefix_rep") + (if_then_else + (and (match_test "<MODE>mode == V4SFmode") + (eq_attr "alternative" "0")) + (const_string "1") + (const_string "*"))) + (set_attr "mode" "<MODE>")]) (define_split [(set (match_operand:VF_128_256 0 "register_operand") @@ -11781,9 +11751,9 @@ [(set (match_operand:V8QI 0 "register_operand") (mult:V8QI (match_operand:V8QI 1 "register_operand") (match_operand:V8QI 2 "register_operand")))] - "TARGET_AVX512VL && TARGET_AVX512BW" + "TARGET_AVX512VL && TARGET_AVX512BW && TARGET_64BIT" { - gcc_assert (ix86_expand_vecmul_qihi (operands[0], operands[1], operands[2])); + ix86_expand_vecop_qihi (MULT, operands[0], operands[1], operands[2]); DONE; }) @@ -11793,8 +11763,6 @@ (match_operand:VI1_AVX512 2 "register_operand")))] "TARGET_SSE2" { - if (ix86_expand_vecmul_qihi (operands[0], operands[1], operands[2])) - DONE; ix86_expand_vecop_qihi (MULT, operands[0], operands[1], operands[2]); DONE; }) @@ -20240,12 +20208,20 @@ (lshiftrt:VI12_128 (match_operand:VI12_128 1 "register_operand") (match_operand:VI12_128 2 "nonimmediate_operand")))] - "TARGET_XOP" + "TARGET_XOP || (TARGET_AVX512BW && TARGET_AVX512VL)" { - rtx neg = gen_reg_rtx (<MODE>mode); - emit_insn (gen_neg<mode>2 (neg, operands[2])); - emit_insn (gen_xop_shl<mode>3 (operands[0], operands[1], neg)); - DONE; + if (TARGET_XOP) + { + rtx neg = gen_reg_rtx (<MODE>mode); + emit_insn (gen_neg<mode>2 (neg, operands[2])); + emit_insn (gen_xop_shl<mode>3 (operands[0], operands[1], neg)); + DONE; + } + else if (<MODE>mode == V16QImode) + { + ix86_expand_vecop_qihi (LSHIFTRT, operands[0], operands[1], operands[2]); + DONE; + } }) (define_expand "vlshr<mode>3" @@ -20264,6 +20240,31 @@ } }) +(define_expand "v<insn><mode>3" + [(set (match_operand:VI12_256_512_AVX512VL 0 "register_operand") + (any_shift:VI12_256_512_AVX512VL + (match_operand:VI12_256_512_AVX512VL 1 "register_operand") + (match_operand:VI12_256_512_AVX512VL 2 "nonimmediate_operand")))] + "TARGET_AVX512BW" +{ + if (<MODE>mode == V32QImode || <MODE>mode == V64QImode) + { + ix86_expand_vecop_qihi (<CODE>, operands[0], operands[1], operands[2]); + DONE; + } +}) + +(define_expand "v<insn>v8qi3" + [(set (match_operand:V8QI 0 "register_operand") + (any_shift:V8QI + (match_operand:V8QI 1 "register_operand") + (match_operand:V8QI 2 "nonimmediate_operand")))] + "TARGET_AVX512BW && TARGET_AVX512VL && TARGET_64BIT" +{ + ix86_expand_vecop_qihi (<CODE>, operands[0], operands[1], operands[2]); + DONE; +}) + (define_expand "vlshr<mode>3" [(set (match_operand:VI48_512 0 "register_operand") (lshiftrt:VI48_512 @@ -20278,33 +20279,32 @@ (match_operand:VI48_256 2 "nonimmediate_operand")))] "TARGET_AVX2") -(define_expand "vashrv8hi3<mask_name>" - [(set (match_operand:V8HI 0 "register_operand") - (ashiftrt:V8HI - (match_operand:V8HI 1 "register_operand") - (match_operand:V8HI 2 "nonimmediate_operand")))] +(define_expand "vashr<mode>3" + [(set (match_operand:VI8_256_512 0 "register_operand") + (ashiftrt:VI8_256_512 + (match_operand:VI8_256_512 1 "register_operand") + (match_operand:VI8_256_512 2 "nonimmediate_operand")))] + "TARGET_AVX512F") + +(define_expand "vashr<mode>3" + [(set (match_operand:VI12_128 0 "register_operand") + (ashiftrt:VI12_128 + (match_operand:VI12_128 1 "register_operand") + (match_operand:VI12_128 2 "nonimmediate_operand")))] "TARGET_XOP || (TARGET_AVX512BW && TARGET_AVX512VL)" { if (TARGET_XOP) { - rtx neg = gen_reg_rtx (V8HImode); - emit_insn (gen_negv8hi2 (neg, operands[2])); - emit_insn (gen_xop_shav8hi3 (operands[0], operands[1], neg)); + rtx neg = gen_reg_rtx (<MODE>mode); + emit_insn (gen_neg<mode>2 (neg, operands[2])); + emit_insn (gen_xop_sha<mode>3 (operands[0], operands[1], neg)); + DONE; + } + else if(<MODE>mode == V16QImode) + { + ix86_expand_vecop_qihi (ASHIFTRT, operands[0],operands[1], operands[2]); DONE; } -}) - -(define_expand "vashrv16qi3" - [(set (match_operand:V16QI 0 "register_operand") - (ashiftrt:V16QI - (match_operand:V16QI 1 "register_operand") - (match_operand:V16QI 2 "nonimmediate_operand")))] - "TARGET_XOP" -{ - rtx neg = gen_reg_rtx (V16QImode); - emit_insn (gen_negv16qi2 (neg, operands[2])); - emit_insn (gen_xop_shav16qi3 (operands[0], operands[1], neg)); - DONE; }) (define_expand "vashrv2di3<mask_name>" @@ -20355,10 +20355,18 @@ (ashift:VI12_128 (match_operand:VI12_128 1 "register_operand") (match_operand:VI12_128 2 "nonimmediate_operand")))] - "TARGET_XOP" + "TARGET_XOP || (TARGET_AVX512BW && TARGET_AVX512VL)" { - emit_insn (gen_xop_sha<mode>3 (operands[0], operands[1], operands[2])); - DONE; + if (TARGET_XOP) + { + emit_insn (gen_xop_sha<mode>3 (operands[0], operands[1], operands[2])); + DONE; + } + else if (<MODE>mode == V16QImode) + { + ix86_expand_vecop_qihi (ASHIFT, operands[0], operands[1], operands[2]); + DONE; + } }) (define_expand "vashl<mode>3" @@ -20462,8 +20470,7 @@ gen = (<CODE> == LSHIFTRT ? gen_xop_shlv16qi3 : gen_xop_shav16qi3); emit_insn (gen (operands[0], operands[1], tmp)); } - else if (!ix86_expand_vec_shift_qihi_constant (<CODE>, operands[0], - operands[1], operands[2])) + else ix86_expand_vecop_qihi (<CODE>, operands[0], operands[1], operands[2]); DONE; }) diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def index 52ce848..6758296 100644 --- a/gcc/config/rs6000/rs6000-cpus.def +++ b/gcc/config/rs6000/rs6000-cpus.def @@ -75,9 +75,11 @@ | OPTION_MASK_P9_VECTOR) /* Flags that need to be turned off if -mno-power10. */ +/* We comment out PCREL_OPT here to disable it by default because SPEC2017 + performance was degraded by it. */ #define OTHER_POWER10_MASKS (OPTION_MASK_MMA \ | OPTION_MASK_PCREL \ - | OPTION_MASK_PCREL_OPT \ + /* | OPTION_MASK_PCREL_OPT */ \ | OPTION_MASK_PREFIXED) #define ISA_3_1_MASKS_SERVER (ISA_3_0_MASKS_SERVER \ diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 6bbeb64..590dd8f 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -13110,33 +13110,25 @@ output_asm_nops (const char *user, int hw) } } -/* Output assembler code to FILE to increment profiler label # LABELNO - for profiling a function entry. */ +/* Output assembler code to FILE to call a profiler hook. */ void -s390_function_profiler (FILE *file, int labelno) +s390_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) { - rtx op[8]; - - char label[128]; - ASM_GENERATE_INTERNAL_LABEL (label, "LP", labelno); + rtx op[4]; fprintf (file, "# function profiler \n"); op[0] = gen_rtx_REG (Pmode, RETURN_REGNUM); op[1] = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); op[1] = gen_rtx_MEM (Pmode, plus_constant (Pmode, op[1], UNITS_PER_LONG)); - op[7] = GEN_INT (UNITS_PER_LONG); - - op[2] = gen_rtx_REG (Pmode, 1); - op[3] = gen_rtx_SYMBOL_REF (Pmode, label); - SYMBOL_REF_FLAGS (op[3]) = SYMBOL_FLAG_LOCAL; + op[3] = GEN_INT (UNITS_PER_LONG); - op[4] = gen_rtx_SYMBOL_REF (Pmode, flag_fentry ? "__fentry__" : "_mcount"); + op[2] = gen_rtx_SYMBOL_REF (Pmode, flag_fentry ? "__fentry__" : "_mcount"); if (flag_pic) { - op[4] = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op[4]), UNSPEC_PLT); - op[4] = gen_rtx_CONST (Pmode, op[4]); + op[2] = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op[2]), UNSPEC_PLT); + op[2] = gen_rtx_CONST (Pmode, op[2]); } if (flag_record_mcount) @@ -13150,20 +13142,19 @@ s390_function_profiler (FILE *file, int labelno) warning (OPT_Wcannot_profile, "nested functions cannot be profiled " "with %<-mfentry%> on s390"); else - output_asm_insn ("brasl\t0,%4", op); + output_asm_insn ("brasl\t0,%2", op); } else if (TARGET_64BIT) { if (flag_nop_mcount) - output_asm_nops ("-mnop-mcount", /* stg */ 3 + /* larl */ 3 + - /* brasl */ 3 + /* lg */ 3); + output_asm_nops ("-mnop-mcount", /* stg */ 3 + /* brasl */ 3 + + /* lg */ 3); else { output_asm_insn ("stg\t%0,%1", op); if (flag_dwarf2_cfi_asm) - output_asm_insn (".cfi_rel_offset\t%0,%7", op); - output_asm_insn ("larl\t%2,%3", op); - output_asm_insn ("brasl\t%0,%4", op); + output_asm_insn (".cfi_rel_offset\t%0,%3", op); + output_asm_insn ("brasl\t%0,%2", op); output_asm_insn ("lg\t%0,%1", op); if (flag_dwarf2_cfi_asm) output_asm_insn (".cfi_restore\t%0", op); @@ -13172,15 +13163,14 @@ s390_function_profiler (FILE *file, int labelno) else { if (flag_nop_mcount) - output_asm_nops ("-mnop-mcount", /* st */ 2 + /* larl */ 3 + - /* brasl */ 3 + /* l */ 2); + output_asm_nops ("-mnop-mcount", /* st */ 2 + /* brasl */ 3 + + /* l */ 2); else { output_asm_insn ("st\t%0,%1", op); if (flag_dwarf2_cfi_asm) - output_asm_insn (".cfi_rel_offset\t%0,%7", op); - output_asm_insn ("larl\t%2,%3", op); - output_asm_insn ("brasl\t%0,%4", op); + output_asm_insn (".cfi_rel_offset\t%0,%3", op); + output_asm_insn ("brasl\t%0,%2", op); output_asm_insn ("l\t%0,%1", op); if (flag_dwarf2_cfi_asm) output_asm_insn (".cfi_restore\t%0", op); diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h index 3b87616..fb16a45 100644 --- a/gcc/config/s390/s390.h +++ b/gcc/config/s390/s390.h @@ -787,6 +787,8 @@ CUMULATIVE_ARGS; #define PROFILE_BEFORE_PROLOGUE 1 +#define NO_PROFILE_COUNTERS 1 + /* Trampolines for nested functions. */ diff --git a/gcc/configure b/gcc/configure index dd0194a..f0b2ebd 100755 --- a/gcc/configure +++ b/gcc/configure @@ -29145,8 +29145,8 @@ fi $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; } then - if test x$gcc_cv_readelf != x \ - && $gcc_cv_readelf -wi conftest.o 2>&1 \ + if test x$gcc_cv_objdump != x \ + && $gcc_cv_objdump -Wi conftest.o 2>&1 \ | grep DW_TAG_compile_unit > /dev/null 2>&1; then gcc_cv_as_gdwarf_5_flag=yes; fi @@ -29166,6 +29166,16 @@ $as_echo "#define HAVE_AS_GDWARF_5_DEBUG_FLAG 1" >>confdefs.h fi + case $target_os in + win32 | pe | cygwin* | mingw32*) + section_flags=\"dr\" + function_type=".def foo; .scl 2; .type 32; .endef" + function_size="";; + *) + section_flags=\"\",%progbits + function_type=".type foo, %function" + function_size=".size foo, .-foo";; + esac dwarf4_debug_info_size=0x46 dwarf4_high_pc_form=7 dwarf4_debug_aranges_size=0x2c @@ -29177,16 +29187,16 @@ fi .Ltext0: .p2align 4 .globl foo - .type foo, %function + $function_type foo: .LFB0: .LM1: $insn .LM2: .LFE0: - .size foo, .-foo + $function_size .Letext0: - .section .debug_info,\"\",%progbits + .section .debug_info,$section_flags .Ldebug_info0: .4byte $dwarf4_debug_info_size .2byte 0x4 @@ -29210,7 +29220,7 @@ foo: .byte 0x1 .byte 0x9c .byte 0 - .section .debug_abbrev,\"\",%progbits + .section .debug_abbrev,$section_flags .Ldebug_abbrev0: .byte 0x1 .byte 0x11 @@ -29253,7 +29263,7 @@ foo: .byte 0 .byte 0 .byte 0 - .section .debug_aranges,\"\",%progbits + .section .debug_aranges,$section_flags .4byte $dwarf4_debug_aranges_size .2byte 0x2 .4byte .Ldebug_info0 @@ -29265,7 +29275,7 @@ foo: .${dwarf4_addr_size}byte .Letext0-.Ltext0 .${dwarf4_addr_size}byte 0 .${dwarf4_addr_size}byte 0 - .section .debug_line,\"\",%progbits + .section .debug_line,$section_flags .Ldebug_line0: .4byte .LELT0-.LSLT0 .LSLT0: @@ -29319,7 +29329,7 @@ foo: .byte 0x1 .byte 0x1 .LELT0: - .section .debug_str,\"\",%progbits + .section .debug_str,$section_flags .ident \"GCC\" " dwarf4_success=no @@ -29491,10 +29501,10 @@ fi conftest_s="\ .text .globl foo - .type foo, %function + $function_type foo: $insn - .size foo, .-foo + $function_size .file 1 \"foo.c\" " { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for working --gdwarf-4/--gdwarf-5 for all sources" >&5 @@ -29512,8 +29522,8 @@ else $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; } then - if test x$gcc_cv_readelf != x \ - && $gcc_cv_readelf -w conftest.o 2>&1 \ + if test x$gcc_cv_objdump != x \ + && $gcc_cv_objdump -W conftest.o 2>&1 \ | grep conftest.s > /dev/null 2>&1; then gcc_cv_as_working_gdwarf_n_flag=no else diff --git a/gcc/configure.ac b/gcc/configure.ac index 5f30f80..7008939 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -5457,13 +5457,23 @@ if test x"$insn" != x; then gcc_GAS_CHECK_FEATURE([--gdwarf-5 option], gcc_cv_as_gdwarf_5_flag, [elf,2,36,0], [--gdwarf-5], [$insn], - [if test x$gcc_cv_readelf != x \ - && $gcc_cv_readelf -wi conftest.o 2>&1 \ + [if test x$gcc_cv_objdump != x \ + && $gcc_cv_objdump -Wi conftest.o 2>&1 \ | grep DW_TAG_compile_unit > /dev/null 2>&1; then gcc_cv_as_gdwarf_5_flag=yes; fi],[AC_DEFINE(HAVE_AS_GDWARF_5_DEBUG_FLAG, 1, [Define if your assembler supports the --gdwarf-5 option.])]) + case $target_os in + win32 | pe | cygwin* | mingw32*) + section_flags=\"dr\" + function_type=".def foo; .scl 2; .type 32; .endef" + function_size="";; + *) + section_flags=\"\",%progbits + function_type=".type foo, %function" + function_size=".size foo, .-foo";; + esac dwarf4_debug_info_size=0x46 dwarf4_high_pc_form=7 dwarf4_debug_aranges_size=0x2c @@ -5475,16 +5485,16 @@ if test x"$insn" != x; then .Ltext0: .p2align 4 .globl foo - .type foo, %function + $function_type foo: .LFB0: .LM1: $insn .LM2: .LFE0: - .size foo, .-foo + $function_size .Letext0: - .section .debug_info,\"\",%progbits + .section .debug_info,$section_flags .Ldebug_info0: .4byte $dwarf4_debug_info_size .2byte 0x4 @@ -5508,7 +5518,7 @@ foo: .byte 0x1 .byte 0x9c .byte 0 - .section .debug_abbrev,\"\",%progbits + .section .debug_abbrev,$section_flags .Ldebug_abbrev0: .byte 0x1 .byte 0x11 @@ -5551,7 +5561,7 @@ foo: .byte 0 .byte 0 .byte 0 - .section .debug_aranges,\"\",%progbits + .section .debug_aranges,$section_flags .4byte $dwarf4_debug_aranges_size .2byte 0x2 .4byte .Ldebug_info0 @@ -5563,7 +5573,7 @@ foo: .${dwarf4_addr_size}byte .Letext0-.Ltext0 .${dwarf4_addr_size}byte 0 .${dwarf4_addr_size}byte 0 - .section .debug_line,\"\",%progbits + .section .debug_line,$section_flags .Ldebug_line0: .4byte .LELT0-.LSLT0 .LSLT0: @@ -5617,7 +5627,7 @@ foo: .byte 0x1 .byte 0x1 .LELT0: - .section .debug_str,\"\",%progbits + .section .debug_str,$section_flags .ident \"GCC\" " dwarf4_success=no @@ -5673,10 +5683,10 @@ foo: conftest_s="\ .text .globl foo - .type foo, %function + $function_type foo: $insn - .size foo, .-foo + $function_size .file 1 \"foo.c\" " gcc_GAS_CHECK_FEATURE([working --gdwarf-4/--gdwarf-5 for all sources], @@ -5684,8 +5694,8 @@ foo: [--gdwarf-4], [$conftest_s], [changequote(,)dnl - if test x$gcc_cv_readelf != x \ - && $gcc_cv_readelf -w conftest.o 2>&1 \ + if test x$gcc_cv_objdump != x \ + && $gcc_cv_objdump -W conftest.o 2>&1 \ | grep conftest.s > /dev/null 2>&1; then gcc_cv_as_working_gdwarf_n_flag=no else diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog index cfe9aa4..368ef75 100644 --- a/gcc/cp/ChangeLog +++ b/gcc/cp/ChangeLog @@ -1,3 +1,21 @@ +2021-06-23 Patrick Palka <ppalka@redhat.com> + + PR c++/101174 + * pt.c (push_access_scope): For artificial deduction guides, + set the access scope to that of the constructor. + (pop_access_scope): Likewise. + (build_deduction_guide): Don't set DECL_CONTEXT on the guide. + +2021-06-23 Patrick Palka <ppalka@redhat.com> + + PR c++/86439 + * call.c (print_error_for_call_failure): Constify 'args' parameter. + (perform_dguide_overload_resolution): Define. + * cp-tree.h: (perform_dguide_overload_resolution): Declare. + * pt.c (do_class_deduction): Use perform_dguide_overload_resolution + instead of build_new_function_call. Don't use tf_decltype or + set cp_unevaluated_operand. Remove unnecessary NULL_TREE tests. + 2021-06-21 Patrick Palka <ppalka@redhat.com> PR c++/67302 diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 9f03534..aafc7ac 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -4629,7 +4629,7 @@ perform_overload_resolution (tree fn, functions. */ static void -print_error_for_call_failure (tree fn, vec<tree, va_gc> *args, +print_error_for_call_failure (tree fn, const vec<tree, va_gc> *args, struct z_candidate *candidates) { tree targs = NULL_TREE; @@ -4654,6 +4654,40 @@ print_error_for_call_failure (tree fn, vec<tree, va_gc> *args, print_z_candidates (loc, candidates); } +/* Perform overload resolution on the set of deduction guides DGUIDES + using ARGS. Returns the selected deduction guide, or error_mark_node + if overload resolution fails. */ + +tree +perform_dguide_overload_resolution (tree dguides, const vec<tree, va_gc> *args, + tsubst_flags_t complain) +{ + z_candidate *candidates; + bool any_viable_p; + tree result; + + gcc_assert (deduction_guide_p (OVL_FIRST (dguides))); + + /* Get the high-water mark for the CONVERSION_OBSTACK. */ + void *p = conversion_obstack_alloc (0); + + z_candidate *cand = perform_overload_resolution (dguides, args, &candidates, + &any_viable_p, complain); + if (!cand) + { + if (complain & tf_error) + print_error_for_call_failure (dguides, args, candidates); + result = error_mark_node; + } + else + result = cand->fn; + + /* Free all the conversions we allocated. */ + obstack_free (&conversion_obstack, p); + + return result; +} + /* Return an expression for a call to FN (a namespace-scope function, or a static member function) with the ARGS. This may change ARGS. */ diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 36f99cc..6f71371 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -6437,6 +6437,8 @@ extern void complain_about_bad_argument (location_t arg_loc, tree from_type, tree to_type, tree fndecl, int parmnum); extern void maybe_inform_about_fndecl_for_bogus_argument_init (tree, int); +extern tree perform_dguide_overload_resolution (tree, const vec<tree, va_gc> *, + tsubst_flags_t); /* A class for recording information about access failures (e.g. private diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index d57ddc4..b7a4298 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -40877,7 +40877,9 @@ cp_omp_split_clauses (location_t loc, enum tree_code code, c_omp_split_clauses (loc, code, mask, clauses, cclauses); for (i = 0; i < C_OMP_CLAUSE_SPLIT_COUNT; i++) if (cclauses[i]) - cclauses[i] = finish_omp_clauses (cclauses[i], C_ORT_OMP); + cclauses[i] = finish_omp_clauses (cclauses[i], + i == C_OMP_CLAUSE_SPLIT_TARGET + ? C_ORT_OMP_TARGET : C_ORT_OMP); } /* OpenMP 5.0: @@ -42219,6 +42221,7 @@ cp_parser_omp_target_update (cp_parser *parser, cp_token *pragma_tok, | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_FIRSTPRIVATE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEFAULTMAP) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_ALLOCATE) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IN_REDUCTION) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IS_DEVICE_PTR)) static bool @@ -42381,7 +42384,18 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok, OMP_TARGET_CLAUSES (stmt) = cp_parser_omp_all_clauses (parser, OMP_TARGET_CLAUSE_MASK, - "#pragma omp target", pragma_tok); + "#pragma omp target", pragma_tok, false); + for (tree c = OMP_TARGET_CLAUSES (stmt); c; c = OMP_CLAUSE_CHAIN (c)) + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION) + { + tree nc = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); + OMP_CLAUSE_DECL (nc) = OMP_CLAUSE_DECL (c); + OMP_CLAUSE_SET_MAP_KIND (nc, GOMP_MAP_ALWAYS_TOFROM); + OMP_CLAUSE_CHAIN (nc) = OMP_CLAUSE_CHAIN (c); + OMP_CLAUSE_CHAIN (c) = nc; + } + OMP_TARGET_CLAUSES (stmt) + = finish_omp_clauses (OMP_TARGET_CLAUSES (stmt), C_ORT_OMP_TARGET); c_omp_adjust_map_clauses (OMP_TARGET_CLAUSES (stmt), true); pc = &OMP_TARGET_CLAUSES (stmt); diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 15947b2..1af8120 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -236,6 +236,10 @@ push_access_scope (tree t) push_nested_class (DECL_FRIEND_CONTEXT (t)); else if (DECL_CLASS_SCOPE_P (t)) push_nested_class (DECL_CONTEXT (t)); + else if (deduction_guide_p (t) && DECL_ARTIFICIAL (t)) + /* An artificial deduction guide should have the same access as + the constructor. */ + push_nested_class (TREE_TYPE (TREE_TYPE (t))); else push_to_top_level (); @@ -255,7 +259,9 @@ pop_access_scope (tree t) if (TREE_CODE (t) == FUNCTION_DECL) current_function_decl = saved_access_scope->pop(); - if (DECL_FRIEND_CONTEXT (t) || DECL_CLASS_SCOPE_P (t)) + if (DECL_FRIEND_CONTEXT (t) + || DECL_CLASS_SCOPE_P (t) + || (deduction_guide_p (t) && DECL_ARTIFICIAL (t))) pop_nested_class (); else pop_from_top_level (); @@ -18880,9 +18886,12 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl, case OACC_DATA: case OMP_TARGET_DATA: case OMP_TARGET: - tmp = tsubst_omp_clauses (OMP_CLAUSES (t), (TREE_CODE (t) == OACC_DATA) - ? C_ORT_ACC : C_ORT_OMP, args, complain, - in_decl); + tmp = tsubst_omp_clauses (OMP_CLAUSES (t), + TREE_CODE (t) == OACC_DATA + ? C_ORT_ACC + : TREE_CODE (t) == OMP_TARGET + ? C_ORT_OMP_TARGET : C_ORT_OMP, + args, complain, in_decl); keep_next_level (true); stmt = begin_omp_structured_block (); @@ -28804,9 +28813,6 @@ build_deduction_guide (tree type, tree ctor, tree outer_args, tsubst_flags_t com DECL_ABSTRACT_ORIGIN (ded_tmpl) = fn_tmpl; if (ci) set_constraints (ded_tmpl, ci); - /* The artificial deduction guide should have same access as the - constructor. */ - DECL_CONTEXT (ded_fn) = type; return ded_tmpl; } @@ -29382,7 +29388,7 @@ do_class_deduction (tree ptype, tree tmpl, tree init, if (tree guide = maybe_aggr_guide (tmpl, init, args)) cands = lookup_add (guide, cands); - tree call = error_mark_node; + tree fndecl = error_mark_node; /* If this is list-initialization and the class has a list constructor, first try deducing from the list as a single argument, as [over.match.list]. */ @@ -29396,11 +29402,9 @@ do_class_deduction (tree ptype, tree tmpl, tree init, } if (list_cands) { - ++cp_unevaluated_operand; - call = build_new_function_call (list_cands, &args, tf_decltype); - --cp_unevaluated_operand; + fndecl = perform_dguide_overload_resolution (list_cands, args, tf_none); - if (call == error_mark_node) + if (fndecl == error_mark_node) { /* That didn't work, now try treating the list as a sequence of arguments. */ @@ -29416,31 +29420,22 @@ do_class_deduction (tree ptype, tree tmpl, tree init, "user-declared constructors", type); return error_mark_node; } - else if (!cands && call == error_mark_node) + else if (!cands && fndecl == error_mark_node) { error ("cannot deduce template arguments of %qT, as it has no viable " "deduction guides", type); return error_mark_node; } - if (call == error_mark_node) - { - ++cp_unevaluated_operand; - call = build_new_function_call (cands, &args, tf_decltype); - --cp_unevaluated_operand; - } + if (fndecl == error_mark_node) + fndecl = perform_dguide_overload_resolution (cands, args, tf_none); - if (call == error_mark_node) + if (fndecl == error_mark_node) { if (complain & tf_warning_or_error) { error ("class template argument deduction failed:"); - - ++cp_unevaluated_operand; - call = build_new_function_call (cands, &args, - complain | tf_decltype); - --cp_unevaluated_operand; - + perform_dguide_overload_resolution (cands, args, complain); if (elided) inform (input_location, "explicit deduction guides not considered " "for copy-initialization"); @@ -29451,8 +29446,7 @@ do_class_deduction (tree ptype, tree tmpl, tree init, constructor is chosen, the initialization is ill-formed. */ else if (flags & LOOKUP_ONLYCONVERTING) { - tree fndecl = cp_get_callee_fndecl_nofold (call); - if (fndecl && DECL_NONCONVERTING_P (fndecl)) + if (DECL_NONCONVERTING_P (fndecl)) { if (complain & tf_warning_or_error) { @@ -29470,12 +29464,10 @@ do_class_deduction (tree ptype, tree tmpl, tree init, /* If CTAD succeeded but the type doesn't have any explicit deduction guides, this deduction might not be what the user intended. */ - if (call != error_mark_node && !any_dguides_p) + if (fndecl != error_mark_node && !any_dguides_p) { - tree fndecl = cp_get_callee_fndecl_nofold (call); - if (fndecl != NULL_TREE - && (!DECL_IN_SYSTEM_HEADER (fndecl) - || global_dc->dc_warn_system_headers) + if ((!DECL_IN_SYSTEM_HEADER (fndecl) + || global_dc->dc_warn_system_headers) && warning (OPT_Wctad_maybe_unsupported, "%qT may not intend to support class template argument " "deduction", type)) @@ -29483,7 +29475,8 @@ do_class_deduction (tree ptype, tree tmpl, tree init, "warning"); } - return cp_build_qualified_type (TREE_TYPE (call), cp_type_quals (ptype)); + return cp_build_qualified_type (TREE_TYPE (TREE_TYPE (fndecl)), + cp_type_quals (ptype)); } /* Replace occurrences of 'auto' in TYPE with the appropriate type deduced diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 384c54b..fbaabf6 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -5042,7 +5042,7 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types, omp_clause_code_name[OMP_CLAUSE_CODE (c)]); return error_mark_node; } - else if (ort == C_ORT_OMP + else if ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP && TREE_CODE (t) == PARM_DECL && DECL_ARTIFICIAL (t) && DECL_NAME (t) == this_identifier @@ -5069,7 +5069,7 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types, return ret; } - if (ort == C_ORT_OMP + if ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TASK_REDUCTION) @@ -5571,33 +5571,30 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort) || (TREE_CODE (t) == COMPONENT_REF && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)) return false; - if (ort == C_ORT_OMP || ort == C_ORT_ACC) - switch (OMP_CLAUSE_MAP_KIND (c)) - { - case GOMP_MAP_ALLOC: - case GOMP_MAP_IF_PRESENT: - case GOMP_MAP_TO: - case GOMP_MAP_FROM: - case GOMP_MAP_TOFROM: - case GOMP_MAP_ALWAYS_TO: - case GOMP_MAP_ALWAYS_FROM: - case GOMP_MAP_ALWAYS_TOFROM: - case GOMP_MAP_RELEASE: - case GOMP_MAP_DELETE: - case GOMP_MAP_FORCE_TO: - case GOMP_MAP_FORCE_FROM: - case GOMP_MAP_FORCE_TOFROM: - case GOMP_MAP_FORCE_PRESENT: - OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION (c) = 1; - break; - default: - break; - } + switch (OMP_CLAUSE_MAP_KIND (c)) + { + case GOMP_MAP_ALLOC: + case GOMP_MAP_IF_PRESENT: + case GOMP_MAP_TO: + case GOMP_MAP_FROM: + case GOMP_MAP_TOFROM: + case GOMP_MAP_ALWAYS_TO: + case GOMP_MAP_ALWAYS_FROM: + case GOMP_MAP_ALWAYS_TOFROM: + case GOMP_MAP_RELEASE: + case GOMP_MAP_DELETE: + case GOMP_MAP_FORCE_TO: + case GOMP_MAP_FORCE_FROM: + case GOMP_MAP_FORCE_TOFROM: + case GOMP_MAP_FORCE_PRESENT: + OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION (c) = 1; + break; + default: + break; + } tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); - if ((ort & C_ORT_OMP_DECLARE_SIMD) != C_ORT_OMP && ort != C_ORT_ACC) - OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_POINTER); - else if (TREE_CODE (t) == COMPONENT_REF) + if (TREE_CODE (t) == COMPONENT_REF) OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_ATTACH_DETACH); else if (REFERENCE_REF_P (t) && TREE_CODE (TREE_OPERAND (t, 0)) == COMPONENT_REF) @@ -6592,6 +6589,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) tree detach_seen = NULL_TREE; bool mergeable_seen = false; bool implicit_moved = false; + bool target_in_reduction_seen = false; bitmap_obstack_initialize (NULL); bitmap_initialize (&generic_head, &bitmap_default_obstack); @@ -6603,7 +6601,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) bitmap_initialize (&map_field_head, &bitmap_default_obstack); bitmap_initialize (&map_firstprivate_head, &bitmap_default_obstack); /* If ort == C_ORT_OMP used as nontemporal_head or use_device_xxx_head - instead. */ + instead and for ort == C_ORT_OMP_TARGET used as in_reduction_head. */ bitmap_initialize (&oacc_reduction_head, &bitmap_default_obstack); if (ort & C_ORT_ACC) @@ -6866,8 +6864,22 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) || (ort == C_ORT_OMP && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_PTR || (OMP_CLAUSE_CODE (c) - == OMP_CLAUSE_USE_DEVICE_ADDR)))) + == OMP_CLAUSE_USE_DEVICE_ADDR))) + || (ort == C_ORT_OMP_TARGET + && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION)) { + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION + && (bitmap_bit_p (&generic_head, DECL_UID (t)) + || bitmap_bit_p (&firstprivate_head, DECL_UID (t)))) + { + error_at (OMP_CLAUSE_LOCATION (c), + "%qD appears more than once in data-sharing " + "clauses", t); + remove = true; + break; + } + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION) + target_in_reduction_seen = true; if (bitmap_bit_p (&oacc_reduction_head, DECL_UID (t))) { error_at (OMP_CLAUSE_LOCATION (c), @@ -6882,7 +6894,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } else if (bitmap_bit_p (&generic_head, DECL_UID (t)) || bitmap_bit_p (&firstprivate_head, DECL_UID (t)) - || bitmap_bit_p (&lastprivate_head, DECL_UID (t))) + || bitmap_bit_p (&lastprivate_head, DECL_UID (t)) + || bitmap_bit_p (&map_firstprivate_head, DECL_UID (t))) { error_at (OMP_CLAUSE_LOCATION (c), "%qD appears more than once in data clauses", t); @@ -6982,7 +6995,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) && bitmap_bit_p (&map_firstprivate_head, DECL_UID (t))) remove = true; else if (bitmap_bit_p (&generic_head, DECL_UID (t)) - || bitmap_bit_p (&firstprivate_head, DECL_UID (t))) + || bitmap_bit_p (&firstprivate_head, DECL_UID (t)) + || bitmap_bit_p (&map_firstprivate_head, DECL_UID (t))) { error_at (OMP_CLAUSE_LOCATION (c), "%qD appears more than once in data clauses", t); @@ -7795,13 +7809,10 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) t = TREE_OPERAND (t, 0); OMP_CLAUSE_DECL (c) = t; } - if ((ort == C_ORT_ACC || ort == C_ORT_OMP) - && TREE_CODE (t) == COMPONENT_REF + if (TREE_CODE (t) == COMPONENT_REF && TREE_CODE (TREE_OPERAND (t, 0)) == INDIRECT_REF) t = TREE_OPERAND (TREE_OPERAND (t, 0), 0); if (TREE_CODE (t) == COMPONENT_REF - && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP - || ort == C_ORT_ACC) && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_) { if (type_dependent_expression_p (t)) @@ -7842,7 +7853,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) if (VAR_P (t) || TREE_CODE (t) == PARM_DECL) { if (bitmap_bit_p (&map_field_head, DECL_UID (t)) - || (ort == C_ORT_OMP + || (ort != C_ORT_ACC && bitmap_bit_p (&map_head, DECL_UID (t)))) goto handle_map_references; } @@ -7924,7 +7935,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER) { if (bitmap_bit_p (&generic_head, DECL_UID (t)) - || bitmap_bit_p (&firstprivate_head, DECL_UID (t))) + || bitmap_bit_p (&firstprivate_head, DECL_UID (t)) + || bitmap_bit_p (&map_firstprivate_head, DECL_UID (t))) { error_at (OMP_CLAUSE_LOCATION (c), "%qD appears more than once in data clauses", t); @@ -7941,10 +7953,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) remove = true; } else - { - bitmap_set_bit (&generic_head, DECL_UID (t)); - bitmap_set_bit (&map_firstprivate_head, DECL_UID (t)); - } + bitmap_set_bit (&map_firstprivate_head, DECL_UID (t)); } else if (bitmap_bit_p (&map_head, DECL_UID (t)) && !bitmap_bit_p (&map_field_head, DECL_UID (t))) @@ -7960,8 +7969,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) "%qD appears more than once in map clauses", t); remove = true; } - else if (bitmap_bit_p (&generic_head, DECL_UID (t)) - && ort == C_ORT_ACC) + else if (ort == C_ORT_ACC + && bitmap_bit_p (&generic_head, DECL_UID (t))) { error_at (OMP_CLAUSE_LOCATION (c), "%qD appears more than once in data clauses", t); @@ -8511,6 +8520,22 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } pc = &OMP_CLAUSE_CHAIN (c); continue; + case OMP_CLAUSE_MAP: + if (target_in_reduction_seen && !processing_template_decl) + { + t = OMP_CLAUSE_DECL (c); + while (handled_component_p (t) + || TREE_CODE (t) == INDIRECT_REF + || TREE_CODE (t) == ADDR_EXPR + || TREE_CODE (t) == MEM_REF + || TREE_CODE (t) == NON_LVALUE_EXPR) + t = TREE_OPERAND (t, 0); + if (DECL_P (t) + && bitmap_bit_p (&oacc_reduction_head, DECL_UID (t))) + OMP_CLAUSE_MAP_IN_REDUCTION (c) = 1; + } + pc = &OMP_CLAUSE_CHAIN (c); + continue; case OMP_CLAUSE_NOWAIT: if (copyprivate_seen) { diff --git a/gcc/df-scan.c b/gcc/df-scan.c index e9da64f..3dbda7a 100644 --- a/gcc/df-scan.c +++ b/gcc/df-scan.c @@ -2576,9 +2576,21 @@ df_ref_record (enum df_ref_class cl, if (GET_CODE (reg) == SUBREG) { - regno += subreg_regno_offset (regno, GET_MODE (SUBREG_REG (reg)), - SUBREG_BYTE (reg), GET_MODE (reg)); - endregno = regno + subreg_nregs (reg); + int off = subreg_regno_offset (regno, GET_MODE (SUBREG_REG (reg)), + SUBREG_BYTE (reg), GET_MODE (reg)); + unsigned int nregno = regno + off; + endregno = nregno + subreg_nregs (reg); + if (off < 0 && regno < (unsigned) -off) + /* Deal with paradoxical SUBREGs on big endian where + in debug insns the hard reg number might be smaller + than -off, such as (subreg:DI (reg:SI 0 [+4 ]) 0)); + RA decisions shouldn't be affected by debug insns + and so RA can decide to put pseudo into a hard reg + with small REGNO, even when it is referenced in + a paradoxical SUBREG in a debug insn. */ + regno = 0; + else + regno = nregno; } else endregno = END_REGNO (reg); diff --git a/gcc/doc/lto.texi b/gcc/doc/lto.texi index 1f55216..3c5de21 100644 --- a/gcc/doc/lto.texi +++ b/gcc/doc/lto.texi @@ -36,11 +36,18 @@ bytecode representation of GIMPLE that is emitted in special sections of @code{.o} files. Currently, LTO support is enabled in most ELF-based systems, as well as darwin, cygwin and mingw systems. -Since GIMPLE bytecode is saved alongside final object code, object -files generated with LTO support are larger than regular object files. -This ``fat'' object format makes it easy to integrate LTO into -existing build systems, as one can, for instance, produce archives of -the files. Additionally, one might be able to ship one set of fat +By default, object files generated with LTO support contain only GIMPLE +bytecode. Such objects are called ``slim'', and they require that +tools like @code{ar} and @code{nm} understand symbol tables of LTO +sections. For most targets these tools have been extended to use the +plugin infrastructure, so GCC can support ``slim'' objects consisting +of the intermediate code alone. + +GIMPLE bytecode could also be saved alongside final object code if +the @option{-ffat-lto-objects} option is passed, or if no plugin support +is detected for @code{ar} and @code{nm} when GCC is configured. It makes +the object files generated with LTO support larger than regular object +files. This ``fat'' object format allows to ship one set of fat objects which could be used both for development and the production of optimized builds. A, perhaps surprising, side effect of this feature is that any mistake in the toolchain leads to LTO information not @@ -49,14 +56,6 @@ This is both an advantage, as the system is more robust, and a disadvantage, as the user is not informed that the optimization has been disabled. -The current implementation only produces ``fat'' objects, effectively -doubling compilation time and increasing file sizes up to 5x the -original size. This hides the problem that some tools, such as -@code{ar} and @code{nm}, need to understand symbol tables of LTO -sections. These tools were extended to use the plugin infrastructure, -and with these problems solved, GCC will also support ``slim'' objects -consisting of the intermediate code alone. - At the highest level, LTO splits the compiler in two. The first half (the ``writer'') produces a streaming representation of all the internal data structures needed to optimize and generate code. This diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 00caf38..1b91814 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5682,6 +5682,14 @@ signed/unsigned elements of size S@. Subtract the high/low elements of 2 from 1 and widen the resulting elements. Put the N/2 results of size 2*S in the output vector (operand 0). +@cindex @code{vec_addsub@var{m}3} instruction pattern +@item @samp{vec_addsub@var{m}3} +Alternating subtract, add with even lanes doing subtract and odd +lanes doing addition. Operands 1 and 2 and the outout operand are vectors +with mode @var{m}. + +These instructions are not allowed to @code{FAIL}. + @cindex @code{mulhisi3} instruction pattern @item @samp{mulhisi3} Multiply operands 1 and 2, which have mode @code{HImode}, and store diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 88eb3f9..9a91981 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -29363,6 +29363,30 @@ dwarf2out_assembly_start (void) && dwarf2out_do_cfi_asm () && !dwarf2out_do_eh_frame ()) fprintf (asm_out_file, "\t.cfi_sections\t.debug_frame\n"); + +#if defined(HAVE_AS_GDWARF_5_DEBUG_FLAG) && defined(HAVE_AS_WORKING_DWARF_N_FLAG) + if (output_asm_line_debug_info () && dwarf_version >= 5) + { + /* When gas outputs DWARF5 .debug_line[_str] then we have to + tell it the comp_dir and main file name for the zero entry + line table. */ + const char *comp_dir, *filename0; + + comp_dir = comp_dir_string (); + if (comp_dir == NULL) + comp_dir = ""; + + filename0 = get_AT_string (comp_unit_die (), DW_AT_name); + if (filename0 == NULL) + filename0 = ""; + + fprintf (asm_out_file, "\t.file 0 "); + output_quoted_string (asm_out_file, remap_debug_filename (comp_dir)); + fputc (' ', asm_out_file); + output_quoted_string (asm_out_file, remap_debug_filename (filename0)); + fputc ('\n', asm_out_file); + } +#endif } /* A helper function for dwarf2out_finish called through @@ -32315,27 +32339,6 @@ dwarf2out_finish (const char *filename) ASM_OUTPUT_LABEL (asm_out_file, debug_line_section_label); if (! output_asm_line_debug_info ()) output_line_info (false); - else if (asm_outputs_debug_line_str ()) - { - /* When gas outputs DWARF5 .debug_line[_str] then we have to - tell it the comp_dir and main file name for the zero entry - line table. */ - const char *comp_dir, *filename0; - - comp_dir = comp_dir_string (); - if (comp_dir == NULL) - comp_dir = ""; - - filename0 = get_AT_string (comp_unit_die (), DW_AT_name); - if (filename0 == NULL) - filename0 = ""; - - fprintf (asm_out_file, "\t.file 0 "); - output_quoted_string (asm_out_file, remap_debug_filename (comp_dir)); - fputc (' ', asm_out_file); - output_quoted_string (asm_out_file, remap_debug_filename (filename0)); - fputc ('\n', asm_out_file); - } if (dwarf_split_debug_info && info_section_emitted) { diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog index e57f613..aded48c 100644 --- a/gcc/fortran/ChangeLog +++ b/gcc/fortran/ChangeLog @@ -1,3 +1,14 @@ +2021-06-23 Tobias Burnus <tobias@codesourcery.com> + + * dump-parse-tree.c (show_omp_clauses): Fix enum type used + for dumping gfc_omp_defaultmap_category. + +2021-06-23 Andre Vehreschild <vehre@gcc.gnu.org> + + PR fortran/100337 + * trans-intrinsic.c (conv_co_collective): Check stat for null ptr + before dereferrencing. + 2021-06-18 Harald Anlauf <anlauf@gmx.de> PR fortran/100283 diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c index 07e98b7..26841ee 100644 --- a/gcc/fortran/dump-parse-tree.c +++ b/gcc/fortran/dump-parse-tree.c @@ -1781,7 +1781,7 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses) if (i != OMP_DEFAULTMAP_CAT_UNCATEGORIZED) { fputc (':', dumpfile); - switch ((enum gfc_omp_defaultmap) i) + switch ((enum gfc_omp_defaultmap_category) i) { case OMP_DEFAULTMAP_CAT_SCALAR: dfltmap = "SCALAR"; break; case OMP_DEFAULTMAP_CAT_AGGREGATE: dfltmap = "AGGREGATE"; break; diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c index e578449..46670ba 100644 --- a/gcc/fortran/trans-intrinsic.c +++ b/gcc/fortran/trans-intrinsic.c @@ -11242,8 +11242,28 @@ conv_co_collective (gfc_code *code) if (flag_coarray == GFC_FCOARRAY_SINGLE) { if (stat != NULL_TREE) - gfc_add_modify (&block, stat, - fold_convert (TREE_TYPE (stat), integer_zero_node)); + { + /* For optional stats, check the pointer is valid before zero'ing. */ + if (gfc_expr_attr (stat_expr).optional) + { + tree tmp; + stmtblock_t ass_block; + gfc_start_block (&ass_block); + gfc_add_modify (&ass_block, stat, + fold_convert (TREE_TYPE (stat), + integer_zero_node)); + tmp = fold_build2 (NE_EXPR, logical_type_node, + gfc_build_addr_expr (NULL_TREE, stat), + null_pointer_node); + tmp = fold_build3 (COND_EXPR, void_type_node, tmp, + gfc_finish_block (&ass_block), + build_empty_stmt (input_location)); + gfc_add_expr_to_block (&block, tmp); + } + else + gfc_add_modify (&block, stat, + fold_convert (TREE_TYPE (stat), integer_zero_node)); + } return gfc_finish_block (&block); } diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc index 4347485..a377261 100644 --- a/gcc/gimple-range-cache.cc +++ b/gcc/gimple-range-cache.cc @@ -132,7 +132,7 @@ non_null_ref::process_name (tree name) class ssa_block_ranges { public: - virtual void set_bb_range (const basic_block bb, const irange &r) = 0; + virtual bool set_bb_range (const basic_block bb, const irange &r) = 0; virtual bool get_bb_range (irange &r, const basic_block bb) = 0; virtual bool bb_range_p (const basic_block bb) = 0; @@ -165,7 +165,7 @@ class sbr_vector : public ssa_block_ranges public: sbr_vector (tree t, irange_allocator *allocator); - virtual void set_bb_range (const basic_block bb, const irange &r) OVERRIDE; + virtual bool set_bb_range (const basic_block bb, const irange &r) OVERRIDE; virtual bool get_bb_range (irange &r, const basic_block bb) OVERRIDE; virtual bool bb_range_p (const basic_block bb) OVERRIDE; protected: @@ -196,7 +196,7 @@ sbr_vector::sbr_vector (tree t, irange_allocator *allocator) // Set the range for block BB to be R. -void +bool sbr_vector::set_bb_range (const basic_block bb, const irange &r) { irange *m; @@ -208,6 +208,7 @@ sbr_vector::set_bb_range (const basic_block bb, const irange &r) else m = m_irange_allocator->allocate (r); m_tab[bb->index] = m; + return true; } // Return the range associated with block BB in R. Return false if @@ -252,7 +253,7 @@ class sbr_sparse_bitmap : public ssa_block_ranges { public: sbr_sparse_bitmap (tree t, irange_allocator *allocator, bitmap_obstack *bm); - virtual void set_bb_range (const basic_block bb, const irange &r) OVERRIDE; + virtual bool set_bb_range (const basic_block bb, const irange &r) OVERRIDE; virtual bool get_bb_range (irange &r, const basic_block bb) OVERRIDE; virtual bool bb_range_p (const basic_block bb) OVERRIDE; private: @@ -312,13 +313,13 @@ sbr_sparse_bitmap::bitmap_get_quad (const_bitmap head, int quad) // Set the range on entry to basic block BB to R. -void +bool sbr_sparse_bitmap::set_bb_range (const basic_block bb, const irange &r) { if (r.undefined_p ()) { bitmap_set_quad (bitvec, bb->index, SBR_UNDEF); - return; + return true; } // Loop thru the values to see if R is already present. @@ -328,11 +329,11 @@ sbr_sparse_bitmap::set_bb_range (const basic_block bb, const irange &r) if (!m_range[x]) m_range[x] = m_irange_allocator->allocate (r); bitmap_set_quad (bitvec, bb->index, x + 1); - return; + return true; } // All values are taken, default to VARYING. bitmap_set_quad (bitvec, bb->index, SBR_VARYING); - return; + return false; } // Return the range associated with block BB in R. Return false if @@ -387,7 +388,7 @@ block_range_cache::~block_range_cache () // Set the range for NAME on entry to block BB to R. // If it has not been accessed yet, allocate it first. -void +bool block_range_cache::set_bb_range (tree name, const basic_block bb, const irange &r) { @@ -413,7 +414,7 @@ block_range_cache::set_bb_range (tree name, const basic_block bb, m_irange_allocator); } } - m_ssa_ranges[v]->set_bb_range (bb, r); + return m_ssa_ranges[v]->set_bb_range (bb, r); } @@ -730,10 +731,12 @@ ranger_cache::ranger_cache () if (bb) m_gori.exports (bb); } + m_propfail = BITMAP_ALLOC (NULL); } ranger_cache::~ranger_cache () { + BITMAP_FREE (m_propfail); if (m_oracle) delete m_oracle; delete m_temporal; @@ -989,7 +992,9 @@ ranger_cache::block_range (irange &r, basic_block bb, tree name, bool calc) void ranger_cache::add_to_update (basic_block bb) { - if (!m_update_list.contains (bb)) + // If propagation has failed for BB, or its already in the list, don't + // add it again. + if (!bitmap_bit_p (m_propfail, bb->index) && !m_update_list.contains (bb)) m_update_list.quick_push (bb); } @@ -1006,6 +1011,7 @@ ranger_cache::propagate_cache (tree name) int_range_max current_range; int_range_max e_range; + gcc_checking_assert (bitmap_empty_p (m_propfail)); // Process each block by seeing if its calculated range on entry is // the same as its cached value. If there is a difference, update // the cache to reflect the new value, and check to see if any @@ -1061,13 +1067,21 @@ ranger_cache::propagate_cache (tree name) // If the range on entry has changed, update it. if (new_range != current_range) { + bool ok_p = m_on_entry.set_bb_range (name, bb, new_range); + // If the cache couldn't set the value, mark it as failed. + if (!ok_p) + bitmap_set_bit (m_propfail, bb->index); if (DEBUG_RANGE_CACHE) { - fprintf (dump_file, " Updating range to "); - new_range.dump (dump_file); + if (!ok_p) + fprintf (dump_file, " Cache failure to store value."); + else + { + fprintf (dump_file, " Updating range to "); + new_range.dump (dump_file); + } fprintf (dump_file, "\n Updating blocks :"); } - m_on_entry.set_bb_range (name, bb, new_range); // Mark each successor that has a range to re-check its range FOR_EACH_EDGE (e, ei, bb->succs) if (m_on_entry.bb_range_p (name, e->dest)) @@ -1080,12 +1094,13 @@ ranger_cache::propagate_cache (tree name) fprintf (dump_file, "\n"); } } - if (DEBUG_RANGE_CACHE) - { - fprintf (dump_file, "DONE visiting blocks for "); - print_generic_expr (dump_file, name, TDF_SLIM); - fprintf (dump_file, "\n"); - } + if (DEBUG_RANGE_CACHE) + { + fprintf (dump_file, "DONE visiting blocks for "); + print_generic_expr (dump_file, name, TDF_SLIM); + fprintf (dump_file, "\n"); + } + bitmap_clear (m_propfail); } // Check to see if an update to the value for NAME in BB has any effect diff --git a/gcc/gimple-range-cache.h b/gcc/gimple-range-cache.h index 04150ea..ecf63dc 100644 --- a/gcc/gimple-range-cache.h +++ b/gcc/gimple-range-cache.h @@ -50,7 +50,7 @@ public: block_range_cache (); ~block_range_cache (); - void set_bb_range (tree name, const basic_block bb, const irange &r); + bool set_bb_range (tree name, const basic_block bb, const irange &r); bool get_bb_range (irange &r, tree name, const basic_block bb); bool bb_range_p (tree name, const basic_block bb); @@ -121,6 +121,7 @@ private: void propagate_updated_value (tree name, basic_block bb); + bitmap m_propfail; vec<basic_block> m_workback; vec<basic_block> m_update_list; }; diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc new file mode 100644 index 0000000..583348e --- /dev/null +++ b/gcc/gimple-range-fold.cc @@ -0,0 +1,1331 @@ +/* Code for GIMPLE range related routines. + Copyright (C) 2019-2021 Free Software Foundation, Inc. + Contributed by Andrew MacLeod <amacleod@redhat.com> + and Aldy Hernandez <aldyh@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "insn-codes.h" +#include "tree.h" +#include "gimple.h" +#include "ssa.h" +#include "gimple-pretty-print.h" +#include "optabs-tree.h" +#include "gimple-fold.h" +#include "wide-int.h" +#include "fold-const.h" +#include "case-cfn-macros.h" +#include "omp-general.h" +#include "cfgloop.h" +#include "tree-ssa-loop.h" +#include "tree-scalar-evolution.h" +#include "vr-values.h" +#include "range.h" +#include "value-query.h" +#include "range-op.h" +#include "gimple-range-fold.h" +#include "gimple-range-edge.h" +#include "gimple-range-gori.h" +// Construct a fur_source, and set the m_query field. + +fur_source::fur_source (range_query *q) +{ + if (q) + m_query = q; + else if (cfun) + m_query = get_range_query (cfun); + else + m_query = get_global_range_query (); + m_gori = NULL; +} + +// Invoke range_of_expr on EXPR. + +bool +fur_source::get_operand (irange &r, tree expr) +{ + return m_query->range_of_expr (r, expr); +} + +// Evaluate EXPR for this stmt as a PHI argument on edge E. Use the current +// range_query to get the range on the edge. + +bool +fur_source::get_phi_operand (irange &r, tree expr, edge e) +{ + return m_query->range_on_edge (r, e, expr); +} + +// Default is no relation. + +relation_kind +fur_source::query_relation (tree op1 ATTRIBUTE_UNUSED, + tree op2 ATTRIBUTE_UNUSED) +{ + return VREL_NONE; +} + +// Default registers nothing. + +void +fur_source::register_relation (gimple *s ATTRIBUTE_UNUSED, + relation_kind k ATTRIBUTE_UNUSED, + tree op1 ATTRIBUTE_UNUSED, + tree op2 ATTRIBUTE_UNUSED) +{ +} + +// Default registers nothing. + +void +fur_source::register_relation (edge e ATTRIBUTE_UNUSED, + relation_kind k ATTRIBUTE_UNUSED, + tree op1 ATTRIBUTE_UNUSED, + tree op2 ATTRIBUTE_UNUSED) +{ +} + +// This version of fur_source will pick a range up off an edge. + +class fur_edge : public fur_source +{ +public: + fur_edge (edge e, range_query *q = NULL); + virtual bool get_operand (irange &r, tree expr) OVERRIDE; + virtual bool get_phi_operand (irange &r, tree expr, edge e) OVERRIDE; +private: + edge m_edge; +}; + +// Instantiate an edge based fur_source. + +inline +fur_edge::fur_edge (edge e, range_query *q) : fur_source (q) +{ + m_edge = e; +} + +// Get the value of EXPR on edge m_edge. + +bool +fur_edge::get_operand (irange &r, tree expr) +{ + return m_query->range_on_edge (r, m_edge, expr); +} + +// Evaluate EXPR for this stmt as a PHI argument on edge E. Use the current +// range_query to get the range on the edge. + +bool +fur_edge::get_phi_operand (irange &r, tree expr, edge e) +{ + // Edge to edge recalculations not supoprted yet, until we sort it out. + gcc_checking_assert (e == m_edge); + return m_query->range_on_edge (r, e, expr); +} + +// Instantiate a stmt based fur_source. + +fur_stmt::fur_stmt (gimple *s, range_query *q) : fur_source (q) +{ + m_stmt = s; +} + +// Retreive range of EXPR as it occurs as a use on stmt M_STMT. + +bool +fur_stmt::get_operand (irange &r, tree expr) +{ + return m_query->range_of_expr (r, expr, m_stmt); +} + +// Evaluate EXPR for this stmt as a PHI argument on edge E. Use the current +// range_query to get the range on the edge. + +bool +fur_stmt::get_phi_operand (irange &r, tree expr, edge e) +{ + // Pick up the range of expr from edge E. + fur_edge e_src (e, m_query); + return e_src.get_operand (r, expr); +} + +// Return relation based from m_stmt. + +relation_kind +fur_stmt::query_relation (tree op1, tree op2) +{ + return m_query->query_relation (m_stmt, op1, op2); +} + +// Instantiate a stmt based fur_source with a GORI object. + + +fur_depend::fur_depend (gimple *s, gori_compute *gori, range_query *q) + : fur_stmt (s, q) +{ + gcc_checking_assert (gori); + m_gori = gori; + // Set relations if there is an oracle in the range_query. + // This will enable registering of relationships as they are discovered. + m_oracle = q->oracle (); + +} + +// Register a relation on a stmt if there is an oracle. + +void +fur_depend::register_relation (gimple *s, relation_kind k, tree op1, tree op2) +{ + if (m_oracle) + m_oracle->register_relation (s, k, op1, op2); +} + +// Register a relation on an edge if there is an oracle. + +void +fur_depend::register_relation (edge e, relation_kind k, tree op1, tree op2) +{ + if (m_oracle) + m_oracle->register_relation (e, k, op1, op2); +} + +// This version of fur_source will pick a range up from a list of ranges +// supplied by the caller. + +class fur_list : public fur_source +{ +public: + fur_list (irange &r1); + fur_list (irange &r1, irange &r2); + fur_list (unsigned num, irange *list); + virtual bool get_operand (irange &r, tree expr) OVERRIDE; + virtual bool get_phi_operand (irange &r, tree expr, edge e) OVERRIDE; +private: + int_range_max m_local[2]; + irange *m_list; + unsigned m_index; + unsigned m_limit; +}; + +// One range supplied for unary operations. + +fur_list::fur_list (irange &r1) : fur_source (NULL) +{ + m_list = m_local; + m_index = 0; + m_limit = 1; + m_local[0] = r1; +} + +// Two ranges supplied for binary operations. + +fur_list::fur_list (irange &r1, irange &r2) : fur_source (NULL) +{ + m_list = m_local; + m_index = 0; + m_limit = 2; + m_local[0] = r1; + m_local[0] = r2; +} + +// Arbitrary number of ranges in a vector. + +fur_list::fur_list (unsigned num, irange *list) : fur_source (NULL) +{ + m_list = list; + m_index = 0; + m_limit = num; +} + +// Get the next operand from the vector, ensure types are compatible. + +bool +fur_list::get_operand (irange &r, tree expr) +{ + if (m_index >= m_limit) + return m_query->range_of_expr (r, expr); + r = m_list[m_index++]; + gcc_checking_assert (range_compatible_p (TREE_TYPE (expr), r.type ())); + return true; +} + +// This will simply pick the next operand from the vector. +bool +fur_list::get_phi_operand (irange &r, tree expr, edge e ATTRIBUTE_UNUSED) +{ + return get_operand (r, expr); +} + +// Fold stmt S into range R using R1 as the first operand. + +bool +fold_range (irange &r, gimple *s, irange &r1) +{ + fold_using_range f; + fur_list src (r1); + return f.fold_stmt (r, s, src); +} + +// Fold stmt S into range R using R1 and R2 as the first two operands. + +bool +fold_range (irange &r, gimple *s, irange &r1, irange &r2) +{ + fold_using_range f; + fur_list src (r1, r2); + return f.fold_stmt (r, s, src); +} + +// Fold stmt S into range R using NUM_ELEMENTS from VECTOR as the initial +// operands encountered. + +bool +fold_range (irange &r, gimple *s, unsigned num_elements, irange *vector) +{ + fold_using_range f; + fur_list src (num_elements, vector); + return f.fold_stmt (r, s, src); +} + +// Fold stmt S into range R using range query Q. + +bool +fold_range (irange &r, gimple *s, range_query *q) +{ + fold_using_range f; + fur_stmt src (s, q); + return f.fold_stmt (r, s, src); +} + +// Recalculate stmt S into R using range query Q as if it were on edge ON_EDGE. + +bool +fold_range (irange &r, gimple *s, edge on_edge, range_query *q) +{ + fold_using_range f; + fur_edge src (on_edge, q); + return f.fold_stmt (r, s, src); +} + +// ------------------------------------------------------------------------- + +// Adjust the range for a pointer difference where the operands came +// from a memchr. +// +// This notices the following sequence: +// +// def = __builtin_memchr (arg, 0, sz) +// n = def - arg +// +// The range for N can be narrowed to [0, PTRDIFF_MAX - 1]. + +static void +adjust_pointer_diff_expr (irange &res, const gimple *diff_stmt) +{ + tree op0 = gimple_assign_rhs1 (diff_stmt); + tree op1 = gimple_assign_rhs2 (diff_stmt); + tree op0_ptype = TREE_TYPE (TREE_TYPE (op0)); + tree op1_ptype = TREE_TYPE (TREE_TYPE (op1)); + gimple *call; + + if (TREE_CODE (op0) == SSA_NAME + && TREE_CODE (op1) == SSA_NAME + && (call = SSA_NAME_DEF_STMT (op0)) + && is_gimple_call (call) + && gimple_call_builtin_p (call, BUILT_IN_MEMCHR) + && TYPE_MODE (op0_ptype) == TYPE_MODE (char_type_node) + && TYPE_PRECISION (op0_ptype) == TYPE_PRECISION (char_type_node) + && TYPE_MODE (op1_ptype) == TYPE_MODE (char_type_node) + && TYPE_PRECISION (op1_ptype) == TYPE_PRECISION (char_type_node) + && gimple_call_builtin_p (call, BUILT_IN_MEMCHR) + && vrp_operand_equal_p (op1, gimple_call_arg (call, 0)) + && integer_zerop (gimple_call_arg (call, 1))) + { + tree max = vrp_val_max (ptrdiff_type_node); + wide_int wmax = wi::to_wide (max, TYPE_PRECISION (TREE_TYPE (max))); + tree expr_type = gimple_expr_type (diff_stmt); + tree range_min = build_zero_cst (expr_type); + tree range_max = wide_int_to_tree (expr_type, wmax - 1); + int_range<2> r (range_min, range_max); + res.intersect (r); + } +} + +// This function looks for situations when walking the use/def chains +// may provide additonal contextual range information not exposed on +// this statement. Like knowing the IMAGPART return value from a +// builtin function is a boolean result. + +// We should rework how we're called, as we have an op_unknown entry +// for IMAGPART_EXPR and POINTER_DIFF_EXPR in range-ops just so this +// function gets called. + +static void +gimple_range_adjustment (irange &res, const gimple *stmt) +{ + switch (gimple_expr_code (stmt)) + { + case POINTER_DIFF_EXPR: + adjust_pointer_diff_expr (res, stmt); + return; + + case IMAGPART_EXPR: + { + tree name = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0); + if (TREE_CODE (name) == SSA_NAME) + { + gimple *def_stmt = SSA_NAME_DEF_STMT (name); + if (def_stmt && is_gimple_call (def_stmt) + && gimple_call_internal_p (def_stmt)) + { + switch (gimple_call_internal_fn (def_stmt)) + { + case IFN_ADD_OVERFLOW: + case IFN_SUB_OVERFLOW: + case IFN_MUL_OVERFLOW: + case IFN_ATOMIC_COMPARE_EXCHANGE: + { + int_range<2> r; + r.set_varying (boolean_type_node); + tree type = TREE_TYPE (gimple_assign_lhs (stmt)); + range_cast (r, type); + res.intersect (r); + } + default: + break; + } + } + } + break; + } + + default: + break; + } +} + +// Return the base of the RHS of an assignment. + +static tree +gimple_range_base_of_assignment (const gimple *stmt) +{ + gcc_checking_assert (gimple_code (stmt) == GIMPLE_ASSIGN); + tree op1 = gimple_assign_rhs1 (stmt); + if (gimple_assign_rhs_code (stmt) == ADDR_EXPR) + return get_base_address (TREE_OPERAND (op1, 0)); + return op1; +} + +// Return the first operand of this statement if it is a valid operand +// supported by ranges, otherwise return NULL_TREE. Special case is +// &(SSA_NAME expr), return the SSA_NAME instead of the ADDR expr. + +tree +gimple_range_operand1 (const gimple *stmt) +{ + gcc_checking_assert (gimple_range_handler (stmt)); + + switch (gimple_code (stmt)) + { + case GIMPLE_COND: + return gimple_cond_lhs (stmt); + case GIMPLE_ASSIGN: + { + tree base = gimple_range_base_of_assignment (stmt); + if (base && TREE_CODE (base) == MEM_REF) + { + // If the base address is an SSA_NAME, we return it + // here. This allows processing of the range of that + // name, while the rest of the expression is simply + // ignored. The code in range_ops will see the + // ADDR_EXPR and do the right thing. + tree ssa = TREE_OPERAND (base, 0); + if (TREE_CODE (ssa) == SSA_NAME) + return ssa; + } + return base; + } + default: + break; + } + return NULL; +} + +// Return the second operand of statement STMT, otherwise return NULL_TREE. + +tree +gimple_range_operand2 (const gimple *stmt) +{ + gcc_checking_assert (gimple_range_handler (stmt)); + + switch (gimple_code (stmt)) + { + case GIMPLE_COND: + return gimple_cond_rhs (stmt); + case GIMPLE_ASSIGN: + if (gimple_num_ops (stmt) >= 3) + return gimple_assign_rhs2 (stmt); + default: + break; + } + return NULL_TREE; +} + +// Calculate a range for statement S and return it in R. If NAME is provided it +// represents the SSA_NAME on the LHS of the statement. It is only required +// if there is more than one lhs/output. If a range cannot +// be calculated, return false. + +bool +fold_using_range::fold_stmt (irange &r, gimple *s, fur_source &src, tree name) +{ + bool res = false; + // If name and S are specified, make sure it is an LHS of S. + gcc_checking_assert (!name || !gimple_get_lhs (s) || + name == gimple_get_lhs (s)); + + if (!name) + name = gimple_get_lhs (s); + + // Process addresses. + if (gimple_code (s) == GIMPLE_ASSIGN + && gimple_assign_rhs_code (s) == ADDR_EXPR) + return range_of_address (r, s, src); + + if (gimple_range_handler (s)) + res = range_of_range_op (r, s, src); + else if (is_a<gphi *>(s)) + res = range_of_phi (r, as_a<gphi *> (s), src); + else if (is_a<gcall *>(s)) + res = range_of_call (r, as_a<gcall *> (s), src); + else if (is_a<gassign *> (s) && gimple_assign_rhs_code (s) == COND_EXPR) + res = range_of_cond_expr (r, as_a<gassign *> (s), src); + + if (!res) + { + // If no name is specified, try the expression kind. + if (!name) + { + tree t = gimple_expr_type (s); + if (!irange::supports_type_p (t)) + return false; + r.set_varying (t); + return true; + } + if (!gimple_range_ssa_p (name)) + return false; + // We don't understand the stmt, so return the global range. + r = gimple_range_global (name); + return true; + } + + if (r.undefined_p ()) + return true; + + // We sometimes get compatible types copied from operands, make sure + // the correct type is being returned. + if (name && TREE_TYPE (name) != r.type ()) + { + gcc_checking_assert (range_compatible_p (r.type (), TREE_TYPE (name))); + range_cast (r, TREE_TYPE (name)); + } + return true; +} + +// Calculate a range for range_op statement S and return it in R. If any +// If a range cannot be calculated, return false. + +bool +fold_using_range::range_of_range_op (irange &r, gimple *s, fur_source &src) +{ + int_range_max range1, range2; + tree type = gimple_expr_type (s); + range_operator *handler = gimple_range_handler (s); + gcc_checking_assert (handler); + gcc_checking_assert (irange::supports_type_p (type)); + + tree lhs = gimple_get_lhs (s); + tree op1 = gimple_range_operand1 (s); + tree op2 = gimple_range_operand2 (s); + + if (src.get_operand (range1, op1)) + { + if (!op2) + { + // Fold range, and register any dependency if available. + int_range<2> r2 (type); + handler->fold_range (r, type, range1, r2); + if (lhs && gimple_range_ssa_p (op1)) + { + if (src.gori ()) + src.gori ()->register_dependency (lhs, op1); + relation_kind rel; + rel = handler->lhs_op1_relation (r, range1, range1); + if (rel != VREL_NONE) + src.register_relation (s, rel, lhs, op1); + } + } + else if (src.get_operand (range2, op2)) + { + relation_kind rel = src.query_relation (op1, op2); + if (dump_file && (dump_flags & TDF_DETAILS) && rel != VREL_NONE) + { + fprintf (dump_file, " folding with relation "); + print_relation (dump_file, rel); + fputc ('\n', dump_file); + } + // Fold range, and register any dependency if available. + handler->fold_range (r, type, range1, range2, rel); + relation_fold_and_or (r, s, src); + if (lhs) + { + if (src.gori ()) + { + src.gori ()->register_dependency (lhs, op1); + src.gori ()->register_dependency (lhs, op2); + } + if (gimple_range_ssa_p (op1)) + { + rel = handler->lhs_op1_relation (r, range1, range2); + if (rel != VREL_NONE) + src.register_relation (s, rel, lhs, op1); + } + if (gimple_range_ssa_p (op2)) + { + rel= handler->lhs_op2_relation (r, range1, range2); + if (rel != VREL_NONE) + src.register_relation (s, rel, lhs, op2); + } + } + else if (is_a<gcond *> (s)) + postfold_gcond_edges (as_a<gcond *> (s), src); + } + else + r.set_varying (type); + } + else + r.set_varying (type); + // Make certain range-op adjustments that aren't handled any other way. + gimple_range_adjustment (r, s); + return true; +} + +// Calculate the range of an assignment containing an ADDR_EXPR. +// Return the range in R. +// If a range cannot be calculated, set it to VARYING and return true. + +bool +fold_using_range::range_of_address (irange &r, gimple *stmt, fur_source &src) +{ + gcc_checking_assert (gimple_code (stmt) == GIMPLE_ASSIGN); + gcc_checking_assert (gimple_assign_rhs_code (stmt) == ADDR_EXPR); + + bool strict_overflow_p; + tree expr = gimple_assign_rhs1 (stmt); + poly_int64 bitsize, bitpos; + tree offset; + machine_mode mode; + int unsignedp, reversep, volatilep; + tree base = get_inner_reference (TREE_OPERAND (expr, 0), &bitsize, + &bitpos, &offset, &mode, &unsignedp, + &reversep, &volatilep); + + + if (base != NULL_TREE + && TREE_CODE (base) == MEM_REF + && TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME) + { + tree ssa = TREE_OPERAND (base, 0); + tree lhs = gimple_get_lhs (stmt); + if (lhs && gimple_range_ssa_p (ssa) && src.gori ()) + src.gori ()->register_dependency (lhs, ssa); + gcc_checking_assert (irange::supports_type_p (TREE_TYPE (ssa))); + src.get_operand (r, ssa); + range_cast (r, TREE_TYPE (gimple_assign_rhs1 (stmt))); + + poly_offset_int off = 0; + bool off_cst = false; + if (offset == NULL_TREE || TREE_CODE (offset) == INTEGER_CST) + { + off = mem_ref_offset (base); + if (offset) + off += poly_offset_int::from (wi::to_poly_wide (offset), + SIGNED); + off <<= LOG2_BITS_PER_UNIT; + off += bitpos; + off_cst = true; + } + /* If &X->a is equal to X, the range of X is the result. */ + if (off_cst && known_eq (off, 0)) + return true; + else if (flag_delete_null_pointer_checks + && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (expr))) + { + /* For -fdelete-null-pointer-checks -fno-wrapv-pointer we don't + allow going from non-NULL pointer to NULL. */ + if(!range_includes_zero_p (&r)) + return true; + } + /* If MEM_REF has a "positive" offset, consider it non-NULL + always, for -fdelete-null-pointer-checks also "negative" + ones. Punt for unknown offsets (e.g. variable ones). */ + if (!TYPE_OVERFLOW_WRAPS (TREE_TYPE (expr)) + && off_cst + && known_ne (off, 0) + && (flag_delete_null_pointer_checks || known_gt (off, 0))) + { + r = range_nonzero (TREE_TYPE (gimple_assign_rhs1 (stmt))); + return true; + } + r = int_range<2> (TREE_TYPE (gimple_assign_rhs1 (stmt))); + return true; + } + + // Handle "= &a". + if (tree_single_nonzero_warnv_p (expr, &strict_overflow_p)) + { + r = range_nonzero (TREE_TYPE (gimple_assign_rhs1 (stmt))); + return true; + } + + // Otherwise return varying. + r = int_range<2> (TREE_TYPE (gimple_assign_rhs1 (stmt))); + return true; +} + +// Calculate a range for phi statement S and return it in R. +// If a range cannot be calculated, return false. + +bool +fold_using_range::range_of_phi (irange &r, gphi *phi, fur_source &src) +{ + tree phi_def = gimple_phi_result (phi); + tree type = TREE_TYPE (phi_def); + int_range_max arg_range; + unsigned x; + + if (!irange::supports_type_p (type)) + return false; + + // Start with an empty range, unioning in each argument's range. + r.set_undefined (); + for (x = 0; x < gimple_phi_num_args (phi); x++) + { + tree arg = gimple_phi_arg_def (phi, x); + edge e = gimple_phi_arg_edge (phi, x); + + // Register potential dependencies for stale value tracking. + if (gimple_range_ssa_p (arg) && src.gori ()) + src.gori ()->register_dependency (phi_def, arg); + + // Get the range of the argument on its edge. + src.get_phi_operand (arg_range, arg, e); + // If we're recomputing the argument elsewhere, try to refine it. + r.union_ (arg_range); + // Once the value reaches varying, stop looking. + if (r.varying_p ()) + break; + } + + // If SCEV is available, query if this PHI has any knonwn values. + if (scev_initialized_p () && !POINTER_TYPE_P (TREE_TYPE (phi_def))) + { + value_range loop_range; + class loop *l = loop_containing_stmt (phi); + if (l && loop_outer (l)) + { + range_of_ssa_name_with_loop_info (loop_range, phi_def, l, phi, src); + if (!loop_range.varying_p ()) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, " Loops range found for "); + print_generic_expr (dump_file, phi_def, TDF_SLIM); + fprintf (dump_file, ": "); + loop_range.dump (dump_file); + fprintf (dump_file, " and calculated range :"); + r.dump (dump_file); + fprintf (dump_file, "\n"); + } + r.intersect (loop_range); + } + } + } + + return true; +} + +// Calculate a range for call statement S and return it in R. +// If a range cannot be calculated, return false. + +bool +fold_using_range::range_of_call (irange &r, gcall *call, fur_source &src) +{ + tree type = gimple_call_return_type (call); + tree lhs = gimple_call_lhs (call); + bool strict_overflow_p; + + if (!irange::supports_type_p (type)) + return false; + + if (range_of_builtin_call (r, call, src)) + ; + else if (gimple_stmt_nonnegative_warnv_p (call, &strict_overflow_p)) + r.set (build_int_cst (type, 0), TYPE_MAX_VALUE (type)); + else if (gimple_call_nonnull_result_p (call) + || gimple_call_nonnull_arg (call)) + r = range_nonzero (type); + else + r.set_varying (type); + + // If there is an LHS, intersect that with what is known. + if (lhs) + { + value_range def; + def = gimple_range_global (lhs); + r.intersect (def); + } + return true; +} + +// Return the range of a __builtin_ubsan* in CALL and set it in R. +// CODE is the type of ubsan call (PLUS_EXPR, MINUS_EXPR or +// MULT_EXPR). + +void +fold_using_range::range_of_builtin_ubsan_call (irange &r, gcall *call, + tree_code code, fur_source &src) +{ + gcc_checking_assert (code == PLUS_EXPR || code == MINUS_EXPR + || code == MULT_EXPR); + tree type = gimple_call_return_type (call); + range_operator *op = range_op_handler (code, type); + gcc_checking_assert (op); + int_range_max ir0, ir1; + tree arg0 = gimple_call_arg (call, 0); + tree arg1 = gimple_call_arg (call, 1); + src.get_operand (ir0, arg0); + src.get_operand (ir1, arg1); + + bool saved_flag_wrapv = flag_wrapv; + // Pretend the arithmetic is wrapping. If there is any overflow, + // we'll complain, but will actually do wrapping operation. + flag_wrapv = 1; + op->fold_range (r, type, ir0, ir1); + flag_wrapv = saved_flag_wrapv; + + // If for both arguments vrp_valueize returned non-NULL, this should + // have been already folded and if not, it wasn't folded because of + // overflow. Avoid removing the UBSAN_CHECK_* calls in that case. + if (r.singleton_p ()) + r.set_varying (type); +} + +// For a builtin in CALL, return a range in R if known and return +// TRUE. Otherwise return FALSE. + +bool +fold_using_range::range_of_builtin_call (irange &r, gcall *call, + fur_source &src) +{ + combined_fn func = gimple_call_combined_fn (call); + if (func == CFN_LAST) + return false; + + tree type = gimple_call_return_type (call); + tree arg; + int mini, maxi, zerov = 0, prec; + scalar_int_mode mode; + + switch (func) + { + case CFN_BUILT_IN_CONSTANT_P: + if (cfun->after_inlining) + { + r.set_zero (type); + // r.equiv_clear (); + return true; + } + arg = gimple_call_arg (call, 0); + if (src.get_operand (r, arg) && r.singleton_p ()) + { + r.set (build_one_cst (type), build_one_cst (type)); + return true; + } + break; + + CASE_CFN_FFS: + CASE_CFN_POPCOUNT: + // __builtin_ffs* and __builtin_popcount* return [0, prec]. + arg = gimple_call_arg (call, 0); + prec = TYPE_PRECISION (TREE_TYPE (arg)); + mini = 0; + maxi = prec; + src.get_operand (r, arg); + // If arg is non-zero, then ffs or popcount are non-zero. + if (!range_includes_zero_p (&r)) + mini = 1; + // If some high bits are known to be zero, decrease the maximum. + if (!r.undefined_p ()) + { + if (TYPE_SIGN (r.type ()) == SIGNED) + range_cast (r, unsigned_type_for (r.type ())); + wide_int max = r.upper_bound (); + maxi = wi::floor_log2 (max) + 1; + } + r.set (build_int_cst (type, mini), build_int_cst (type, maxi)); + return true; + + CASE_CFN_PARITY: + r.set (build_zero_cst (type), build_one_cst (type)); + return true; + + CASE_CFN_CLZ: + // __builtin_c[lt]z* return [0, prec-1], except when the + // argument is 0, but that is undefined behavior. + // + // For __builtin_c[lt]z* consider argument of 0 always undefined + // behavior, for internal fns depending on C?Z_DEFINED_VALUE_AT_ZERO. + arg = gimple_call_arg (call, 0); + prec = TYPE_PRECISION (TREE_TYPE (arg)); + mini = 0; + maxi = prec - 1; + mode = SCALAR_INT_TYPE_MODE (TREE_TYPE (arg)); + if (gimple_call_internal_p (call)) + { + if (optab_handler (clz_optab, mode) != CODE_FOR_nothing + && CLZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2) + { + // Only handle the single common value. + if (zerov == prec) + maxi = prec; + else + // Magic value to give up, unless we can prove arg is non-zero. + mini = -2; + } + } + + src.get_operand (r, arg); + // From clz of minimum we can compute result maximum. + if (!r.undefined_p ()) + { + // From clz of minimum we can compute result maximum. + if (wi::gt_p (r.lower_bound (), 0, TYPE_SIGN (r.type ()))) + { + maxi = prec - 1 - wi::floor_log2 (r.lower_bound ()); + if (mini == -2) + mini = 0; + } + else if (!range_includes_zero_p (&r)) + { + mini = 0; + maxi = prec - 1; + } + if (mini == -2) + break; + // From clz of maximum we can compute result minimum. + wide_int max = r.upper_bound (); + int newmini = prec - 1 - wi::floor_log2 (max); + if (max == 0) + { + // If CLZ_DEFINED_VALUE_AT_ZERO is 2 with VALUE of prec, + // return [prec, prec], otherwise ignore the range. + if (maxi == prec) + mini = prec; + } + else + mini = newmini; + } + if (mini == -2) + break; + r.set (build_int_cst (type, mini), build_int_cst (type, maxi)); + return true; + + CASE_CFN_CTZ: + // __builtin_ctz* return [0, prec-1], except for when the + // argument is 0, but that is undefined behavior. + // + // For __builtin_ctz* consider argument of 0 always undefined + // behavior, for internal fns depending on CTZ_DEFINED_VALUE_AT_ZERO. + arg = gimple_call_arg (call, 0); + prec = TYPE_PRECISION (TREE_TYPE (arg)); + mini = 0; + maxi = prec - 1; + mode = SCALAR_INT_TYPE_MODE (TREE_TYPE (arg)); + if (gimple_call_internal_p (call)) + { + if (optab_handler (ctz_optab, mode) != CODE_FOR_nothing + && CTZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2) + { + // Handle only the two common values. + if (zerov == -1) + mini = -1; + else if (zerov == prec) + maxi = prec; + else + // Magic value to give up, unless we can prove arg is non-zero. + mini = -2; + } + } + src.get_operand (r, arg); + if (!r.undefined_p ()) + { + // If arg is non-zero, then use [0, prec - 1]. + if (!range_includes_zero_p (&r)) + { + mini = 0; + maxi = prec - 1; + } + // If some high bits are known to be zero, we can decrease + // the maximum. + wide_int max = r.upper_bound (); + if (max == 0) + { + // Argument is [0, 0]. If CTZ_DEFINED_VALUE_AT_ZERO + // is 2 with value -1 or prec, return [-1, -1] or [prec, prec]. + // Otherwise ignore the range. + if (mini == -1) + maxi = -1; + else if (maxi == prec) + mini = prec; + } + // If value at zero is prec and 0 is in the range, we can't lower + // the upper bound. We could create two separate ranges though, + // [0,floor_log2(max)][prec,prec] though. + else if (maxi != prec) + maxi = wi::floor_log2 (max); + } + if (mini == -2) + break; + r.set (build_int_cst (type, mini), build_int_cst (type, maxi)); + return true; + + CASE_CFN_CLRSB: + arg = gimple_call_arg (call, 0); + prec = TYPE_PRECISION (TREE_TYPE (arg)); + r.set (build_int_cst (type, 0), build_int_cst (type, prec - 1)); + return true; + case CFN_UBSAN_CHECK_ADD: + range_of_builtin_ubsan_call (r, call, PLUS_EXPR, src); + return true; + case CFN_UBSAN_CHECK_SUB: + range_of_builtin_ubsan_call (r, call, MINUS_EXPR, src); + return true; + case CFN_UBSAN_CHECK_MUL: + range_of_builtin_ubsan_call (r, call, MULT_EXPR, src); + return true; + + case CFN_GOACC_DIM_SIZE: + case CFN_GOACC_DIM_POS: + // Optimizing these two internal functions helps the loop + // optimizer eliminate outer comparisons. Size is [1,N] + // and pos is [0,N-1]. + { + bool is_pos = func == CFN_GOACC_DIM_POS; + int axis = oacc_get_ifn_dim_arg (call); + int size = oacc_get_fn_dim_size (current_function_decl, axis); + if (!size) + // If it's dynamic, the backend might know a hardware limitation. + size = targetm.goacc.dim_limit (axis); + + r.set (build_int_cst (type, is_pos ? 0 : 1), + size + ? build_int_cst (type, size - is_pos) : vrp_val_max (type)); + return true; + } + + case CFN_BUILT_IN_STRLEN: + if (tree lhs = gimple_call_lhs (call)) + if (ptrdiff_type_node + && (TYPE_PRECISION (ptrdiff_type_node) + == TYPE_PRECISION (TREE_TYPE (lhs)))) + { + tree type = TREE_TYPE (lhs); + tree max = vrp_val_max (ptrdiff_type_node); + wide_int wmax + = wi::to_wide (max, TYPE_PRECISION (TREE_TYPE (max))); + tree range_min = build_zero_cst (type); + // To account for the terminating NULL, the maximum length + // is one less than the maximum array size, which in turn + // is one less than PTRDIFF_MAX (or SIZE_MAX where it's + // smaller than the former type). + // FIXME: Use max_object_size() - 1 here. + tree range_max = wide_int_to_tree (type, wmax - 2); + r.set (range_min, range_max); + return true; + } + break; + default: + break; + } + return false; +} + + +// Calculate a range for COND_EXPR statement S and return it in R. +// If a range cannot be calculated, return false. + +bool +fold_using_range::range_of_cond_expr (irange &r, gassign *s, fur_source &src) +{ + int_range_max cond_range, range1, range2; + tree cond = gimple_assign_rhs1 (s); + tree op1 = gimple_assign_rhs2 (s); + tree op2 = gimple_assign_rhs3 (s); + + gcc_checking_assert (gimple_assign_rhs_code (s) == COND_EXPR); + gcc_checking_assert (useless_type_conversion_p (TREE_TYPE (op1), + TREE_TYPE (op2))); + if (!irange::supports_type_p (TREE_TYPE (op1))) + return false; + + src.get_operand (cond_range, cond); + src.get_operand (range1, op1); + src.get_operand (range2, op2); + + // If the condition is known, choose the appropriate expression. + if (cond_range.singleton_p ()) + { + // False, pick second operand. + if (cond_range.zero_p ()) + r = range2; + else + r = range1; + } + else + { + r = range1; + r.union_ (range2); + } + return true; +} + +// If SCEV has any information about phi node NAME, return it as a range in R. + +void +fold_using_range::range_of_ssa_name_with_loop_info (irange &r, tree name, + class loop *l, gphi *phi, + fur_source &src) +{ + gcc_checking_assert (TREE_CODE (name) == SSA_NAME); + tree min, max, type = TREE_TYPE (name); + if (bounds_of_var_in_loop (&min, &max, src.query (), l, phi, name)) + { + if (TREE_CODE (min) != INTEGER_CST) + { + if (src.query ()->range_of_expr (r, min, phi) && !r.undefined_p ()) + min = wide_int_to_tree (type, r.lower_bound ()); + else + min = vrp_val_min (type); + } + if (TREE_CODE (max) != INTEGER_CST) + { + if (src.query ()->range_of_expr (r, max, phi) && !r.undefined_p ()) + max = wide_int_to_tree (type, r.upper_bound ()); + else + max = vrp_val_max (type); + } + r.set (min, max); + } + else + r.set_varying (type); +} + +// ----------------------------------------------------------------------- + +// Check if an && or || expression can be folded based on relations. ie +// c_2 = a_6 > b_7 +// c_3 = a_6 < b_7 +// c_4 = c_2 && c_3 +// c_2 and c_3 can never be true at the same time, +// Therefore c_4 can always resolve to false based purely on the relations. + +void +fold_using_range::relation_fold_and_or (irange& lhs_range, gimple *s, + fur_source &src) +{ + // No queries or already folded. + if (!src.gori () || !src.query ()->oracle () || lhs_range.singleton_p ()) + return; + + // Only care about AND and OR expressions. + enum tree_code code = gimple_expr_code (s); + bool is_and = false; + if (code == BIT_AND_EXPR || code == TRUTH_AND_EXPR) + is_and = true; + else if (code != BIT_IOR_EXPR && code != TRUTH_OR_EXPR) + return; + + tree lhs = gimple_get_lhs (s); + tree ssa1 = gimple_range_ssa_p (gimple_range_operand1 (s)); + tree ssa2 = gimple_range_ssa_p (gimple_range_operand2 (s)); + + // Deal with || and && only when there is a full set of symbolics. + if (!lhs || !ssa1 || !ssa2 + || (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE) + || (TREE_CODE (TREE_TYPE (ssa1)) != BOOLEAN_TYPE) + || (TREE_CODE (TREE_TYPE (ssa2)) != BOOLEAN_TYPE)) + return; + + // Now we know its a boolean AND or OR expression with boolean operands. + // Ideally we search dependencies for common names, and see what pops out. + // until then, simply try to resolve direct dependencies. + + // Both names will need to have 2 direct dependencies. + tree ssa1_dep2 = src.gori ()->depend2 (ssa1); + tree ssa2_dep2 = src.gori ()->depend2 (ssa2); + if (!ssa1_dep2 || !ssa2_dep2) + return; + + tree ssa1_dep1 = src.gori ()->depend1 (ssa1); + tree ssa2_dep1 = src.gori ()->depend1 (ssa2); + // Make sure they are the same dependencies, and detect the order of the + // relationship. + bool reverse_op2 = true; + if (ssa1_dep1 == ssa2_dep1 && ssa1_dep2 == ssa2_dep2) + reverse_op2 = false; + else if (ssa1_dep1 != ssa2_dep2 || ssa1_dep2 != ssa2_dep1) + return; + + range_operator *handler1 = gimple_range_handler (SSA_NAME_DEF_STMT (ssa1)); + range_operator *handler2 = gimple_range_handler (SSA_NAME_DEF_STMT (ssa2)); + + int_range<2> bool_one (boolean_true_node, boolean_true_node); + + relation_kind relation1 = handler1->op1_op2_relation (bool_one); + relation_kind relation2 = handler2->op1_op2_relation (bool_one); + if (relation1 == VREL_NONE || relation2 == VREL_NONE) + return; + + if (reverse_op2) + relation2 = relation_negate (relation2); + + // x && y is false if the relation intersection of the true cases is NULL. + if (is_and && relation_intersect (relation1, relation2) == VREL_EMPTY) + lhs_range = int_range<2> (boolean_false_node, boolean_false_node); + // x || y is true if the union of the true cases is NO-RELATION.. + // ie, one or the other being true covers the full range of possibilties. + else if (!is_and && relation_union (relation1, relation2) == VREL_NONE) + lhs_range = bool_one; + else + return; + + range_cast (lhs_range, TREE_TYPE (lhs)); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, " Relation adjustment: "); + print_generic_expr (dump_file, ssa1, TDF_SLIM); + fprintf (dump_file, " and "); + print_generic_expr (dump_file, ssa2, TDF_SLIM); + fprintf (dump_file, " combine to produce "); + lhs_range.dump (dump_file); + fputc ('\n', dump_file); + } + + return; +} + +// Register any outgoing edge relations from a conditional branch. + +void +fold_using_range::postfold_gcond_edges (gcond *s, fur_source &src) +{ + int_range_max r; + tree name; + range_operator *handler; + basic_block bb = gimple_bb (s); + + edge e0 = EDGE_SUCC (bb, 0); + if (!single_pred_p (e0->dest)) + e0 = NULL; + + edge e1 = EDGE_SUCC (bb, 1); + if (!single_pred_p (e1->dest)) + e1 = NULL; + + // At least one edge needs to be single pred. + if (!e0 && !e1) + return; + + // First, register the gcond itself. This will catch statements like + // if (a_2 < b_5) + tree ssa1 = gimple_range_ssa_p (gimple_range_operand1 (s)); + tree ssa2 = gimple_range_ssa_p (gimple_range_operand2 (s)); + if (ssa1 && ssa2) + { + handler = gimple_range_handler (s); + gcc_checking_assert (handler); + if (e0) + { + gcond_edge_range (r, e0); + relation_kind relation = handler->op1_op2_relation (r); + if (relation != VREL_NONE) + src.register_relation (e0, relation, ssa1, ssa2); + } + if (e1) + { + gcond_edge_range (r, e1); + relation_kind relation = handler->op1_op2_relation (r); + if (relation != VREL_NONE) + src.register_relation (e1, relation, ssa1, ssa2); + } + } + + // Outgoing relations of GORI exports require a gori engine. + if (!src.gori ()) + return; + + range_query *q = src.query (); + // Now look for other relations in the exports. This will find stmts + // leading to the condition such as: + // c_2 = a_4 < b_7 + // if (c_2) + + FOR_EACH_GORI_EXPORT_NAME (*(src.gori ()), bb, name) + { + if (TREE_CODE (TREE_TYPE (name)) != BOOLEAN_TYPE) + continue; + gimple *stmt = SSA_NAME_DEF_STMT (name); + handler = gimple_range_handler (stmt); + if (!handler) + continue; + tree ssa1 = gimple_range_ssa_p (gimple_range_operand1 (stmt)); + tree ssa2 = gimple_range_ssa_p (gimple_range_operand2 (stmt)); + if (ssa1 && ssa2) + { + if (e0 && src.gori ()->outgoing_edge_range_p (r, e0, name, *q) + && r.singleton_p ()) + { + relation_kind relation = handler->op1_op2_relation (r); + if (relation != VREL_NONE) + src.register_relation (e0, relation, ssa1, ssa2); + } + if (e1 && src.gori ()->outgoing_edge_range_p (r, e1, name, *q) + && r.singleton_p ()) + { + relation_kind relation = handler->op1_op2_relation (r); + if (relation != VREL_NONE) + src.register_relation (e1, relation, ssa1, ssa2); + } + } + } +} diff --git a/gcc/gimple-range-fold.h b/gcc/gimple-range-fold.h new file mode 100644 index 0000000..aeb9231 --- /dev/null +++ b/gcc/gimple-range-fold.h @@ -0,0 +1,163 @@ +/* Header file for the GIMPLE fold_using_range interface. + Copyright (C) 2019-2021 Free Software Foundation, Inc. + Contributed by Andrew MacLeod <amacleod@redhat.com> + and Aldy Hernandez <aldyh@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#ifndef GCC_GIMPLE_RANGE_FOLD_H +#define GCC_GIMPLE_RANGE_FOLD_H + +// This file is the main include point for gimple range folding. +// These routines will fold stmt S into the result irange R. +// Any ssa_names on the stmt will be calculated using the range_query +// parameter via a call to range_of_expr. +// If no range_query is provided, current global range info will be used. +// The second variation specifies an edge, and stmt S is recalculated as if +// it appeared on that edge. + +// Fold stmt S into range R using range query Q. +bool fold_range (irange &r, gimple *s, range_query *q = NULL); +// Recalculate stmt S into R using range query Q as if it were on edge ON_EDGE. +bool fold_range (irange &r, gimple *s, edge on_edge, range_query *q = NULL); + +// These routines the operands to be specified when manually folding. +// Any excess queries will be drawn from the current range_query. +bool fold_range (irange &r, gimple *s, irange &r1); +bool fold_range (irange &r, gimple *s, irange &r1, irange &r2); +bool fold_range (irange &r, gimple *s, unsigned num_elements, irange *vector); + +// Return the range_operator pointer for this statement. This routine +// can also be used to gate whether a routine is range-ops enabled. + +static inline range_operator * +gimple_range_handler (const gimple *s) +{ + if (const gassign *ass = dyn_cast<const gassign *> (s)) + return range_op_handler (gimple_assign_rhs_code (ass), + TREE_TYPE (gimple_assign_lhs (ass))); + if (const gcond *cond = dyn_cast<const gcond *> (s)) + return range_op_handler (gimple_cond_code (cond), + TREE_TYPE (gimple_cond_lhs (cond))); + return NULL; +} + +// Return EXP if it is an SSA_NAME with a type supported by gimple ranges. + +static inline tree +gimple_range_ssa_p (tree exp) +{ + if (exp && TREE_CODE (exp) == SSA_NAME && + !SSA_NAME_IS_VIRTUAL_OPERAND (exp) && + irange::supports_type_p (TREE_TYPE (exp))) + return exp; + return NULL_TREE; +} + +// Return true if TYPE1 and TYPE2 are compatible range types. + +static inline bool +range_compatible_p (tree type1, tree type2) +{ + // types_compatible_p requires conversion in both directions to be useless. + // GIMPLE only requires a cast one way in order to be compatible. + // Ranges really only need the sign and precision to be the same. + return (TYPE_PRECISION (type1) == TYPE_PRECISION (type2) + && TYPE_SIGN (type1) == TYPE_SIGN (type2)); +} + + +// Source of all operands for fold_using_range and gori_compute. +// It abstracts out the source of an operand so it can come from a stmt or +// and edge or anywhere a derived class of fur_source wants. +// The default simply picks up ranges from the current range_query. + +class fur_source +{ +public: + fur_source (range_query *q = NULL); + inline range_query *query () { return m_query; } + inline class gori_compute *gori () { return m_gori; }; + virtual bool get_operand (irange &r, tree expr); + virtual bool get_phi_operand (irange &r, tree expr, edge e); + virtual relation_kind query_relation (tree op1, tree op2); + virtual void register_relation (gimple *stmt, relation_kind k, tree op1, + tree op2); + virtual void register_relation (edge e, relation_kind k, tree op1, + tree op2); +protected: + range_query *m_query; + gori_compute *m_gori; +}; + +// fur_stmt is the specification for drawing an operand from range_query Q +// via a range_of_Expr call on stmt S. + +class fur_stmt : public fur_source +{ +public: + fur_stmt (gimple *s, range_query *q = NULL); + virtual bool get_operand (irange &r, tree expr) OVERRIDE; + virtual bool get_phi_operand (irange &r, tree expr, edge e) OVERRIDE; + virtual relation_kind query_relation (tree op1, tree op2) OVERRIDE; +private: + gimple *m_stmt; +}; + +// This version of fur_source will pick a range from a stmt, and also register +// dependencies via a gori_compute object. This is mostly an internal API. + +class fur_depend : public fur_stmt +{ +public: + fur_depend (gimple *s, gori_compute *gori, range_query *q = NULL); + virtual void register_relation (gimple *stmt, relation_kind k, tree op1, + tree op2) OVERRIDE; + virtual void register_relation (edge e, relation_kind k, tree op1, + tree op2) OVERRIDE; +private: + relation_oracle *m_oracle; +}; + +extern tree gimple_range_operand1 (const gimple *s); +extern tree gimple_range_operand2 (const gimple *s); + +// This class uses ranges to fold a gimple statement producinf a range for +// the LHS. The source of all operands is supplied via the fur_source class +// which provides a range_query as well as a source location and any other +// required information. + +class fold_using_range +{ +public: + bool fold_stmt (irange &r, gimple *s, class fur_source &src, + tree name = NULL_TREE); +protected: + bool range_of_range_op (irange &r, gimple *s, fur_source &src); + bool range_of_call (irange &r, gcall *call, fur_source &src); + bool range_of_cond_expr (irange &r, gassign* cond, fur_source &src); + bool range_of_address (irange &r, gimple *s, fur_source &src); + bool range_of_builtin_call (irange &r, gcall *call, fur_source &src); + void range_of_builtin_ubsan_call (irange &r, gcall *call, tree_code code, + fur_source &src); + bool range_of_phi (irange &r, gphi *phi, fur_source &src); + void range_of_ssa_name_with_loop_info (irange &, tree, class loop *, gphi *, + fur_source &src); + void relation_fold_and_or (irange& lhs_range, gimple *s, fur_source &src); + void postfold_gcond_edges (gcond *s, fur_source &src); +}; +#endif // GCC_GIMPLE_RANGE_FOLD_H diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc index b58f249..17032ac 100644 --- a/gcc/gimple-range-gori.cc +++ b/gcc/gimple-range-gori.cc @@ -29,6 +29,72 @@ along with GCC; see the file COPYING3. If not see #include "gimple-pretty-print.h" #include "gimple-range.h" +// Calculate what we can determine of the range of this unary +// statement's operand if the lhs of the expression has the range +// LHS_RANGE. Return false if nothing can be determined. + +bool +gimple_range_calc_op1 (irange &r, const gimple *stmt, const irange &lhs_range) +{ + gcc_checking_assert (gimple_num_ops (stmt) < 3); + + // An empty range is viral. + tree type = TREE_TYPE (gimple_range_operand1 (stmt)); + if (lhs_range.undefined_p ()) + { + r.set_undefined (); + return true; + } + // Unary operations require the type of the first operand in the + // second range position. + int_range<2> type_range (type); + return gimple_range_handler (stmt)->op1_range (r, type, lhs_range, + type_range); +} + +// Calculate what we can determine of the range of this statement's +// first operand if the lhs of the expression has the range LHS_RANGE +// and the second operand has the range OP2_RANGE. Return false if +// nothing can be determined. + +bool +gimple_range_calc_op1 (irange &r, const gimple *stmt, + const irange &lhs_range, const irange &op2_range) +{ + // Unary operation are allowed to pass a range in for second operand + // as there are often additional restrictions beyond the type which + // can be imposed. See operator_cast::op1_range(). + tree type = TREE_TYPE (gimple_range_operand1 (stmt)); + // An empty range is viral. + if (op2_range.undefined_p () || lhs_range.undefined_p ()) + { + r.set_undefined (); + return true; + } + return gimple_range_handler (stmt)->op1_range (r, type, lhs_range, + op2_range); +} + +// Calculate what we can determine of the range of this statement's +// second operand if the lhs of the expression has the range LHS_RANGE +// and the first operand has the range OP1_RANGE. Return false if +// nothing can be determined. + +bool +gimple_range_calc_op2 (irange &r, const gimple *stmt, + const irange &lhs_range, const irange &op1_range) +{ + tree type = TREE_TYPE (gimple_range_operand2 (stmt)); + // An empty range is viral. + if (op1_range.undefined_p () || lhs_range.undefined_p ()) + { + r.set_undefined (); + return true; + } + return gimple_range_handler (stmt)->op2_range (r, type, lhs_range, + op1_range); +} + // Return TRUE if GS is a logical && or || expression. static inline bool diff --git a/gcc/gimple-range-gori.h b/gcc/gimple-range-gori.h index 6f187db..ad83324 100644 --- a/gcc/gimple-range-gori.h +++ b/gcc/gimple-range-gori.h @@ -182,6 +182,15 @@ private: gimple_outgoing_range outgoing; // Edge values for COND_EXPR & SWITCH_EXPR. }; +// These routines provide a GIMPLE interface to the range-ops code. +extern bool gimple_range_calc_op1 (irange &r, const gimple *s, + const irange &lhs_range); +extern bool gimple_range_calc_op1 (irange &r, const gimple *s, + const irange &lhs_range, + const irange &op2_range); +extern bool gimple_range_calc_op2 (irange &r, const gimple *s, + const irange &lhs_range, + const irange &op1_range); // For each name that is an import into BB's exports.. #define FOR_EACH_GORI_IMPORT_NAME(gori, bb, name) \ diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc index 385cecf..1851339 100644 --- a/gcc/gimple-range.cc +++ b/gcc/gimple-range.cc @@ -23,1186 +23,18 @@ along with GCC; see the file COPYING3. If not see #include "system.h" #include "coretypes.h" #include "backend.h" -#include "insn-codes.h" -#include "rtl.h" #include "tree.h" #include "gimple.h" #include "ssa.h" #include "gimple-pretty-print.h" #include "gimple-iterator.h" -#include "optabs-tree.h" -#include "gimple-fold.h" #include "tree-cfg.h" #include "fold-const.h" #include "tree-cfg.h" -#include "wide-int.h" -#include "fold-const.h" -#include "case-cfn-macros.h" -#include "omp-general.h" #include "cfgloop.h" -#include "tree-ssa-loop.h" #include "tree-scalar-evolution.h" -#include "dbgcnt.h" -#include "alloc-pool.h" -#include "vr-values.h" #include "gimple-range.h" -// Construct a fur_source, and set the m_query field. - -fur_source::fur_source (range_query *q) -{ - if (q) - m_query = q; - else if (cfun) - m_query = get_range_query (cfun); - else - m_query = get_global_range_query (); - m_gori = NULL; -} - -// Invoke range_of_expr on EXPR. - -bool -fur_source::get_operand (irange &r, tree expr) -{ - return m_query->range_of_expr (r, expr); -} - -// Evaluate EXPR for this stmt as a PHI argument on edge E. Use the current -// range_query to get the range on the edge. - -bool -fur_source::get_phi_operand (irange &r, tree expr, edge e) -{ - return m_query->range_on_edge (r, e, expr); -} - -// Default is no relation. - -relation_kind -fur_source::query_relation (tree op1 ATTRIBUTE_UNUSED, - tree op2 ATTRIBUTE_UNUSED) -{ - return VREL_NONE; -} - -// Default registers nothing. - -void -fur_source::register_relation (gimple *s ATTRIBUTE_UNUSED, - relation_kind k ATTRIBUTE_UNUSED, - tree op1 ATTRIBUTE_UNUSED, - tree op2 ATTRIBUTE_UNUSED) -{ -} - -// Default registers nothing. - -void -fur_source::register_relation (edge e ATTRIBUTE_UNUSED, - relation_kind k ATTRIBUTE_UNUSED, - tree op1 ATTRIBUTE_UNUSED, - tree op2 ATTRIBUTE_UNUSED) -{ -} - -// This version of fur_source will pick a range up off an edge. - -class fur_edge : public fur_source -{ -public: - fur_edge (edge e, range_query *q = NULL); - virtual bool get_operand (irange &r, tree expr) OVERRIDE; - virtual bool get_phi_operand (irange &r, tree expr, edge e) OVERRIDE; -private: - edge m_edge; -}; - -// Instantiate an edge based fur_source. - -inline -fur_edge::fur_edge (edge e, range_query *q) : fur_source (q) -{ - m_edge = e; -} - -// Get the value of EXPR on edge m_edge. - -bool -fur_edge::get_operand (irange &r, tree expr) -{ - return m_query->range_on_edge (r, m_edge, expr); -} - -// Evaluate EXPR for this stmt as a PHI argument on edge E. Use the current -// range_query to get the range on the edge. - -bool -fur_edge::get_phi_operand (irange &r, tree expr, edge e) -{ - // Edge to edge recalculations not supoprted yet, until we sort it out. - gcc_checking_assert (e == m_edge); - return m_query->range_on_edge (r, e, expr); -} - -// Instantiate a stmt based fur_source. - -fur_stmt::fur_stmt (gimple *s, range_query *q) : fur_source (q) -{ - m_stmt = s; -} - -// Retreive range of EXPR as it occurs as a use on stmt M_STMT. - -bool -fur_stmt::get_operand (irange &r, tree expr) -{ - return m_query->range_of_expr (r, expr, m_stmt); -} - -// Evaluate EXPR for this stmt as a PHI argument on edge E. Use the current -// range_query to get the range on the edge. - -bool -fur_stmt::get_phi_operand (irange &r, tree expr, edge e) -{ - // Pick up the range of expr from edge E. - fur_edge e_src (e, m_query); - return e_src.get_operand (r, expr); -} - -// Return relation based from m_stmt. - -relation_kind -fur_stmt::query_relation (tree op1, tree op2) -{ - return m_query->query_relation (m_stmt, op1, op2); -} - -// This version of fur_source will pick a range from a stmt, and also register -// dependencies via a gori_compute object. This is mostly an internal API. - -class fur_depend : public fur_stmt -{ -public: - fur_depend (gimple *s, gori_compute *gori, range_query *q = NULL); - virtual void register_relation (gimple *stmt, relation_kind k, tree op1, - tree op2) OVERRIDE; - virtual void register_relation (edge e, relation_kind k, tree op1, - tree op2) OVERRIDE; -private: - relation_oracle *m_oracle; -}; - -// Instantiate a stmt based fur_source with a GORI object. - -inline -fur_depend::fur_depend (gimple *s, gori_compute *gori, range_query *q) - : fur_stmt (s, q) -{ - gcc_checking_assert (gori); - m_gori = gori; - // Set relations if there is an oracle in the range_query. - // This will enable registering of relationships as they are discovered. - m_oracle = q->oracle (); - -} - -// Register a relation on a stmt if there is an oracle. - -void -fur_depend::register_relation (gimple *s, relation_kind k, tree op1, tree op2) -{ - if (m_oracle) - m_oracle->register_relation (s, k, op1, op2); -} - -// Register a relation on an edge if there is an oracle. - -void -fur_depend::register_relation (edge e, relation_kind k, tree op1, tree op2) -{ - if (m_oracle) - m_oracle->register_relation (e, k, op1, op2); -} - -// This version of fur_source will pick a range up from a list of ranges -// supplied by the caller. - -class fur_list : public fur_source -{ -public: - fur_list (irange &r1); - fur_list (irange &r1, irange &r2); - fur_list (unsigned num, irange *list); - virtual bool get_operand (irange &r, tree expr) OVERRIDE; - virtual bool get_phi_operand (irange &r, tree expr, edge e) OVERRIDE; -private: - int_range_max m_local[2]; - irange *m_list; - unsigned m_index; - unsigned m_limit; -}; - -// One range supplied for unary operations. - -fur_list::fur_list (irange &r1) : fur_source (NULL) -{ - m_list = m_local; - m_index = 0; - m_limit = 1; - m_local[0] = r1; -} - -// Two ranges supplied for binary operations. - -fur_list::fur_list (irange &r1, irange &r2) : fur_source (NULL) -{ - m_list = m_local; - m_index = 0; - m_limit = 2; - m_local[0] = r1; - m_local[0] = r2; -} - -// Arbitrary number of ranges in a vector. - -fur_list::fur_list (unsigned num, irange *list) : fur_source (NULL) -{ - m_list = list; - m_index = 0; - m_limit = num; -} - -// Get the next operand from the vector, ensure types are compatible. - -bool -fur_list::get_operand (irange &r, tree expr) -{ - if (m_index >= m_limit) - return m_query->range_of_expr (r, expr); - r = m_list[m_index++]; - gcc_checking_assert (range_compatible_p (TREE_TYPE (expr), r.type ())); - return true; -} - -// This will simply pick the next operand from the vector. -bool -fur_list::get_phi_operand (irange &r, tree expr, edge e ATTRIBUTE_UNUSED) -{ - return get_operand (r, expr); -} - -// Fold stmt S into range R using R1 as the first operand. - -bool -fold_range (irange &r, gimple *s, irange &r1) -{ - fold_using_range f; - fur_list src (r1); - return f.fold_stmt (r, s, src); -} - -// Fold stmt S into range R using R1 and R2 as the first two operands. - -bool -fold_range (irange &r, gimple *s, irange &r1, irange &r2) -{ - fold_using_range f; - fur_list src (r1, r2); - return f.fold_stmt (r, s, src); -} - -// Fold stmt S into range R using NUM_ELEMENTS from VECTOR as the initial -// operands encountered. - -bool -fold_range (irange &r, gimple *s, unsigned num_elements, irange *vector) -{ - fold_using_range f; - fur_list src (num_elements, vector); - return f.fold_stmt (r, s, src); -} - -// Fold stmt S into range R using range query Q. - -bool -fold_range (irange &r, gimple *s, range_query *q) -{ - fold_using_range f; - fur_stmt src (s, q); - return f.fold_stmt (r, s, src); -} - -// Recalculate stmt S into R using range query Q as if it were on edge ON_EDGE. - -bool -fold_range (irange &r, gimple *s, edge on_edge, range_query *q) -{ - fold_using_range f; - fur_edge src (on_edge, q); - return f.fold_stmt (r, s, src); -} - -// ------------------------------------------------------------------------- - -// Adjust the range for a pointer difference where the operands came -// from a memchr. -// -// This notices the following sequence: -// -// def = __builtin_memchr (arg, 0, sz) -// n = def - arg -// -// The range for N can be narrowed to [0, PTRDIFF_MAX - 1]. - -static void -adjust_pointer_diff_expr (irange &res, const gimple *diff_stmt) -{ - tree op0 = gimple_assign_rhs1 (diff_stmt); - tree op1 = gimple_assign_rhs2 (diff_stmt); - tree op0_ptype = TREE_TYPE (TREE_TYPE (op0)); - tree op1_ptype = TREE_TYPE (TREE_TYPE (op1)); - gimple *call; - - if (TREE_CODE (op0) == SSA_NAME - && TREE_CODE (op1) == SSA_NAME - && (call = SSA_NAME_DEF_STMT (op0)) - && is_gimple_call (call) - && gimple_call_builtin_p (call, BUILT_IN_MEMCHR) - && TYPE_MODE (op0_ptype) == TYPE_MODE (char_type_node) - && TYPE_PRECISION (op0_ptype) == TYPE_PRECISION (char_type_node) - && TYPE_MODE (op1_ptype) == TYPE_MODE (char_type_node) - && TYPE_PRECISION (op1_ptype) == TYPE_PRECISION (char_type_node) - && gimple_call_builtin_p (call, BUILT_IN_MEMCHR) - && vrp_operand_equal_p (op1, gimple_call_arg (call, 0)) - && integer_zerop (gimple_call_arg (call, 1))) - { - tree max = vrp_val_max (ptrdiff_type_node); - wide_int wmax = wi::to_wide (max, TYPE_PRECISION (TREE_TYPE (max))); - tree expr_type = gimple_expr_type (diff_stmt); - tree range_min = build_zero_cst (expr_type); - tree range_max = wide_int_to_tree (expr_type, wmax - 1); - int_range<2> r (range_min, range_max); - res.intersect (r); - } -} - -// This function looks for situations when walking the use/def chains -// may provide additonal contextual range information not exposed on -// this statement. Like knowing the IMAGPART return value from a -// builtin function is a boolean result. - -// We should rework how we're called, as we have an op_unknown entry -// for IMAGPART_EXPR and POINTER_DIFF_EXPR in range-ops just so this -// function gets called. - -static void -gimple_range_adjustment (irange &res, const gimple *stmt) -{ - switch (gimple_expr_code (stmt)) - { - case POINTER_DIFF_EXPR: - adjust_pointer_diff_expr (res, stmt); - return; - - case IMAGPART_EXPR: - { - tree name = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0); - if (TREE_CODE (name) == SSA_NAME) - { - gimple *def_stmt = SSA_NAME_DEF_STMT (name); - if (def_stmt && is_gimple_call (def_stmt) - && gimple_call_internal_p (def_stmt)) - { - switch (gimple_call_internal_fn (def_stmt)) - { - case IFN_ADD_OVERFLOW: - case IFN_SUB_OVERFLOW: - case IFN_MUL_OVERFLOW: - case IFN_ATOMIC_COMPARE_EXCHANGE: - { - int_range<2> r; - r.set_varying (boolean_type_node); - tree type = TREE_TYPE (gimple_assign_lhs (stmt)); - range_cast (r, type); - res.intersect (r); - } - default: - break; - } - } - } - break; - } - - default: - break; - } -} - -// Return the base of the RHS of an assignment. - -static tree -gimple_range_base_of_assignment (const gimple *stmt) -{ - gcc_checking_assert (gimple_code (stmt) == GIMPLE_ASSIGN); - tree op1 = gimple_assign_rhs1 (stmt); - if (gimple_assign_rhs_code (stmt) == ADDR_EXPR) - return get_base_address (TREE_OPERAND (op1, 0)); - return op1; -} - -// Return the first operand of this statement if it is a valid operand -// supported by ranges, otherwise return NULL_TREE. Special case is -// &(SSA_NAME expr), return the SSA_NAME instead of the ADDR expr. - -tree -gimple_range_operand1 (const gimple *stmt) -{ - gcc_checking_assert (gimple_range_handler (stmt)); - - switch (gimple_code (stmt)) - { - case GIMPLE_COND: - return gimple_cond_lhs (stmt); - case GIMPLE_ASSIGN: - { - tree base = gimple_range_base_of_assignment (stmt); - if (base && TREE_CODE (base) == MEM_REF) - { - // If the base address is an SSA_NAME, we return it - // here. This allows processing of the range of that - // name, while the rest of the expression is simply - // ignored. The code in range_ops will see the - // ADDR_EXPR and do the right thing. - tree ssa = TREE_OPERAND (base, 0); - if (TREE_CODE (ssa) == SSA_NAME) - return ssa; - } - return base; - } - default: - break; - } - return NULL; -} - -// Return the second operand of statement STMT, otherwise return NULL_TREE. - -tree -gimple_range_operand2 (const gimple *stmt) -{ - gcc_checking_assert (gimple_range_handler (stmt)); - - switch (gimple_code (stmt)) - { - case GIMPLE_COND: - return gimple_cond_rhs (stmt); - case GIMPLE_ASSIGN: - if (gimple_num_ops (stmt) >= 3) - return gimple_assign_rhs2 (stmt); - default: - break; - } - return NULL_TREE; -} - -// Calculate what we can determine of the range of this unary -// statement's operand if the lhs of the expression has the range -// LHS_RANGE. Return false if nothing can be determined. - -bool -gimple_range_calc_op1 (irange &r, const gimple *stmt, const irange &lhs_range) -{ - gcc_checking_assert (gimple_num_ops (stmt) < 3); - - // An empty range is viral. - tree type = TREE_TYPE (gimple_range_operand1 (stmt)); - if (lhs_range.undefined_p ()) - { - r.set_undefined (); - return true; - } - // Unary operations require the type of the first operand in the - // second range position. - int_range<2> type_range (type); - return gimple_range_handler (stmt)->op1_range (r, type, lhs_range, - type_range); -} - -// Calculate what we can determine of the range of this statement's -// first operand if the lhs of the expression has the range LHS_RANGE -// and the second operand has the range OP2_RANGE. Return false if -// nothing can be determined. - -bool -gimple_range_calc_op1 (irange &r, const gimple *stmt, - const irange &lhs_range, const irange &op2_range) -{ - // Unary operation are allowed to pass a range in for second operand - // as there are often additional restrictions beyond the type which - // can be imposed. See operator_cast::op1_range(). - tree type = TREE_TYPE (gimple_range_operand1 (stmt)); - // An empty range is viral. - if (op2_range.undefined_p () || lhs_range.undefined_p ()) - { - r.set_undefined (); - return true; - } - return gimple_range_handler (stmt)->op1_range (r, type, lhs_range, - op2_range); -} - -// Calculate what we can determine of the range of this statement's -// second operand if the lhs of the expression has the range LHS_RANGE -// and the first operand has the range OP1_RANGE. Return false if -// nothing can be determined. - -bool -gimple_range_calc_op2 (irange &r, const gimple *stmt, - const irange &lhs_range, const irange &op1_range) -{ - tree type = TREE_TYPE (gimple_range_operand2 (stmt)); - // An empty range is viral. - if (op1_range.undefined_p () || lhs_range.undefined_p ()) - { - r.set_undefined (); - return true; - } - return gimple_range_handler (stmt)->op2_range (r, type, lhs_range, - op1_range); -} - -// Calculate a range for statement S and return it in R. If NAME is provided it -// represents the SSA_NAME on the LHS of the statement. It is only required -// if there is more than one lhs/output. If a range cannot -// be calculated, return false. - -bool -fold_using_range::fold_stmt (irange &r, gimple *s, fur_source &src, tree name) -{ - bool res = false; - // If name and S are specified, make sure it is an LHS of S. - gcc_checking_assert (!name || !gimple_get_lhs (s) || - name == gimple_get_lhs (s)); - - if (!name) - name = gimple_get_lhs (s); - - // Process addresses. - if (gimple_code (s) == GIMPLE_ASSIGN - && gimple_assign_rhs_code (s) == ADDR_EXPR) - return range_of_address (r, s, src); - - if (gimple_range_handler (s)) - res = range_of_range_op (r, s, src); - else if (is_a<gphi *>(s)) - res = range_of_phi (r, as_a<gphi *> (s), src); - else if (is_a<gcall *>(s)) - res = range_of_call (r, as_a<gcall *> (s), src); - else if (is_a<gassign *> (s) && gimple_assign_rhs_code (s) == COND_EXPR) - res = range_of_cond_expr (r, as_a<gassign *> (s), src); - - if (!res) - { - // If no name is specified, try the expression kind. - if (!name) - { - tree t = gimple_expr_type (s); - if (!irange::supports_type_p (t)) - return false; - r.set_varying (t); - return true; - } - if (!gimple_range_ssa_p (name)) - return false; - // We don't understand the stmt, so return the global range. - r = gimple_range_global (name); - return true; - } - - if (r.undefined_p ()) - return true; - - // We sometimes get compatible types copied from operands, make sure - // the correct type is being returned. - if (name && TREE_TYPE (name) != r.type ()) - { - gcc_checking_assert (range_compatible_p (r.type (), TREE_TYPE (name))); - range_cast (r, TREE_TYPE (name)); - } - return true; -} - -// Calculate a range for range_op statement S and return it in R. If any -// If a range cannot be calculated, return false. - -bool -fold_using_range::range_of_range_op (irange &r, gimple *s, fur_source &src) -{ - int_range_max range1, range2; - tree type = gimple_expr_type (s); - range_operator *handler = gimple_range_handler (s); - gcc_checking_assert (handler); - gcc_checking_assert (irange::supports_type_p (type)); - - tree lhs = gimple_get_lhs (s); - tree op1 = gimple_range_operand1 (s); - tree op2 = gimple_range_operand2 (s); - - if (src.get_operand (range1, op1)) - { - if (!op2) - { - // Fold range, and register any dependency if available. - int_range<2> r2 (type); - handler->fold_range (r, type, range1, r2); - if (lhs && gimple_range_ssa_p (op1)) - { - if (src.gori ()) - src.gori ()->register_dependency (lhs, op1); - relation_kind rel; - rel = handler->lhs_op1_relation (r, range1, range1); - if (rel != VREL_NONE) - src.register_relation (s, rel, lhs, op1); - } - } - else if (src.get_operand (range2, op2)) - { - relation_kind rel = src.query_relation (op1, op2); - if (dump_file && (dump_flags & TDF_DETAILS) && rel != VREL_NONE) - { - fprintf (dump_file, " folding with relation "); - print_relation (dump_file, rel); - fputc ('\n', dump_file); - } - // Fold range, and register any dependency if available. - handler->fold_range (r, type, range1, range2, rel); - relation_fold_and_or (r, s, src); - if (lhs) - { - if (src.gori ()) - { - src.gori ()->register_dependency (lhs, op1); - src.gori ()->register_dependency (lhs, op2); - } - if (gimple_range_ssa_p (op1)) - { - rel = handler->lhs_op1_relation (r, range1, range2); - if (rel != VREL_NONE) - src.register_relation (s, rel, lhs, op1); - } - if (gimple_range_ssa_p (op2)) - { - rel= handler->lhs_op2_relation (r, range1, range2); - if (rel != VREL_NONE) - src.register_relation (s, rel, lhs, op2); - } - } - else if (is_a<gcond *> (s)) - postfold_gcond_edges (as_a<gcond *> (s), src); - } - else - r.set_varying (type); - } - else - r.set_varying (type); - // Make certain range-op adjustments that aren't handled any other way. - gimple_range_adjustment (r, s); - return true; -} - -// Calculate the range of an assignment containing an ADDR_EXPR. -// Return the range in R. -// If a range cannot be calculated, set it to VARYING and return true. - -bool -fold_using_range::range_of_address (irange &r, gimple *stmt, fur_source &src) -{ - gcc_checking_assert (gimple_code (stmt) == GIMPLE_ASSIGN); - gcc_checking_assert (gimple_assign_rhs_code (stmt) == ADDR_EXPR); - - bool strict_overflow_p; - tree expr = gimple_assign_rhs1 (stmt); - poly_int64 bitsize, bitpos; - tree offset; - machine_mode mode; - int unsignedp, reversep, volatilep; - tree base = get_inner_reference (TREE_OPERAND (expr, 0), &bitsize, - &bitpos, &offset, &mode, &unsignedp, - &reversep, &volatilep); - - - if (base != NULL_TREE - && TREE_CODE (base) == MEM_REF - && TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME) - { - tree ssa = TREE_OPERAND (base, 0); - tree lhs = gimple_get_lhs (stmt); - if (lhs && gimple_range_ssa_p (ssa) && src.gori ()) - src.gori ()->register_dependency (lhs, ssa); - gcc_checking_assert (irange::supports_type_p (TREE_TYPE (ssa))); - src.get_operand (r, ssa); - range_cast (r, TREE_TYPE (gimple_assign_rhs1 (stmt))); - - poly_offset_int off = 0; - bool off_cst = false; - if (offset == NULL_TREE || TREE_CODE (offset) == INTEGER_CST) - { - off = mem_ref_offset (base); - if (offset) - off += poly_offset_int::from (wi::to_poly_wide (offset), - SIGNED); - off <<= LOG2_BITS_PER_UNIT; - off += bitpos; - off_cst = true; - } - /* If &X->a is equal to X, the range of X is the result. */ - if (off_cst && known_eq (off, 0)) - return true; - else if (flag_delete_null_pointer_checks - && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (expr))) - { - /* For -fdelete-null-pointer-checks -fno-wrapv-pointer we don't - allow going from non-NULL pointer to NULL. */ - if(!range_includes_zero_p (&r)) - return true; - } - /* If MEM_REF has a "positive" offset, consider it non-NULL - always, for -fdelete-null-pointer-checks also "negative" - ones. Punt for unknown offsets (e.g. variable ones). */ - if (!TYPE_OVERFLOW_WRAPS (TREE_TYPE (expr)) - && off_cst - && known_ne (off, 0) - && (flag_delete_null_pointer_checks || known_gt (off, 0))) - { - r = range_nonzero (TREE_TYPE (gimple_assign_rhs1 (stmt))); - return true; - } - r = int_range<2> (TREE_TYPE (gimple_assign_rhs1 (stmt))); - return true; - } - - // Handle "= &a". - if (tree_single_nonzero_warnv_p (expr, &strict_overflow_p)) - { - r = range_nonzero (TREE_TYPE (gimple_assign_rhs1 (stmt))); - return true; - } - - // Otherwise return varying. - r = int_range<2> (TREE_TYPE (gimple_assign_rhs1 (stmt))); - return true; -} - -// Calculate a range for phi statement S and return it in R. -// If a range cannot be calculated, return false. - -bool -fold_using_range::range_of_phi (irange &r, gphi *phi, fur_source &src) -{ - tree phi_def = gimple_phi_result (phi); - tree type = TREE_TYPE (phi_def); - int_range_max arg_range; - unsigned x; - - if (!irange::supports_type_p (type)) - return false; - - // Start with an empty range, unioning in each argument's range. - r.set_undefined (); - for (x = 0; x < gimple_phi_num_args (phi); x++) - { - tree arg = gimple_phi_arg_def (phi, x); - edge e = gimple_phi_arg_edge (phi, x); - - // Register potential dependencies for stale value tracking. - if (gimple_range_ssa_p (arg) && src.gori ()) - src.gori ()->register_dependency (phi_def, arg); - - // Get the range of the argument on its edge. - src.get_phi_operand (arg_range, arg, e); - // If we're recomputing the argument elsewhere, try to refine it. - r.union_ (arg_range); - // Once the value reaches varying, stop looking. - if (r.varying_p ()) - break; - } - - // If SCEV is available, query if this PHI has any knonwn values. - if (scev_initialized_p () && !POINTER_TYPE_P (TREE_TYPE (phi_def))) - { - value_range loop_range; - class loop *l = loop_containing_stmt (phi); - if (l && loop_outer (l)) - { - range_of_ssa_name_with_loop_info (loop_range, phi_def, l, phi, src); - if (!loop_range.varying_p ()) - { - if (dump_file && (dump_flags & TDF_DETAILS)) - { - fprintf (dump_file, " Loops range found for "); - print_generic_expr (dump_file, phi_def, TDF_SLIM); - fprintf (dump_file, ": "); - loop_range.dump (dump_file); - fprintf (dump_file, " and calculated range :"); - r.dump (dump_file); - fprintf (dump_file, "\n"); - } - r.intersect (loop_range); - } - } - } - - return true; -} - -// Calculate a range for call statement S and return it in R. -// If a range cannot be calculated, return false. - -bool -fold_using_range::range_of_call (irange &r, gcall *call, fur_source &src) -{ - tree type = gimple_call_return_type (call); - tree lhs = gimple_call_lhs (call); - bool strict_overflow_p; - - if (!irange::supports_type_p (type)) - return false; - - if (range_of_builtin_call (r, call, src)) - ; - else if (gimple_stmt_nonnegative_warnv_p (call, &strict_overflow_p)) - r.set (build_int_cst (type, 0), TYPE_MAX_VALUE (type)); - else if (gimple_call_nonnull_result_p (call) - || gimple_call_nonnull_arg (call)) - r = range_nonzero (type); - else - r.set_varying (type); - - // If there is an LHS, intersect that with what is known. - if (lhs) - { - value_range def; - def = gimple_range_global (lhs); - r.intersect (def); - } - return true; -} - -// Return the range of a __builtin_ubsan* in CALL and set it in R. -// CODE is the type of ubsan call (PLUS_EXPR, MINUS_EXPR or -// MULT_EXPR). - -void -fold_using_range::range_of_builtin_ubsan_call (irange &r, gcall *call, - tree_code code, fur_source &src) -{ - gcc_checking_assert (code == PLUS_EXPR || code == MINUS_EXPR - || code == MULT_EXPR); - tree type = gimple_call_return_type (call); - range_operator *op = range_op_handler (code, type); - gcc_checking_assert (op); - int_range_max ir0, ir1; - tree arg0 = gimple_call_arg (call, 0); - tree arg1 = gimple_call_arg (call, 1); - src.get_operand (ir0, arg0); - src.get_operand (ir1, arg1); - - bool saved_flag_wrapv = flag_wrapv; - // Pretend the arithmetic is wrapping. If there is any overflow, - // we'll complain, but will actually do wrapping operation. - flag_wrapv = 1; - op->fold_range (r, type, ir0, ir1); - flag_wrapv = saved_flag_wrapv; - - // If for both arguments vrp_valueize returned non-NULL, this should - // have been already folded and if not, it wasn't folded because of - // overflow. Avoid removing the UBSAN_CHECK_* calls in that case. - if (r.singleton_p ()) - r.set_varying (type); -} - -// For a builtin in CALL, return a range in R if known and return -// TRUE. Otherwise return FALSE. - -bool -fold_using_range::range_of_builtin_call (irange &r, gcall *call, - fur_source &src) -{ - combined_fn func = gimple_call_combined_fn (call); - if (func == CFN_LAST) - return false; - - tree type = gimple_call_return_type (call); - tree arg; - int mini, maxi, zerov = 0, prec; - scalar_int_mode mode; - - switch (func) - { - case CFN_BUILT_IN_CONSTANT_P: - if (cfun->after_inlining) - { - r.set_zero (type); - // r.equiv_clear (); - return true; - } - arg = gimple_call_arg (call, 0); - if (src.get_operand (r, arg) && r.singleton_p ()) - { - r.set (build_one_cst (type), build_one_cst (type)); - return true; - } - break; - - CASE_CFN_FFS: - CASE_CFN_POPCOUNT: - // __builtin_ffs* and __builtin_popcount* return [0, prec]. - arg = gimple_call_arg (call, 0); - prec = TYPE_PRECISION (TREE_TYPE (arg)); - mini = 0; - maxi = prec; - src.get_operand (r, arg); - // If arg is non-zero, then ffs or popcount are non-zero. - if (!range_includes_zero_p (&r)) - mini = 1; - // If some high bits are known to be zero, decrease the maximum. - if (!r.undefined_p ()) - { - if (TYPE_SIGN (r.type ()) == SIGNED) - range_cast (r, unsigned_type_for (r.type ())); - wide_int max = r.upper_bound (); - maxi = wi::floor_log2 (max) + 1; - } - r.set (build_int_cst (type, mini), build_int_cst (type, maxi)); - return true; - - CASE_CFN_PARITY: - r.set (build_zero_cst (type), build_one_cst (type)); - return true; - - CASE_CFN_CLZ: - // __builtin_c[lt]z* return [0, prec-1], except when the - // argument is 0, but that is undefined behavior. - // - // For __builtin_c[lt]z* consider argument of 0 always undefined - // behavior, for internal fns depending on C?Z_DEFINED_VALUE_AT_ZERO. - arg = gimple_call_arg (call, 0); - prec = TYPE_PRECISION (TREE_TYPE (arg)); - mini = 0; - maxi = prec - 1; - mode = SCALAR_INT_TYPE_MODE (TREE_TYPE (arg)); - if (gimple_call_internal_p (call)) - { - if (optab_handler (clz_optab, mode) != CODE_FOR_nothing - && CLZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2) - { - // Only handle the single common value. - if (zerov == prec) - maxi = prec; - else - // Magic value to give up, unless we can prove arg is non-zero. - mini = -2; - } - } - - src.get_operand (r, arg); - // From clz of minimum we can compute result maximum. - if (!r.undefined_p ()) - { - // From clz of minimum we can compute result maximum. - if (wi::gt_p (r.lower_bound (), 0, TYPE_SIGN (r.type ()))) - { - maxi = prec - 1 - wi::floor_log2 (r.lower_bound ()); - if (mini == -2) - mini = 0; - } - else if (!range_includes_zero_p (&r)) - { - mini = 0; - maxi = prec - 1; - } - if (mini == -2) - break; - // From clz of maximum we can compute result minimum. - wide_int max = r.upper_bound (); - int newmini = prec - 1 - wi::floor_log2 (max); - if (max == 0) - { - // If CLZ_DEFINED_VALUE_AT_ZERO is 2 with VALUE of prec, - // return [prec, prec], otherwise ignore the range. - if (maxi == prec) - mini = prec; - } - else - mini = newmini; - } - if (mini == -2) - break; - r.set (build_int_cst (type, mini), build_int_cst (type, maxi)); - return true; - - CASE_CFN_CTZ: - // __builtin_ctz* return [0, prec-1], except for when the - // argument is 0, but that is undefined behavior. - // - // For __builtin_ctz* consider argument of 0 always undefined - // behavior, for internal fns depending on CTZ_DEFINED_VALUE_AT_ZERO. - arg = gimple_call_arg (call, 0); - prec = TYPE_PRECISION (TREE_TYPE (arg)); - mini = 0; - maxi = prec - 1; - mode = SCALAR_INT_TYPE_MODE (TREE_TYPE (arg)); - if (gimple_call_internal_p (call)) - { - if (optab_handler (ctz_optab, mode) != CODE_FOR_nothing - && CTZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2) - { - // Handle only the two common values. - if (zerov == -1) - mini = -1; - else if (zerov == prec) - maxi = prec; - else - // Magic value to give up, unless we can prove arg is non-zero. - mini = -2; - } - } - src.get_operand (r, arg); - if (!r.undefined_p ()) - { - // If arg is non-zero, then use [0, prec - 1]. - if (!range_includes_zero_p (&r)) - { - mini = 0; - maxi = prec - 1; - } - // If some high bits are known to be zero, we can decrease - // the maximum. - wide_int max = r.upper_bound (); - if (max == 0) - { - // Argument is [0, 0]. If CTZ_DEFINED_VALUE_AT_ZERO - // is 2 with value -1 or prec, return [-1, -1] or [prec, prec]. - // Otherwise ignore the range. - if (mini == -1) - maxi = -1; - else if (maxi == prec) - mini = prec; - } - // If value at zero is prec and 0 is in the range, we can't lower - // the upper bound. We could create two separate ranges though, - // [0,floor_log2(max)][prec,prec] though. - else if (maxi != prec) - maxi = wi::floor_log2 (max); - } - if (mini == -2) - break; - r.set (build_int_cst (type, mini), build_int_cst (type, maxi)); - return true; - - CASE_CFN_CLRSB: - arg = gimple_call_arg (call, 0); - prec = TYPE_PRECISION (TREE_TYPE (arg)); - r.set (build_int_cst (type, 0), build_int_cst (type, prec - 1)); - return true; - case CFN_UBSAN_CHECK_ADD: - range_of_builtin_ubsan_call (r, call, PLUS_EXPR, src); - return true; - case CFN_UBSAN_CHECK_SUB: - range_of_builtin_ubsan_call (r, call, MINUS_EXPR, src); - return true; - case CFN_UBSAN_CHECK_MUL: - range_of_builtin_ubsan_call (r, call, MULT_EXPR, src); - return true; - - case CFN_GOACC_DIM_SIZE: - case CFN_GOACC_DIM_POS: - // Optimizing these two internal functions helps the loop - // optimizer eliminate outer comparisons. Size is [1,N] - // and pos is [0,N-1]. - { - bool is_pos = func == CFN_GOACC_DIM_POS; - int axis = oacc_get_ifn_dim_arg (call); - int size = oacc_get_fn_dim_size (current_function_decl, axis); - if (!size) - // If it's dynamic, the backend might know a hardware limitation. - size = targetm.goacc.dim_limit (axis); - - r.set (build_int_cst (type, is_pos ? 0 : 1), - size - ? build_int_cst (type, size - is_pos) : vrp_val_max (type)); - return true; - } - - case CFN_BUILT_IN_STRLEN: - if (tree lhs = gimple_call_lhs (call)) - if (ptrdiff_type_node - && (TYPE_PRECISION (ptrdiff_type_node) - == TYPE_PRECISION (TREE_TYPE (lhs)))) - { - tree type = TREE_TYPE (lhs); - tree max = vrp_val_max (ptrdiff_type_node); - wide_int wmax - = wi::to_wide (max, TYPE_PRECISION (TREE_TYPE (max))); - tree range_min = build_zero_cst (type); - // To account for the terminating NULL, the maximum length - // is one less than the maximum array size, which in turn - // is one less than PTRDIFF_MAX (or SIZE_MAX where it's - // smaller than the former type). - // FIXME: Use max_object_size() - 1 here. - tree range_max = wide_int_to_tree (type, wmax - 2); - r.set (range_min, range_max); - return true; - } - break; - default: - break; - } - return false; -} - - -// Calculate a range for COND_EXPR statement S and return it in R. -// If a range cannot be calculated, return false. - -bool -fold_using_range::range_of_cond_expr (irange &r, gassign *s, fur_source &src) -{ - int_range_max cond_range, range1, range2; - tree cond = gimple_assign_rhs1 (s); - tree op1 = gimple_assign_rhs2 (s); - tree op2 = gimple_assign_rhs3 (s); - - gcc_checking_assert (gimple_assign_rhs_code (s) == COND_EXPR); - gcc_checking_assert (useless_type_conversion_p (TREE_TYPE (op1), - TREE_TYPE (op2))); - if (!irange::supports_type_p (TREE_TYPE (op1))) - return false; - - src.get_operand (cond_range, cond); - src.get_operand (range1, op1); - src.get_operand (range2, op2); - - // If the condition is known, choose the appropriate expression. - if (cond_range.singleton_p ()) - { - // False, pick second operand. - if (cond_range.zero_p ()) - r = range2; - else - r = range1; - } - else - { - r = range1; - r.union_ (range2); - } - return true; -} - gimple_ranger::gimple_ranger () { // If the cache has a relation oracle, use it. @@ -1457,7 +289,7 @@ gimple_ranger::dump_bb (FILE *f, basic_block bb) m_cache.block_range (range, bb, name, false) || m_cache.block_range (range, e->dest, name, false)) { - range_on_edge (range, e, name); + m_cache.range_on_edge (range, e, name); if (!range.varying_p ()) { fprintf (f, "%d->%d ", e->src->index, @@ -1493,217 +325,6 @@ gimple_ranger::dump (FILE *f) m_cache.dump (f); } -// If SCEV has any information about phi node NAME, return it as a range in R. - -void -fold_using_range::range_of_ssa_name_with_loop_info (irange &r, tree name, - class loop *l, gphi *phi, - fur_source &src) -{ - gcc_checking_assert (TREE_CODE (name) == SSA_NAME); - tree min, max, type = TREE_TYPE (name); - if (bounds_of_var_in_loop (&min, &max, src.query (), l, phi, name)) - { - if (TREE_CODE (min) != INTEGER_CST) - { - if (src.query ()->range_of_expr (r, min, phi) && !r.undefined_p ()) - min = wide_int_to_tree (type, r.lower_bound ()); - else - min = vrp_val_min (type); - } - if (TREE_CODE (max) != INTEGER_CST) - { - if (src.query ()->range_of_expr (r, max, phi) && !r.undefined_p ()) - max = wide_int_to_tree (type, r.upper_bound ()); - else - max = vrp_val_max (type); - } - r.set (min, max); - } - else - r.set_varying (type); -} - -// ----------------------------------------------------------------------- - -// Check if an && or || expression can be folded based on relations. ie -// c_2 = a_6 > b_7 -// c_3 = a_6 < b_7 -// c_4 = c_2 && c_3 -// c_2 and c_3 can never be true at the same time, -// Therefore c_4 can always resolve to false based purely on the relations. - -void -fold_using_range::relation_fold_and_or (irange& lhs_range, gimple *s, - fur_source &src) -{ - // No queries or already folded. - if (!src.gori () || !src.query ()->oracle () || lhs_range.singleton_p ()) - return; - - // Only care about AND and OR expressions. - enum tree_code code = gimple_expr_code (s); - bool is_and = false; - if (code == BIT_AND_EXPR || code == TRUTH_AND_EXPR) - is_and = true; - else if (code != BIT_IOR_EXPR && code != TRUTH_OR_EXPR) - return; - - tree lhs = gimple_get_lhs (s); - tree ssa1 = gimple_range_ssa_p (gimple_range_operand1 (s)); - tree ssa2 = gimple_range_ssa_p (gimple_range_operand2 (s)); - - // Deal with || and && only when there is a full set of symbolics. - if (!lhs || !ssa1 || !ssa2 - || (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE) - || (TREE_CODE (TREE_TYPE (ssa1)) != BOOLEAN_TYPE) - || (TREE_CODE (TREE_TYPE (ssa2)) != BOOLEAN_TYPE)) - return; - - // Now we know its a boolean AND or OR expression with boolean operands. - // Ideally we search dependencies for common names, and see what pops out. - // until then, simply try to resolve direct dependencies. - - // Both names will need to have 2 direct dependencies. - tree ssa1_dep2 = src.gori ()->depend2 (ssa1); - tree ssa2_dep2 = src.gori ()->depend2 (ssa2); - if (!ssa1_dep2 || !ssa2_dep2) - return; - - tree ssa1_dep1 = src.gori ()->depend1 (ssa1); - tree ssa2_dep1 = src.gori ()->depend1 (ssa2); - // Make sure they are the same dependencies, and detect the order of the - // relationship. - bool reverse_op2 = true; - if (ssa1_dep1 == ssa2_dep1 && ssa1_dep2 == ssa2_dep2) - reverse_op2 = false; - else if (ssa1_dep1 != ssa2_dep2 || ssa1_dep2 != ssa2_dep1) - return; - - range_operator *handler1 = gimple_range_handler (SSA_NAME_DEF_STMT (ssa1)); - range_operator *handler2 = gimple_range_handler (SSA_NAME_DEF_STMT (ssa2)); - - int_range<2> bool_one (boolean_true_node, boolean_true_node); - - relation_kind relation1 = handler1->op1_op2_relation (bool_one); - relation_kind relation2 = handler2->op1_op2_relation (bool_one); - if (relation1 == VREL_NONE || relation2 == VREL_NONE) - return; - - if (reverse_op2) - relation2 = relation_negate (relation2); - - // x && y is false if the relation intersection of the true cases is NULL. - if (is_and && relation_intersect (relation1, relation2) == VREL_EMPTY) - lhs_range = int_range<2> (boolean_false_node, boolean_false_node); - // x || y is true if the union of the true cases is NO-RELATION.. - // ie, one or the other being true covers the full range of possibilties. - else if (!is_and && relation_union (relation1, relation2) == VREL_NONE) - lhs_range = bool_one; - else - return; - - range_cast (lhs_range, TREE_TYPE (lhs)); - if (dump_file && (dump_flags & TDF_DETAILS)) - { - fprintf (dump_file, " Relation adjustment: "); - print_generic_expr (dump_file, ssa1, TDF_SLIM); - fprintf (dump_file, " and "); - print_generic_expr (dump_file, ssa2, TDF_SLIM); - fprintf (dump_file, " combine to produce "); - lhs_range.dump (dump_file); - fputc ('\n', dump_file); - } - - return; -} - -// Register any outgoing edge relations from a conditional branch. - -void -fold_using_range::postfold_gcond_edges (gcond *s, fur_source &src) -{ - int_range_max r; - tree name; - range_operator *handler; - basic_block bb = gimple_bb (s); - - edge e0 = EDGE_SUCC (bb, 0); - if (!single_pred_p (e0->dest)) - e0 = NULL; - - edge e1 = EDGE_SUCC (bb, 1); - if (!single_pred_p (e1->dest)) - e1 = NULL; - - // At least one edge needs to be single pred. - if (!e0 && !e1) - return; - - // First, register the gcond itself. This will catch statements like - // if (a_2 < b_5) - tree ssa1 = gimple_range_ssa_p (gimple_range_operand1 (s)); - tree ssa2 = gimple_range_ssa_p (gimple_range_operand2 (s)); - if (ssa1 && ssa2) - { - handler = gimple_range_handler (s); - gcc_checking_assert (handler); - if (e0) - { - gcond_edge_range (r, e0); - relation_kind relation = handler->op1_op2_relation (r); - if (relation != VREL_NONE) - src.register_relation (e0, relation, ssa1, ssa2); - } - if (e1) - { - gcond_edge_range (r, e1); - relation_kind relation = handler->op1_op2_relation (r); - if (relation != VREL_NONE) - src.register_relation (e1, relation, ssa1, ssa2); - } - } - - // Outgoing relations of GORI exports require a gori engine. - if (!src.gori ()) - return; - - range_query *q = src.query (); - // Now look for other relations in the exports. This will find stmts - // leading to the condition such as: - // c_2 = a_4 < b_7 - // if (c_2) - - FOR_EACH_GORI_EXPORT_NAME (*(src.gori ()), bb, name) - { - if (TREE_CODE (TREE_TYPE (name)) != BOOLEAN_TYPE) - continue; - gimple *stmt = SSA_NAME_DEF_STMT (name); - handler = gimple_range_handler (stmt); - if (!handler) - continue; - tree ssa1 = gimple_range_ssa_p (gimple_range_operand1 (stmt)); - tree ssa2 = gimple_range_ssa_p (gimple_range_operand2 (stmt)); - if (ssa1 && ssa2) - { - if (e0 && src.gori ()->outgoing_edge_range_p (r, e0, name, *q) - && r.singleton_p ()) - { - relation_kind relation = handler->op1_op2_relation (r); - if (relation != VREL_NONE) - src.register_relation (e0, relation, ssa1, ssa2); - } - if (e1 && src.gori ()->outgoing_edge_range_p (r, e1, name, *q) - && r.singleton_p ()) - { - relation_kind relation = handler->op1_op2_relation (r); - if (relation != VREL_NONE) - src.register_relation (e1, relation, ssa1, ssa2); - } - } - } -} -// -------------------------------------------------------------------------- // trace_ranger implementation. diff --git a/gcc/gimple-range.h b/gcc/gimple-range.h index 87911b9..aa62039 100644 --- a/gcc/gimple-range.h +++ b/gcc/gimple-range.h @@ -19,29 +19,18 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ -#ifndef GCC_GIMPLE_RANGE_STMT_H -#define GCC_GIMPLE_RANGE_STMT_H +#ifndef GCC_GIMPLE_RANGE_H +#define GCC_GIMPLE_RANGE_H #include "range.h" #include "value-query.h" #include "range-op.h" #include "gimple-range-edge.h" +#include "gimple-range-fold.h" #include "gimple-range-gori.h" #include "gimple-range-cache.h" -// This file is the main include point for gimple ranges. -// There are two fold_range routines of interest: -// bool fold_range (irange &r, gimple *s, range_query *q) -// bool fold_range (irange &r, gimple *s, edge on_edge, range_query *q) -// These routines will fold stmt S into the result irange R. -// Any ssa_names on the stmt will be calculated using the range_query -// parameter via a call to range_of_expr. -// If no range_query is provided, current global range info will be used. -// The second variation specifies an edge, and stmt S is recalculated as if -// it appeared on that edge. - - // This is the basic range generator interface. // // This base class provides all the API entry points, but only provides @@ -73,131 +62,6 @@ protected: ranger_cache m_cache; }; -// Source of all operands for fold_using_range and gori_compute. -// It abstracts out the source of an operand so it can come from a stmt or -// and edge or anywhere a derived class of fur_source wants. -// THe default simply picks up ranges from the current range_query. - -class fur_source -{ -public: - fur_source (range_query *q = NULL); - inline range_query *query () { return m_query; } - inline class gori_compute *gori () { return m_gori; }; - virtual bool get_operand (irange &r, tree expr); - virtual bool get_phi_operand (irange &r, tree expr, edge e); - virtual relation_kind query_relation (tree op1, tree op2); - virtual void register_relation (gimple *stmt, relation_kind k, tree op1, - tree op2); - virtual void register_relation (edge e, relation_kind k, tree op1, - tree op2); -protected: - range_query *m_query; - gori_compute *m_gori; -}; - -// fur_stmt is the specification for drawing an operand from range_query Q -// via a range_of_Expr call on stmt S. - -class fur_stmt : public fur_source -{ -public: - fur_stmt (gimple *s, range_query *q = NULL); - virtual bool get_operand (irange &r, tree expr) OVERRIDE; - virtual bool get_phi_operand (irange &r, tree expr, edge e) OVERRIDE; - virtual relation_kind query_relation (tree op1, tree op2) OVERRIDE; -private: - gimple *m_stmt; -}; - - -// Fold stmt S into range R using range query Q. -bool fold_range (irange &r, gimple *s, range_query *q = NULL); -// Recalculate stmt S into R using range query Q as if it were on edge ON_EDGE. -bool fold_range (irange &r, gimple *s, edge on_edge, range_query *q = NULL); -// These routines allow you to specify the operands to use when folding. -// Any excess queries will be drawn from the current range_query. -bool fold_range (irange &r, gimple *s, irange &r1); -bool fold_range (irange &r, gimple *s, irange &r1, irange &r2); -bool fold_range (irange &r, gimple *s, unsigned num_elements, irange *vector); - -// This class uses ranges to fold a gimple statement producinf a range for -// the LHS. The source of all operands is supplied via the fur_source class -// which provides a range_query as well as a source location and any other -// required information. - -class fold_using_range -{ -public: - bool fold_stmt (irange &r, gimple *s, class fur_source &src, - tree name = NULL_TREE); -protected: - bool range_of_range_op (irange &r, gimple *s, fur_source &src); - bool range_of_call (irange &r, gcall *call, fur_source &src); - bool range_of_cond_expr (irange &r, gassign* cond, fur_source &src); - bool range_of_address (irange &r, gimple *s, fur_source &src); - bool range_of_builtin_call (irange &r, gcall *call, fur_source &src); - void range_of_builtin_ubsan_call (irange &r, gcall *call, tree_code code, - fur_source &src); - bool range_of_phi (irange &r, gphi *phi, fur_source &src); - void range_of_ssa_name_with_loop_info (irange &, tree, class loop *, gphi *, - fur_source &src); - void relation_fold_and_or (irange& lhs_range, gimple *s, fur_source &src); - void postfold_gcond_edges (gcond *s, fur_source &src); -}; - - -// These routines provide a GIMPLE interface to the range-ops code. -extern tree gimple_range_operand1 (const gimple *s); -extern tree gimple_range_operand2 (const gimple *s); -extern bool gimple_range_calc_op1 (irange &r, const gimple *s, - const irange &lhs_range); -extern bool gimple_range_calc_op1 (irange &r, const gimple *s, - const irange &lhs_range, - const irange &op2_range); -extern bool gimple_range_calc_op2 (irange &r, const gimple *s, - const irange &lhs_range, - const irange &op1_range); - - -// Return the range_operator pointer for this statement. This routine -// can also be used to gate whether a routine is range-ops enabled. - -static inline range_operator * -gimple_range_handler (const gimple *s) -{ - if (const gassign *ass = dyn_cast<const gassign *> (s)) - return range_op_handler (gimple_assign_rhs_code (ass), - TREE_TYPE (gimple_assign_lhs (ass))); - if (const gcond *cond = dyn_cast<const gcond *> (s)) - return range_op_handler (gimple_cond_code (cond), - TREE_TYPE (gimple_cond_lhs (cond))); - return NULL; -} - -// Return EXP if it is an SSA_NAME with a type supported by gimple ranges. - -static inline tree -gimple_range_ssa_p (tree exp) -{ - if (exp && TREE_CODE (exp) == SSA_NAME && - !SSA_NAME_IS_VIRTUAL_OPERAND (exp) && - irange::supports_type_p (TREE_TYPE (exp))) - return exp; - return NULL_TREE; -} - -// Return true if TYPE1 and TYPE2 are compatible range types. - -static inline bool -range_compatible_p (tree type1, tree type2) -{ - // types_compatible_p requires conversion in both directions to be useless. - // GIMPLE only requires a cast one way in order to be compatible. - // Ranges really only need the sign and precision to be the same. - return (TYPE_PRECISION (type1) == TYPE_PRECISION (type2) - && TYPE_SIGN (type1) == TYPE_SIGN (type2)); -} // This class overloads the ranger routines to provide tracing facilties // Entry and exit values to each of the APIs is placed in the dumpfile. @@ -227,4 +91,4 @@ private: extern gimple_ranger *enable_ranger (struct function *); extern void disable_ranger (struct function *); -#endif // GCC_GIMPLE_RANGE_STMT_H +#endif // GCC_GIMPLE_RANGE_H diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 21e7a6c..4be2feb 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -9566,8 +9566,116 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, OMP_CLAUSE_SET_MAP_KIND (c, k); } - if (gimplify_expr (pd, pre_p, NULL, is_gimple_lvalue, fb_lvalue) - == GS_ERROR) + if (code == OMP_TARGET && OMP_CLAUSE_MAP_IN_REDUCTION (c)) + { + /* Don't gimplify *pd fully at this point, as the base + will need to be adjusted during omp lowering. */ + auto_vec<tree, 10> expr_stack; + tree *p = pd; + while (handled_component_p (*p) + || TREE_CODE (*p) == INDIRECT_REF + || TREE_CODE (*p) == ADDR_EXPR + || TREE_CODE (*p) == MEM_REF + || TREE_CODE (*p) == NON_LVALUE_EXPR) + { + expr_stack.safe_push (*p); + p = &TREE_OPERAND (*p, 0); + } + for (int i = expr_stack.length () - 1; i >= 0; i--) + { + tree t = expr_stack[i]; + if (TREE_CODE (t) == ARRAY_REF + || TREE_CODE (t) == ARRAY_RANGE_REF) + { + if (TREE_OPERAND (t, 2) == NULL_TREE) + { + tree low = unshare_expr (array_ref_low_bound (t)); + if (!is_gimple_min_invariant (low)) + { + TREE_OPERAND (t, 2) = low; + if (gimplify_expr (&TREE_OPERAND (t, 2), + pre_p, NULL, + is_gimple_reg, + fb_rvalue) == GS_ERROR) + remove = true; + } + } + else if (gimplify_expr (&TREE_OPERAND (t, 2), pre_p, + NULL, is_gimple_reg, + fb_rvalue) == GS_ERROR) + remove = true; + if (TREE_OPERAND (t, 3) == NULL_TREE) + { + tree elmt_size = array_ref_element_size (t); + if (!is_gimple_min_invariant (elmt_size)) + { + elmt_size = unshare_expr (elmt_size); + tree elmt_type + = TREE_TYPE (TREE_TYPE (TREE_OPERAND (t, + 0))); + tree factor + = size_int (TYPE_ALIGN_UNIT (elmt_type)); + elmt_size + = size_binop (EXACT_DIV_EXPR, elmt_size, + factor); + TREE_OPERAND (t, 3) = elmt_size; + if (gimplify_expr (&TREE_OPERAND (t, 3), + pre_p, NULL, + is_gimple_reg, + fb_rvalue) == GS_ERROR) + remove = true; + } + } + else if (gimplify_expr (&TREE_OPERAND (t, 3), pre_p, + NULL, is_gimple_reg, + fb_rvalue) == GS_ERROR) + remove = true; + } + else if (TREE_CODE (t) == COMPONENT_REF) + { + if (TREE_OPERAND (t, 2) == NULL_TREE) + { + tree offset = component_ref_field_offset (t); + if (!is_gimple_min_invariant (offset)) + { + offset = unshare_expr (offset); + tree field = TREE_OPERAND (t, 1); + tree factor + = size_int (DECL_OFFSET_ALIGN (field) + / BITS_PER_UNIT); + offset = size_binop (EXACT_DIV_EXPR, offset, + factor); + TREE_OPERAND (t, 2) = offset; + if (gimplify_expr (&TREE_OPERAND (t, 2), + pre_p, NULL, + is_gimple_reg, + fb_rvalue) == GS_ERROR) + remove = true; + } + } + else if (gimplify_expr (&TREE_OPERAND (t, 2), pre_p, + NULL, is_gimple_reg, + fb_rvalue) == GS_ERROR) + remove = true; + } + } + for (; expr_stack.length () > 0; ) + { + tree t = expr_stack.pop (); + + if (TREE_CODE (t) == ARRAY_REF + || TREE_CODE (t) == ARRAY_RANGE_REF) + { + if (!is_gimple_min_invariant (TREE_OPERAND (t, 1)) + && gimplify_expr (&TREE_OPERAND (t, 1), pre_p, + NULL, is_gimple_val, + fb_rvalue) == GS_ERROR) + remove = true; + } + } + } + else if (gimplify_expr (pd, pre_p, NULL, is_gimple_lvalue, + fb_lvalue) == GS_ERROR) { remove = true; break; @@ -9764,17 +9872,21 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TASK_REDUCTION) && OMP_CLAUSE_REDUCTION_PLACEHOLDER (c)) { - omp_add_variable (ctx, OMP_CLAUSE_REDUCTION_PLACEHOLDER (c), - GOVD_LOCAL | GOVD_SEEN); - if (OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER (c) + struct gimplify_omp_ctx *pctx + = code == OMP_TARGET ? outer_ctx : ctx; + if (pctx) + omp_add_variable (pctx, OMP_CLAUSE_REDUCTION_PLACEHOLDER (c), + GOVD_LOCAL | GOVD_SEEN); + if (pctx + && OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER (c) && walk_tree (&OMP_CLAUSE_REDUCTION_INIT (c), find_decl_expr, OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER (c), NULL) == NULL_TREE) - omp_add_variable (ctx, + omp_add_variable (pctx, OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER (c), GOVD_LOCAL | GOVD_SEEN); - gimplify_omp_ctxp = ctx; + gimplify_omp_ctxp = pctx; push_gimplify_context (); OMP_CLAUSE_REDUCTION_GIMPLE_INIT (c) = NULL; diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index b2f414d..c3b8e73 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -281,6 +281,7 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT90, ECF_CONST, cadd90, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) +DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) /* FP scales. */ diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c index f8b1558..5009279 100644 --- a/gcc/omp-expand.c +++ b/gcc/omp-expand.c @@ -9615,6 +9615,10 @@ expand_omp_target (struct omp_region *region) } c = omp_find_clause (clauses, OMP_CLAUSE_NOWAIT); + /* FIXME: in_reduction(...) nowait is unimplemented yet, pretend + nowait doesn't appear. */ + if (c && omp_find_clause (clauses, OMP_CLAUSE_IN_REDUCTION)) + c = NULL; if (c) flags_i |= GOMP_TARGET_FLAG_NOWAIT; } diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 2cc2a18..503754b 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -1240,6 +1240,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) && ((OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION && (OMP_CLAUSE_REDUCTION_INSCAN (c) || OMP_CLAUSE_REDUCTION_TASK (c))) + || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION || is_task_ctx (ctx))) { /* For now. */ @@ -1254,6 +1255,29 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) if (TREE_CODE (t) == INDIRECT_REF || TREE_CODE (t) == ADDR_EXPR) t = TREE_OPERAND (t, 0); + if (is_omp_target (ctx->stmt)) + { + if (is_variable_sized (t)) + { + gcc_assert (DECL_HAS_VALUE_EXPR_P (t)); + t = DECL_VALUE_EXPR (t); + gcc_assert (TREE_CODE (t) == INDIRECT_REF); + t = TREE_OPERAND (t, 0); + gcc_assert (DECL_P (t)); + } + tree at = t; + if (ctx->outer) + scan_omp_op (&at, ctx->outer); + tree nt = omp_copy_decl_1 (at, ctx); + splay_tree_insert (ctx->field_map, + (splay_tree_key) &DECL_CONTEXT (t), + (splay_tree_value) nt); + if (at != t) + splay_tree_insert (ctx->field_map, + (splay_tree_key) &DECL_CONTEXT (at), + (splay_tree_value) nt); + break; + } install_var_local (t, ctx); if (is_taskreg_ctx (ctx) && (!is_global_var (maybe_lookup_decl_in_outer_ctx (t, ctx)) @@ -1280,6 +1304,21 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) } break; } + if (is_omp_target (ctx->stmt)) + { + tree at = decl; + if (ctx->outer) + scan_omp_op (&at, ctx->outer); + tree nt = omp_copy_decl_1 (at, ctx); + splay_tree_insert (ctx->field_map, + (splay_tree_key) &DECL_CONTEXT (decl), + (splay_tree_value) nt); + if (at != decl) + splay_tree_insert (ctx->field_map, + (splay_tree_key) &DECL_CONTEXT (at), + (splay_tree_value) nt); + break; + } if (is_task_ctx (ctx) || (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION && OMP_CLAUSE_REDUCTION_TASK (c) @@ -1546,7 +1585,8 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) else install_var_field (decl, true, 3, ctx); if (is_gimple_omp_offloaded (ctx->stmt) - && !OMP_CLAUSE_MAP_IN_REDUCTION (c)) + && !(is_gimple_omp_oacc (ctx->stmt) + && OMP_CLAUSE_MAP_IN_REDUCTION (c))) install_var_local (decl, ctx); } } @@ -1692,7 +1732,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) case OMP_CLAUSE_REDUCTION: case OMP_CLAUSE_IN_REDUCTION: decl = OMP_CLAUSE_DECL (c); - if (TREE_CODE (decl) != MEM_REF) + if (TREE_CODE (decl) != MEM_REF && !is_omp_target (ctx->stmt)) { if (is_variable_sized (decl)) install_var_local (decl, ctx); @@ -1844,8 +1884,11 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TASK_REDUCTION) && OMP_CLAUSE_REDUCTION_PLACEHOLDER (c)) { - scan_omp (&OMP_CLAUSE_REDUCTION_GIMPLE_INIT (c), ctx); - scan_omp (&OMP_CLAUSE_REDUCTION_GIMPLE_MERGE (c), ctx); + omp_context *rctx = ctx; + if (is_omp_target (ctx->stmt)) + rctx = ctx->outer; + scan_omp (&OMP_CLAUSE_REDUCTION_GIMPLE_INIT (c), rctx); + scan_omp (&OMP_CLAUSE_REDUCTION_GIMPLE_MERGE (c), rctx); } else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE && OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c)) @@ -4828,7 +4871,9 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, break; case OMP_CLAUSE_REDUCTION: case OMP_CLAUSE_IN_REDUCTION: - if (is_task_ctx (ctx) || OMP_CLAUSE_REDUCTION_TASK (c)) + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IN_REDUCTION + || is_task_ctx (ctx) + || OMP_CLAUSE_REDUCTION_TASK (c)) { task_reduction_p = true; if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION) @@ -4958,7 +5003,12 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, } new_var = var; } - if (c_kind != OMP_CLAUSE_COPYIN) + if (c_kind == OMP_CLAUSE_IN_REDUCTION && is_omp_target (ctx->stmt)) + { + splay_tree_key key = (splay_tree_key) &DECL_CONTEXT (var); + new_var = (tree) splay_tree_lookup (ctx->field_map, key)->value; + } + else if (c_kind != OMP_CLAUSE_COPYIN) new_var = lookup_decl (var, ctx); if (c_kind == OMP_CLAUSE_SHARED || c_kind == OMP_CLAUSE_COPYIN) @@ -4980,7 +5030,10 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, if (TREE_CODE (orig_var) == POINTER_PLUS_EXPR) { tree b = TREE_OPERAND (orig_var, 1); - b = maybe_lookup_decl (b, ctx); + if (is_omp_target (ctx->stmt)) + b = NULL_TREE; + else + b = maybe_lookup_decl (b, ctx); if (b == NULL) { b = TREE_OPERAND (orig_var, 1); @@ -5006,6 +5059,8 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, || (TREE_CODE (TREE_TYPE (TREE_TYPE (out))) != POINTER_TYPE))) x = var; + else if (is_omp_target (ctx->stmt)) + x = out; else { bool by_ref = use_pointer_for_field (var, NULL); @@ -5049,7 +5104,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, const char *name = get_name (orig_var); if (pass != 3 && !TREE_CONSTANT (v)) { - tree t = maybe_lookup_decl (v, ctx); + tree t; + if (is_omp_target (ctx->stmt)) + t = NULL_TREE; + else + t = maybe_lookup_decl (v, ctx); if (t) v = t; else @@ -5100,7 +5159,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, TYPE_SIZE_UNIT (type)); else { - tree t = maybe_lookup_decl (v, ctx); + tree t; + if (is_omp_target (ctx->stmt)) + t = NULL_TREE; + else + t = maybe_lookup_decl (v, ctx); if (t) v = t; else @@ -5410,8 +5473,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, } else if (pass == 2) { - if (is_global_var (maybe_lookup_decl_in_outer_ctx (var, ctx))) + tree out = maybe_lookup_decl_in_outer_ctx (var, ctx); + if (is_global_var (out)) x = var; + else if (is_omp_target (ctx->stmt)) + x = out; else { bool by_ref = use_pointer_for_field (var, ctx); @@ -6345,7 +6411,27 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, if (OMP_CLAUSE_REDUCTION_GIMPLE_INIT (c)) { tseq = OMP_CLAUSE_REDUCTION_GIMPLE_INIT (c); - lower_omp (&tseq, ctx); + if (c_kind == OMP_CLAUSE_IN_REDUCTION + && is_omp_target (ctx->stmt)) + { + tree d = maybe_lookup_decl_in_outer_ctx (var, ctx); + tree oldv = NULL_TREE; + gcc_assert (d); + if (DECL_HAS_VALUE_EXPR_P (d)) + oldv = DECL_VALUE_EXPR (d); + SET_DECL_VALUE_EXPR (d, new_vard); + DECL_HAS_VALUE_EXPR_P (d) = 1; + lower_omp (&tseq, ctx); + if (oldv) + SET_DECL_VALUE_EXPR (d, oldv); + else + { + SET_DECL_VALUE_EXPR (d, NULL_TREE); + DECL_HAS_VALUE_EXPR_P (d) = 0; + } + } + else + lower_omp (&tseq, ctx); gimple_seq_add_seq (ilist, tseq); } OMP_CLAUSE_REDUCTION_GIMPLE_INIT (c) = NULL; @@ -12184,11 +12270,26 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) location_t loc = gimple_location (stmt); bool offloaded, data_region; unsigned int map_cnt = 0; + tree in_reduction_clauses = NULL_TREE; offloaded = is_gimple_omp_offloaded (stmt); switch (gimple_omp_target_kind (stmt)) { case GF_OMP_TARGET_KIND_REGION: + tree *p, *q; + q = &in_reduction_clauses; + for (p = gimple_omp_target_clauses_ptr (stmt); *p; ) + if (OMP_CLAUSE_CODE (*p) == OMP_CLAUSE_IN_REDUCTION) + { + *q = *p; + q = &OMP_CLAUSE_CHAIN (*q); + *p = OMP_CLAUSE_CHAIN (*p); + } + else + p = &OMP_CLAUSE_CHAIN (*p); + *q = NULL_TREE; + *p = in_reduction_clauses; + /* FALLTHRU */ case GF_OMP_TARGET_KIND_UPDATE: case GF_OMP_TARGET_KIND_ENTER_DATA: case GF_OMP_TARGET_KIND_EXIT_DATA: @@ -12217,12 +12318,17 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) gimple_seq dep_ilist = NULL; gimple_seq dep_olist = NULL; - if (omp_find_clause (clauses, OMP_CLAUSE_DEPEND)) + bool has_depend = omp_find_clause (clauses, OMP_CLAUSE_DEPEND) != NULL_TREE; + if (has_depend || in_reduction_clauses) { push_gimplify_context (); dep_bind = gimple_build_bind (NULL, NULL, make_node (BLOCK)); - lower_depend_clauses (gimple_omp_target_clauses_ptr (stmt), - &dep_ilist, &dep_olist); + if (has_depend) + lower_depend_clauses (gimple_omp_target_clauses_ptr (stmt), + &dep_ilist, &dep_olist); + if (in_reduction_clauses) + lower_rec_input_clauses (in_reduction_clauses, &dep_ilist, &dep_olist, + ctx, NULL); } tgt_bind = NULL; @@ -12348,6 +12454,7 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) /* Don't remap compute constructs' reduction variables, because the intermediate result must be local to each gang. */ if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && is_gimple_omp_oacc (ctx->stmt) && OMP_CLAUSE_MAP_IN_REDUCTION (c))) { x = build_receiver_ref (var, true, ctx); @@ -12565,16 +12672,46 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP && OMP_CLAUSE_MAP_ZERO_BIAS_ARRAY_SECTION (c)) { - gcc_checking_assert (OMP_CLAUSE_DECL (OMP_CLAUSE_CHAIN (c)) - == get_base_address (ovar)); nc = OMP_CLAUSE_CHAIN (c); + gcc_checking_assert (OMP_CLAUSE_DECL (nc) + == get_base_address (ovar)); ovar = OMP_CLAUSE_DECL (nc); } else { tree x = build_sender_ref (ovar, ctx); - tree v - = build_fold_addr_expr_with_type (ovar, ptr_type_node); + tree v = ovar; + if (in_reduction_clauses + && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && OMP_CLAUSE_MAP_IN_REDUCTION (c)) + { + v = unshare_expr (v); + tree *p = &v; + while (handled_component_p (*p) + || TREE_CODE (*p) == INDIRECT_REF + || TREE_CODE (*p) == ADDR_EXPR + || TREE_CODE (*p) == MEM_REF + || TREE_CODE (*p) == NON_LVALUE_EXPR) + p = &TREE_OPERAND (*p, 0); + tree d = *p; + if (is_variable_sized (d)) + { + gcc_assert (DECL_HAS_VALUE_EXPR_P (d)); + d = DECL_VALUE_EXPR (d); + gcc_assert (TREE_CODE (d) == INDIRECT_REF); + d = TREE_OPERAND (d, 0); + gcc_assert (DECL_P (d)); + } + splay_tree_key key + = (splay_tree_key) &DECL_CONTEXT (d); + tree nd = (tree) splay_tree_lookup (ctx->field_map, + key)->value; + if (d == *p) + *p = nd; + else + *p = build_fold_indirect_ref (nd); + } + v = build_fold_addr_expr_with_type (v, ptr_type_node); gimplify_assign (x, v, &ilist); nc = NULL_TREE; } @@ -12601,19 +12738,45 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) if (DECL_P (ovar) && DECL_ALIGN_UNIT (ovar) > talign) talign = DECL_ALIGN_UNIT (ovar); + var = NULL_TREE; + if (nc) + { + if (in_reduction_clauses + && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && OMP_CLAUSE_MAP_IN_REDUCTION (c)) + { + tree d = ovar; + if (is_variable_sized (d)) + { + gcc_assert (DECL_HAS_VALUE_EXPR_P (d)); + d = DECL_VALUE_EXPR (d); + gcc_assert (TREE_CODE (d) == INDIRECT_REF); + d = TREE_OPERAND (d, 0); + gcc_assert (DECL_P (d)); + } + splay_tree_key key + = (splay_tree_key) &DECL_CONTEXT (d); + tree nd = (tree) splay_tree_lookup (ctx->field_map, + key)->value; + if (d == ovar) + var = nd; + else + var = build_fold_indirect_ref (nd); + } + else + var = lookup_decl_in_outer_ctx (ovar, ctx); + } if (nc && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH) && is_omp_target (stmt)) { - var = lookup_decl_in_outer_ctx (ovar, ctx); x = build_sender_ref (c, ctx); gimplify_assign (x, build_fold_addr_expr (var), &ilist); } else if (nc) { - var = lookup_decl_in_outer_ctx (ovar, ctx); x = build_sender_ref (ovar, ctx); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP @@ -13539,7 +13702,9 @@ lower_omp_regimplify_p (tree *tp, int *walk_subtrees, tree t = *tp; /* Any variable with DECL_VALUE_EXPR needs to be regimplified. */ - if (VAR_P (t) && data == NULL && DECL_HAS_VALUE_EXPR_P (t)) + if ((VAR_P (t) || TREE_CODE (t) == PARM_DECL || TREE_CODE (t) == RESULT_DECL) + && data == NULL + && DECL_HAS_VALUE_EXPR_P (t)) return t; if (task_shared_vars diff --git a/gcc/optabs.def b/gcc/optabs.def index b192a9d..41ab259 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -407,6 +407,7 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a") OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a") OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a") OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a") +OPTAB_D (vec_addsub_optab, "vec_addsub$a3") OPTAB_D (sync_add_optab, "sync_add$I$a") OPTAB_D (sync_and_optab, "sync_and$I$a") diff --git a/gcc/optc-save-gen.awk b/gcc/optc-save-gen.awk index e2a9a49..e363ac7 100644 --- a/gcc/optc-save-gen.awk +++ b/gcc/optc-save-gen.awk @@ -1442,6 +1442,8 @@ checked_options["flag_omit_frame_pointer"]++ checked_options["TARGET_ALIGN_CALL"]++ checked_options["TARGET_CASE_VECTOR_PC_RELATIVE"]++ checked_options["arc_size_opt_level"]++ +# arm exceptions +checked_options["arm_fp16_format"]++ for (i = 0; i < n_opts; i++) { diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c index 9f850c3..242452f 100644 --- a/gcc/stor-layout.c +++ b/gcc/stor-layout.c @@ -2086,7 +2086,10 @@ finish_bitfield_representative (tree repr, tree field) /* If there was an error, the field may be not laid out correctly. Don't bother to do anything. */ if (TREE_TYPE (nextf) == error_mark_node) - return; + { + TREE_TYPE (repr) = error_mark_node; + return; + } maxsize = size_diffop (DECL_FIELD_OFFSET (nextf), DECL_FIELD_OFFSET (repr)); if (tree_fits_uhwi_p (maxsize)) diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index b2de840..9ad2094 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,51 @@ +2021-06-23 Patrick Palka <ppalka@redhat.com> + + PR c++/101174 + * g++.dg/cpp1z/class-deduction-access3.C: New test. + * g++.dg/cpp1z/class-deduction91.C: New test. + +2021-06-23 Aaron Sawdey <acsawdey@linux.ibm.com> + + * gcc.target/powerpc/pcrel-opt-inc-di.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-ld-df.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-ld-di.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-ld-hi.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-ld-qi.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-ld-sf.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-ld-si.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-ld-vector.c: Enable -mpcrel-opt to + test it. + * gcc.target/powerpc/pcrel-opt-st-df.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-st-di.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-st-hi.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-st-qi.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-st-sf.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-st-si.c: Enable -mpcrel-opt to test it. + * gcc.target/powerpc/pcrel-opt-st-vector.c: Enable -mpcrel-opt to + test it. + +2021-06-23 Xi Ruoyao <xry111@mengyan@123.wang> + + * gcc.c-torture/execute/950704-1.c: Add -fwrapv to avoid + undefined behavior. + +2021-06-23 Patrick Palka <ppalka@redhat.com> + + PR c++/86439 + * g++.dg/cpp1z/class-deduction88.C: New test. + * g++.dg/cpp1z/class-deduction89.C: New test. + * g++.dg/cpp1z/class-deduction90.C: New test. + +2021-06-23 Uroš Bizjak <ubizjak@gmail.com> + + PR target/101175 + * gcc.target/i386/pr101175.c: New test. + +2021-06-23 Andre Vehreschild <vehre@gcc.gnu.org> + + PR fortran/100337 + * gfortran.dg/coarray_collectives_17.f90: New test. + 2021-06-22 Sandra Loosemore <sandra@codesourcery.com> Tobias Burnus <tobias@codesourcery.com> diff --git a/gcc/testsuite/c-c++-common/gomp/clauses-1.c b/gcc/testsuite/c-c++-common/gomp/clauses-1.c index 105288e..682442af 100644 --- a/gcc/testsuite/c-c++-common/gomp/clauses-1.c +++ b/gcc/testsuite/c-c++-common/gomp/clauses-1.c @@ -125,20 +125,20 @@ bar (int d, int m, int i1, int i2, int i3, int p, int *idp, int s, #pragma omp target parallel \ device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \ if (parallel: i2) default(shared) shared(s) reduction(+:r) num_threads (nth) proc_bind(spread) \ - nowait depend(inout: dd[0]) allocate (omp_default_mem_alloc:f) + nowait depend(inout: dd[0]) allocate (omp_default_mem_alloc:f) in_reduction(+:r2) ; #pragma omp target parallel for \ device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \ if (parallel: i2) default(shared) shared(s) reduction(+:r) num_threads (nth) proc_bind(spread) \ lastprivate (l) linear (ll:1) ordered schedule(static, 4) collapse(1) nowait depend(inout: dd[0]) \ - allocate (omp_default_mem_alloc:f) + allocate (omp_default_mem_alloc:f) in_reduction(+:r2) for (int i = 0; i < 64; i++) ll++; #pragma omp target parallel for \ device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \ if (parallel: i2) default(shared) shared(s) reduction(+:r) num_threads (nth) proc_bind(spread) \ lastprivate (l) linear (ll:1) schedule(static, 4) collapse(1) nowait depend(inout: dd[0]) order(concurrent) \ - allocate (omp_default_mem_alloc:f) + allocate (omp_default_mem_alloc:f) in_reduction(+:r2) for (int i = 0; i < 64; i++) ll++; #pragma omp target parallel for simd \ @@ -146,18 +146,18 @@ bar (int d, int m, int i1, int i2, int i3, int p, int *idp, int s, if (parallel: i2) default(shared) shared(s) reduction(+:r) num_threads (nth) proc_bind(spread) \ lastprivate (l) linear (ll:1) schedule(static, 4) collapse(1) \ safelen(8) simdlen(4) aligned(q: 32) nowait depend(inout: dd[0]) nontemporal(ntm) if (simd: i3) order(concurrent) \ - allocate (omp_default_mem_alloc:f) + allocate (omp_default_mem_alloc:f) in_reduction(+:r2) for (int i = 0; i < 64; i++) ll++; #pragma omp target teams \ device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \ shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) nowait depend(inout: dd[0]) \ - allocate (omp_default_mem_alloc:f) + allocate (omp_default_mem_alloc:f) in_reduction(+:r2) ; #pragma omp target teams distribute \ device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \ shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) \ - collapse(1) dist_schedule(static, 16) nowait depend(inout: dd[0]) allocate (omp_default_mem_alloc:f) + collapse(1) dist_schedule(static, 16) nowait depend(inout: dd[0]) allocate (omp_default_mem_alloc:f) in_reduction(+:r2) for (int i = 0; i < 64; i++) ; #pragma omp target teams distribute parallel for \ @@ -166,7 +166,7 @@ bar (int d, int m, int i1, int i2, int i3, int p, int *idp, int s, collapse(1) dist_schedule(static, 16) \ if (parallel: i2) num_threads (nth) proc_bind(spread) \ lastprivate (l) schedule(static, 4) nowait depend(inout: dd[0]) order(concurrent) \ - allocate (omp_default_mem_alloc:f) + allocate (omp_default_mem_alloc:f) in_reduction(+:r2) for (int i = 0; i < 64; i++) ll++; #pragma omp target teams distribute parallel for simd \ @@ -176,7 +176,7 @@ bar (int d, int m, int i1, int i2, int i3, int p, int *idp, int s, if (parallel: i2) num_threads (nth) proc_bind(spread) \ lastprivate (l) schedule(static, 4) order(concurrent) \ safelen(8) simdlen(4) aligned(q: 32) nowait depend(inout: dd[0]) nontemporal(ntm) if (simd: i3) \ - allocate (omp_default_mem_alloc:f) + allocate (omp_default_mem_alloc:f) in_reduction(+:r2) for (int i = 0; i < 64; i++) ll++; #pragma omp target teams distribute simd \ @@ -184,14 +184,14 @@ bar (int d, int m, int i1, int i2, int i3, int p, int *idp, int s, shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) \ collapse(1) dist_schedule(static, 16) order(concurrent) \ safelen(8) simdlen(4) aligned(q: 32) nowait depend(inout: dd[0]) nontemporal(ntm) \ - allocate (omp_default_mem_alloc:f) + allocate (omp_default_mem_alloc:f) in_reduction(+:r2) for (int i = 0; i < 64; i++) ll++; #pragma omp target simd \ device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \ safelen(8) simdlen(4) lastprivate (l) linear(ll: 1) aligned(q: 32) reduction(+:r) \ nowait depend(inout: dd[0]) nontemporal(ntm) if(simd:i3) order(concurrent) \ - allocate (omp_default_mem_alloc:f) + allocate (omp_default_mem_alloc:f) in_reduction(+:r2) for (int i = 0; i < 64; i++) ll++; #pragma omp taskgroup task_reduction(+:r2) allocate (r2) @@ -215,7 +215,7 @@ bar (int d, int m, int i1, int i2, int i3, int p, int *idp, int s, order(concurrent) allocate (f) for (int i = 0; i < 64; i++) ll++; - #pragma omp target nowait depend(inout: dd[0]) + #pragma omp target nowait depend(inout: dd[0]) in_reduction(+:r2) #pragma omp teams distribute \ private(p) firstprivate (f) shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) \ collapse(1) dist_schedule(static, 16) allocate (omp_default_mem_alloc: f) @@ -349,28 +349,28 @@ bar (int d, int m, int i1, int i2, int i3, int p, int *idp, int s, device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \ if (parallel: i2) default(shared) shared(s) reduction(+:r) num_threads (nth) proc_bind(spread) \ nowait depend(inout: dd[0]) lastprivate (l) bind(parallel) order(concurrent) collapse(1) \ - allocate (omp_default_mem_alloc: f) + allocate (omp_default_mem_alloc: f) in_reduction(+:r2) for (l = 0; l < 64; ++l) ; #pragma omp target parallel loop \ device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \ if (parallel: i2) default(shared) shared(s) reduction(+:r) num_threads (nth) proc_bind(spread) \ nowait depend(inout: dd[0]) lastprivate (l) order(concurrent) collapse(1) \ - allocate (omp_default_mem_alloc: f) + allocate (omp_default_mem_alloc: f) in_reduction(+:r2) for (l = 0; l < 64; ++l) ; #pragma omp target teams loop \ device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \ shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) nowait depend(inout: dd[0]) \ lastprivate (l) bind(teams) collapse(1) \ - allocate (omp_default_mem_alloc: f) + allocate (omp_default_mem_alloc: f) in_reduction(+:r2) for (l = 0; l < 64; ++l) ; #pragma omp target teams loop \ device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \ shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) nowait depend(inout: dd[0]) \ lastprivate (l) order(concurrent) collapse(1) \ - allocate (omp_default_mem_alloc: f) + allocate (omp_default_mem_alloc: f) in_reduction(+:r2) for (l = 0; l < 64; ++l) ; } diff --git a/gcc/testsuite/c-c++-common/gomp/target-in-reduction-1.c b/gcc/testsuite/c-c++-common/gomp/target-in-reduction-1.c new file mode 100644 index 0000000..23ed300 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/target-in-reduction-1.c @@ -0,0 +1,12 @@ +void +foo (int i, int j, int k) +{ + #pragma omp target in_reduction (+:i) private (i) /* { dg-error "'i' appears more than once in data-sharing clauses" } */ + ; + #pragma omp target private (i) in_reduction (+:i) /* { dg-error "'i' appears both in data and map clauses" } */ + ; + #pragma omp target in_reduction (+:i) firstprivate (i) /* { dg-error "'i' appears more than once in data-sharing clauses" } */ + ; /* { dg-error "'i' appears both in data and map clauses" "" { target *-*-* } .-1 } */ + #pragma omp target firstprivate (i) in_reduction (+:i) /* { dg-error "'i' appears both in data and map clauses" } */ + ; +} diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction-access3.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction-access3.C new file mode 100644 index 0000000..9df9480 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction-access3.C @@ -0,0 +1,20 @@ +// { dg-do compile { target c++17 } } + +template<class> +struct Cont; + +template<class T> +class Base +{ + using type = T; + friend Cont<T>; +}; + +template<class T> +struct Cont +{ + using argument_type = typename Base<T>::type; + Cont(T, argument_type); +}; + +Cont c(1, 1); diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction88.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction88.C new file mode 100644 index 0000000..be38fed --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction88.C @@ -0,0 +1,18 @@ +// PR c++/86439 +// { dg-do compile { target c++17 } } + +struct NC { + NC() = default; + NC(NC const&) = delete; + NC& operator=(NC const&) = delete; +}; + +template <int> +struct C { + C(NC const&); +}; + +C(NC) -> C<0>; + +NC nc; +C c(nc); diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction89.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction89.C new file mode 100644 index 0000000..dd89857 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction89.C @@ -0,0 +1,15 @@ +// PR c++/86439 +// { dg-do compile { target c++17 } } + +struct B { }; +struct C { }; + +template<class T> +struct A { + A(T, B); +}; + +template<class T> +A(T, C) -> A<T>; + +A a(0, {}); diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction90.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction90.C new file mode 100644 index 0000000..8b93193 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction90.C @@ -0,0 +1,16 @@ +// PR c++/86439 +// { dg-do compile { target c++17 } } + +struct less { }; +struct allocator { }; + +template<class T, class U = less, class V = allocator> +struct A { + A(T, U); + A(T, V); +}; + +template<class T, class U = less> +A(T, U) -> A<T>; + +A a(0, {}); // { dg-error "ambiguous" } diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction91.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction91.C new file mode 100644 index 0000000..f474c8e --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction91.C @@ -0,0 +1,16 @@ +// PR c++/101174 +// { dg-do compile { target c++17 } } + +struct S { using type = int; }; + +template<class T = int, class U = S> +struct multiset { + using type = typename U::type; + multiset(T); + multiset(U); +}; + +template<class T> +multiset(T) -> multiset<T>; + +multiset c(42); diff --git a/gcc/testsuite/gcc.c-torture/execute/950704-1.c b/gcc/testsuite/gcc.c-torture/execute/950704-1.c index f11aff8..67fe088 100644 --- a/gcc/testsuite/gcc.c-torture/execute/950704-1.c +++ b/gcc/testsuite/gcc.c-torture/execute/950704-1.c @@ -1,3 +1,4 @@ +/* { dg-additional-options "-fwrapv" } */ int errflag; long long diff --git a/gcc/testsuite/gcc.dg/pr101170.c b/gcc/testsuite/gcc.dg/pr101170.c new file mode 100644 index 0000000..fc81062 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr101170.c @@ -0,0 +1,37 @@ +/* PR middle-end/101170 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -g" } */ + +#include <stdarg.h> + +struct S { int a; int b[4]; } s; +va_list ap; +int i; +long long l; + +struct S +foo (int x) +{ + struct S a = {}; + do + if (x) + return a; + while (1); +} + +__attribute__((noipa)) void +bar (void) +{ + for (; i; i++) + l |= va_arg (ap, long long) << s.b[i]; + if (l) + foo (l); +} + +void +baz (int v, ...) +{ + va_start (ap, v); + bar (); + va_end (ap); +} diff --git a/gcc/testsuite/gcc.dg/pr101172.c b/gcc/testsuite/gcc.dg/pr101172.c new file mode 100644 index 0000000..b9d098b --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr101172.c @@ -0,0 +1,20 @@ +/* PR middle-end/101172 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +union U +{ + int a[3]; + struct + { + int a : 3; + struct this_struct var; /* { dg-error "field 'var' has incomplete type" } */ + } b; +}; + +const union U hello = {.a = {1, 2, 3}}; + +void foo() +{ + int x = hello.b.a; +} diff --git a/gcc/testsuite/gcc.dg/torture/pr101105.c b/gcc/testsuite/gcc.dg/torture/pr101105.c new file mode 100644 index 0000000..9222351 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr101105.c @@ -0,0 +1,19 @@ +/* { dg-do run } */ + +short a; +int b[5][4] = {2, 2}; +int d; +short e(int f) { return f == 0 || a && f == 1 ? 0 : a; } +int main() { + int g, h; + g = 3; + for (; g >= 0; g--) { + h = 3; + for (; h >= 0; h--) + b[g][h] = b[0][1] && e(1); + } + d = b[0][1]; + if (d != 0) + __builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr95488-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr95488-1.c index b3674fb..dc684a1 100644 --- a/gcc/testsuite/gcc.target/i386/avx512vl-pr95488-1.c +++ b/gcc/testsuite/gcc.target/i386/avx512vl-pr95488-1.c @@ -1,10 +1,10 @@ /* PR target/pr95488 */ /* { dg-do compile } */ /* { dg-options "-O2 -mavx512bw -mavx512vl" } */ -/* { dg-final { scan-assembler-times "vpmovzxbw" 8 } } */ +/* { dg-final { scan-assembler-times "vpmovzxbw" 8 { target { ! ia32 } } } } */ /* { dg-final { scan-assembler-times "vpmullw\[^\n\]*ymm" 2 } } */ -/* { dg-final { scan-assembler-times "vpmullw\[^\n\]*xmm" 2 } } */ -/* { dg-final { scan-assembler-times "vpmovwb" 4 } } */ +/* { dg-final { scan-assembler-times "vpmullw\[^\n\]*xmm" 2 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "vpmovwb" 4 { target { ! ia32 } } } } */ typedef char v16qi __attribute__ ((vector_size (16))); typedef char v8qi __attribute__ ((vector_size (8))); diff --git a/gcc/testsuite/gcc.target/i386/pr101175.c b/gcc/testsuite/gcc.target/i386/pr101175.c new file mode 100644 index 0000000..ed7a081 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr101175.c @@ -0,0 +1,28 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mlzcnt" } */ +/* { dg-require-effective-target lzcnt } */ + +#include "lzcnt-check.h" + +static int +foo (unsigned int v) +{ + return v ? __builtin_clz (v) : 32; +} + +/* returns -1 if x == 0 */ +int +__attribute__ ((noinline, noclone)) +bar (unsigned int x) +{ + return 31 - foo (x); +} + +static void +lzcnt_test () +{ + int r = bar (0); + + if (r != -1) + abort (); +} diff --git a/gcc/testsuite/gcc.target/i386/pr98434-1.c b/gcc/testsuite/gcc.target/i386/pr98434-1.c new file mode 100644 index 0000000..773d3b8 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr98434-1.c @@ -0,0 +1,64 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bw -mavx512vl -O2 -mprefer-vector-width=512" } */ +/* { dg-final { scan-assembler-times {vpsravw[\t ]*%xmm} 2 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times {vpsrlvw[\t ]*%ymm} 2 } } */ +/* { dg-final { scan-assembler-times {vpsllvw[\t ]*%zmm} 2 } } */ +/* { dg-final { scan-assembler-times {vpsllvq[\t ]*%xmm} 1 } } */ +/* { dg-final { scan-assembler-times {vpsravq[\t ]*%ymm} 1 } } */ +/* { dg-final { scan-assembler-times {vpsrlvq[\t ]*%zmm} 1 } } */ + +int n; + +typedef char v8qi __attribute__((vector_size (8))); +typedef char v16qi __attribute__((vector_size (16))); +typedef char v32qi __attribute__((vector_size (32))); +typedef short v8hi __attribute__((vector_size (16))); +typedef short v16hi __attribute__((vector_size (32))); +typedef short v32hi __attribute__((vector_size (64))); +typedef long long v2di __attribute__((vector_size (16))); +typedef long long v4di __attribute__((vector_size (32))); +typedef long long v8di __attribute__((vector_size (64))); +typedef unsigned char v8uqi __attribute__((vector_size (8))); +typedef unsigned char v16uqi __attribute__((vector_size (16))); +typedef unsigned char v32uqi __attribute__((vector_size (32))); +typedef unsigned short v8uhi __attribute__((vector_size (16))); +typedef unsigned short v16uhi __attribute__((vector_size (32))); +typedef unsigned short v32uhi __attribute__((vector_size (64))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef unsigned long long v4udi __attribute__((vector_size (32))); +typedef unsigned long long v8udi __attribute__((vector_size (64))); + +#define FOO(TYPE, OP, NAME) \ + __attribute__((noipa)) TYPE \ + foo_##TYPE##_##NAME (TYPE a, TYPE b) \ + { \ + return a OP b; \ + } \ + +FOO (v8qi, <<, vashl); +FOO (v8qi, >>, vashr); +FOO (v8uqi, >>, vlshr); +FOO (v16qi, <<, vashl); +FOO (v16qi, >>, vashr); +FOO (v16uqi, >>, vlshr); +FOO (v32qi, <<, vashl); +FOO (v32qi, >>, vashr); +FOO (v32uqi, >>, vlshr); +FOO (v8hi, <<, vashl); +FOO (v8hi, >>, vashr); +FOO (v8uhi, >>, vlshr); +FOO (v16hi, <<, vashl); +FOO (v16hi, >>, vashr); +FOO (v16uhi, >>, vlshr); +FOO (v32hi, <<, vashl); +FOO (v32hi, >>, vashr); +FOO (v32uhi, >>, vlshr); +FOO (v2di, <<, vashl); +FOO (v2di, >>, vashr); +FOO (v2udi, >>, vlshr); +FOO (v4di, <<, vashl); +FOO (v4di, >>, vashr); +FOO (v4udi, >>, vlshr); +FOO (v8di, <<, vashl); +FOO (v8di, >>, vashr); +FOO (v8udi, >>, vlshr); diff --git a/gcc/testsuite/gcc.target/i386/pr98434-2.c b/gcc/testsuite/gcc.target/i386/pr98434-2.c new file mode 100644 index 0000000..4878e70 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr98434-2.c @@ -0,0 +1,129 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mprefer-vector-width=512 -mavx512vl -mavx512bw" } */ +/* { dg-require-effective-target avx512bw } */ +/* { dg-require-effective-target avx512vl } */ + +#include "pr98434-1.c" +void test (void); +#define DO_TEST test +#define AVX512VL +#define AVX512BW +#include "avx512-check.h" + + +typedef char int8; +typedef unsigned char uint8; +typedef short int16; +typedef unsigned short uint16; +typedef long long int64; +typedef unsigned long long uint64; + +#define F_EMULATE(TYPE, SIZE, OP, NAME) \ + __attribute__((noipa, optimize("-fno-tree-vectorize"))) void \ + emulate_##SIZE##_##TYPE##_##NAME (TYPE *a, \ + TYPE *b, \ + TYPE *c) \ + { \ + int i; \ + for (i = 0; i < SIZE; i++) \ + { \ + a[i] = b[i] OP c[i]; \ + } \ + } + +F_EMULATE (int8, 8, <<, vashl); +F_EMULATE (int8, 8, >>, vashr); +F_EMULATE (uint8, 8, >>, vlshr); +F_EMULATE (int8, 16, <<, vashl); +F_EMULATE (int8, 16, >>, vashr); +F_EMULATE (uint8, 16, >>, vlshr); +F_EMULATE (int8, 32, <<, vashl); +F_EMULATE (int8, 32, >>, vashr); +F_EMULATE (uint8, 32, >>, vlshr); +F_EMULATE (int16, 8, <<, vashl); +F_EMULATE (int16, 8, >>, vashr); +F_EMULATE (uint16, 8, >>, vlshr); +F_EMULATE (int16, 16, <<, vashl); +F_EMULATE (int16, 16, >>, vashr); +F_EMULATE (uint16, 16, >>, vlshr); +F_EMULATE (int16, 32, <<, vashl); +F_EMULATE (int16, 32, >>, vashr); +F_EMULATE (uint16, 32, >>, vlshr); +F_EMULATE (int64, 2, <<, vashl); +F_EMULATE (int64, 2, >>, vashr); +F_EMULATE (uint64, 2, >>, vlshr); +F_EMULATE (int64, 4, <<, vashl); +F_EMULATE (int64, 4, >>, vashr); +F_EMULATE (uint64, 4, >>, vlshr); +F_EMULATE (int64, 8, <<, vashl); +F_EMULATE (int64, 8, >>, vashr); +F_EMULATE (uint64, 8, >>, vlshr); + +#define VSHIFT(VTYPE, NAME, src1, src2) \ + foo_##VTYPE##_##NAME (src1, src2) + +#define EMULATE(SIZE, TYPE, NAME, dst, src1, src2) \ + emulate_##SIZE##_##TYPE##_##NAME (dst, src1, src2) + +#define F_TEST_SHIFT(VTYPE, VTYPEU, TYPE, TYPEU, SIZE) \ + __attribute__((noipa, optimize("-fno-tree-vectorize"))) void \ + test_##VTYPE ()\ + {\ + TYPE src1[SIZE], src2[SIZE], ref[SIZE]; \ + TYPEU usrc1[SIZE], usrc2[SIZE], uref[SIZE]; \ + VTYPE dst; \ + VTYPEU udst; \ + int i;\ + for (i = 0; i < SIZE; i++)\ + {\ + dst[i] = ref[i] = -i; \ + src1[i] = -(i + SIZE); \ + src2[i] = i % 8; \ + udst[i] = uref[i] = i; \ + usrc1[i] = (i + SIZE); \ + usrc2[i] = (i % 8); \ + }\ + EMULATE(SIZE, TYPE, vashl, ref, src1, src2); \ + dst = VSHIFT(VTYPE, vashl, *((VTYPE* )&src1[0]), *((VTYPE*) &src2[0])); \ + for (i = 0; i < SIZE; i++)\ + {\ + if(dst[i] != ref[i]) __builtin_abort();\ + }\ + EMULATE(SIZE, TYPE, vashr, ref, src1, src2); \ + dst = VSHIFT(VTYPE, vashr, *((VTYPE* )&src1[0]), *((VTYPE*) &src2[0])); \ + for (i = 0; i < SIZE; i++)\ + {\ + if(dst[i] != ref[i]) __builtin_abort();\ + }\ + EMULATE(SIZE, TYPEU, vlshr, uref, usrc1, usrc2); \ + udst = VSHIFT(VTYPEU, vlshr, *((VTYPEU* )&usrc1[0]), *((VTYPEU*) &usrc2[0])); \ + for (i = 0; i < SIZE; i++)\ + {\ + if(udst[i] != uref[i]) __builtin_abort();\ + }\ + } + +F_TEST_SHIFT (v8qi, v8uqi, int8, uint8, 8); +F_TEST_SHIFT (v16qi, v16uqi, int8, uint8, 16); +F_TEST_SHIFT (v32qi, v32uqi, int8, uint8, 32); +F_TEST_SHIFT (v8hi, v8uhi, int16, uint16, 8); +F_TEST_SHIFT (v16hi, v16uhi, int16, uint16, 16); +F_TEST_SHIFT (v32hi, v32uhi, int16, uint16, 32); +F_TEST_SHIFT (v2di, v2udi, int64, uint64, 2); +F_TEST_SHIFT (v4di, v4udi, int64, uint64, 4); +F_TEST_SHIFT (v8di, v8udi, int64, uint64, 8); + + +void +test (void) +{ + test_v8qi (); + test_v16qi (); + test_v32qi (); + test_v8hi (); + test_v16hi (); + test_v32hi (); + test_v2di (); + test_v4di (); + test_v8di (); +} diff --git a/gcc/testsuite/gcc.target/i386/vect-addsub-2.c b/gcc/testsuite/gcc.target/i386/vect-addsub-2.c new file mode 100644 index 0000000..a6b9414 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vect-addsub-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target sse3 } */ +/* { dg-options "-O3 -msse3" } */ + +float a[1024], b[1024]; + +void foo() +{ + for (int i = 0; i < 256; i++) + { + a[4*i+0] = a[4*i+0] - b[4*i+0]; + a[4*i+1] = a[4*i+1] + b[4*i+1]; + a[4*i+2] = a[4*i+2] - b[4*i+2]; + a[4*i+3] = a[4*i+3] + b[4*i+3]; + } +} + +/* We should be able to vectorize this with SLP using the addsub + SLP pattern. */ +/* { dg-final { scan-assembler "addsubps" } } */ +/* { dg-final { scan-assembler-not "shuf" } } */ diff --git a/gcc/testsuite/gcc.target/i386/vect-addsub-3.c b/gcc/testsuite/gcc.target/i386/vect-addsub-3.c new file mode 100644 index 0000000..b27ee56 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vect-addsub-3.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-require-effective-target sse3 } */ +/* { dg-options "-O3 -msse3" } */ + +#ifndef CHECK_H +#define CHECK_H "sse3-check.h" +#endif + +#ifndef TEST +#define TEST sse3_test +#endif + +#include CHECK_H + +double a[2], b[2], c[2]; + +void __attribute__((noipa)) +foo () +{ + /* When we want to use addsubpd we have to keep permuting both + loads, if instead we blend the result of an add and a sub we + can combine the blend with the permute. Both are similar in cost, + verify we did not wrongly apply both. */ + double tem0 = a[1] - b[1]; + double tem1 = a[0] + b[0]; + c[0] = tem0; + c[1] = tem1; +} + +static void +TEST (void) +{ + a[0] = 1.; a[1] = 2.; + b[0] = 2.; b[1] = 4.; + foo (); + if (c[0] != -2. || c[1] != 3.) + __builtin_abort (); +} diff --git a/gcc/testsuite/gcc.target/i386/vect-addsubv2df.c b/gcc/testsuite/gcc.target/i386/vect-addsubv2df.c new file mode 100644 index 0000000..547485d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vect-addsubv2df.c @@ -0,0 +1,42 @@ +/* { dg-do run } */ +/* { dg-require-effective-target sse3 } */ +/* { dg-options "-O3 -msse3 -fdump-tree-slp2" } */ + +#ifndef CHECK_H +#define CHECK_H "sse3-check.h" +#endif + +#ifndef TEST +#define TEST sse3_test +#endif + +#include CHECK_H + +double x[2], y[2], z[2]; +void __attribute__((noipa)) foo () +{ + x[0] = y[0] - z[0]; + x[1] = y[1] + z[1]; +} +void __attribute__((noipa)) bar () +{ + x[0] = y[0] + z[0]; + x[1] = y[1] - z[1]; +} +static void +TEST (void) +{ + for (int i = 0; i < 2; ++i) + { + y[i] = i + 1; + z[i] = 2 * i + 1; + } + foo (); + if (x[0] != 0 || x[1] != 5) + __builtin_abort (); + bar (); + if (x[0] != 2 || x[1] != -1) + __builtin_abort (); +} + +/* { dg-final { scan-tree-dump-times "ADDSUB" 1 "slp2" } } */ diff --git a/gcc/testsuite/gcc.target/i386/vect-addsubv4df.c b/gcc/testsuite/gcc.target/i386/vect-addsubv4df.c new file mode 100644 index 0000000..e0a1b3d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vect-addsubv4df.c @@ -0,0 +1,36 @@ +/* { dg-do run { target avx_runtime } } */ +/* { dg-require-effective-target avx } */ +/* { dg-options "-O3 -mavx -fdump-tree-slp2" } */ + +double x[4], y[4], z[4]; +void __attribute__((noipa)) foo () +{ + x[0] = y[0] - z[0]; + x[1] = y[1] + z[1]; + x[2] = y[2] - z[2]; + x[3] = y[3] + z[3]; +} +void __attribute__((noipa)) bar () +{ + x[0] = y[0] + z[0]; + x[1] = y[1] - z[1]; + x[2] = y[2] + z[2]; + x[3] = y[3] - z[3]; +} +int main() +{ + for (int i = 0; i < 4; ++i) + { + y[i] = i + 1; + z[i] = 2 * i + 1; + } + foo (); + if (x[0] != 0 || x[1] != 5 || x[2] != -2 || x[3] != 11) + __builtin_abort (); + bar (); + if (x[0] != 2 || x[1] != -1 || x[2] != 8 || x[3] != -3) + __builtin_abort (); + return 0; +} + +/* { dg-final { scan-tree-dump-times "ADDSUB" 1 "slp2" } } */ diff --git a/gcc/testsuite/gcc.target/i386/vect-addsubv4sf.c b/gcc/testsuite/gcc.target/i386/vect-addsubv4sf.c new file mode 100644 index 0000000..b524f0c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vect-addsubv4sf.c @@ -0,0 +1,46 @@ +/* { dg-do run } */ +/* { dg-require-effective-target sse3 } */ +/* { dg-options "-O3 -msse3 -fdump-tree-slp2" } */ + +#ifndef CHECK_H +#define CHECK_H "sse3-check.h" +#endif + +#ifndef TEST +#define TEST sse3_test +#endif + +#include CHECK_H + +float x[4], y[4], z[4]; +void __attribute__((noipa)) foo () +{ + x[0] = y[0] - z[0]; + x[1] = y[1] + z[1]; + x[2] = y[2] - z[2]; + x[3] = y[3] + z[3]; +} +void __attribute__((noipa)) bar () +{ + x[0] = y[0] + z[0]; + x[1] = y[1] - z[1]; + x[2] = y[2] + z[2]; + x[3] = y[3] - z[3]; +} +static void +TEST (void) +{ + for (int i = 0; i < 4; ++i) + { + y[i] = i + 1; + z[i] = 2 * i + 1; + } + foo (); + if (x[0] != 0 || x[1] != 5 || x[2] != -2 || x[3] != 11) + __builtin_abort (); + bar (); + if (x[0] != 2 || x[1] != -1 || x[2] != 8 || x[3] != -3) + __builtin_abort (); +} + +/* { dg-final { scan-tree-dump-times "ADDSUB" 1 "slp2" } } */ diff --git a/gcc/testsuite/gcc.target/i386/vect-addsubv8sf.c b/gcc/testsuite/gcc.target/i386/vect-addsubv8sf.c new file mode 100644 index 0000000..0eed33b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vect-addsubv8sf.c @@ -0,0 +1,46 @@ +/* { dg-do run { target avx_runtime } } */ +/* { dg-require-effective-target avx } */ +/* { dg-options "-O3 -mavx -fdump-tree-slp2" } */ + +float x[8], y[8], z[8]; +void __attribute__((noipa)) foo () +{ + x[0] = y[0] - z[0]; + x[1] = y[1] + z[1]; + x[2] = y[2] - z[2]; + x[3] = y[3] + z[3]; + x[4] = y[4] - z[4]; + x[5] = y[5] + z[5]; + x[6] = y[6] - z[6]; + x[7] = y[7] + z[7]; +} +void __attribute__((noipa)) bar () +{ + x[0] = y[0] + z[0]; + x[1] = y[1] - z[1]; + x[2] = y[2] + z[2]; + x[3] = y[3] - z[3]; + x[4] = y[4] + z[4]; + x[5] = y[5] - z[5]; + x[6] = y[6] + z[6]; + x[7] = y[7] - z[7]; +} +int main() +{ + for (int i = 0; i < 8; ++i) + { + y[i] = i + 1; + z[i] = 2 * i + 1; + } + foo (); + if (x[0] != 0 || x[1] != 5 || x[2] != -2 || x[3] != 11 + || x[4] != -4 || x[5] != 17 || x[6] != -6 || x[7] != 23) + __builtin_abort (); + bar (); + if (x[0] != 2 || x[1] != -1 || x[2] != 8 || x[3] != -3 + || x[4] != 14 || x[5] != -5 || x[6] != 20 || x[7] != -7) + __builtin_abort (); + return 0; +} + +/* { dg-final { scan-tree-dump-times "ADDSUB" 1 "slp2" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c index c82041c..6272f5c 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE unsigned int diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c index d35862f..0dcab31 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE double #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c index 7e1ff99..95b60f3 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE long long #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-hi.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-hi.c index 4143aeb..4a62dfbd 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-hi.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-hi.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE unsigned short #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-qi.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-qi.c index 30d3236..3a7aad4 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-qi.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-qi.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE unsigned char #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-sf.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-sf.c index 9d1e2a1..cb76bed 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-sf.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-sf.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE float #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-si.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-si.c index 17be6fa..ad011d6 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-si.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-si.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE int #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-vector.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-vector.c index 8c12aea..c8f70b5 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-vector.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-vector.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE vector double #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-df.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-df.c index d795d35..686a376 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-df.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-df.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE double #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-di.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-di.c index bf57de4..fefe6d3 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-di.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-di.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE long long #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-hi.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-hi.c index 8822e76..d285686 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-hi.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-hi.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE unsigned short #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-qi.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-qi.c index 2f75683..f617ffc 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-qi.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-qi.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE unsigned char #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-sf.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-sf.c index 3dd88aa..6eb20a3 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-sf.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-sf.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE float #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-si.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-si.c index 78dc812..0cc0b30 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-si.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-si.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE int #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-vector.c b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-vector.c index 2c602eb..d760819 100644 --- a/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-vector.c +++ b/gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-vector.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target powerpc_pcrel } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel-opt" } */ #define TYPE vector double #define LARGE 0x20000 diff --git a/gcc/testsuite/gcc.target/s390/mnop-mcount-m31-mzarch.c b/gcc/testsuite/gcc.target/s390/mnop-mcount-m31-mzarch.c index b2ad9f5b..874ceb9 100644 --- a/gcc/testsuite/gcc.target/s390/mnop-mcount-m31-mzarch.c +++ b/gcc/testsuite/gcc.target/s390/mnop-mcount-m31-mzarch.c @@ -4,5 +4,5 @@ void profileme (void) { - /* { dg-final { scan-assembler "NOPs for -mnop-mcount \\(10 halfwords\\)\n.*brcl\t0,0\n.*brcl\t0,0\n.*brcl\t0,0\n.*bcr\t0,0" } } */ + /* { dg-final { scan-assembler "NOPs for -mnop-mcount \\(7 halfwords\\)\n.*brcl\t0,0\n.*brcl\t0,0\n.*bcr\t0,0" } } */ } diff --git a/gcc/testsuite/gcc.target/s390/mnop-mcount-m64.c b/gcc/testsuite/gcc.target/s390/mnop-mcount-m64.c index c0e3c4e..0d45834 100644 --- a/gcc/testsuite/gcc.target/s390/mnop-mcount-m64.c +++ b/gcc/testsuite/gcc.target/s390/mnop-mcount-m64.c @@ -4,5 +4,5 @@ void profileme (void) { - /* { dg-final { scan-assembler "NOPs for -mnop-mcount \\(12 halfwords\\)\n.*brcl\t0,0\n.*brcl\t0,0\n.*brcl\t0,0\n.*brcl\t0,0" } } */ + /* { dg-final { scan-assembler "NOPs for -mnop-mcount \\(9 halfwords\\)\n.*brcl\t0,0\n.*brcl\t0,0\n.*brcl\t0,0" } } */ } diff --git a/gcc/testsuite/gfortran.dg/coarray_collectives_17.f90 b/gcc/testsuite/gfortran.dg/coarray_collectives_17.f90 new file mode 100644 index 0000000..84a6645 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/coarray_collectives_17.f90 @@ -0,0 +1,42 @@ +! { dg-do run } +! { dg-options "-fcoarray=single" } +! +! PR 100337 +! Test case inspired by code submitted by Brad Richardson + +program main + implicit none + + integer, parameter :: MESSAGE = 42 + integer :: result + + call myco_broadcast(MESSAGE, result, 1) + + if (result /= MESSAGE) error stop 1 +contains + subroutine myco_broadcast(m, r, source_image, stat, errmsg) + integer, intent(in) :: m + integer, intent(out) :: r + integer, intent(in) :: source_image + integer, intent(out), optional :: stat + character(len=*), intent(inout), optional :: errmsg + + integer :: data_length + + data_length = 1 + + call co_broadcast(data_length, source_image, stat, errmsg) + + if (present(stat)) then + if (stat /= 0) return + end if + + if (this_image() == source_image) then + r = m + end if + + call co_broadcast(r, source_image, stat, errmsg) + end subroutine + +end program + diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 4a0dc3b..c5f6ba5 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -3491,17 +3491,11 @@ setup_one_parameter (copy_body_data *id, tree p, tree value, tree fn, automatically replaced by the VAR_DECL. */ insert_decl_map (id, p, var); - /* Even if P was TREE_READONLY, the new VAR should not be. - In the original code, we would have constructed a - temporary, and then the function body would have never - changed the value of P. However, now, we will be - constructing VAR directly. The constructor body may - change its value multiple times as it is being - constructed. Therefore, it must not be TREE_READONLY; - the back-end assumes that TREE_READONLY variable is - assigned to only once. */ - if (TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (p))) - TREE_READONLY (var) = 0; + /* Even if P was TREE_READONLY, the new VAR should not be. In the original + code, we would have constructed a temporary, and then the function body + would have never changed the value of P. However, now, we will be + constructing VAR directly. Therefore, it must not be TREE_READONLY. */ + TREE_READONLY (var) = 0; tree rhs = value; if (value diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c index ac1674d..a4ebf22 100644 --- a/gcc/tree-predcom.c +++ b/gcc/tree-predcom.c @@ -375,13 +375,156 @@ struct component struct component *next; }; -/* Bitmap of ssa names defined by looparound phi nodes covered by chains. */ +/* A class to encapsulate the global states used for predictive + commoning work on top of one given LOOP. */ -static bitmap looparound_phis; +class pcom_worker +{ +public: + pcom_worker (loop_p l) : loop (l), chains (vNULL), cache (NULL) + { + dependences.create (10); + datarefs.create (10); + } + + ~pcom_worker () + { + free_data_refs (datarefs); + free_dependence_relations (dependences); + free_affine_expand_cache (&cache); + release_chains (); + } + + pcom_worker (const pcom_worker &) = delete; + pcom_worker &operator= (const pcom_worker &) = delete; + + /* Performs predictive commoning. */ + unsigned tree_predictive_commoning_loop (bool allow_unroll_p); + + /* Perform the predictive commoning optimization for chains, make this + public for being called in callback execute_pred_commoning_cbck. */ + void execute_pred_commoning (bitmap tmp_vars); + +private: + /* The pointer to the given loop. */ + loop_p loop; + + /* All data references. */ + vec<data_reference_p> datarefs; + + /* All data dependences. */ + vec<ddr_p> dependences; + + /* All chains. */ + vec<chain_p> chains; + + /* Bitmap of ssa names defined by looparound phi nodes covered by chains. */ + auto_bitmap looparound_phis; + + typedef hash_map<tree, name_expansion *> tree_expand_map_t; + /* Cache used by tree_to_aff_combination_expand. */ + tree_expand_map_t *cache; + /* Splits dependence graph to components. */ + struct component *split_data_refs_to_components (); + + /* Check the conditions on references inside each of components COMPS, + and remove the unsuitable components from the list. */ + struct component *filter_suitable_components (struct component *comps); + + /* Find roots of the values and determine distances in components COMPS, + and separates the references to chains. */ + void determine_roots (struct component *comps); + + /* Prepare initializers for chains, and free chains that cannot + be used because the initializers might trap. */ + void prepare_initializers (); + + /* Generates finalizer memory reference for chains. Returns true if + finalizer code generation for chains breaks loop closed ssa form. */ + bool prepare_finalizers (); + + /* Try to combine the chains. */ + void try_combine_chains (); + + /* Frees CHAINS. */ + void release_chains (); + + /* Frees a chain CHAIN. */ + void release_chain (chain_p chain); -/* Cache used by tree_to_aff_combination_expand. */ + /* Prepare initializers for CHAIN. Returns false if this is impossible + because one of these initializers may trap, true otherwise. */ + bool prepare_initializers_chain (chain_p chain); -static hash_map<tree, name_expansion *> *name_expansions; + /* Generates finalizer memory references for CHAIN. Returns true + if finalizer code for CHAIN can be generated, otherwise false. */ + bool prepare_finalizers_chain (chain_p chain); + + /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */ + void aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset); + + /* Determines number of iterations of the innermost enclosing loop before + B refers to exactly the same location as A and stores it to OFF. */ + bool determine_offset (struct data_reference *a, struct data_reference *b, + poly_widest_int *off); + + /* Returns true if the component COMP satisfies the conditions + described in 2) at the beginning of this file. */ + bool suitable_component_p (struct component *comp); + + /* Returns true if REF is a valid initializer for ROOT with given + DISTANCE (in iterations of the innermost enclosing loop). */ + bool valid_initializer_p (struct data_reference *ref, unsigned distance, + struct data_reference *root); + + /* Finds looparound phi node of loop that copies the value of REF. */ + gphi *find_looparound_phi (dref ref, dref root); + + /* Find roots of the values and determine distances in the component + COMP. The references are redistributed into chains. */ + void determine_roots_comp (struct component *comp); + + /* For references in CHAIN that are copied around the loop, add the + results of such copies to the chain. */ + void add_looparound_copies (chain_p chain); + + /* Returns the single statement in that NAME is used, excepting + the looparound phi nodes contained in one of the chains. */ + gimple *single_nonlooparound_use (tree name); + + /* Remove statement STMT, as well as the chain of assignments in that + it is used. */ + void remove_stmt (gimple *stmt); + + /* Perform the predictive commoning optimization for a chain CHAIN. */ + void execute_pred_commoning_chain (chain_p chain, bitmap tmp_vars); + + /* Returns the modify statement that uses NAME. */ + gimple *find_use_stmt (tree *name); + + /* If the operation used in STMT is associative and commutative, go + through the tree of the same operations and returns its root. */ + gimple *find_associative_operation_root (gimple *stmt, unsigned *distance); + + /* Returns the common statement in that NAME1 and NAME2 have a use. */ + gimple *find_common_use_stmt (tree *name1, tree *name2); + + /* Checks whether R1 and R2 are combined together using CODE, with the + result in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order + R2 CODE R1 if it is true. */ + bool combinable_refs_p (dref r1, dref r2, enum tree_code *code, bool *swap, + tree *rslt_type); + + /* Reassociates the expression in that NAME1 and NAME2 are used so that + they are combined in a single statement, and returns this statement. */ + gimple *reassociate_to_the_same_stmt (tree name1, tree name2); + + /* Returns the statement that combines references R1 and R2. */ + gimple *stmt_combining_refs (dref r1, dref r2); + + /* Tries to combine chains CH1 and CH2 together. */ + chain_p combine_chains (chain_p ch1, chain_p ch2); +}; /* Dumps data reference REF to FILE. */ @@ -540,8 +683,8 @@ dump_components (FILE *file, struct component *comps) /* Frees a chain CHAIN. */ -static void -release_chain (chain_p chain) +void +pcom_worker::release_chain (chain_p chain) { dref ref; unsigned i; @@ -567,8 +710,8 @@ release_chain (chain_p chain) /* Frees CHAINS. */ -static void -release_chains (vec<chain_p> chains) +void +pcom_worker::release_chains () { unsigned i; chain_p chain; @@ -672,14 +815,14 @@ suitable_reference_p (struct data_reference *a, enum ref_step_type *ref_step) /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */ -static void -aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset) +void +pcom_worker::aff_combination_dr_offset (struct data_reference *dr, + aff_tree *offset) { tree type = TREE_TYPE (DR_OFFSET (dr)); aff_tree delta; - tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset, - &name_expansions); + tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset, &cache); aff_combination_const (&delta, type, wi::to_poly_widest (DR_INIT (dr))); aff_combination_add (offset, &delta); } @@ -690,9 +833,9 @@ aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset) returns false, otherwise returns true. Both A and B are assumed to satisfy suitable_reference_p. */ -static bool -determine_offset (struct data_reference *a, struct data_reference *b, - poly_widest_int *off) +bool +pcom_worker::determine_offset (struct data_reference *a, + struct data_reference *b, poly_widest_int *off) { aff_tree diff, baseb, step; tree typea, typeb; @@ -726,7 +869,7 @@ determine_offset (struct data_reference *a, struct data_reference *b, aff_combination_add (&diff, &baseb); tree_to_aff_combination_expand (DR_STEP (a), TREE_TYPE (DR_STEP (a)), - &step, &name_expansions); + &step, &cache); return aff_combination_constant_multiple_p (&diff, &step, off); } @@ -747,17 +890,39 @@ last_always_executed_block (class loop *loop) return last; } -/* Splits dependence graph on DATAREFS described by DEPENDS to components. */ +/* RAII class for comp_father and comp_size usage. */ + +class comp_ptrs +{ +public: + unsigned *comp_father; + unsigned *comp_size; + + comp_ptrs (unsigned n) + { + comp_father = XNEWVEC (unsigned, n + 1); + comp_size = XNEWVEC (unsigned, n + 1); + } + + ~comp_ptrs () + { + free (comp_father); + free (comp_size); + } + + comp_ptrs (const comp_ptrs &) = delete; + comp_ptrs &operator= (const comp_ptrs &) = delete; +}; + +/* Splits dependence graph on DATAREFS described by DEPENDENCES to + components. */ -static struct component * -split_data_refs_to_components (class loop *loop, - vec<data_reference_p> datarefs, - vec<ddr_p> depends) +struct component * +pcom_worker::split_data_refs_to_components () { unsigned i, n = datarefs.length (); unsigned ca, ia, ib, bad; - unsigned *comp_father = XNEWVEC (unsigned, n + 1); - unsigned *comp_size = XNEWVEC (unsigned, n + 1); + comp_ptrs ptrs (n); struct component **comps; struct data_reference *dr, *dra, *drb; struct data_dependence_relation *ddr; @@ -771,22 +936,20 @@ split_data_refs_to_components (class loop *loop, FOR_EACH_VEC_ELT (datarefs, i, dr) { if (!DR_REF (dr)) - { /* A fake reference for call or asm_expr that may clobber memory; just fail. */ - goto end; - } + return NULL; /* predcom pass isn't prepared to handle calls with data references. */ if (is_gimple_call (DR_STMT (dr))) - goto end; + return NULL; dr->aux = (void *) (size_t) i; - comp_father[i] = i; - comp_size[i] = 1; + ptrs.comp_father[i] = i; + ptrs.comp_size[i] = 1; } /* A component reserved for the "bad" data references. */ - comp_father[n] = n; - comp_size[n] = 1; + ptrs.comp_father[n] = n; + ptrs.comp_size[n] = 1; FOR_EACH_VEC_ELT (datarefs, i, dr) { @@ -795,11 +958,11 @@ split_data_refs_to_components (class loop *loop, if (!suitable_reference_p (dr, &dummy)) { ia = (unsigned) (size_t) dr->aux; - merge_comps (comp_father, comp_size, n, ia); + merge_comps (ptrs.comp_father, ptrs.comp_size, n, ia); } } - FOR_EACH_VEC_ELT (depends, i, ddr) + FOR_EACH_VEC_ELT (dependences, i, ddr) { poly_widest_int dummy_off; @@ -816,12 +979,12 @@ split_data_refs_to_components (class loop *loop, || DDR_NUM_DIST_VECTS (ddr) == 0)) eliminate_store_p = false; - ia = component_of (comp_father, (unsigned) (size_t) dra->aux); - ib = component_of (comp_father, (unsigned) (size_t) drb->aux); + ia = component_of (ptrs.comp_father, (unsigned) (size_t) dra->aux); + ib = component_of (ptrs.comp_father, (unsigned) (size_t) drb->aux); if (ia == ib) continue; - bad = component_of (comp_father, n); + bad = component_of (ptrs.comp_father, n); /* If both A and B are reads, we may ignore unsuitable dependences. */ if (DR_IS_READ (dra) && DR_IS_READ (drb)) @@ -845,7 +1008,7 @@ split_data_refs_to_components (class loop *loop, else if (!determine_offset (dra, drb, &dummy_off)) { bitmap_set_bit (no_store_store_comps, ib); - merge_comps (comp_father, comp_size, bad, ia); + merge_comps (ptrs.comp_father, ptrs.comp_size, bad, ia); continue; } } @@ -859,7 +1022,7 @@ split_data_refs_to_components (class loop *loop, else if (!determine_offset (dra, drb, &dummy_off)) { bitmap_set_bit (no_store_store_comps, ia); - merge_comps (comp_father, comp_size, bad, ib); + merge_comps (ptrs.comp_father, ptrs.comp_size, bad, ib); continue; } } @@ -867,12 +1030,12 @@ split_data_refs_to_components (class loop *loop, && ia != bad && ib != bad && !determine_offset (dra, drb, &dummy_off)) { - merge_comps (comp_father, comp_size, bad, ia); - merge_comps (comp_father, comp_size, bad, ib); + merge_comps (ptrs.comp_father, ptrs.comp_size, bad, ia); + merge_comps (ptrs.comp_father, ptrs.comp_size, bad, ib); continue; } - merge_comps (comp_father, comp_size, ia, ib); + merge_comps (ptrs.comp_father, ptrs.comp_size, ia, ib); } if (eliminate_store_p) @@ -886,11 +1049,11 @@ split_data_refs_to_components (class loop *loop, } comps = XCNEWVEC (struct component *, n); - bad = component_of (comp_father, n); + bad = component_of (ptrs.comp_father, n); FOR_EACH_VEC_ELT (datarefs, i, dr) { ia = (unsigned) (size_t) dr->aux; - ca = component_of (comp_father, ia); + ca = component_of (ptrs.comp_father, ia); if (ca == bad) continue; @@ -898,7 +1061,7 @@ split_data_refs_to_components (class loop *loop, if (!comp) { comp = XCNEW (struct component); - comp->refs.create (comp_size[ca]); + comp->refs.create (ptrs.comp_size[ca]); comp->eliminate_store_p = eliminate_store_p; comps[ca] = comp; } @@ -921,7 +1084,7 @@ split_data_refs_to_components (class loop *loop, bitmap_iterator bi; EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi) { - ca = component_of (comp_father, ia); + ca = component_of (ptrs.comp_father, ia); if (ca != bad) comps[ca]->eliminate_store_p = false; } @@ -937,19 +1100,14 @@ split_data_refs_to_components (class loop *loop, } } free (comps); - -end: - free (comp_father); - free (comp_size); return comp_list; } /* Returns true if the component COMP satisfies the conditions - described in 2) at the beginning of this file. LOOP is the current - loop. */ + described in 2) at the beginning of this file. */ -static bool -suitable_component_p (class loop *loop, struct component *comp) +bool +pcom_worker::suitable_component_p (struct component *comp) { unsigned i; dref a, first; @@ -1002,17 +1160,17 @@ suitable_component_p (class loop *loop, struct component *comp) /* Check the conditions on references inside each of components COMPS, and remove the unsuitable components from the list. The new list of components is returned. The conditions are described in 2) at - the beginning of this file. LOOP is the current loop. */ + the beginning of this file. */ -static struct component * -filter_suitable_components (class loop *loop, struct component *comps) +struct component * +pcom_worker::filter_suitable_components (struct component *comps) { struct component **comp, *act; for (comp = &comps; *comp; ) { act = *comp; - if (suitable_component_p (loop, act)) + if (suitable_component_p (act)) comp = &act->next; else { @@ -1205,9 +1363,9 @@ name_for_ref (dref ref) /* Returns true if REF is a valid initializer for ROOT with given DISTANCE (in iterations of the innermost enclosing loop). */ -static bool -valid_initializer_p (struct data_reference *ref, - unsigned distance, struct data_reference *root) +bool +pcom_worker::valid_initializer_p (struct data_reference *ref, unsigned distance, + struct data_reference *root) { aff_tree diff, base, step; poly_widest_int off; @@ -1234,7 +1392,7 @@ valid_initializer_p (struct data_reference *ref, aff_combination_add (&diff, &base); tree_to_aff_combination_expand (DR_STEP (root), TREE_TYPE (DR_STEP (root)), - &step, &name_expansions); + &step, &cache); if (!aff_combination_constant_multiple_p (&diff, &step, &off)) return false; @@ -1244,13 +1402,13 @@ valid_initializer_p (struct data_reference *ref, return true; } -/* Finds looparound phi node of LOOP that copies the value of REF, and if its +/* Finds looparound phi node of loop that copies the value of REF, and if its initial value is correct (equal to initial value of REF shifted by one iteration), returns the phi node. Otherwise, NULL_TREE is returned. ROOT is the root of the current chain. */ -static gphi * -find_looparound_phi (class loop *loop, dref ref, dref root) +gphi * +pcom_worker::find_looparound_phi (dref ref, dref root) { tree name, init, init_ref; gphi *phi = NULL; @@ -1333,13 +1491,13 @@ insert_looparound_copy (chain_p chain, dref ref, gphi *phi) } } -/* For references in CHAIN that are copied around the LOOP (created previously +/* For references in CHAIN that are copied around the loop (created previously by PRE, or by user), add the results of such copies to the chain. This enables us to remove the copies by unrolling, and may need less registers (also, it may allow us to combine chains together). */ -static void -add_looparound_copies (class loop *loop, chain_p chain) +void +pcom_worker::add_looparound_copies (chain_p chain) { unsigned i; dref ref, root = get_chain_root (chain); @@ -1350,7 +1508,7 @@ add_looparound_copies (class loop *loop, chain_p chain) FOR_EACH_VEC_ELT (chain->refs, i, ref) { - phi = find_looparound_phi (loop, ref, root); + phi = find_looparound_phi (ref, root); if (!phi) continue; @@ -1360,13 +1518,10 @@ add_looparound_copies (class loop *loop, chain_p chain) } /* Find roots of the values and determine distances in the component COMP. - The references are redistributed into CHAINS. LOOP is the current - loop. */ + The references are redistributed into chains. */ -static void -determine_roots_comp (class loop *loop, - struct component *comp, - vec<chain_p> *chains) +void +pcom_worker::determine_roots_comp (struct component *comp) { unsigned i; dref a; @@ -1378,7 +1533,7 @@ determine_roots_comp (class loop *loop, if (comp->comp_step == RS_INVARIANT) { chain = make_invariant_chain (comp); - chains->safe_push (chain); + chains.safe_push (chain); return; } @@ -1422,8 +1577,8 @@ determine_roots_comp (class loop *loop, { if (nontrivial_chain_p (chain)) { - add_looparound_copies (loop, chain); - chains->safe_push (chain); + add_looparound_copies (chain); + chains.safe_push (chain); } else release_chain (chain); @@ -1443,24 +1598,23 @@ determine_roots_comp (class loop *loop, if (nontrivial_chain_p (chain)) { - add_looparound_copies (loop, chain); - chains->safe_push (chain); + add_looparound_copies (chain); + chains.safe_push (chain); } else release_chain (chain); } /* Find roots of the values and determine distances in components COMPS, and - separates the references to CHAINS. LOOP is the current loop. */ + separates the references to chains. */ -static void -determine_roots (class loop *loop, - struct component *comps, vec<chain_p> *chains) +void +pcom_worker::determine_roots (struct component *comps) { struct component *comp; for (comp = comps; comp; comp = comp->next) - determine_roots_comp (loop, comp, chains); + determine_roots_comp (comp); } /* Replace the reference in statement STMT with temporary variable @@ -2027,8 +2181,8 @@ execute_load_motion (class loop *loop, chain_p chain, bitmap tmp_vars) the looparound phi nodes contained in one of the chains. If there is no such statement, or more statements, NULL is returned. */ -static gimple * -single_nonlooparound_use (tree name) +gimple * +pcom_worker::single_nonlooparound_use (tree name) { use_operand_p use; imm_use_iterator it; @@ -2062,8 +2216,8 @@ single_nonlooparound_use (tree name) /* Remove statement STMT, as well as the chain of assignments in that it is used. */ -static void -remove_stmt (gimple *stmt) +void +pcom_worker::remove_stmt (gimple *stmt) { tree name; gimple *next; @@ -2120,8 +2274,8 @@ remove_stmt (gimple *stmt) /* Perform the predictive commoning optimization for a chain CHAIN. Uids of the newly created temporary variables are marked in TMP_VARS.*/ -static void -execute_pred_commoning_chain (class loop *loop, chain_p chain, +void +pcom_worker::execute_pred_commoning_chain (chain_p chain, bitmap tmp_vars) { unsigned i; @@ -2248,12 +2402,11 @@ determine_unroll_factor (vec<chain_p> chains) return factor; } -/* Perform the predictive commoning optimization for CHAINS. +/* Perform the predictive commoning optimization for chains. Uids of the newly created temporary variables are marked in TMP_VARS. */ -static void -execute_pred_commoning (class loop *loop, vec<chain_p> chains, - bitmap tmp_vars) +void +pcom_worker::execute_pred_commoning (bitmap tmp_vars) { chain_p chain; unsigned i; @@ -2263,7 +2416,7 @@ execute_pred_commoning (class loop *loop, vec<chain_p> chains, if (chain->type == CT_INVARIANT) execute_load_motion (loop, chain, tmp_vars); else - execute_pred_commoning_chain (loop, chain, tmp_vars); + execute_pred_commoning_chain (chain, tmp_vars); } FOR_EACH_VEC_ELT (chains, i, chain) @@ -2330,18 +2483,20 @@ struct epcc_data { vec<chain_p> chains; bitmap tmp_vars; + pcom_worker *worker; }; static void -execute_pred_commoning_cbck (class loop *loop, void *data) +execute_pred_commoning_cbck (class loop *loop ATTRIBUTE_UNUSED, void *data) { struct epcc_data *const dta = (struct epcc_data *) data; + pcom_worker *worker = dta->worker; /* Restore phi nodes that were replaced by ssa names before tree_transform_and_unroll_loop (see detailed description in tree_predictive_commoning_loop). */ replace_names_by_phis (dta->chains); - execute_pred_commoning (loop, dta->chains, dta->tmp_vars); + worker->execute_pred_commoning (dta->tmp_vars); } /* Base NAME and all the names in the chain of phi nodes that use it @@ -2433,8 +2588,8 @@ chain_can_be_combined_p (chain_p chain) statements, NAME is replaced with the actual name used in the returned statement. */ -static gimple * -find_use_stmt (tree *name) +gimple * +pcom_worker::find_use_stmt (tree *name) { gimple *stmt; tree rhs, lhs; @@ -2486,8 +2641,8 @@ may_reassociate_p (tree type, enum tree_code code) tree of the same operations and returns its root. Distance to the root is stored in DISTANCE. */ -static gimple * -find_associative_operation_root (gimple *stmt, unsigned *distance) +gimple * +pcom_worker::find_associative_operation_root (gimple *stmt, unsigned *distance) { tree lhs; gimple *next; @@ -2523,8 +2678,8 @@ find_associative_operation_root (gimple *stmt, unsigned *distance) tree formed by this operation instead of the statement that uses NAME1 or NAME2. */ -static gimple * -find_common_use_stmt (tree *name1, tree *name2) +gimple * +pcom_worker::find_common_use_stmt (tree *name1, tree *name2) { gimple *stmt1, *stmt2; @@ -2553,8 +2708,8 @@ find_common_use_stmt (tree *name1, tree *name2) in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order R2 CODE R1 if it is true. If CODE is ERROR_MARK, set these values instead. */ -static bool -combinable_refs_p (dref r1, dref r2, +bool +pcom_worker::combinable_refs_p (dref r1, dref r2, enum tree_code *code, bool *swap, tree *rslt_type) { enum tree_code acode; @@ -2622,8 +2777,8 @@ remove_name_from_operation (gimple *stmt, tree op) /* Reassociates the expression in that NAME1 and NAME2 are used so that they are combined in a single statement, and returns this statement. */ -static gimple * -reassociate_to_the_same_stmt (tree name1, tree name2) +gimple * +pcom_worker::reassociate_to_the_same_stmt (tree name1, tree name2) { gimple *stmt1, *stmt2, *root1, *root2, *s1, *s2; gassign *new_stmt, *tmp_stmt; @@ -2707,8 +2862,8 @@ reassociate_to_the_same_stmt (tree name1, tree name2) associative and commutative operation in the same expression, reassociate the expression so that they are used in the same statement. */ -static gimple * -stmt_combining_refs (dref r1, dref r2) +gimple * +pcom_worker::stmt_combining_refs (dref r1, dref r2) { gimple *stmt1, *stmt2; tree name1 = name_for_ref (r1); @@ -2725,8 +2880,8 @@ stmt_combining_refs (dref r1, dref r2) /* Tries to combine chains CH1 and CH2 together. If this succeeds, the description of the new chain is returned, otherwise we return NULL. */ -static chain_p -combine_chains (chain_p ch1, chain_p ch2) +chain_p +pcom_worker::combine_chains (chain_p ch1, chain_p ch2) { dref r1, r2, nw; enum tree_code op = ERROR_MARK; @@ -2814,17 +2969,17 @@ pcom_stmt_dominates_stmt_p (gimple *s1, gimple *s2) return dominated_by_p (CDI_DOMINATORS, bb2, bb1); } -/* Try to combine the CHAINS in LOOP. */ +/* Try to combine the chains. */ -static void -try_combine_chains (class loop *loop, vec<chain_p> *chains) +void +pcom_worker::try_combine_chains () { unsigned i, j; chain_p ch1, ch2, cch; auto_vec<chain_p> worklist; bool combined_p = false; - FOR_EACH_VEC_ELT (*chains, i, ch1) + FOR_EACH_VEC_ELT (chains, i, ch1) if (chain_can_be_combined_p (ch1)) worklist.safe_push (ch1); @@ -2834,7 +2989,7 @@ try_combine_chains (class loop *loop, vec<chain_p> *chains) if (!chain_can_be_combined_p (ch1)) continue; - FOR_EACH_VEC_ELT (*chains, j, ch2) + FOR_EACH_VEC_ELT (chains, j, ch2) { if (!chain_can_be_combined_p (ch2)) continue; @@ -2843,7 +2998,7 @@ try_combine_chains (class loop *loop, vec<chain_p> *chains) if (cch) { worklist.safe_push (cch); - chains->safe_push (cch); + chains.safe_push (cch); combined_p = true; break; } @@ -2867,7 +3022,7 @@ try_combine_chains (class loop *loop, vec<chain_p> *chains) We first update position information for all combined chains. */ dref ref; - for (i = 0; chains->iterate (i, &ch1); ++i) + for (i = 0; chains.iterate (i, &ch1); ++i) { if (ch1->type != CT_COMBINATION || ch1->combined) continue; @@ -2878,7 +3033,7 @@ try_combine_chains (class loop *loop, vec<chain_p> *chains) update_pos_for_combined_chains (ch1); } /* Then sort references according to newly updated position information. */ - for (i = 0; chains->iterate (i, &ch1); ++i) + for (i = 0; chains.iterate (i, &ch1); ++i) { if (ch1->type != CT_COMBINATION && !ch1->combined) continue; @@ -2990,11 +3145,11 @@ prepare_initializers_chain_store_elim (class loop *loop, chain_p chain) return true; } -/* Prepare initializers for CHAIN in LOOP. Returns false if this is - impossible because one of these initializers may trap, true otherwise. */ +/* Prepare initializers for CHAIN. Returns false if this is impossible + because one of these initializers may trap, true otherwise. */ -static bool -prepare_initializers_chain (class loop *loop, chain_p chain) +bool +pcom_worker::prepare_initializers_chain (chain_p chain) { unsigned i, n = (chain->type == CT_INVARIANT) ? 1 : chain->length; struct data_reference *dr = get_chain_root (chain)->ref; @@ -3046,11 +3201,11 @@ prepare_initializers_chain (class loop *loop, chain_p chain) return true; } -/* Prepare initializers for CHAINS in LOOP, and free chains that cannot +/* Prepare initializers for chains, and free chains that cannot be used because the initializers might trap. */ -static void -prepare_initializers (class loop *loop, vec<chain_p> chains) +void +pcom_worker::prepare_initializers () { chain_p chain; unsigned i; @@ -3058,7 +3213,7 @@ prepare_initializers (class loop *loop, vec<chain_p> chains) for (i = 0; i < chains.length (); ) { chain = chains[i]; - if (prepare_initializers_chain (loop, chain)) + if (prepare_initializers_chain (chain)) i++; else { @@ -3068,11 +3223,11 @@ prepare_initializers (class loop *loop, vec<chain_p> chains) } } -/* Generates finalizer memory references for CHAIN in LOOP. Returns true +/* Generates finalizer memory references for CHAIN. Returns true if finalizer code for CHAIN can be generated, otherwise false. */ -static bool -prepare_finalizers_chain (class loop *loop, chain_p chain) +bool +pcom_worker::prepare_finalizers_chain (chain_p chain) { unsigned i, n = chain->length; struct data_reference *dr = get_chain_root (chain)->ref; @@ -3116,11 +3271,11 @@ prepare_finalizers_chain (class loop *loop, chain_p chain) return true; } -/* Generates finalizer memory reference for CHAINS in LOOP. Returns true - if finalizer code generation for CHAINS breaks loop closed ssa form. */ +/* Generates finalizer memory reference for chains. Returns true if + finalizer code generation for chains breaks loop closed ssa form. */ -static bool -prepare_finalizers (class loop *loop, vec<chain_p> chains) +bool +pcom_worker::prepare_finalizers () { chain_p chain; unsigned i; @@ -3138,7 +3293,7 @@ prepare_finalizers (class loop *loop, vec<chain_p> chains) continue; } - if (prepare_finalizers_chain (loop, chain)) + if (prepare_finalizers_chain (chain)) { i++; /* Be conservative, assume loop closed ssa form is corrupted @@ -3156,7 +3311,7 @@ prepare_finalizers (class loop *loop, vec<chain_p> chains) return loop_closed_ssa; } -/* Insert all initializing gimple stmts into loop's entry edge. */ +/* Insert all initializing gimple stmts into LOOP's entry edge. */ static void insert_init_seqs (class loop *loop, vec<chain_p> chains) @@ -3177,19 +3332,16 @@ insert_init_seqs (class loop *loop, vec<chain_p> chains) form was corrupted. Non-zero return value indicates some changes were applied to this loop. */ -static unsigned -tree_predictive_commoning_loop (class loop *loop, bool allow_unroll_p) +unsigned +pcom_worker::tree_predictive_commoning_loop (bool allow_unroll_p) { - vec<data_reference_p> datarefs; - vec<ddr_p> dependences; struct component *components; - vec<chain_p> chains = vNULL; unsigned unroll_factor = 0; class tree_niter_desc desc; bool unroll = false, loop_closed_ssa = false; if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Processing loop %d\n", loop->num); + fprintf (dump_file, "Processing loop %d\n", loop->num); /* Nothing for predicitive commoning if loop only iterates 1 time. */ if (get_max_loop_iterations_int (loop) == 0) @@ -3203,30 +3355,22 @@ tree_predictive_commoning_loop (class loop *loop, bool allow_unroll_p) /* Find the data references and split them into components according to their dependence relations. */ auto_vec<loop_p, 3> loop_nest; - dependences.create (10); - datarefs.create (10); - if (! compute_data_dependences_for_loop (loop, true, &loop_nest, &datarefs, - &dependences)) + if (!compute_data_dependences_for_loop (loop, true, &loop_nest, &datarefs, + &dependences)) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Cannot analyze data dependencies\n"); - free_data_refs (datarefs); - free_dependence_relations (dependences); return 0; } if (dump_file && (dump_flags & TDF_DETAILS)) dump_data_dependence_relations (dump_file, dependences); - components = split_data_refs_to_components (loop, datarefs, dependences); + components = split_data_refs_to_components (); + loop_nest.release (); - free_dependence_relations (dependences); if (!components) - { - free_data_refs (datarefs); - free_affine_expand_cache (&name_expansions); - return 0; - } + return 0; if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -3235,34 +3379,25 @@ tree_predictive_commoning_loop (class loop *loop, bool allow_unroll_p) } /* Find the suitable components and split them into chains. */ - components = filter_suitable_components (loop, components); + components = filter_suitable_components (components); auto_bitmap tmp_vars; - looparound_phis = BITMAP_ALLOC (NULL); - determine_roots (loop, components, &chains); + determine_roots (components); release_components (components); - auto cleanup = [&]() { - release_chains (chains); - free_data_refs (datarefs); - BITMAP_FREE (looparound_phis); - free_affine_expand_cache (&name_expansions); - }; - if (!chains.exists ()) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Predictive commoning failed: no suitable chains\n"); - cleanup (); return 0; } - prepare_initializers (loop, chains); - loop_closed_ssa = prepare_finalizers (loop, chains); + prepare_initializers (); + loop_closed_ssa = prepare_finalizers (); /* Try to combine the chains that are always worked with together. */ - try_combine_chains (loop, &chains); + try_combine_chains (); insert_init_seqs (loop, chains); @@ -3289,8 +3424,9 @@ tree_predictive_commoning_loop (class loop *loop, bool allow_unroll_p) if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Unrolling %u times.\n", unroll_factor); - dta.chains = chains; dta.tmp_vars = tmp_vars; + dta.chains = chains; + dta.worker = this; /* Cfg manipulations performed in tree_transform_and_unroll_loop before execute_pred_commoning_cbck is called may cause phi nodes to be @@ -3310,11 +3446,9 @@ tree_predictive_commoning_loop (class loop *loop, bool allow_unroll_p) if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Executing predictive commoning without unrolling.\n"); - execute_pred_commoning (loop, chains, tmp_vars); + execute_pred_commoning (tmp_vars); } - cleanup (); - return (unroll ? 2 : 1) | (loop_closed_ssa ? 4 : 1); } @@ -3330,7 +3464,8 @@ tree_predictive_commoning (bool allow_unroll_p) FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST) if (optimize_loop_for_speed_p (loop)) { - changed |= tree_predictive_commoning_loop (loop, allow_unroll_p); + pcom_worker w(loop); + changed |= w.tree_predictive_commoning_loop (allow_unroll_p); } free_original_copy_tables (); diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index be067c8..579149d 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -3479,9 +3479,9 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo) /* Step values are irrelevant for aliasing if the number of vector iterations is equal to the number of scalar iterations (which can happen for fully-SLP loops). */ - bool ignore_step_p = known_eq (LOOP_VINFO_VECT_FACTOR (loop_vinfo), 1U); + bool vf_one_p = known_eq (LOOP_VINFO_VECT_FACTOR (loop_vinfo), 1U); - if (!ignore_step_p) + if (!vf_one_p) { /* Convert the checks for nonzero steps into bound tests. */ tree value; @@ -3534,6 +3534,11 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo) bool preserves_scalar_order_p = vect_preserves_scalar_order_p (dr_info_a, dr_info_b); + bool ignore_step_p + = (vf_one_p + && (preserves_scalar_order_p + || operand_equal_p (DR_STEP (dr_info_a->dr), + DR_STEP (dr_info_b->dr)))); /* Skip the pair if inter-iteration dependencies are irrelevant and intra-iteration dependencies are guaranteed to be honored. */ diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index 2ed49cd..d536494 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -1490,6 +1490,105 @@ complex_operations_pattern::build (vec_info * /* vinfo */) gcc_unreachable (); } + +/* The addsub_pattern. */ + +class addsub_pattern : public vect_pattern +{ + public: + addsub_pattern (slp_tree *node) + : vect_pattern (node, NULL, IFN_VEC_ADDSUB) {}; + + void build (vec_info *); + + static vect_pattern* + recognize (slp_tree_to_load_perm_map_t *, slp_tree *); +}; + +vect_pattern * +addsub_pattern::recognize (slp_tree_to_load_perm_map_t *, slp_tree *node_) +{ + slp_tree node = *node_; + if (SLP_TREE_CODE (node) != VEC_PERM_EXPR + || SLP_TREE_CHILDREN (node).length () != 2) + return NULL; + + /* Match a blend of a plus and a minus op with the same number of plus and + minus lanes on the same operands. */ + slp_tree sub = SLP_TREE_CHILDREN (node)[0]; + slp_tree add = SLP_TREE_CHILDREN (node)[1]; + bool swapped_p = false; + if (vect_match_expression_p (sub, PLUS_EXPR)) + { + std::swap (add, sub); + swapped_p = true; + } + if (!(vect_match_expression_p (add, PLUS_EXPR) + && vect_match_expression_p (sub, MINUS_EXPR))) + return NULL; + if (!((SLP_TREE_CHILDREN (sub)[0] == SLP_TREE_CHILDREN (add)[0] + && SLP_TREE_CHILDREN (sub)[1] == SLP_TREE_CHILDREN (add)[1]) + || (SLP_TREE_CHILDREN (sub)[0] == SLP_TREE_CHILDREN (add)[1] + && SLP_TREE_CHILDREN (sub)[1] == SLP_TREE_CHILDREN (add)[0]))) + return NULL; + + for (unsigned i = 0; i < SLP_TREE_LANE_PERMUTATION (node).length (); ++i) + { + std::pair<unsigned, unsigned> perm = SLP_TREE_LANE_PERMUTATION (node)[i]; + if (swapped_p) + perm.first = perm.first == 0 ? 1 : 0; + /* It has to be alternating -, +, -, ... + While we could permute the .ADDSUB inputs and the .ADDSUB output + that's only profitable over the add + sub + blend if at least + one of the permute is optimized which we can't determine here. */ + if (perm.first != (i & 1) + || perm.second != i) + return NULL; + } + + if (!vect_pattern_validate_optab (IFN_VEC_ADDSUB, node)) + return NULL; + + return new addsub_pattern (node_); +} + +void +addsub_pattern::build (vec_info *vinfo) +{ + slp_tree node = *m_node; + + slp_tree sub = SLP_TREE_CHILDREN (node)[0]; + slp_tree add = SLP_TREE_CHILDREN (node)[1]; + if (vect_match_expression_p (sub, PLUS_EXPR)) + std::swap (add, sub); + + /* Modify the blend node in-place. */ + SLP_TREE_CHILDREN (node)[0] = SLP_TREE_CHILDREN (sub)[0]; + SLP_TREE_CHILDREN (node)[1] = SLP_TREE_CHILDREN (sub)[1]; + SLP_TREE_REF_COUNT (SLP_TREE_CHILDREN (node)[0])++; + SLP_TREE_REF_COUNT (SLP_TREE_CHILDREN (node)[1])++; + + /* Build IFN_VEC_ADDSUB from the sub representative operands. */ + stmt_vec_info rep = SLP_TREE_REPRESENTATIVE (sub); + gcall *call = gimple_build_call_internal (IFN_VEC_ADDSUB, 2, + gimple_assign_rhs1 (rep->stmt), + gimple_assign_rhs2 (rep->stmt)); + gimple_call_set_lhs (call, make_ssa_name + (TREE_TYPE (gimple_assign_lhs (rep->stmt)))); + gimple_call_set_nothrow (call, true); + gimple_set_bb (call, gimple_bb (rep->stmt)); + SLP_TREE_REPRESENTATIVE (node) = vinfo->add_pattern_stmt (call, rep); + STMT_VINFO_RELEVANT (SLP_TREE_REPRESENTATIVE (node)) = vect_used_in_scope; + STMT_SLP_TYPE (SLP_TREE_REPRESENTATIVE (node)) = pure_slp; + STMT_VINFO_VECTYPE (SLP_TREE_REPRESENTATIVE (node)) = SLP_TREE_VECTYPE (node); + STMT_VINFO_SLP_VECT_ONLY_PATTERN (SLP_TREE_REPRESENTATIVE (node)) = true; + SLP_TREE_CODE (node) = ERROR_MARK; + SLP_TREE_LANE_PERMUTATION (node).release (); + + vect_free_slp_tree (sub); + vect_free_slp_tree (add); +} + /******************************************************************************* * Pattern matching definitions ******************************************************************************/ @@ -1502,6 +1601,7 @@ vect_pattern_decl_t slp_patterns[] overlap in what they can detect. */ SLP_PATTERN (complex_operations_pattern), + SLP_PATTERN (addsub_pattern) }; #undef SLP_PATTERN diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index 6c98acb..227d6aa 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -3467,11 +3467,27 @@ vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size) return opt_result::success (); } +struct slpg_vertex +{ + slpg_vertex (slp_tree node_) + : node (node_), visited (0), perm_out (0), materialize (0) {} + + int get_perm_in () const { return materialize ? materialize : perm_out; } + + slp_tree node; + unsigned visited : 1; + /* The permutation on the outgoing lanes (towards SLP parents). */ + int perm_out; + /* The permutation that is applied by this node. perm_out is + relative to this. */ + int materialize; +}; + /* Fill the vertices and leafs vector with all nodes in the SLP graph. */ static void vect_slp_build_vertices (hash_set<slp_tree> &visited, slp_tree node, - vec<slp_tree> &vertices, vec<int> &leafs) + vec<slpg_vertex> &vertices, vec<int> &leafs) { unsigned i; slp_tree child; @@ -3480,7 +3496,7 @@ vect_slp_build_vertices (hash_set<slp_tree> &visited, slp_tree node, return; node->vertex = vertices.length (); - vertices.safe_push (node); + vertices.safe_push (slpg_vertex (node)); bool leaf = true; FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) @@ -3496,7 +3512,7 @@ vect_slp_build_vertices (hash_set<slp_tree> &visited, slp_tree node, /* Fill the vertices and leafs vector with all nodes in the SLP graph. */ static void -vect_slp_build_vertices (vec_info *info, vec<slp_tree> &vertices, +vect_slp_build_vertices (vec_info *info, vec<slpg_vertex> &vertices, vec<int> &leafs) { hash_set<slp_tree> visited; @@ -3568,39 +3584,28 @@ vect_optimize_slp (vec_info *vinfo) slp_tree node; unsigned i; - auto_vec<slp_tree> vertices; + auto_vec<slpg_vertex> vertices; auto_vec<int> leafs; vect_slp_build_vertices (vinfo, vertices, leafs); struct graph *slpg = new_graph (vertices.length ()); - FOR_EACH_VEC_ELT (vertices, i, node) - { - unsigned j; - slp_tree child; - FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), j, child) - if (child) - add_edge (slpg, i, child->vertex); - } + for (slpg_vertex &v : vertices) + for (slp_tree child : SLP_TREE_CHILDREN (v.node)) + if (child) + add_edge (slpg, v.node->vertex, child->vertex); /* Compute (reverse) postorder on the inverted graph. */ auto_vec<int> ipo; graphds_dfs (slpg, &leafs[0], leafs.length (), &ipo, false, NULL, NULL); - auto_sbitmap n_visited (vertices.length ()); - auto_sbitmap n_materialize (vertices.length ()); - auto_vec<int> n_perm (vertices.length ()); auto_vec<vec<unsigned> > perms; - - bitmap_clear (n_visited); - bitmap_clear (n_materialize); - n_perm.quick_grow_cleared (vertices.length ()); perms.safe_push (vNULL); /* zero is no permute */ /* Produce initial permutations. */ for (i = 0; i < leafs.length (); ++i) { int idx = leafs[i]; - slp_tree node = vertices[idx]; + slp_tree node = vertices[idx].node; /* Handle externals and constants optimistically throughout the iteration. */ @@ -3611,7 +3616,7 @@ vect_optimize_slp (vec_info *vinfo) /* Leafs do not change across iterations. Note leafs also double as entries to the reverse graph. */ if (!slpg->vertices[idx].succ) - bitmap_set_bit (n_visited, idx); + vertices[idx].visited = 1; /* Loads are the only thing generating permutes. */ if (!SLP_TREE_LOAD_PERMUTATION (node).exists ()) continue; @@ -3660,7 +3665,7 @@ vect_optimize_slp (vec_info *vinfo) for (unsigned j = 0; j < SLP_TREE_LANES (node); ++j) perm[j] = SLP_TREE_LOAD_PERMUTATION (node)[j] - imin; perms.safe_push (perm); - n_perm[idx] = perms.length () - 1; + vertices[idx].perm_out = perms.length () - 1; } /* Propagate permutes along the graph and compute materialization points. */ @@ -3674,19 +3679,36 @@ vect_optimize_slp (vec_info *vinfo) for (i = vertices.length (); i > 0 ; --i) { int idx = ipo[i-1]; - slp_tree node = vertices[idx]; - /* For leafs there's nothing to do - we've seeded permutes - on those above. */ - if (SLP_TREE_DEF_TYPE (node) != vect_internal_def) + slp_tree node = vertices[idx].node; + + /* Handle externals and constants optimistically throughout the + iteration. */ + if (SLP_TREE_DEF_TYPE (node) == vect_external_def + || SLP_TREE_DEF_TYPE (node) == vect_constant_def) continue; - bitmap_set_bit (n_visited, idx); + vertices[idx].visited = 1; - /* We cannot move a permute across a store. */ - if (STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (node)) - && DR_IS_WRITE - (STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (node)))) + /* We do not handle stores with a permutation. */ + stmt_vec_info rep = SLP_TREE_REPRESENTATIVE (node); + if (STMT_VINFO_DATA_REF (rep) + && DR_IS_WRITE (STMT_VINFO_DATA_REF (rep))) continue; + /* We cannot move a permute across an operation that is + not independent on lanes. Note this is an explicit + negative list since that's much shorter than the respective + positive one but it's critical to keep maintaining it. */ + if (is_gimple_call (STMT_VINFO_STMT (rep))) + switch (gimple_call_combined_fn (STMT_VINFO_STMT (rep))) + { + case CFN_COMPLEX_ADD_ROT90: + case CFN_COMPLEX_ADD_ROT270: + case CFN_COMPLEX_MUL: + case CFN_COMPLEX_MUL_CONJ: + case CFN_VEC_ADDSUB: + continue; + default:; + } int perm = -1; for (graph_edge *succ = slpg->vertices[idx].succ; @@ -3698,13 +3720,9 @@ vect_optimize_slp (vec_info *vinfo) permutes we have to verify the permute, when unifying lanes, will not unify different constants. For example see gcc.dg/vect/bb-slp-14.c for a case that would break. */ - if (!bitmap_bit_p (n_visited, succ_idx)) + if (!vertices[succ_idx].visited) continue; - int succ_perm = n_perm[succ_idx]; - /* Once we materialize succs permutation its output lanes - appear unpermuted to us. */ - if (bitmap_bit_p (n_materialize, succ_idx)) - succ_perm = 0; + int succ_perm = vertices[succ_idx].perm_out; if (perm == -1) perm = succ_perm; else if (succ_perm == 0) @@ -3721,15 +3739,20 @@ vect_optimize_slp (vec_info *vinfo) if (perm == -1) /* Pick up pre-computed leaf values. */ - perm = n_perm[idx]; - else if (!vect_slp_perms_eq (perms, perm, n_perm[idx])) + perm = vertices[idx].perm_out; + else if (!vect_slp_perms_eq (perms, perm, + vertices[idx].get_perm_in ())) { if (iteration > 1) /* Make sure we eventually converge. */ gcc_checking_assert (perm == 0); - n_perm[idx] = perm; if (perm == 0) - bitmap_clear_bit (n_materialize, idx); + { + vertices[idx].perm_out = 0; + vertices[idx].materialize = 0; + } + if (!vertices[idx].materialize) + vertices[idx].perm_out = perm; changed = true; } @@ -3756,8 +3779,8 @@ vect_optimize_slp (vec_info *vinfo) for (graph_edge *pred = slpg->vertices[idx].pred; pred; pred = pred->pred_next) { - gcc_checking_assert (bitmap_bit_p (n_visited, pred->src)); - int pred_perm = n_perm[pred->src]; + gcc_checking_assert (vertices[pred->src].visited); + int pred_perm = vertices[pred->src].get_perm_in (); if (!vect_slp_perms_eq (perms, perm, pred_perm)) { all_preds_permuted = false; @@ -3766,9 +3789,10 @@ vect_optimize_slp (vec_info *vinfo) } if (!all_preds_permuted) { - if (!bitmap_bit_p (n_materialize, idx)) + if (!vertices[idx].materialize) changed = true; - bitmap_set_bit (n_materialize, idx); + vertices[idx].materialize = perm; + vertices[idx].perm_out = 0; } } } @@ -3777,11 +3801,11 @@ vect_optimize_slp (vec_info *vinfo) /* Materialize. */ for (i = 0; i < vertices.length (); ++i) { - int perm = n_perm[i]; + int perm = vertices[i].get_perm_in (); if (perm <= 0) continue; - slp_tree node = vertices[i]; + slp_tree node = vertices[i].node; /* First permute invariant/external original successors. */ unsigned j; @@ -3808,7 +3832,7 @@ vect_optimize_slp (vec_info *vinfo) SLP_TREE_SCALAR_OPS (child), true); } - if (bitmap_bit_p (n_materialize, i)) + if (vertices[i].materialize) { if (SLP_TREE_LOAD_PERMUTATION (node).exists ()) /* For loads simply drop the permutation, the load permutation @@ -3897,7 +3921,7 @@ vect_optimize_slp (vec_info *vinfo) /* Now elide load permutations that are not necessary. */ for (i = 0; i < leafs.length (); ++i) { - node = vertices[leafs[i]]; + node = vertices[leafs[i]].node; if (!SLP_TREE_LOAD_PERMUTATION (node).exists ()) continue; diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 5c71fbc..fa28336 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2100,7 +2100,8 @@ class vect_pattern this->m_ifn = ifn; this->m_node = node; this->m_ops.create (0); - this->m_ops.safe_splice (*m_ops); + if (m_ops) + this->m_ops.safe_splice (*m_ops); } public: @@ -1651,7 +1651,8 @@ class auto_suppress_location_wrappers #define OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION(NODE) \ TREE_PROTECTED (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) /* Nonzero if this map clause is for an OpenACC compute construct's reduction - variable. */ + variable or OpenMP map clause mentioned also in in_reduction clause on the + same construct. */ #define OMP_CLAUSE_MAP_IN_REDUCTION(NODE) \ TREE_PRIVATE (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) /* Nonzero on map clauses added implicitly for reduction clauses on combined diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog index dc52ff5..d453f87 100644 --- a/libgcc/ChangeLog +++ b/libgcc/ChangeLog @@ -1,3 +1,25 @@ +2021-06-23 Kewen Lin <linkw@linux.ibm.com> + + * configure: Regenerate. + * configure.ac (test for libgcc_cv_powerpc_3_1_float128_hw): Fix + typos among the name, CFLAGS and the test. + * config/rs6000/t-float128-hw (fp128_3_1_hw_funcs, fp128_3_1_hw_src, + fp128_3_1_hw_static_obj, fp128_3_1_hw_shared_obj, fp128_3_1_hw_obj): + Remove. + * config/rs6000/t-float128-p10-hw (FLOAT128_HW_INSNS): Append + macro FLOAT128_HW_INSNS_ISA3_1. + (FP128_3_1_CFLAGS_HW): Fix option typo. + * config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1): Guard this with + FLOAT128_HW_INSNS_ISA3_1. + (__floattikf_resolve): Likewise. + (__floatuntikf_resolve): Likewise. + (__fixkfti_resolve): Likewise. + (__fixunskfti_resolve): Likewise. + (__floattikf): Likewise. + (__floatuntikf): Likewise. + (__fixkfti): Likewise. + (__fixunskfti): Likewise. + 2021-06-11 Srinath Parvathaneni <srinath.parvathaneni@arm.com> PR target/99939 diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog index 98b85a0..6a87abb 100644 --- a/libgomp/ChangeLog +++ b/libgomp/ChangeLog @@ -1,3 +1,8 @@ +2021-06-23 Jakub Jelinek <jakub@redhat.com> + + PR middle-end/101167 + * testsuite/libgomp.c-c++-common/task-reduction-15.c: New test. + 2021-06-17 Chung-Lin Tang <cltang@codesourcery.com> * hashtab.h (htab_clear): New function with initialization code diff --git a/libgomp/testsuite/libgomp.c++/target-in-reduction-1.C b/libgomp/testsuite/libgomp.c++/target-in-reduction-1.C new file mode 100644 index 0000000..21130f5 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/target-in-reduction-1.C @@ -0,0 +1,113 @@ +void +foo (int &x, int *&y, int n, int v) +{ + int zu[3] = { 45, 46, 47 }; + int uu[n], wu[n], i; + int (&z)[3] = zu; + int (&u)[n] = uu; + int (&w)[n] = wu; + for (i = 0; i < n; i++) + w[i] = u[i] = n + i; + #pragma omp taskgroup task_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + #pragma omp task in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x++; + y[0] += 2; + y[1] += 3; + z[1] += 4; + u[0] += 5; + w[1] += 6; + } + #pragma omp target in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x += 4; + y[0] += 5; + y[1] += 6; + z[2] += 7; + u[1] += 8; + w[2] += 7; + } + #pragma omp target in_reduction (+: x, y[:v], z[1:v], u, w[1:2]) + { + x += 9; + y[0] += 10; + y[1] += 11; + z[1] += 12; + u[2] += 13; + w[1] += 14; + } + } + if (x != 56 || y[0] != 60 || y[1] != 64) + __builtin_abort (); + if (z[0] != 45 || z[1] != 62 || z[2] != 54) + __builtin_abort (); + if (u[0] != 8 || u[1] != 12 || u[2] != 18) + __builtin_abort (); + if (w[0] != 3 || w[1] != 24 || w[2] != 12) + __builtin_abort (); +} + +void +bar (int &x, int *&y, int n, int v) +{ + int zu[3] = { 45, 46, 47 }; + int uu[n], wu[n], i; + int (&z)[3] = zu; + int (&u)[n] = uu; + int (&w)[n] = wu; + for (i = 0; i < n; i++) + w[i] = u[i] = n + i; + #pragma omp parallel master + #pragma omp taskgroup task_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + #pragma omp task in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x++; + y[0] += 2; + y[1] += 3; + z[1] += 4; + u[0] += 5; + w[1] += 6; + } + #pragma omp target in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x += 4; + y[0] += 5; + y[1] += 6; + z[2] += 7; + u[1] += 8; + w[2] += 7; + } + #pragma omp target in_reduction (+: x, y[:v], z[1:v], u, w[1:2]) + { + x += 9; + y[0] += 10; + y[1] += 11; + z[1] += 12; + u[2] += 13; + w[1] += 14; + } + } + if (x != 56 || y[0] != 77 || y[1] != 84) + __builtin_abort (); + if (z[0] != 45 || z[1] != 62 || z[2] != 54) + __builtin_abort (); + if (u[0] != 8 || u[1] != 12 || u[2] != 18) + __builtin_abort (); + if (w[0] != 3 || w[1] != 24 || w[2] != 12) + __builtin_abort (); +} + +int +main () +{ + int x = 42; + int yu[2] = { 43, 44 }; + int *y = yu; + #pragma omp parallel master + foo (x, y, 3, 2); + x = 42; + bar (x, y, 3, 2); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/target-in-reduction-2.C b/libgomp/testsuite/libgomp.c++/target-in-reduction-2.C new file mode 100644 index 0000000..5da0e90 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/target-in-reduction-2.C @@ -0,0 +1,182 @@ +struct S { int a, b, c[2]; }; +#pragma omp declare reduction (+: S : (omp_out.a += omp_in.a, omp_out.b += omp_in.b)) \ + initializer (omp_priv = { 0, 0, { 0, 0 } }) + +void +foo (S &x, S *&y, int n, int v) +{ + S zu[3] = { { 45, 47, {} }, { 46, 48, {} }, { 47, 49, {} } }; + S uu[n], wu[n]; + S (&z)[3] = zu; + S (&u)[n] = uu; + S (&w)[n] = wu; + int i; + for (i = 0; i < n; i++) + { + w[i].a = u[i].a = n + i; + w[i].b = u[i].b = n - i; + w[i].c[0] = u[i].c[0] = 0; + w[i].c[1] = u[i].c[1] = 0; + } + #pragma omp taskgroup task_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + #pragma omp task in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x.a++; + x.b++; + y[0].a += 2; + y[0].b += 12; + y[1].a += 3; + y[1].b += 13; + z[1].a += 4; + z[1].b += 14; + u[0].a += 5; + u[0].b += 15; + w[1].a += 6; + w[1].b += 16; + } + #pragma omp target in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x.a += 4; + x.b += 14; + y[0].a += 5; + y[0].b += 15; + y[1].a += 6; + y[1].b += 16; + z[2].a += 7; + z[2].b += 17; + u[1].a += 8; + u[1].b += 18; + w[2].a += 7; + w[2].b += 17; + } + #pragma omp target in_reduction (+: x, y[:v], z[1:v], u, w[1:2]) + { + x.a += 9; + x.b += 19; + y[0].a += 10; + y[0].b += 20; + y[1].a += 11; + y[1].b += 21; + z[1].a += 12; + z[1].b += 22; + u[2].a += 13; + u[2].b += 23; + w[1].a += 14; + w[1].b += 24; + } + } + if (x.a != 56 || y[0].a != 60 || y[1].a != 64) + __builtin_abort (); + if (x.b != 86 || y[0].b != 100 || y[1].b != 104) + __builtin_abort (); + if (z[0].a != 45 || z[1].a != 62 || z[2].a != 54) + __builtin_abort (); + if (z[0].b != 47 || z[1].b != 84 || z[2].b != 66) + __builtin_abort (); + if (u[0].a != 8 || u[1].a != 12 || u[2].a != 18) + __builtin_abort (); + if (u[0].b != 18 || u[1].b != 20 || u[2].b != 24) + __builtin_abort (); + if (w[0].a != 3 || w[1].a != 24 || w[2].a != 12) + __builtin_abort (); + if (w[0].b != 3 || w[1].b != 42 || w[2].b != 18) + __builtin_abort (); +} + +void +bar (S &x, S *&y, int n, int v) +{ + S zu[3] = { { 45, 47, {} }, { 46, 48, {} }, { 47, 49, {} } }; + S uu[n], wu[n]; + S (&z)[3] = zu; + S (&u)[n] = uu; + S (&w)[n] = wu; + int i; + for (i = 0; i < n; i++) + { + w[i].a = u[i].a = n + i; + w[i].b = u[i].b = n - i; + w[i].c[0] = u[i].c[0] = 0; + w[i].c[1] = u[i].c[1] = 0; + } + #pragma omp parallel master + #pragma omp taskgroup task_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + #pragma omp task in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x.a++; + x.b++; + y[0].a += 2; + y[0].b += 12; + y[1].a += 3; + y[1].b += 13; + z[1].a += 4; + z[1].b += 14; + u[0].a += 5; + u[0].b += 15; + w[1].a += 6; + w[1].b += 16; + } + #pragma omp target in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x.a += 4; + x.b += 14; + y[0].a += 5; + y[0].b += 15; + y[1].a += 6; + y[1].b += 16; + z[2].a += 7; + z[2].b += 17; + u[1].a += 8; + u[1].b += 18; + w[2].a += 7; + w[2].b += 17; + } + #pragma omp target in_reduction (+: x, y[:v], z[1:v], u, w[1:2]) + { + x.a += 9; + x.b += 19; + y[0].a += 10; + y[0].b += 20; + y[1].a += 11; + y[1].b += 21; + z[1].a += 12; + z[1].b += 22; + u[2].a += 13; + u[2].b += 23; + w[1].a += 14; + w[1].b += 24; + } + } + if (x.a != 56 || y[0].a != 77 || y[1].a != 84) + __builtin_abort (); + if (x.b != 86 || y[0].b != 147 || y[1].b != 154) + __builtin_abort (); + if (z[0].a != 45 || z[1].a != 62 || z[2].a != 54) + __builtin_abort (); + if (z[0].b != 47 || z[1].b != 84 || z[2].b != 66) + __builtin_abort (); + if (u[0].a != 8 || u[1].a != 12 || u[2].a != 18) + __builtin_abort (); + if (u[0].b != 18 || u[1].b != 20 || u[2].b != 24) + __builtin_abort (); + if (w[0].a != 3 || w[1].a != 24 || w[2].a != 12) + __builtin_abort (); + if (w[0].b != 3 || w[1].b != 42 || w[2].b != 18) + __builtin_abort (); +} + +int +main () +{ + S x = { 42, 52 }; + S yu[2] = { { 43, 53 }, { 44, 54 } }; + S *y = yu; + #pragma omp parallel master + foo (x, y, 3, 2); + x.a = 42; + x.b = 52; + bar (x, y, 3, 2); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-in-reduction-1.c b/libgomp/testsuite/libgomp.c-c++-common/target-in-reduction-1.c new file mode 100644 index 0000000..813b5d9 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/target-in-reduction-1.c @@ -0,0 +1,104 @@ +void +foo (int x, int *y, int n, int v) +{ + int z[3] = { 45, 46, 47 }; + int u[n], w[n], i; + for (i = 0; i < n; i++) + w[i] = u[i] = n + i; + #pragma omp taskgroup task_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + #pragma omp task in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x++; + y[0] += 2; + y[1] += 3; + z[1] += 4; + u[0] += 5; + w[1] += 6; + } + #pragma omp target in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x += 4; + y[0] += 5; + y[1] += 6; + z[2] += 7; + u[1] += 8; + w[2] += 7; + } + #pragma omp target in_reduction (+: x, y[:v], z[1:v], u, w[1:2]) + { + x += 9; + y[0] += 10; + y[1] += 11; + z[1] += 12; + u[2] += 13; + w[1] += 14; + } + } + if (x != 56 || y[0] != 60 || y[1] != 64) + __builtin_abort (); + if (z[0] != 45 || z[1] != 62 || z[2] != 54) + __builtin_abort (); + if (u[0] != 8 || u[1] != 12 || u[2] != 18) + __builtin_abort (); + if (w[0] != 3 || w[1] != 24 || w[2] != 12) + __builtin_abort (); +} + +void +bar (int x, int *y, int n, int v) +{ + int z[3] = { 45, 46, 47 }; + int u[n], w[n], i; + for (i = 0; i < n; i++) + w[i] = u[i] = n + i; + #pragma omp parallel master + #pragma omp taskgroup task_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + #pragma omp task in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x++; + y[0] += 2; + y[1] += 3; + z[1] += 4; + u[0] += 5; + w[1] += 6; + } + #pragma omp target in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x += 4; + y[0] += 5; + y[1] += 6; + z[2] += 7; + u[1] += 8; + w[2] += 7; + } + #pragma omp target in_reduction (+: x, y[:v], z[1:v], u, w[1:2]) + { + x += 9; + y[0] += 10; + y[1] += 11; + z[1] += 12; + u[2] += 13; + w[1] += 14; + } + } + if (x != 56 || y[0] != 77 || y[1] != 84) + __builtin_abort (); + if (z[0] != 45 || z[1] != 62 || z[2] != 54) + __builtin_abort (); + if (u[0] != 8 || u[1] != 12 || u[2] != 18) + __builtin_abort (); + if (w[0] != 3 || w[1] != 24 || w[2] != 12) + __builtin_abort (); +} + +int +main () +{ + int y[2] = { 43, 44 }; + #pragma omp parallel master + foo (42, y, 3, 2); + bar (42, y, 3, 2); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-in-reduction-2.c b/libgomp/testsuite/libgomp.c-c++-common/target-in-reduction-2.c new file mode 100644 index 0000000..dd56965 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/target-in-reduction-2.c @@ -0,0 +1,173 @@ +struct S { int a, b, c[2]; }; +#pragma omp declare reduction (+: struct S : (omp_out.a += omp_in.a, omp_out.b += omp_in.b)) \ + initializer (omp_priv = { 0, 0, { 0, 0 } }) + +void +foo (struct S x, struct S *y, int n, int v) +{ + struct S z[3] = { { 45, 47, {} }, { 46, 48, {} }, { 47, 49, {} } }; + struct S u[n], w[n]; + int i; + for (i = 0; i < n; i++) + { + w[i].a = u[i].a = n + i; + w[i].b = u[i].b = n - i; + w[i].c[0] = u[i].c[0] = 0; + w[i].c[1] = u[i].c[1] = 0; + } + #pragma omp taskgroup task_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + #pragma omp task in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x.a++; + x.b++; + y[0].a += 2; + y[0].b += 12; + y[1].a += 3; + y[1].b += 13; + z[1].a += 4; + z[1].b += 14; + u[0].a += 5; + u[0].b += 15; + w[1].a += 6; + w[1].b += 16; + } + #pragma omp target in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) map(tofrom: x.a, x.b, x.c[:2]) + { + x.a += 4; + x.b += 14; + y[0].a += 5; + y[0].b += 15; + y[1].a += 6; + y[1].b += 16; + z[2].a += 7; + z[2].b += 17; + u[1].a += 8; + u[1].b += 18; + w[2].a += 7; + w[2].b += 17; + } + #pragma omp target in_reduction (+: x, y[:v], z[1:v], u, w[1:2]) + { + x.a += 9; + x.b += 19; + y[0].a += 10; + y[0].b += 20; + y[1].a += 11; + y[1].b += 21; + z[1].a += 12; + z[1].b += 22; + u[2].a += 13; + u[2].b += 23; + w[1].a += 14; + w[1].b += 24; + } + } + if (x.a != 56 || y[0].a != 60 || y[1].a != 64) + __builtin_abort (); + if (x.b != 86 || y[0].b != 100 || y[1].b != 104) + __builtin_abort (); + if (z[0].a != 45 || z[1].a != 62 || z[2].a != 54) + __builtin_abort (); + if (z[0].b != 47 || z[1].b != 84 || z[2].b != 66) + __builtin_abort (); + if (u[0].a != 8 || u[1].a != 12 || u[2].a != 18) + __builtin_abort (); + if (u[0].b != 18 || u[1].b != 20 || u[2].b != 24) + __builtin_abort (); + if (w[0].a != 3 || w[1].a != 24 || w[2].a != 12) + __builtin_abort (); + if (w[0].b != 3 || w[1].b != 42 || w[2].b != 18) + __builtin_abort (); +} + +void +bar (struct S x, struct S *y, int n, int v) +{ + struct S z[3] = { { 45, 47, {} }, { 46, 48, {} }, { 47, 49, {} } }; + struct S u[n], w[n]; + int i; + for (i = 0; i < n; i++) + { + w[i].a = u[i].a = n + i; + w[i].b = u[i].b = n - i; + w[i].c[0] = u[i].c[0] = 0; + w[i].c[1] = u[i].c[1] = 0; + } + #pragma omp parallel master + #pragma omp taskgroup task_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + #pragma omp task in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) + { + x.a++; + x.b++; + y[0].a += 2; + y[0].b += 12; + y[1].a += 3; + y[1].b += 13; + z[1].a += 4; + z[1].b += 14; + u[0].a += 5; + u[0].b += 15; + w[1].a += 6; + w[1].b += 16; + } + #pragma omp target in_reduction (+: x, y[:2], z[1:2], u, w[1:v]) map(tofrom: x.a, x.b, x.c[:2]) + { + x.a += 4; + x.b += 14; + y[0].a += 5; + y[0].b += 15; + y[1].a += 6; + y[1].b += 16; + z[2].a += 7; + z[2].b += 17; + u[1].a += 8; + u[1].b += 18; + w[2].a += 7; + w[2].b += 17; + } + #pragma omp target in_reduction (+: x, y[:v], z[1:v], u, w[1:2]) + { + x.a += 9; + x.b += 19; + y[0].a += 10; + y[0].b += 20; + y[1].a += 11; + y[1].b += 21; + z[1].a += 12; + z[1].b += 22; + u[2].a += 13; + u[2].b += 23; + w[1].a += 14; + w[1].b += 24; + } + } + if (x.a != 56 || y[0].a != 77 || y[1].a != 84) + __builtin_abort (); + if (x.b != 86 || y[0].b != 147 || y[1].b != 154) + __builtin_abort (); + if (z[0].a != 45 || z[1].a != 62 || z[2].a != 54) + __builtin_abort (); + if (z[0].b != 47 || z[1].b != 84 || z[2].b != 66) + __builtin_abort (); + if (u[0].a != 8 || u[1].a != 12 || u[2].a != 18) + __builtin_abort (); + if (u[0].b != 18 || u[1].b != 20 || u[2].b != 24) + __builtin_abort (); + if (w[0].a != 3 || w[1].a != 24 || w[2].a != 12) + __builtin_abort (); + if (w[0].b != 3 || w[1].b != 42 || w[2].b != 18) + __builtin_abort (); +} + +int +main () +{ + struct S x = { 42, 52 }; + struct S y[2] = { { 43, 53 }, { 44, 54 } }; + #pragma omp parallel master + foo (x, y, 3, 2); + bar (x, y, 3, 2); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/task-reduction-15.c b/libgomp/testsuite/libgomp.c-c++-common/task-reduction-15.c new file mode 100644 index 0000000..5e87139 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/task-reduction-15.c @@ -0,0 +1,61 @@ +/* PR middle-end/101167 */ + +extern +#ifdef __cplusplus +"C" +#endif +void abort (void); + +struct S { int a, b, c[2]; }; + +void +init (struct S *x) +{ + x->a = 0; + x->b = 0; + x->c[0] = 0; + x->c[1] = 0; +} + +void +merge (struct S *x, struct S *y) +{ + x->a += y->a; + x->b += y->b; +} + +#pragma omp declare reduction (+: struct S : merge (&omp_out, &omp_in)) initializer (init (&omp_priv)) + +void +foo (struct S x) +{ + #pragma omp taskgroup task_reduction (+: x) + { + #pragma omp task in_reduction (+: x) + { + x.a++; + x.b++; + } + #pragma omp task in_reduction (+: x) + { + x.a += 4; + x.b += 14; + } + #pragma omp task in_reduction (+: x) + { + x.a += 9; + x.b += 19; + } + } + if (x.a != 56 || x.b != 86) + abort (); +} + +int +main () +{ + struct S x = { 42, 52 }; + #pragma omp parallel master num_threads(3) + foo (x); + return 0; +} diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog index c1c04be..0835510 100644 --- a/libstdc++-v3/ChangeLog +++ b/libstdc++-v3/ChangeLog @@ -1,3 +1,70 @@ +2021-06-23 Patrick Palka <ppalka@redhat.com> + + PR c++/101174 + * testsuite/23_containers/multiset/cons/deduction.cc: + Uncomment CTAD example that was rejected by this bug. + * testsuite/23_containers/set/cons/deduction.cc: Likewise. + +2021-06-23 Jonathan Wakely <jwakely@redhat.com> + + * include/std/chrono (chrono::year::is_leap()): Fix incorrect + logic in comment. + +2021-06-23 Matthias Kretz <m.kretz@gsi.de> + + * testsuite/experimental/simd/README.md: New file. + +2021-06-23 Matthias Kretz <m.kretz@gsi.de> + + * testsuite/experimental/simd/driver.sh: Rewrite output + verbosity logic. Add -p/--percentage option. Allow -v/--verbose + to be used twice. Add -x and -o short options. Parse long + options with = instead of separating space generically. Parce + contracted short options. Make unrecognized options an error. + If same-line output is active, trap on EXIT to increment the + progress (only with --percentage), erase the line and print the + current status. + * testsuite/experimental/simd/generate_makefile.sh: Initialize + helper files for progress account keeping. Update help target + for changes to DRIVEROPTS. + +2021-06-23 Matthias Kretz <m.kretz@gsi.de> + + * testsuite/Makefile.am (check-simd): Remove -fno-tree-vrp flag + and associated warning. + * testsuite/Makefile.in: Regenerate. + +2021-06-23 Cassio Neri <cassio.neri@gmail.com> + Jonathan Wakely <jwakely@redhat.com> + Ulrich Drepper <drepper@redhat.com> + + * include/std/chrono (chrono::year::is_leap()): Optimize. + +2021-06-23 Patrick Palka <ppalka@redhat.com> + + PR c++/86439 + * testsuite/23_containers/map/cons/deduction.cc: Replace ambiguous + CTAD examples. + * testsuite/23_containers/multimap/cons/deduction.cc: Likewise. + * testsuite/23_containers/multiset/cons/deduction.cc: Likewise. + Mention one of the replaced examples is broken due to PR101174. + * testsuite/23_containers/set/cons/deduction.cc: Likewise. + * testsuite/23_containers/unordered_map/cons/deduction.cc: Replace + ambiguous CTAD examples. + * testsuite/23_containers/unordered_multimap/cons/deduction.cc: + Likewise. + * testsuite/23_containers/unordered_multiset/cons/deduction.cc: + Likewise. + * testsuite/23_containers/unordered_set/cons/deduction.cc: Likewise. + +2021-06-23 Jonathan Wakely <jwakely@redhat.com> + + * include/std/mutex (__detail::__try_lock_impl): Rename + parameter to avoid clashing with newlib's __lockable macro. + (try_lock): Add 'inline' specifier. + * testsuite/17_intro/names.cc: Add check for __lockable. + * testsuite/30_threads/try_lock/5.cc: Add options for pthreads. + 2021-06-22 Jonathan Wakely <jwakely@redhat.com> Matthias Kretz <m.kretz@gsi.de> diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h index eb9ad23..5be935d 100644 --- a/libstdc++-v3/include/bits/shared_ptr_base.h +++ b/libstdc++-v3/include/bits/shared_ptr_base.h @@ -1035,7 +1035,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #endif element_type& - operator[](ptrdiff_t __i) const + operator[](ptrdiff_t __i) const noexcept { __glibcxx_assert(_M_get() != nullptr); __glibcxx_assert(!extent<_Tp>::value || __i < extent<_Tp>::value); diff --git a/libstdc++-v3/include/bits/unique_ptr.h b/libstdc++-v3/include/bits/unique_ptr.h index 6e55375..1781fe1 100644 --- a/libstdc++-v3/include/bits/unique_ptr.h +++ b/libstdc++-v3/include/bits/unique_ptr.h @@ -402,7 +402,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /// Dereference the stored pointer. typename add_lvalue_reference<element_type>::type - operator*() const + operator*() const noexcept(noexcept(*std::declval<pointer>())) { __glibcxx_assert(get() != pointer()); return *get(); @@ -655,6 +655,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /// Access an element of owned array. typename std::add_lvalue_reference<element_type>::type operator[](size_t __i) const + noexcept(noexcept(std::declval<pointer>()[std::declval<size_t&>()])) { __glibcxx_assert(get() != pointer()); return get()[__i]; diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 59ddf3c..c396ebd 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -234,7 +234,8 @@ namespace __detail // unrolled/pack execution helpers // __execute_n_times{{{ template <typename _Fp, size_t... _I> - _GLIBCXX_SIMD_INTRINSIC constexpr void + [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr + void __execute_on_index_sequence(_Fp&& __f, index_sequence<_I...>) { ((void)__f(_SizeConstant<_I>()), ...); } @@ -254,7 +255,8 @@ template <size_t _Np, typename _Fp> // }}} // __generate_from_n_evaluations{{{ template <typename _R, typename _Fp, size_t... _I> - _GLIBCXX_SIMD_INTRINSIC constexpr _R + [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr + _R __execute_on_index_sequence_with_return(_Fp&& __f, index_sequence<_I...>) { return _R{__f(_SizeConstant<_I>())...}; } @@ -269,7 +271,8 @@ template <size_t _Np, typename _R, typename _Fp> // }}} // __call_with_n_evaluations{{{ template <size_t... _I, typename _F0, typename _FArgs> - _GLIBCXX_SIMD_INTRINSIC constexpr auto + [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr + auto __call_with_n_evaluations(index_sequence<_I...>, _F0&& __f0, _FArgs&& __fargs) { return __f0(__fargs(_SizeConstant<_I>())...); } @@ -285,7 +288,8 @@ template <size_t _Np, typename _F0, typename _FArgs> // }}} // __call_with_subscripts{{{ template <size_t _First = 0, size_t... _It, typename _Tp, typename _Fp> - _GLIBCXX_SIMD_INTRINSIC constexpr auto + [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr + auto __call_with_subscripts(_Tp&& __x, index_sequence<_It...>, _Fp&& __fun) { return __fun(__x[_First + _It]...); } @@ -5189,6 +5193,12 @@ template <typename _Tp, typename _Ap> return {__private_init, _Ap::_SimdImpl::_S_bit_and(__data(__a), __data(__b))}; } + +template <typename _Tp, typename _Ap> + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + enable_if_t<is_floating_point_v<_Tp>, simd<_Tp, _Ap>> + operator~(const simd<_Tp, _Ap>& __a) + { return {__private_init, _Ap::_SimdImpl::_S_complement(__data(__a))}; } } // namespace __float_bitwise_operators }}} _GLIBCXX_SIMD_END_NAMESPACE diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h index e986ee9..8cd338e 100644 --- a/libstdc++-v3/include/experimental/bits/simd_builtin.h +++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h @@ -1632,7 +1632,12 @@ template <typename _Abi> template <typename _Tp, size_t _Np> _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np> _S_complement(_SimdWrapper<_Tp, _Np> __x) noexcept - { return ~__x._M_data; } + { + if constexpr (is_floating_point_v<_Tp>) + return __vector_bitcast<_Tp>(~__vector_bitcast<__int_for_sizeof_t<_Tp>>(__x)); + else + return ~__x._M_data; + } // _S_unary_minus {{{2 template <typename _Tp, size_t _Np> diff --git a/libstdc++-v3/include/experimental/bits/simd_converter.h b/libstdc++-v3/include/experimental/bits/simd_converter.h index 9c8bf38..11999df 100644 --- a/libstdc++-v3/include/experimental/bits/simd_converter.h +++ b/libstdc++-v3/include/experimental/bits/simd_converter.h @@ -316,7 +316,7 @@ template <typename _From, int _Np, typename _To, typename _Ap> _GLIBCXX_SIMD_INTRINSIC constexpr typename _SimdTraits<_To, _Ap>::_SimdMember - operator()(_Arg __x) const noexcept + operator()(const _Arg& __x) const noexcept { if constexpr (_Arg::_S_tuple_size == 1) return __vector_convert<__vector_type_t<_To, _Np>>(__x.first); diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h index 2722055..dc2fb90 100644 --- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h +++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h @@ -1026,55 +1026,6 @@ template <typename _Tp, int _Np, typename... _As, typename _Next, int _Remain> }; // }}} -// _AbisInSimdTuple {{{ -template <typename _Tp> - struct _SeqOp; - -template <size_t _I0, size_t... _Is> - struct _SeqOp<index_sequence<_I0, _Is...>> - { - using _FirstPlusOne = index_sequence<_I0 + 1, _Is...>; - using _NotFirstPlusOne = index_sequence<_I0, (_Is + 1)...>; - template <size_t _First, size_t _Add> - using _Prepend = index_sequence<_First, _I0 + _Add, (_Is + _Add)...>; - }; - -template <typename _Tp> - struct _AbisInSimdTuple; - -template <typename _Tp> - struct _AbisInSimdTuple<_SimdTuple<_Tp>> - { - using _Counts = index_sequence<0>; - using _Begins = index_sequence<0>; - }; - -template <typename _Tp, typename _Ap> - struct _AbisInSimdTuple<_SimdTuple<_Tp, _Ap>> - { - using _Counts = index_sequence<1>; - using _Begins = index_sequence<0>; - }; - -template <typename _Tp, typename _A0, typename... _As> - struct _AbisInSimdTuple<_SimdTuple<_Tp, _A0, _A0, _As...>> - { - using _Counts = typename _SeqOp<typename _AbisInSimdTuple< - _SimdTuple<_Tp, _A0, _As...>>::_Counts>::_FirstPlusOne; - using _Begins = typename _SeqOp<typename _AbisInSimdTuple< - _SimdTuple<_Tp, _A0, _As...>>::_Begins>::_NotFirstPlusOne; - }; - -template <typename _Tp, typename _A0, typename _A1, typename... _As> - struct _AbisInSimdTuple<_SimdTuple<_Tp, _A0, _A1, _As...>> - { - using _Counts = typename _SeqOp<typename _AbisInSimdTuple< - _SimdTuple<_Tp, _A1, _As...>>::_Counts>::template _Prepend<1, 0>; - using _Begins = typename _SeqOp<typename _AbisInSimdTuple< - _SimdTuple<_Tp, _A1, _As...>>::_Begins>::template _Prepend<0, 1>; - }; - -// }}} // __autocvt_to_simd {{{ template <typename _Tp, bool = is_arithmetic_v<__remove_cvref_t<_Tp>>> struct __autocvt_to_simd @@ -1529,7 +1480,7 @@ template <int _Np> #define _GLIBCXX_SIMD_FIXED_OP(name_, op_) \ template <typename _Tp, typename... _As> \ static inline constexpr _SimdTuple<_Tp, _As...> name_( \ - const _SimdTuple<_Tp, _As...> __x, const _SimdTuple<_Tp, _As...> __y) \ + const _SimdTuple<_Tp, _As...>& __x, const _SimdTuple<_Tp, _As...>& __y)\ { \ return __x._M_apply_per_chunk( \ [](auto __impl, auto __xx, auto __yy) constexpr { \ @@ -1663,7 +1614,7 @@ template <int _Np> _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, ldexp) _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmod) _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, remainder) - // copysign in simd_math.h + _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, copysign) _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, nextafter) _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fdim) _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmax) @@ -1829,8 +1780,7 @@ template <int _Np> // _S_masked_unary {{{2 template <template <typename> class _Op, typename _Tp, typename... _As> static inline _SimdTuple<_Tp, _As...> - _S_masked_unary(const _MaskMember __bits, - const _SimdTuple<_Tp, _As...> __v) // TODO: const-ref __v? + _S_masked_unary(const _MaskMember __bits, const _SimdTuple<_Tp, _As...>& __v) { return __v._M_apply_wrapped([&__bits](auto __meta, auto __native) constexpr { diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h index 4799803..dd9aaa3 100644 --- a/libstdc++-v3/include/experimental/bits/simd_math.h +++ b/libstdc++-v3/include/experimental/bits/simd_math.h @@ -119,10 +119,10 @@ template <typename _Up, typename _Tp, typename _Abi> //}}} // _GLIBCXX_SIMD_MATH_CALL2_ {{{ -#define _GLIBCXX_SIMD_MATH_CALL2_(__name, arg2_) \ +#define _GLIBCXX_SIMD_MATH_CALL2_(__name, __arg2) \ template < \ typename _Tp, typename _Abi, typename..., \ - typename _Arg2 = _Extra_argument_type<arg2_, _Tp, _Abi>, \ + typename _Arg2 = _Extra_argument_type<__arg2, _Tp, _Abi>, \ typename _R = _Math_return_type_t< \ decltype(std::__name(declval<double>(), _Arg2::declval())), _Tp, _Abi>> \ enable_if_t<is_floating_point_v<_Tp>, _R> \ @@ -137,7 +137,7 @@ template <typename _Up, typename _Tp, typename _Abi> \ declval<double>(), \ declval<enable_if_t< \ conjunction_v< \ - is_same<arg2_, _Tp>, \ + is_same<__arg2, _Tp>, \ negation<is_same<__remove_cvref_t<_Up>, simd<_Tp, _Abi>>>, \ is_convertible<_Up, simd<_Tp, _Abi>>, is_floating_point<_Tp>>, \ double>>())), \ @@ -147,10 +147,10 @@ template <typename _Up, typename _Tp, typename _Abi> \ // }}} // _GLIBCXX_SIMD_MATH_CALL3_ {{{ -#define _GLIBCXX_SIMD_MATH_CALL3_(__name, arg2_, arg3_) \ +#define _GLIBCXX_SIMD_MATH_CALL3_(__name, __arg2, __arg3) \ template <typename _Tp, typename _Abi, typename..., \ - typename _Arg2 = _Extra_argument_type<arg2_, _Tp, _Abi>, \ - typename _Arg3 = _Extra_argument_type<arg3_, _Tp, _Abi>, \ + typename _Arg2 = _Extra_argument_type<__arg2, _Tp, _Abi>, \ + typename _Arg3 = _Extra_argument_type<__arg3, _Tp, _Abi>, \ typename _R = _Math_return_type_t< \ decltype(std::__name(declval<double>(), _Arg2::declval(), \ _Arg3::declval())), \ @@ -645,11 +645,8 @@ template <typename _Tp, typename _Abi> return __r; } else if constexpr (__is_fixed_size_abi_v<_Abi>) - { - return {__private_init, - _Abi::_SimdImpl::_S_frexp(__data(__x), __data(*__exp))}; + return {__private_init, _Abi::_SimdImpl::_S_frexp(__data(__x), __data(*__exp))}; #if _GLIBCXX_SIMD_X86INTRIN - } else if constexpr (__have_avx512f) { constexpr size_t _Np = simd_size_v<_Tp, _Abi>; @@ -667,8 +664,8 @@ template <typename _Tp, typename _Abi> _Abi::_CommonImpl::_S_blend(_SimdWrapper<bool, _Np>( __isnonzero), __v, __getmant_avx512(__v))}; -#endif // _GLIBCXX_SIMD_X86INTRIN } +#endif // _GLIBCXX_SIMD_X86INTRIN else { // fallback implementation @@ -751,14 +748,7 @@ template <typename _Tp, typename _Abi> if constexpr (_Np == 1) return std::logb(__x[0]); else if constexpr (__is_fixed_size_abi_v<_Abi>) - { - return {__private_init, - __data(__x)._M_apply_per_chunk([](auto __impl, auto __xx) { - using _V = typename decltype(__impl)::simd_type; - return __data( - std::experimental::logb(_V(__private_init, __xx))); - })}; - } + return {__private_init, _Abi::_SimdImpl::_S_logb(__data(__x))}; #if _GLIBCXX_SIMD_X86INTRIN // {{{ else if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>()) return {__private_init, @@ -829,9 +819,7 @@ template <typename _Tp, typename _Abi> enable_if_t<is_floating_point_v<_Tp>, simd<_Tp, _Abi>> modf(const simd<_Tp, _Abi>& __x, simd<_Tp, _Abi>* __iptr) { - if constexpr (__is_scalar_abi<_Abi>() - || (__is_fixed_size_abi_v< - _Abi> && simd_size_v<_Tp, _Abi> == 1)) + if constexpr (simd_size_v<_Tp, _Abi> == 1) { _Tp __tmp; _Tp __r = std::modf(__x[0], &__tmp); @@ -865,22 +853,6 @@ template <typename _Tp, typename _Abi> abs(const simd<_Tp, _Abi>& __x) { return {__private_init, _Abi::_SimdImpl::_S_abs(__data(__x))}; } -template <typename _Tp, typename _Abi> - enable_if_t<!is_floating_point_v<_Tp> && is_signed_v<_Tp>, simd<_Tp, _Abi>> - fabs(const simd<_Tp, _Abi>& __x) - { return {__private_init, _Abi::_SimdImpl::_S_abs(__data(__x))}; } - -// the following are overloads for functions in <cstdlib> and not covered by -// [parallel.simd.math]. I don't see much value in making them work, though -/* -template <typename _Abi> simd<long, _Abi> labs(const simd<long, _Abi> &__x) -{ return {__private_init, _Abi::_SimdImpl::abs(__data(__x))}; } - -template <typename _Abi> simd<long long, _Abi> llabs(const simd<long long, _Abi> -&__x) -{ return {__private_init, _Abi::_SimdImpl::abs(__data(__x))}; } -*/ - #define _GLIBCXX_SIMD_CVTING2(_NAME) \ template <typename _Tp, typename _Abi> \ _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME( \ @@ -1304,6 +1276,8 @@ template <typename _Tp, typename _Abi> { if constexpr (simd_size_v<_Tp, _Abi> == 1) return std::copysign(__x[0], __y[0]); + else if constexpr (__is_fixed_size_abi_v<_Abi>) + return {__private_init, _Abi::_SimdImpl::_S_copysign(__data(__x), __data(__y))}; else if constexpr (is_same_v<_Tp, long double> && sizeof(_Tp) == 12) // Remove this case once __bit_cast is implemented via __builtin_bit_cast. // It is necessary, because __signmask below cannot be computed at compile @@ -1315,7 +1289,7 @@ template <typename _Tp, typename _Abi> using _V = simd<_Tp, _Abi>; using namespace std::experimental::__float_bitwise_operators; _GLIBCXX_SIMD_USE_CONSTEXPR_API auto __signmask = _V(1) ^ _V(-1); - return (__x & (__x ^ __signmask)) | (__y & __signmask); + return (__x & ~__signmask) | (__y & __signmask); } } @@ -1488,6 +1462,8 @@ template <typename _Tp, typename _Abi> } // }}} +#undef _GLIBCXX_SIMD_CVTING2 +#undef _GLIBCXX_SIMD_CVTING3 #undef _GLIBCXX_SIMD_MATH_CALL_ #undef _GLIBCXX_SIMD_MATH_CALL2_ #undef _GLIBCXX_SIMD_MATH_CALL3_ diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h index 305d7a9..34633c0 100644 --- a/libstdc++-v3/include/experimental/bits/simd_x86.h +++ b/libstdc++-v3/include/experimental/bits/simd_x86.h @@ -2611,13 +2611,14 @@ template <typename _Abi> _S_ldexp(_SimdWrapper<_Tp, _Np> __x, __fixed_size_storage_t<int, _Np> __exp) { - if constexpr (__is_avx512_abi<_Abi>()) + if constexpr (sizeof(__x) == 64 || __have_avx512vl) { const auto __xi = __to_intrin(__x); constexpr _SimdConverter<int, simd_abi::fixed_size<_Np>, _Tp, _Abi> __cvt; const auto __expi = __to_intrin(__cvt(__exp)); - constexpr auto __k1 = _Abi::template _S_implicit_mask_intrin<_Tp>(); + using _Up = __bool_storage_member_type_t<_Np>; + constexpr _Up __k1 = _Np < sizeof(_Up) * __CHAR_BIT__ ? _Up((1ULL << _Np) - 1) : ~_Up(); if constexpr (sizeof(__xi) == 16) { if constexpr (sizeof(_Tp) == 8) @@ -2656,13 +2657,13 @@ template <typename _Abi> else if constexpr (__is_avx512_pd<_Tp, _Np>()) return _mm512_roundscale_pd(__x, 0x0b); else if constexpr (__is_avx_ps<_Tp, _Np>()) - return _mm256_round_ps(__x, 0x3); + return _mm256_round_ps(__x, 0xb); else if constexpr (__is_avx_pd<_Tp, _Np>()) - return _mm256_round_pd(__x, 0x3); + return _mm256_round_pd(__x, 0xb); else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>()) - return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0x3)); + return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0xb)); else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>()) - return _mm_round_pd(__x, 0x3); + return _mm_round_pd(__x, 0xb); else if constexpr (__is_sse_ps<_Tp, _Np>()) { auto __truncated @@ -2785,13 +2786,13 @@ template <typename _Abi> else if constexpr (__is_avx512_pd<_Tp, _Np>()) return _mm512_roundscale_pd(__x, 0x09); else if constexpr (__is_avx_ps<_Tp, _Np>()) - return _mm256_round_ps(__x, 0x1); + return _mm256_round_ps(__x, 0x9); else if constexpr (__is_avx_pd<_Tp, _Np>()) - return _mm256_round_pd(__x, 0x1); + return _mm256_round_pd(__x, 0x9); else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>()) - return __auto_bitcast(_mm_floor_ps(__to_intrin(__x))); + return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0x9)); else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>()) - return _mm_floor_pd(__x); + return _mm_round_pd(__x, 0x9); else return _Base::_S_floor(__x); } @@ -2807,13 +2808,13 @@ template <typename _Abi> else if constexpr (__is_avx512_pd<_Tp, _Np>()) return _mm512_roundscale_pd(__x, 0x0a); else if constexpr (__is_avx_ps<_Tp, _Np>()) - return _mm256_round_ps(__x, 0x2); + return _mm256_round_ps(__x, 0xa); else if constexpr (__is_avx_pd<_Tp, _Np>()) - return _mm256_round_pd(__x, 0x2); + return _mm256_round_pd(__x, 0xa); else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>()) - return __auto_bitcast(_mm_ceil_ps(__to_intrin(__x))); + return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0xa)); else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>()) - return _mm_ceil_pd(__x); + return _mm_round_pd(__x, 0xa); else return _Base::_S_ceil(__x); } diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono index 4631a72..0b597be 100644 --- a/libstdc++-v3/include/std/chrono +++ b/libstdc++-v3/include/std/chrono @@ -1606,13 +1606,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // [1] https://github.com/cassioneri/calendar // [2] https://accu.org/journals/overload/28/155/overload155.pdf#page=16 + // Furthermore, if y%100 == 0, then y%400==0 is equivalent to y%16==0, + // so we can simplify it to (!mult_100 && y % 4 == 0) || y % 16 == 0, + // which is equivalent to (y & (mult_100 ? 15 : 3)) == 0. + // See https://gcc.gnu.org/pipermail/libstdc++/2021-June/052815.html + constexpr uint32_t __multiplier = 42949673; constexpr uint32_t __bound = 42949669; constexpr uint32_t __max_dividend = 1073741799; constexpr uint32_t __offset = __max_dividend / 2 / 100 * 100; const bool __is_multiple_of_100 = __multiplier * (_M_y + __offset) < __bound; - return (!__is_multiple_of_100 || _M_y % 400 == 0) && _M_y % 4 == 0; + return (_M_y & (__is_multiple_of_100 ? 15 : 3)) == 0; } explicit constexpr diff --git a/libstdc++-v3/include/std/mutex b/libstdc++-v3/include/std/mutex index c18ca1a..eeb51fd 100644 --- a/libstdc++-v3/include/std/mutex +++ b/libstdc++-v3/include/std/mutex @@ -517,9 +517,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Lock the last lockable, after all previous ones are locked. template<typename _Lockable> inline int - __try_lock_impl(_Lockable& __lockable) + __try_lock_impl(_Lockable& __l) { - if (unique_lock<_Lockable> __lock{__lockable, try_to_lock}) + if (unique_lock<_Lockable> __lock{__l, try_to_lock}) { __lock.release(); return -1; @@ -585,7 +585,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * Sequentially calls try_lock() on each argument. */ template<typename _L1, typename _L2, typename... _L3> - int + inline int try_lock(_L1& __l1, _L2& __l2, _L3&... __l3) { return __detail::__try_lock_impl(__l1, __l2, __l3...); diff --git a/libstdc++-v3/testsuite/17_intro/names.cc b/libstdc++-v3/testsuite/17_intro/names.cc index 624e3ed..534dab7 100644 --- a/libstdc++-v3/testsuite/17_intro/names.cc +++ b/libstdc++-v3/testsuite/17_intro/names.cc @@ -16,6 +16,7 @@ // <http://www.gnu.org/licenses/>. // { dg-do compile } +// { dg-add-options no_pch } // Define macros for some common variables names that we must not use for // naming variables, parameters etc. in the library. @@ -216,6 +217,11 @@ #undef y #endif +#if ! __has_include(<newlib.h>) +// newlib's <sys/cdefs.h> defines __lockable as a macro, so we can't use it. +# define __lockable cannot be used as an identifier +#endif + #ifdef __sun__ // See https://gcc.gnu.org/ml/libstdc++/2019-05/msg00175.html #undef ptr diff --git a/libstdc++-v3/testsuite/20_util/optional/observers/lwg2762.cc b/libstdc++-v3/testsuite/20_util/optional/observers/lwg2762.cc new file mode 100644 index 0000000..a0cf0bc --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/optional/observers/lwg2762.cc @@ -0,0 +1,21 @@ +// { dg-do compile { target c++17 } } + +// LWG 2762 adds noexcept to operator-> and operator* +#include <optional> + +struct S +{ + void can_throw(); + void cannot_throw() noexcept; +}; + +static_assert( ! noexcept(std::declval<std::optional<S>&>()->can_throw()) ); +static_assert( noexcept(std::declval<std::optional<S>&>()->cannot_throw()) ); + +static_assert( noexcept(std::declval<std::optional<S>&>().operator->()) ); +static_assert( noexcept(std::declval<std::optional<int>&>().operator->()) ); + +static_assert( noexcept(*std::declval<std::optional<int>&>()) ); +static_assert( noexcept(*std::declval<const std::optional<int>&>()) ); +static_assert( noexcept(*std::declval<std::optional<int>&&>()) ); +static_assert( noexcept(*std::declval<const std::optional<int>&&>()) ); diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/observers/array.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/observers/array.cc index f6acb1f..7fd6c01 100644 --- a/libstdc++-v3/testsuite/20_util/shared_ptr/observers/array.cc +++ b/libstdc++-v3/testsuite/20_util/shared_ptr/observers/array.cc @@ -34,6 +34,7 @@ test01() A * const a = new A[2]; const std::shared_ptr<A[2]> p(a); VERIFY( p.get() == a ); + static_assert( noexcept(p.get()), "non-throwing" ); } // get @@ -43,6 +44,7 @@ test02() A * const a = new A[2]; const std::shared_ptr<A[]> p(a); VERIFY( p.get() == a ); + static_assert( noexcept(p.get()), "non-throwing" ); } // operator[] @@ -52,6 +54,7 @@ test03() A * const a = new A[2]; const std::shared_ptr<A[2]> p(a); VERIFY( &p[0] == a ); + static_assert( noexcept(p[0]), "non-throwing" ); } // operator[] @@ -61,6 +64,7 @@ test04() A * const a = new A[2]; const std::shared_ptr<A[]> p(a); VERIFY( &p[0] == a ); + static_assert( noexcept(p[0]), "non-throwing" ); } int diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/observers/get.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/observers/get.cc index cd1282b..6f2cb9f 100644 --- a/libstdc++-v3/testsuite/20_util/shared_ptr/observers/get.cc +++ b/libstdc++-v3/testsuite/20_util/shared_ptr/observers/get.cc @@ -37,6 +37,7 @@ test01() A * const a = new A; const std::shared_ptr<A> p(a); VERIFY( p.get() == a ); + static_assert( noexcept(p.get()), "non-throwing" ); } // operator* @@ -46,6 +47,7 @@ test02() A * const a = new A; const std::shared_ptr<A> p(a); VERIFY( &*p == a ); + static_assert( noexcept(*p), "non-throwing" ); } // operator-> @@ -55,6 +57,7 @@ test03() A * const a = new A; const std::shared_ptr<A> p(a); VERIFY( &p->i == &a->i ); + static_assert( noexcept(p->i), "non-throwing" ); } void @@ -67,7 +70,7 @@ test04() #endif } -int +int main() { test01(); diff --git a/libstdc++-v3/testsuite/20_util/unique_ptr/lwg2762.cc b/libstdc++-v3/testsuite/20_util/unique_ptr/lwg2762.cc new file mode 100644 index 0000000..3cc2ea6 --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/unique_ptr/lwg2762.cc @@ -0,0 +1,43 @@ +// { dg-do compile { target c++11 } } +#include <memory> + +// 2762. unique_ptr operator*() should be noexcept +static_assert( noexcept(*std::declval<std::unique_ptr<long>>()), "LWG 2762" ); + +template<bool B> +struct deleter +{ + struct pointer + { + int& operator*() && noexcept(B); // this is used by unique_ptr + int& operator*() const& = delete; // this should not be + + int& operator[](std::size_t) && noexcept(B); // this is used by unique_ptr + int& operator[](std::size_t) const& = delete; // should not be used + int& operator[](int) && = delete; // should not be used + int& operator[](double) && = delete; // should not be used + + int* operator->() noexcept(false); // noexcept here doesn't affect anything + + // Needed for NullablePointer requirements + pointer(int* = nullptr); + bool operator==(const pointer&) const noexcept; + bool operator!=(const pointer&) const noexcept; + }; + + void operator()(pointer) const noexcept { } +}; + +template<typename T, bool Nothrow> + using UPtr = std::unique_ptr<T, deleter<Nothrow>>; + +// noexcept-specifier depends on the pointer type +static_assert( noexcept(*std::declval<UPtr<int, true>&>()), "" ); +static_assert( ! noexcept(*std::declval<UPtr<int, false>&>()), "" ); + +// This has always been required, even in C++11. +static_assert( noexcept(std::declval<UPtr<int, false>&>().operator->()), "" ); + +// This is not required by the standard +static_assert( noexcept(std::declval<UPtr<int[], true>&>()[0]), "" ); +static_assert( ! noexcept(std::declval<UPtr<int[], false>&>()[0]), "" ); diff --git a/libstdc++-v3/testsuite/23_containers/map/cons/deduction.cc b/libstdc++-v3/testsuite/23_containers/map/cons/deduction.cc index e9628c4..e72033c 100644 --- a/libstdc++-v3/testsuite/23_containers/map/cons/deduction.cc +++ b/libstdc++-v3/testsuite/23_containers/map/cons/deduction.cc @@ -40,7 +40,7 @@ static_assert(std::is_same_v< */ static_assert(std::is_same_v< - decltype(std::map{{std::pair{1, 2.0}, {2, 3.0}, {3, 4.0}}, {}}), + decltype(std::map{{std::pair{1, 2.0}, {2, 3.0}, {3, 4.0}}, std::less<int>{}}), std::map<int, double>>); /* This is not deducible, ambiguous candidates: @@ -92,7 +92,7 @@ void f() static_assert(std::is_same_v< decltype(std::map(x.begin(), x.end(), - {})), + std::less<int>{})), std::map<int, double>>); static_assert(std::is_same_v< @@ -145,7 +145,7 @@ void g() static_assert(std::is_same_v< decltype(std::map(x.begin(), x.end(), - {})), + std::less<int>{})), std::map<int, double>>); static_assert(std::is_same_v< @@ -195,7 +195,7 @@ void h() static_assert(std::is_same_v< decltype(std::map(x.begin(), x.end(), - {})), + std::less<int>{})), std::map<int, double>>); static_assert(std::is_same_v< diff --git a/libstdc++-v3/testsuite/23_containers/multimap/cons/deduction.cc b/libstdc++-v3/testsuite/23_containers/multimap/cons/deduction.cc index 791cc96..ffc7502 100644 --- a/libstdc++-v3/testsuite/23_containers/multimap/cons/deduction.cc +++ b/libstdc++-v3/testsuite/23_containers/multimap/cons/deduction.cc @@ -42,7 +42,7 @@ static_assert(std::is_same_v< static_assert(std::is_same_v< decltype(std::multimap{{std::pair{1, 2.0}, {2, 3.0}, {3, 4.0}}, - {}}), + std::less<int>{}}), std::multimap<int, double>>); static_assert(std::is_same_v< @@ -77,7 +77,7 @@ void f() static_assert(std::is_same_v< decltype(std::multimap(x.begin(), x.end(), - {})), + std::less<int>{})), std::multimap<int, double>>); static_assert(std::is_same_v< @@ -119,7 +119,7 @@ void g() static_assert(std::is_same_v< decltype(std::multimap(x.begin(), x.end(), - {})), + std::less<int>{})), std::multimap<int, double>>); static_assert(std::is_same_v< @@ -158,7 +158,7 @@ void h() static_assert(std::is_same_v< decltype(std::multimap(x.begin(), x.end(), - {})), + std::less<int>{})), std::multimap<int, double>>); static_assert(std::is_same_v< diff --git a/libstdc++-v3/testsuite/23_containers/multiset/cons/deduction.cc b/libstdc++-v3/testsuite/23_containers/multiset/cons/deduction.cc index ad12755..8b7a160 100644 --- a/libstdc++-v3/testsuite/23_containers/multiset/cons/deduction.cc +++ b/libstdc++-v3/testsuite/23_containers/multiset/cons/deduction.cc @@ -20,7 +20,7 @@ static_assert(std::is_same_v< std::multiset<int>>); static_assert(std::is_same_v< - decltype(std::multiset{{1, 2, 3}, {}}), + decltype(std::multiset{{1, 2, 3}, std::less<int>{}}), std::multiset<int>>); static_assert(std::is_same_v< @@ -52,7 +52,7 @@ void f() static_assert(std::is_same_v< decltype(std::multiset(x.begin(), x.end(), - {})), + std::less<int>{})), std::multiset<int>>); static_assert(std::is_same_v< @@ -103,7 +103,7 @@ void g() static_assert(std::is_same_v< decltype(std::multiset(x.begin(), x.end(), - {})), + std::less<int>{})), std::multiset<int>>); static_assert(std::is_same_v< diff --git a/libstdc++-v3/testsuite/23_containers/set/cons/deduction.cc b/libstdc++-v3/testsuite/23_containers/set/cons/deduction.cc index 89a2c43..14f36b7 100644 --- a/libstdc++-v3/testsuite/23_containers/set/cons/deduction.cc +++ b/libstdc++-v3/testsuite/23_containers/set/cons/deduction.cc @@ -22,7 +22,7 @@ static_assert(std::is_same_v< static_assert(std::is_same_v< decltype(std::set{{1, 2, 3}, - {}}), + std::less<int>{}}), std::set<int>>); static_assert(std::is_same_v< @@ -58,7 +58,7 @@ void f() static_assert(std::is_same_v< decltype(std::set(x.begin(), x.end(), - {})), + std::less<int>{})), std::set<int>>); static_assert(std::is_same_v< @@ -104,7 +104,7 @@ void g() static_assert(std::is_same_v< decltype(std::set(x.begin(), x.end(), - {})), + std::less<int>{})), std::set<int>>); static_assert(std::is_same_v< diff --git a/libstdc++-v3/testsuite/23_containers/unordered_map/cons/deduction.cc b/libstdc++-v3/testsuite/23_containers/unordered_map/cons/deduction.cc index d8489b2..0785447 100644 --- a/libstdc++-v3/testsuite/23_containers/unordered_map/cons/deduction.cc +++ b/libstdc++-v3/testsuite/23_containers/unordered_map/cons/deduction.cc @@ -24,7 +24,13 @@ static_assert(std::is_same_v< static_assert(std::is_same_v< decltype(std::unordered_map{{std::pair{1, 2.0}, {2, 3.0}, {3, 4.0}}, - {}, std::hash<int>{}, {}}), + {}, std::hash<int>{}, std::equal_to<int>{}}), + std::unordered_map<int, double>>); + +static_assert(std::is_same_v< + decltype(std::unordered_map{{std::pair{1, 2.0}, + {2, 3.0}, {3, 4.0}}, + {}, std::hash<int>{}, std::allocator<std::pair<const int, double>>{}}), std::unordered_map<int, double>>); static_assert(std::is_same_v< @@ -59,9 +65,14 @@ void f() static_assert(std::is_same_v< decltype(std::unordered_map{x.begin(), x.end(), - {}, std::hash<int>{}, {}}), + {}, std::hash<int>{}, std::equal_to<int>{}}), std::unordered_map<int, double>>); - + + static_assert(std::is_same_v< + decltype(std::unordered_map{x.begin(), x.end(), + {}, std::hash<int>{}, std::allocator<std::pair<const int, double>>{}}), + std::unordered_map<int, double>>); + static_assert(std::is_same_v< decltype(std::unordered_map(x.begin(), x.end(), {})), diff --git a/libstdc++-v3/testsuite/23_containers/unordered_multimap/cons/deduction.cc b/libstdc++-v3/testsuite/23_containers/unordered_multimap/cons/deduction.cc index 13f54d4..d8a6f51 100644 --- a/libstdc++-v3/testsuite/23_containers/unordered_multimap/cons/deduction.cc +++ b/libstdc++-v3/testsuite/23_containers/unordered_multimap/cons/deduction.cc @@ -18,7 +18,13 @@ static_assert(std::is_same_v< static_assert(std::is_same_v< decltype(std::unordered_multimap{{std::pair{1, 2.0}, {2, 3.0}, {3, 4.0}}, - {}, std::hash<int>{}, {}}), + {}, std::hash<int>{}, std::equal_to<int>{}}), + std::unordered_multimap<int, double>>); + +static_assert(std::is_same_v< + decltype(std::unordered_multimap{{std::pair{1, 2.0}, + {2, 3.0}, {3, 4.0}}, + {}, std::hash<int>{}, std::allocator<std::pair<const int, double>>{}}), std::unordered_multimap<int, double>>); static_assert(std::is_same_v< @@ -68,9 +74,14 @@ void f() static_assert(std::is_same_v< decltype(std::unordered_multimap{x.begin(), x.end(), - {}, std::hash<int>{}, {}}), + {}, std::hash<int>{}, std::equal_to<int>{}}), std::unordered_multimap<int, double>>); - + + static_assert(std::is_same_v< + decltype(std::unordered_multimap{x.begin(), x.end(), + {}, std::hash<int>{}, std::allocator<std::pair<const int, double>>{}}), + std::unordered_multimap<int, double>>); + static_assert(std::is_same_v< decltype(std::unordered_multimap(x.begin(), x.end(), {})), diff --git a/libstdc++-v3/testsuite/23_containers/unordered_multiset/cons/deduction.cc b/libstdc++-v3/testsuite/23_containers/unordered_multiset/cons/deduction.cc index 1850237..25c2715 100644 --- a/libstdc++-v3/testsuite/23_containers/unordered_multiset/cons/deduction.cc +++ b/libstdc++-v3/testsuite/23_containers/unordered_multiset/cons/deduction.cc @@ -11,7 +11,12 @@ static_assert(std::is_same_v< static_assert(std::is_same_v< decltype(std::unordered_multiset{{1, 2, 3}, - 0, std::hash<int>{}, {}}), + 0, std::hash<int>{}, std::equal_to<int>{}}), + std::unordered_multiset<int>>); + +static_assert(std::is_same_v< + decltype(std::unordered_multiset{{1, 2, 3}, + 0, std::hash<int>{}, std::allocator<int>{}}), std::unordered_multiset<int>>); static_assert(std::is_same_v< @@ -78,7 +83,12 @@ void f() static_assert(std::is_same_v< decltype(std::unordered_multiset{x.begin(), x.end(), - {}, std::hash<int>{}, {}}), + {}, std::hash<int>{}, std::equal_to<int>{}}), + std::unordered_multiset<int>>); + + static_assert(std::is_same_v< + decltype(std::unordered_multiset{x.begin(), x.end(), + {}, std::hash<int>{}, std::allocator<int>{}}), std::unordered_multiset<int>>); static_assert(std::is_same_v< diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/cons/deduction.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/cons/deduction.cc index a745dce..b8c45d2 100644 --- a/libstdc++-v3/testsuite/23_containers/unordered_set/cons/deduction.cc +++ b/libstdc++-v3/testsuite/23_containers/unordered_set/cons/deduction.cc @@ -11,7 +11,12 @@ static_assert(std::is_same_v< static_assert(std::is_same_v< decltype(std::unordered_set{{1, 2, 3}, - 0, std::hash<int>{}, {}}), + 0, std::hash<int>{}, std::equal_to<int>{}}), + std::unordered_set<int>>); + +static_assert(std::is_same_v< + decltype(std::unordered_set{{1, 2, 3}, + 0, std::hash<int>{}, std::allocator<int>{}}), std::unordered_set<int>>); static_assert(std::is_same_v< @@ -73,7 +78,12 @@ void f() static_assert(std::is_same_v< decltype(std::unordered_set{x.begin(), x.end(), - {}, std::hash<int>{}, {}}), + {}, std::hash<int>{}, std::equal_to<int>{}}), + std::unordered_set<int>>); + + static_assert(std::is_same_v< + decltype(std::unordered_set{x.begin(), x.end(), + {}, std::hash<int>{}, std::allocator<int>{}}), std::unordered_set<int>>); static_assert(std::is_same_v< diff --git a/libstdc++-v3/testsuite/30_threads/try_lock/5.cc b/libstdc++-v3/testsuite/30_threads/try_lock/5.cc index a5574ff..b9ce1cc 100644 --- a/libstdc++-v3/testsuite/30_threads/try_lock/5.cc +++ b/libstdc++-v3/testsuite/30_threads/try_lock/5.cc @@ -1,4 +1,7 @@ -// { dg-do run { target c++11 } } +// { dg-do run } +// { dg-additional-options "-pthread" { target pthread } } +// { dg-require-effective-target c++11 } +// { dg-require-gthreads "" } #include <mutex> #include <testsuite_hooks.h> diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/Makefile.am index ba5023a..d2011f0 100644 --- a/libstdc++-v3/testsuite/Makefile.am +++ b/libstdc++-v3/testsuite/Makefile.am @@ -191,10 +191,9 @@ check-simd: $(srcdir)/experimental/simd/generate_makefile.sh \ ${glibcxx_srcdir}/scripts/check_simd \ testsuite_files_simd \ ${glibcxx_builddir}/scripts/testsuite_flags - @echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around PR98834." @rm -f .simd.summary @echo "Generating simd testsuite subdirs and Makefiles ..." - @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \ + @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS)" | \ while read subdir; do \ $(MAKE) -C "$${subdir}"; \ tail -n20 $${subdir}/simd_testsuite.sum | \ diff --git a/libstdc++-v3/testsuite/Makefile.in b/libstdc++-v3/testsuite/Makefile.in index c9dd7f5..c65cdaf 100644 --- a/libstdc++-v3/testsuite/Makefile.in +++ b/libstdc++-v3/testsuite/Makefile.in @@ -716,10 +716,9 @@ check-simd: $(srcdir)/experimental/simd/generate_makefile.sh \ ${glibcxx_srcdir}/scripts/check_simd \ testsuite_files_simd \ ${glibcxx_builddir}/scripts/testsuite_flags - @echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around PR98834." @rm -f .simd.summary @echo "Generating simd testsuite subdirs and Makefiles ..." - @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \ + @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS)" | \ while read subdir; do \ $(MAKE) -C "$${subdir}"; \ tail -n20 $${subdir}/simd_testsuite.sum | \ diff --git a/libstdc++-v3/testsuite/experimental/simd/README.md b/libstdc++-v3/testsuite/experimental/simd/README.md new file mode 100644 index 0000000..b82453d --- /dev/null +++ b/libstdc++-v3/testsuite/experimental/simd/README.md @@ -0,0 +1,257 @@ +# SIMD Tests + +To execute the simd testsuite, call `make check-simd`, typically with `-j N` +argument. + +For more control over verbosity, compiler flags, and use of a simulator, use +the environment variables documented below. + +## Environment variables + +### `target_list` + +Similar to dejagnu target lists: E.g. +`target_list="unix{-march=sandybridge,-march=native/-ffast-math,-march=native/-ffinite-math-only}"` +would create three subdirs in `testsuite/simd/` to run the complete simd +testsuite first with `-march=sandybridge`, then with `-march=native +-ffast-math`, and finally with `-march=native -ffinite-math-only`. + + +### `CHECK_SIMD_CONFIG` + +This variable can be set to a path to a file which is equivalent to a dejagnu +board. The file needs to be a valid `sh` script since it is sourced from the +`scripts/check_simd` script. Its purpose is to set the `target_list` variable +depending on `$target_triplet` (or whatever else makes sense for you). Example: + +```sh +case "$target_triplet" in +x86_64-*) + target_list="unix{-march=sandybridge,-march=skylake-avx512,-march=native/-ffast-math,-march=athlon64,-march=core2,-march=nehalem,-march=skylake,-march=native/-ffinite-math-only,-march=knl}" + ;; + +powerpc64le-*) + define_target power7 "-mcpu=power7 -static" "$HOME/bin/run_on_gccfarm gcc112" + define_target power8 "-mcpu=power8 -static" "$HOME/bin/run_on_gccfarm gcc112" + define_target power9 "-mcpu=power9 -static" "$HOME/bin/run_on_gccfarm gcc135" + target_list="power7 power8 power9{,-ffast-math}" + ;; + +powerpc64-*) + define_target power7 "-mcpu=power7 -static" "$HOME/bin/run_on_gccfarm gcc110" + define_target power8 "-mcpu=power8 -static" "$HOME/bin/run_on_gccfarm gcc110" + target_list="power7 power8{,-ffast-math}" + ;; +esac +``` + +The `unix` target is pre-defined to have no initial flags and no simulator. Use +the `define_target(name, flags, sim)` function to define your own targets for +the `target_list` variable. In the example above `define_target power7 +"-mcpu=power7 -static" "$HOME/bin/run_on_gccfarm gcc112"` defines the target +`power7` which always uses the flags `-mcpu=power7` and `-static` when +compiling tests and prepends `$HOME/bin/run_on_gccfarm gcc112` to test +executables. In `target_list` you can now use the name `power7`. E.g. +`target_list="power7 power7/-ffast-math"` or its shorthand +`target_list="power7{,-ffast-math}"`. + + +### `DRIVEROPTS` + +This variable affects the `Makefile`s generated per target (as defined above). +It's a string of flags that are prepended to the `driver.sh` invocation which +builds and runs the tests. You `cd` into a simd test subdir and use `make help` +to see possible options and a list of all valid targets. + +``` +use DRIVEROPTS=<options> to pass the following options: +-q, --quiet Disable same-line progress output (default if stdout is + not a tty). +-p, --percentage Add percentage to default same-line progress output. +-v, --verbose Print one line per test and minimal extra information on + failure. +-vv Print all compiler and test output. +-k, --keep-failed Keep executables of failed tests. +--sim <executable> Path to an executable that is prepended to the test + execution binary (default: the value of + GCC_TEST_SIMULATOR). +--timeout-factor <x> + Multiply the default timeout with x. +-x, --run-expensive Compile and run tests marked as expensive (default: + true if GCC_TEST_RUN_EXPENSIVE is set, false otherwise). +-o <pattern>, --only <pattern> + Compile and run only tests matching the given pattern. +``` + + +### `TESTFLAGS` + +This variable also affects the `Makefile`s generated per target. It's a list of +compiler flags that are appended to `CXXFLAGS`. + + +### `GCC_TEST_SIMULATOR` + +If `--sim` is not passed via `DRIVEROPTS`, then this variable is prepended to +test invocations. If a simulator was defined via the `CHECK_SIMD_CONFIG` +script, then then generated `Makefile` sets the `GCC_TEST_SIMULATOR` variable. + + +### `GCC_TEST_RUN_EXPENSIVE` + +If set to any non-empty string, run tests marked as expensive, otherwise treat +these tests as `UNSUPPORTED`. + + +## Writing new tests + +A test starts with the copyright header, directly followed by directives +influencing the set of tests to generate and whether the test driver should +expect a failure. + +Then the test must at least `#include "bits/verify.h"`, which provides `main` +and declares a `template <typename V> void test()` function, which the test has +to define. The template parameter is set to `simd<T, Abi>` type where `T` and +`Abi` are determined by the type and ABI subset dimensions. + +The `test()` functions are typically implemented using the `COMPARE(x, +reference)`, `VERIFY(boolean)`, and `ULP_COMPARE(x, reference, +allowed_distance)` macros. + +### Directives + +* `// skip: <type pattern> <ABI subset pattern> <target triplet pattern> + <CXXFLAGS pattern>` + If all patterns match, the test is silently skipped. + +* `// only: <type pattern> <ABI subset pattern> <target triplet pattern> + <CXXFLAGS pattern>` + If any pattern doesn't match, the test is silently skipped. + +* `// expensive: <type pattern> <ABI subset pattern> <target triplet pattern> + <CXXFLAGS pattern>` + If all patterns match, the test is `UNSUPPORTED` unless expensive tests are + enabled. + +* `// xfail: run|compile <type pattern> <ABI subset pattern> <target triplet + pattern> <CXXFLAGS pattern>` + If all patterns match, test compilation or execution is expected to fail. The + test then shows as "XFAIL: ...". If the test passes, the test shows "XPASS: + ...". + +All patterns are matched via +```sh +case '<test context>' in + <pattern>) + # treat as match + ;; +esac +``` +The `<CXXFLAGS pattern>` is implicitly adds a `*` wildcard before and after the +pattern. Thus, the `CXXFLAGS` pattern matches a substring and all other +patterns require a full match. + +Examples: +```cpp +// The test is only valid for floating-point types: +// only: float|double|ldouble * * * + +// Skip the test for long double for all powerpc64* targets: +// skip: ldouble * powerpc64* * + +// The test is expected to unconditionally fail on execution: +// xfail: run * * * * + +// ABI subsets 1-9 are considered expensive: +// expensive: * [1-9] * * +``` + + +## Implementation sketch + +* `scripts/create_testsuite_files` collects all `*.c` and `*.cc` files with + `simd/tests/` in their path into the file `testsuite_file_simd` (and at the + same time removes them from `testsuite_files`. + +* The `check-simd` target in `testsuite/Makefile.am` calls + `scripts/check_simd`. This script calls + `testsuite/experimental/simd/generate_makefile.sh` to generate `Makefile`s in + all requested subdirectories. The subdirectories are communicated back to the + make target via a `stdout` pipe. The `check-simd` rule then spawns sub-make + in these subdirectories. Finally it collects all summaries + (`simd_testsuite.sum`) to present them at the end of the rule. + +* The generated Makefiles define targets for each file in `testsuite_file_simd` + (you can edit this file after it was generated, though that's not + recommended) while adding two test dimensions: type and ABI subset. The type + is a list of all arithmetic types, potentially reduced via `only` and/or + `skip` directives in the test's source file. The ABI subset is a number + between 0 and 9 (inclusive) mapping to a set of `simd_abi`s in + `testsuite/experimental/simd/tests/bits/verify.h` (`iterate_abis()`). The + tests are thus potentially compiled 170 (17 arithmetic types * 10 ABI + subsets) times. This is necessary to limit the memory usage of GCC to + reasonable numbers and keep the compile time below 1 minute (per compiler + invocation). + +* When `make` executes in the generated subdir, the `all` target depends on + building and running all tests via `testsuite/experimental/simd/driver.sh` + and collecting their logs into a `simd_testsuite.log` and then extracting + `simd_testsuite.sum` from it. + +* The `driver.sh` script builds and runs the test, parses the compiler and test + output, and prints progress information to the terminal. + +## Appendix + +### `run_on_gccfarm` script + +```sh +#!/bin/sh +usage() { + cat <<EOF +Usage $0 <hostname> <executable> [arguments] + +Copies <executable> to $host, executes it and cleans up again. +EOF +} + +[ $# -lt 2 ] && usage && exit 1 +case "$1" in + -h|--help) + usage + exit + ;; +esac + +host="$1" +exe="$2" +shift 2 + +# Copy executable locally to strip it before scp to remote host +local_tmpdir=$(mktemp -d) +cp "$exe" $local_tmpdir +cd $local_tmpdir +exe="${exe##*/}" +powerpc64le-linux-gnu-strip "$exe" + +ssh_controlpath=~/.local/run_on_gccfarm/$host +if [ ! -S $ssh_controlpath ]; then + mkdir -p ~/.local/run_on_gccfarm + ( + flock -n 9 + if [ ! -S $ssh_controlpath ]; then + ssh -o ControlMaster=yes -o ControlPath=$ssh_controlpath -o ControlPersist=10m $host.fsffrance.org true + fi + ) 9> ~/.local/run_on_gccfarm/lockfile +fi +opts="-o ControlPath=$ssh_controlpath" + +remote_tmpdir=$(ssh $opts $host.fsffrance.org mktemp -d -p .) +scp $opts -C -q "$exe" $host.fsffrance.org:$remote_tmpdir/ +cd +rm -r "$local_tmpdir" & +ssh $opts $host.fsffrance.org $remote_tmpdir/$exe "$@" +ret=$? +ssh $opts $host.fsffrance.org rm -r $remote_tmpdir & +exit $ret +``` diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-v3/testsuite/experimental/simd/driver.sh index f2d31c7..5ae9905 100755 --- a/libstdc++-v3/testsuite/experimental/simd/driver.sh +++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh @@ -5,8 +5,22 @@ abi=0 name= srcdir="$(cd "${0%/*}" && pwd)/tests" sim="$GCC_TEST_SIMULATOR" -quiet=false -verbose=false + +# output_mode values: +# print only failures with minimal context +readonly really_quiet=0 +# as above plus same-line output of last successful test +readonly same_line=1 +# as above plus percentage +readonly percentage=2 +# print one line per finished test with minimal context on failure +readonly verbose=3 +# print one line per finished test with full output of the compiler and test +readonly really_verbose=4 + +output_mode=$really_quiet +[ -t 1 ] && output_mode=$same_line + timeout=180 run_expensive=false if [ -n "$GCC_TEST_RUN_EXPENSIVE" ]; then @@ -21,8 +35,12 @@ Usage: $0 [Options] <g++ invocation> Options: -h, --help Print this message and exit. - -q, --quiet Only print failures. - -v, --verbose Print compiler and test output on failure. + -q, --quiet Disable same-line progress output (default if stdout is + not a tty). + -p, --percentage Add percentage to default same-line progress output. + -v, --verbose Print one line per test and minimal extra information on + failure. + -vv Print all compiler and test output. -t <type>, --type <type> The value_type to test (default: $type). -a [0-9], --abi [0-9] @@ -36,9 +54,10 @@ Options: GCC_TEST_SIMULATOR). --timeout-factor <x> Multiply the default timeout with x. - --run-expensive Compile and run tests marked as expensive (default: + -x, --run-expensive Compile and run tests marked as expensive (default: true if GCC_TEST_RUN_EXPENSIVE is set, false otherwise). - --only <pattern> Compile and run only tests matching the given pattern. + -o <pattern>, --only <pattern> + Compile and run only tests matching the given pattern. EOF } @@ -49,71 +68,74 @@ while [ $# -gt 0 ]; do exit ;; -q|--quiet) - quiet=true + output_mode=$really_quiet + ;; + -p|--percentage) + output_mode=$percentage ;; -v|--verbose) - verbose=true + if [ $output_mode -lt $verbose ]; then + output_mode=$verbose + else + output_mode=$really_verbose + fi ;; - --run-expensive) + -x|--run-expensive) run_expensive=true ;; -k|--keep-failed) keep_failed=true ;; - --only) + -o|--only) only="$2" shift ;; - --only=*) - only="${1#--only=}" - ;; -t|--type) type="$2" shift ;; - --type=*) - type="${1#--type=}" - ;; -a|--abi) abi="$2" shift ;; - --abi=*) - abi="${1#--abi=}" - ;; -n|--name) name="$2" shift ;; - --name=*) - name="${1#--name=}" - ;; --srcdir) srcdir="$2" shift ;; - --srcdir=*) - srcdir="${1#--srcdir=}" - ;; --sim) sim="$2" shift ;; - --sim=*) - sim="${1#--sim=}" - ;; --timeout-factor) timeout=$(awk "BEGIN { print int($timeout * $2) }") shift ;; - --timeout-factor=*) - x=${1#--timeout-factor=} - timeout=$(awk "BEGIN { print int($timeout * $x) }") - ;; --) shift break ;; + --*=*) + opt="$1" + shift + value=${opt#*=} + set -- ${opt%=$value} "$value" ${1+"$@"} + continue + ;; + -[ahknopqtvx][ahknopqtvx]*) + opt="$1" + shift + next=${opt#??} + set -- ${opt%$next} "-$next" ${1+"$@"} + continue + ;; + -*) + echo "Error: Unrecognized option '$1'" >&2 + exit 1 + ;; *) break ;; @@ -121,6 +143,17 @@ while [ $# -gt 0 ]; do shift done +if [ $output_mode = $percentage ]; then + inc_progress() { + { + flock -n 9 + n=$(($(cat .progress) + 1)) + echo $n >&9 + echo $n + } 9<>.progress + } +fi + CXX="$1" shift CXXFLAGS="$@" @@ -133,6 +166,7 @@ sum="${testname}.sum" if [ -n "$only" ]; then if echo "$testname"|awk "{ exit /$only/ }"; then touch "$log" "$sum" + [ $output_mode = $percentage ] && inc_progress >/dev/null exit 0 fi fi @@ -146,35 +180,58 @@ else exit 1 fi +if [ $output_mode = $percentage ]; then + show_progress() { + n=$(inc_progress) + read total < .progress_total + total=${total}0 + printf "\e[1G\e[K[%3d %%] ${src##*/} $type $abiflag" \ + $((n * 1005 / total)) + } + trap 'show_progress' EXIT + prefix="\e[1G\e[K" +elif [ $output_mode = $same_line ]; then + show_progress() { + printf "\e[1G\e[K${src##*/} $type $abiflag" + } + trap 'show_progress' EXIT + prefix="\e[1G\e[K" +else + prefix="" +fi + fail() { + printf "$prefix" echo "FAIL: $src $type $abiflag ($*)" | tee -a "$sum" "$log" } xpass() { + printf "$prefix" echo "XPASS: $src $type $abiflag ($*)" | tee -a "$sum" "$log" } xfail() { - $quiet || echo "XFAIL: $src $type $abiflag ($*)" + [ $output_mode -ge $verbose ] && echo "XFAIL: $src $type $abiflag ($*)" echo "XFAIL: $src $type $abiflag ($*)" >> "$sum" echo "XFAIL: $src $type $abiflag ($*)" >> "$log" } pass() { - $quiet || echo "PASS: $src $type $abiflag ($*)" + [ $output_mode -ge $verbose ] && echo "PASS: $src $type $abiflag ($*)" echo "PASS: $src $type $abiflag ($*)" >> "$sum" echo "PASS: $src $type $abiflag ($*)" >> "$log" } unsupported() { - $quiet || echo "UNSUPPORTED: $src $type $abiflag ($*)" + test + [ $output_mode -ge $verbose ] && echo "UNSUPPORTED: $src $type $abiflag ($*)" echo "UNSUPPORTED: $src $type $abiflag ($*)" >> "$sum" echo "UNSUPPORTED: $src $type $abiflag ($*)" >> "$log" } write_log_and_verbose() { echo "$*" >> "$log" - if $verbose; then + if [ $output_mode = $really_verbose ]; then if [ -z "$COLUMNS" ] || ! type fmt>/dev/null; then echo "$*" else @@ -265,7 +322,7 @@ if read_src_option timeout-factor factor; then fi log_output() { - if $verbose; then + if [ $output_mode = $really_verbose ]; then maxcol=${1:-1024} awk " BEGIN { count = 0 } @@ -323,7 +380,7 @@ verify_compilation() { warnings=$(grep -ic 'warning:' "$log") if [ $warnings -gt 0 ]; then fail "excess warnings:" $warnings - if ! $verbose && ! $quiet; then + if [ $output_mode = $verbose ]; then grep -i 'warning:' "$log" | head -n5 fi elif [ "$xfail" = "compile" ]; then @@ -344,7 +401,7 @@ verify_compilation() { fail "excess errors:" $errors fi fi - if ! $verbose && ! $quiet; then + if [ $output_mode = $verbose ]; then grep -i 'error:' "$log" | head -n5 fi return 1 @@ -365,7 +422,7 @@ verify_test() { return 0 else $keep_failed || rm "$exe" - if ! $verbose && ! $quiet; then + if [ $output_mode = $verbose ]; then grep -i fail "$log" | head -n5 fi if [ $exitstatus -eq 124 ]; then diff --git a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh index 4fb710c..ce5162a 100755 --- a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh +++ b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh @@ -97,7 +97,7 @@ driveroptions := \$(DRIVEROPTS) all: simd_testsuite.sum -simd_testsuite.sum: simd_testsuite.log +simd_testsuite.sum: .progress .progress_total simd_testsuite.log @printf "\n\t\t=== simd_testsuite \$(test_flags) Summary ===\n\n"\\ "# of expected passes:\t\t\$(shell grep -c '^PASS:' \$@)\n"\\ "# of unexpected passes:\t\t\$(shell grep -c '^XPASS:' \$@)\n"\\ @@ -255,7 +255,7 @@ EOF done cat <<EOF run-%: export GCC_TEST_RUN_EXPENSIVE=yes -run-%: driveroptions += -v +run-%: driveroptions += -vv run-%: %.log @rm \$^ \$(^:log=sum) @@ -266,17 +266,22 @@ EOF dsthelp="${dst%Makefile}.make_help.txt" cat <<EOF > "$dsthelp" use DRIVEROPTS=<options> to pass the following options: --q, --quiet Only print failures. --v, --verbose Print compiler and test output on failure. +-q, --quiet Disable same-line progress output (default if stdout is + not a tty). +-p, --percentage Add percentage to default same-line progress output. +-v, --verbose Print one line per test and minimal extra information on + failure. +-vv Print all compiler and test output. -k, --keep-failed Keep executables of failed tests. --sim <executable> Path to an executable that is prepended to the test execution binary (default: the value of GCC_TEST_SIMULATOR). --timeout-factor <x> Multiply the default timeout with x. ---run-expensive Compile and run tests marked as expensive (default: +-x, --run-expensive Compile and run tests marked as expensive (default: true if GCC_TEST_RUN_EXPENSIVE is set, false otherwise). ---only <pattern> Compile and run only tests matching the given pattern. +-o <pattern>, --only <pattern> + Compile and run only tests matching the given pattern. use TESTFLAGS=<flags> to pass additional compiler flags @@ -285,9 +290,13 @@ The following are some of the valid targets for this Makefile: ... clean ... help" EOF + N=$(((0$( + all_tests | while read file && read name; do + all_types "$file" | printf " + %d" $(wc -l) + done) ) * 5)) all_tests | while read file && read name; do echo "... run-${name}" - all_types | while read t && read type; do + all_types "$file" | while read t && read type; do echo "... run-${name}-${type}" for i in $(seq 0 9); do echo "... run-${name}-${type}-$i" @@ -296,10 +305,16 @@ EOF done >> "$dsthelp" cat <<EOF +.progress: + @echo 0 > .progress + +.progress_total: + @echo $N > .progress_total + clean: - rm -f -- *.sum *.log *.exe + rm -f -- *.sum *.log *.exe .progress .progress_total -.PHONY: clean help +.PHONY: all clean help .progress .progress_total .PRECIOUS: %.log %.sum EOF |