aboutsummaryrefslogtreecommitdiff
path: root/gcc/internal-fn.c
AgeCommit message (Collapse)AuthorFilesLines
2018-08-03Handle SLP of call pattern statementsRichard Sandiford1-0/+36
We couldn't vectorise: for (int j = 0; j < n; ++j) { for (int i = 0; i < 16; ++i) a[i] = (b[i] + c[i]) >> 1; a += step; b += step; c += step; } at -O3 because cunrolli unrolled the inner loop and SLP couldn't handle AVG_FLOOR patterns (see also PR86504). The problem was some overly strict checking of pattern statements compared to normal statements in vect_get_and_check_slp_defs: switch (gimple_code (def_stmt)) { case GIMPLE_PHI: case GIMPLE_ASSIGN: break; default: if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "unsupported defining stmt:\n"); return -1; } The easy fix would have been to add GIMPLE_CALL to the switch, but I don't think the switch is doing anything useful. We only create pattern statements that the rest of the vectoriser can handle, and the other checks in this function and elsewhere check whether SLP is possible. I'm also not sure why: if (!first && !oprnd_info->first_pattern /* Allow different pattern state for the defs of the first stmt in reduction chains. */ && (oprnd_info->first_dt != vect_reduction_def is necessary. All that should matter is that the statements in the node are "similar enough". It turned out to be quite hard to find a convincing example that used a mixture of pattern and non-pattern statements, so bb-slp-pow-1.c is the best I could come up with. But it does show that the combination of "xi * xi" statements and "pow (xj, 2) -> xj * xj" patterns are handled correctly. The patch therefore just removes the whole if block. The loop also needed commutative swapping to be extended to at least AVG_FLOOR. This gives +3.9% on 525.x264_r at -O3. 2018-08-03 Richard Sandiford <richard.sandiford@arm.com> gcc/ * internal-fn.h (first_commutative_argument): Declare. * internal-fn.c (first_commutative_argument): New function. * tree-vect-slp.c (vect_get_and_check_slp_defs): Remove extra restrictions for pattern statements. Use first_commutative_argument to look for commutative operands in calls to internal functions. gcc/testsuite/ * gcc.dg/vect/bb-slp-over-widen-1.c: Expect AVG_FLOOR to be used on vect_avg_qi targets. * gcc.dg/vect/bb-slp-over-widen-2.c: Likewise. * gcc.dg/vect/bb-slp-pow-1.c: New test. * gcc.dg/vect/vect-avg-15.c: Likewise. From-SVN: r263290
2018-07-12Use conditional internal functions in if-conversionRichard Sandiford1-1/+22
This patch uses IFN_COND_* to vectorise conditionally-executed, potentially-trapping arithmetic, such as most floating-point ops with -ftrapping-math. E.g.: if (cond) { ... x = a + b; ... } becomes: ... x = .COND_ADD (cond, a, b, else_value); ... When this transformation is done on its own, the value of x for !cond isn't important, so else_value is simply the target's preferred_else_value (i.e. the value it can handle the most efficiently). However, the patch also looks for the equivalent of: y = cond ? x : c; in which the "then" value is the result of the conditionally-executed operation and the "else" value "c" is some value that is available at x. In that case we can instead use: x = .COND_ADD (cond, a, b, c); and replace uses of y with uses of x. The patch also looks for: y = !cond ? c : x; which can be transformed in the same way. This involved adding a new utility function inverse_conditions_p, which was already open-coded in a more limited way in match.pd. 2018-07-12 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * fold-const.h (inverse_conditions_p): Declare. * fold-const.c (inverse_conditions_p): New function. * match.pd: Use inverse_conditions_p. Add folds of view_converts that test the inverse condition of a conditional internal function. * internal-fn.h (vectorized_internal_fn_supported_p): Declare. * internal-fn.c (internal_fn_mask_index): Handle conditional internal functions. (vectorized_internal_fn_supported_p): New function. * tree-if-conv.c: Include internal-fn.h and fold-const.h. (any_pred_load_store): Replace with... (need_to_predicate): ...this new variable. (redundant_ssa_names): New variable. (ifcvt_can_use_mask_load_store): Move initial checks to... (ifcvt_can_predicate): ...this new function. Handle tree codes for which a conditional internal function exists. (if_convertible_gimple_assign_stmt_p): Use ifcvt_can_predicate instead of ifcvt_can_use_mask_load_store. Update after variable name change. (predicate_load_or_store): New function, split out from predicate_mem_writes. (check_redundant_cond_expr): New function. (value_available_p): Likewise. (predicate_rhs_code): Likewise. (predicate_mem_writes): Rename to... (predicate_statements): ...this. Use predicate_load_or_store and predicate_rhs_code. (combine_blocks, tree_if_conversion): Update after above name changes. (ifcvt_local_dce): Handle redundant_ssa_names. * tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Handle general conditional functions. * tree-vect-stmts.c (vectorizable_call): Likewise. gcc/testsuite/ * gcc.dg/vect/vect-cond-arith-4.c: New test. * gcc.dg/vect/vect-cond-arith-5.c: Likewise. * gcc.target/aarch64/sve/cond_arith_1.c: Likewise. * gcc.target/aarch64/sve/cond_arith_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_arith_2.c: Likewise. * gcc.target/aarch64/sve/cond_arith_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_arith_3.c: Likewise. * gcc.target/aarch64/sve/cond_arith_3_run.c: Likewise. From-SVN: r262589
2018-07-12Support fused multiply-adds in fully-masked reductionsRichard Sandiford1-0/+56
This patch adds support for fusing a conditional add or subtract with a multiplication, so that we can use fused multiply-add and multiply-subtract operations for fully-masked reductions. E.g. for SVE we vectorise: double res = 0.0; for (int i = 0; i < n; ++i) res += x[i] * y[i]; using a fully-masked loop in which the loop body has the form: res_1 = PHI<0(preheader), res_2(latch)>; avec = .MASK_LOAD (loop_mask, a) bvec = .MASK_LOAD (loop_mask, b) prod = avec * bvec; res_2 = .COND_ADD (loop_mask, res_1, prod, res_1); where the last statement does the equivalent of: res_2 = loop_mask ? res_1 + prod : res_1; (operating elementwise). The point of the patch is to convert the last two statements into: res_s = .COND_FMA (loop_mask, avec, bvec, res_1, res_1); which is equivalent to: res_2 = loop_mask ? fma (avec, bvec, res_1) : res_1; (again operating elementwise). 2018-07-12 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * internal-fn.h (can_interpret_as_conditional_op_p): Declare. * internal-fn.c (can_interpret_as_conditional_op_p): New function. * tree-ssa-math-opts.c (convert_mult_to_fma_1): Handle conditional plus and minus and convert them into IFN_COND_FMA-based sequences. (convert_mult_to_fma): Handle conditional plus and minus. gcc/testsuite/ * gcc.dg/vect/vect-fma-2.c: New test. * gcc.target/aarch64/sve/reduc_4.c: Likewise. * gcc.target/aarch64/sve/reduc_6.c: Likewise. * gcc.target/aarch64/sve/reduc_7.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r262588
2018-07-12Add IFN_COND_FMA functionsRichard Sandiford1-0/+56
This patch adds conditional equivalents of the IFN_FMA built-in functions. Most of it is just a mechanical extension of the binary stuff. 2018-07-12 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * doc/md.texi (cond_fma, cond_fms, cond_fnma, cond_fnms): Document. * optabs.def (cond_fma_optab, cond_fms_optab, cond_fnma_optab) (cond_fnms_optab): New optabs. * internal-fn.def (COND_FMA, COND_FMS, COND_FNMA, COND_FNMS): New internal functions. (FMA): Use DEF_INTERNAL_FLT_FN rather than DEF_INTERNAL_FLT_FLOATN_FN. * internal-fn.h (get_conditional_internal_fn): Declare. (get_unconditional_internal_fn): Likewise. * internal-fn.c (cond_ternary_direct): New macro. (expand_cond_ternary_optab_fn): Likewise. (direct_cond_ternary_optab_supported_p): Likewise. (FOR_EACH_COND_FN_PAIR): Likewise. (get_conditional_internal_fn): New function. (get_unconditional_internal_fn): Likewise. * gimple-match.h (gimple_match_op::MAX_NUM_OPS): Bump to 5. (gimple_match_op::gimple_match_op): Add a new overload for 5 operands. (gimple_match_op::set_op): Likewise. (gimple_resimplify5): Declare. * genmatch.c (decision_tree::gen): Generate simplifications for 5 operands. * gimple-match-head.c (gimple_simplify): Define an overload for 5 operands. Handle calls with 5 arguments in the top-level overload. (convert_conditional_op): Handle conversions from unconditional internal functions to conditional ones. (gimple_resimplify5): New function. (build_call_internal): Pass a fifth operand. (maybe_push_res_to_seq): Likewise. (try_conditional_simplification): Try converting conditional internal functions to unconditional internal functions. Handle 3-operand unconditional forms. * match.pd (UNCOND_TERNARY, COND_TERNARY): Operator lists. Define ternary equivalents of the current rules for binary conditional internal functions. * config/aarch64/aarch64.c (aarch64_preferred_else_value): Handle ternary operations. * config/aarch64/iterators.md (UNSPEC_COND_FMLA, UNSPEC_COND_FMLS) (UNSPEC_COND_FNMLA, UNSPEC_COND_FNMLS): New unspecs. (optab): Handle them. (SVE_COND_FP_TERNARY): New int iterator. (sve_fmla_op, sve_fmad_op): New int attributes. * config/aarch64/aarch64-sve.md (cond_<optab><mode>) (*cond_<optab><mode>_2, *cond_<optab><mode_4) (*cond_<optab><mode>_any): New SVE_COND_FP_TERNARY patterns. gcc/testsuite/ * gcc.dg/vect/vect-cond-arith-3.c: New test. * gcc.target/aarch64/sve/vcond_13.c: Likewise. * gcc.target/aarch64/sve/vcond_13_run.c: Likewise. * gcc.target/aarch64/sve/vcond_14.c: Likewise. * gcc.target/aarch64/sve/vcond_14_run.c: Likewise. * gcc.target/aarch64/sve/vcond_15.c: Likewise. * gcc.target/aarch64/sve/vcond_15_run.c: Likewise. * gcc.target/aarch64/sve/vcond_16.c: Likewise. * gcc.target/aarch64/sve/vcond_16_run.c: Likewise. From-SVN: r262587
2018-07-12Extend tree code folds to IFN_COND_*Richard Sandiford1-20/+34
This patch adds match.pd support for applying normal folds to their IFN_COND_* forms. E.g. the rule: (plus @0 (negate @1)) -> (minus @0 @1) also allows the fold: (IFN_COND_ADD @0 @1 (negate @2) @3) -> (IFN_COND_SUB @0 @1 @2 @3) Actually doing this by direct matches in gimple-match.c would probably lead to combinatorial explosion, so instead, the patch makes gimple_match_op carry a condition under which the operation happens ("cond"), and the value to use when the condition is false ("else_value"). Thus in the example above we'd do the following (a) convert: cond:NULL_TREE (IFN_COND_ADD @0 @1 @4 @3) else_value:NULL_TREE to: cond:@0 (plus @1 @4) else_value:@3 (b) apply gimple_resimplify to (plus @1 @4) (c) reintroduce cond and else_value when constructing the result. Nested operations inherit the condition of the outer operation (so that we don't introduce extra faults) but have a null else_value. If we try to build such an operation, the target gets to choose what else_value it can handle efficiently: obvious choices include one of the operands or a zero constant. (The alternative would be to have some representation for an undefined value, but that seems a bit invasive, and isn't likely to be useful here.) I've made the condition a mandatory part of the gimple_match_op constructor so that it doesn't accidentally get dropped. 2018-07-12 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * target.def (preferred_else_value): New target hook. * doc/tm.texi.in (TARGET_PREFERRED_ELSE_VALUE): New hook. * doc/tm.texi: Regenerate. * targhooks.h (default_preferred_else_value): Declare. * targhooks.c (default_preferred_else_value): New function. * internal-fn.h (conditional_internal_fn_code): Declare. * internal-fn.c (FOR_EACH_CODE_MAPPING): New macro. (get_conditional_internal_fn): Use it. (conditional_internal_fn_code): New function. * gimple-match.h (gimple_match_cond): New struct. (gimple_match_op): Add a cond member function. (gimple_match_op::gimple_match_op): Update all forms to take a gimple_match_cond. * genmatch.c (expr::gen_transform): Use the same condition as res_op for the suboperation, but don't specify a particular else_value. * tree-ssa-sccvn.c (vn_nary_simplify, vn_reference_lookup_3) (visit_nary_op, visit_reference_op_load): Pass gimple_match_cond::UNCOND to the gimple_match_op constructor. * gimple-match-head.c: Include tree-eh.h (convert_conditional_op): New function. (maybe_resimplify_conditional_op): Likewise. (gimple_resimplify1): Call maybe_resimplify_conditional_op. (gimple_resimplify2): Likewise. (gimple_resimplify3): Likewise. (gimple_resimplify4): Likewise. (maybe_push_res_to_seq): Return null for conditional operations. (try_conditional_simplification): New function. (gimple_simplify): Call it. Pass conditions to the gimple_match_op constructor. * match.pd: Fold VEC_COND_EXPRs of an IFN_COND_* call to a new IFN_COND_* call. * config/aarch64/aarch64.c (aarch64_preferred_else_value): New function. (TARGET_PREFERRED_ELSE_VALUE): Redefine. gcc/testsuite/ * gcc.dg/vect/vect-cond-arith-2.c: New test. * gcc.target/aarch64/sve/loop_add_6.c: Likewise. From-SVN: r262586
2018-05-25Add IFN_COND_{MUL,DIV,MOD,RDIV}Richard Sandiford1-0/+6
This patch adds support for conditional multiplication and division. It's mostly mechanical, but a few notes: * The *_optab name and the .md names are the same as the unconditional forms, just with "cond_" added to the front. This means we still have the awkward difference between sdiv and div, etc. * It was easier to retain the difference between integer and FP division in the function names, given that they map to different tree codes (TRUNC_DIV_EXPR and RDIV_EXPR). * SVE has no direct support for IFN_COND_MOD, but it seemed more consistent to add it anyway. * Adding IFN_COND_MUL enables an extra fully-masked reduction in gcc.dg/vect/pr53773.c. * In practice we don't actually use the integer division forms without if-conversion support (added by a later patch). 2018-05-25 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * doc/sourcebuild.texi (vect_double_cond_arith): Include multiplication and division. * doc/md.texi (cond_mul@var{m}, cond_div@var{m}, cond_mod@var{m}) (cond_udiv@var{m}, cond_umod@var{m}): Document. * optabs.def (cond_smul_optab, cond_sdiv_optab, cond_smod_optab) (cond_udiv_optab, cond_umod_optab): New optabs. * internal-fn.def (IFN_COND_MUL, IFN_COND_DIV, IFN_COND_MOD) (IFN_COND_RDIV): New internal functions. * internal-fn.c (get_conditional_internal_fn): Handle TRUNC_DIV_EXPR, TRUNC_MOD_EXPR and RDIV_EXPR. * match.pd (UNCOND_BINARY, COND_BINARY): Handle them. * config/aarch64/iterators.md (UNSPEC_COND_MUL, UNSPEC_COND_DIV): New unspecs. (SVE_INT_BINARY): Include mult. (SVE_COND_FP_BINARY): Include UNSPEC_MUL and UNSPEC_DIV. (optab, sve_int_op): Handle mult. (optab, sve_fp_op, commutative): Handle UNSPEC_COND_MUL and UNSPEC_COND_DIV. * config/aarch64/aarch64-sve.md (cond_<optab><mode>): New pattern for SVE_INT_BINARY_SD. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_double_cond_arith): Include multiplication and division. * gcc.dg/vect/pr53773.c: Do not expect a scalar tail when using fully-masked loops with a fixed vector length. * gcc.dg/vect/vect-cond-arith-1.c: Add multiplication and division tests. * gcc.target/aarch64/sve/vcond_8.c: Likewise. * gcc.target/aarch64/sve/vcond_9.c: Likewise. * gcc.target/aarch64/sve/vcond_12.c: Add multiplication tests. From-SVN: r260713
2018-05-25Add an "else" argument to IFN_COND_* functionsRichard Sandiford1-6/+13
As suggested by Richard B, this patch changes the IFN_COND_* functions so that they take the else value of the ?: operation as a final argument, rather than always using argument 1. All current callers will still use the equivalent of argument 1, so this patch makes the SVE code assert that for now. Later patches add the general case. 2018-05-25 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * doc/md.texi: Update the documentation of the cond_* optabs to mention the new final operand. Fix GET_MODE_NUNITS call. Describe the scalar case too. * internal-fn.def (IFN_EXTRACT_LAST): Change type to fold_left. * internal-fn.c (expand_cond_unary_optab_fn): Expect 3 operands instead of 2. (expand_cond_binary_optab_fn): Expect 4 operands instead of 3. (get_conditional_internal_fn): Update comment. * tree-vect-loop.c (vectorizable_reduction): Pass the original accumulator value as a final argument to conditional functions. * config/aarch64/aarch64-sve.md (cond_<optab><mode>): Turn into a define_expand and add an "else" operand. Assert for now that the else operand is equal to operand 2. Use SVE_INT_BINARY and SVE_COND_FP_BINARY instead of SVE_COND_INT_OP and SVE_COND_FP_OP. (*cond_<optab><mode>): New patterns. * config/aarch64/iterators.md (UNSPEC_COND_SMAX, UNSPEC_COND_UMAX) (UNSPEC_COND_SMIN, UNSPEC_COND_UMIN, UNSPEC_COND_AND, UNSPEC_COND_ORR) (UNSPEC_COND_EOR): Delete. (optab): Remove associated mappings. (SVE_INT_BINARY): New code iterator. (sve_int_op): Remove int attribute and add "minus" to the code attribute. (SVE_COND_INT_OP): Delete. (SVE_COND_FP_OP): Rename to... (SVE_COND_FP_BINARY): ...this. From-SVN: r260707
2018-05-22Handle a null lhs in expand_direct_optab_fn (PR85862)Richard Sandiford1-6/+7
This PR showed that the normal function for expanding directly-mapped internal functions didn't handle the case in which the call was only being kept for its side-effects. 2018-05-22 Richard Sandiford <richard.sandiford@linaro.org> gcc/ PR middle-end/85862 * internal-fn.c (expand_direct_optab_fn): Cope with a null lhs. gcc/testsuite/ PR middle-end/85862 * gcc.dg/torture/pr85862.c: New test. From-SVN: r260504
2018-05-18Replace FMA_EXPR with one internal fn per optabRichard Sandiford1-0/+5
There are four optabs for various forms of fused multiply-add: fma, fms, fnma and fnms. Of these, only fma had a direct gimple representation. For the other three we relied on special pattern- matching during expand, although tree-ssa-math-opts.c did have some code to try to second-guess what expand would do. This patch removes the old FMA_EXPR representation of fma and introduces four new internal functions, one for each optab. IFN_FMA is tied to BUILT_IN_FMA* while the other three are independent directly-mapped internal functions. It's then possible to do the pattern-matching in match.pd and tree-ssa-math-opts.c (via folding) can select the exact FMA-based operation. The BRIG & HSA parts are a best guess, but seem relatively simple. 2018-05-18 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * doc/sourcebuild.texi (scalar_all_fma): Document. * tree.def (FMA_EXPR): Delete. * internal-fn.def (FMA, FMS, FNMA, FNMS): New internal functions. * internal-fn.c (ternary_direct): New macro. (expand_ternary_optab_fn): Likewise. (direct_ternary_optab_supported_p): Likewise. * Makefile.in (build/genmatch.o): Depend on case-fn-macros.h. * builtins.c (fold_builtin_fma): Delete. (fold_builtin_3): Don't call it. * cfgexpand.c (expand_debug_expr): Remove FMA_EXPR handling. * expr.c (expand_expr_real_2): Likewise. * fold-const.c (operand_equal_p): Likewise. (fold_ternary_loc): Likewise. * gimple-pretty-print.c (dump_ternary_rhs): Likewise. * gimple.c (DEFTREECODE): Likewise. * gimplify.c (gimplify_expr): Likewise. * optabs-tree.c (optab_for_tree_code): Likewise. * tree-cfg.c (verify_gimple_assign_ternary): Likewise. * tree-eh.c (operation_could_trap_p): Likewise. (stmt_could_throw_1_p): Likewise. * tree-inline.c (estimate_operator_cost): Likewise. * tree-pretty-print.c (dump_generic_node): Likewise. (op_code_prio): Likewise. * tree-ssa-loop-im.c (stmt_cost): Likewise. * tree-ssa-operands.c (get_expr_operands): Likewise. * tree.c (commutative_ternary_tree_code, add_expr): Likewise. * fold-const-call.h (fold_fma): Delete. * fold-const-call.c (fold_const_call_ssss): Handle CFN_FMS, CFN_FNMA and CFN_FNMS. (fold_fma): Delete. * genmatch.c (combined_fn): New enum. (commutative_ternary_tree_code): Remove FMA_EXPR handling. (commutative_op): New function. (commutate): Use it. Handle more than 2 operands. (dt_operand::gen_gimple_expr): Use commutative_op. (parser::parse_expr): Allow :c to be used with non-binary operators if the commutative operand is known. * gimple-ssa-backprop.c (backprop::process_builtin_call_use): Handle CFN_FMS, CFN_FNMA and CFN_FNMS. (backprop::process_assign_use): Remove FMA_EXPR handling. * hsa-gen.c (gen_hsa_insns_for_operation_assignment): Likewise. (gen_hsa_fma): New function. (gen_hsa_insn_for_internal_fn_call): Use it for IFN_FMA, IFN_FMS, IFN_FNMA and IFN_FNMS. * match.pd: Add folds for IFN_FMS, IFN_FNMA and IFN_FNMS. * gimple-fold.h (follow_all_ssa_edges): Declare. * gimple-fold.c (follow_all_ssa_edges): New function. * tree-ssa-math-opts.c (convert_mult_to_fma_1): Use the gimple_build interface and use follow_all_ssa_edges to fold the result. (convert_mult_to_fma): Use direct_internal_fn_suppoerted_p instead of checking for optabs directly. * config/i386/i386.c (ix86_add_stmt_cost): Recognize FMAs as calls rather than FMA_EXPRs. * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Create a call to IFN_FMA instead of an FMA_EXPR. gcc/brig/ * brigfrontend/brig-function.cc (brig_function::get_builtin_for_hsa_opcode): Use BUILT_IN_FMA for BRIG_OPCODE_FMA. (brig_function::get_tree_code_for_hsa_opcode): Treat BUILT_IN_FMA as a call. gcc/c/ * gimple-parser.c (c_parser_gimple_postfix_expression): Remove __FMA_EXPR handlng. gcc/cp/ * constexpr.c (cxx_eval_constant_expression): Remove FMA_EXPR handling. (potential_constant_expression_1): Likewise. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_scalar_all_fma): New proc. * gcc.dg/fma-1.c: New test. * gcc.dg/fma-2.c: Likewise. * gcc.dg/fma-3.c: Likewise. * gcc.dg/fma-4.c: Likewise. * gcc.dg/fma-5.c: Likewise. * gcc.dg/fma-6.c: Likewise. * gcc.dg/fma-7.c: Likewise. * gcc.dg/gimplefe-26.c: Use .FMA instead of __FMA and require scalar_all_fma. * gfortran.dg/reassoc_7.f: Pass -ffp-contract=off. * gfortran.dg/reassoc_8.f: Likewise. * gfortran.dg/reassoc_9.f: Likewise. * gfortran.dg/reassoc_10.f: Likewise. From-SVN: r260348
2018-05-17Gimple FE support for internal functionsRichard Sandiford1-0/+20
This patch gets the gimple FE to parse calls to internal functions. The only non-obvious thing was how the functions should be written to avoid clashes with real function names. One option would be to go the magic number of underscores route, but we already do that for built-in functions, and it would be good to keep them visually distinct. In the end I borrowed the local/internal label convention from asm and used: x = .SQRT (y); 2018-05-17 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * internal-fn.h (lookup_internal_fn): Declare * internal-fn.c (lookup_internal_fn): New function. * gimple.c (gimple_build_call_from_tree): Handle calls to internal functions. * gimple-pretty-print.c (dump_gimple_call): Print "." before internal function names. * tree-pretty-print.c (dump_generic_node): Likewise. * tree-ssa-scopedtables.c (expr_hash_elt::print): Likewise. gcc/c/ * gimple-parser.c: Include internal-fn.h. (c_parser_gimple_statement): Treat a leading CPP_DOT as a call. (c_parser_gimple_call_internal): New function. (c_parser_gimple_postfix_expression): Use it to handle CPP_DOT. Fix typos in comment. gcc/testsuite/ * gcc.dg/gimplefe-28.c: New test. * gcc.dg/asan/use-after-scope-9.c: Adjust expected output for internal function calls. * gcc.dg/goacc/loop-processing-1.c: Likewise. From-SVN: r260316
2018-02-20Fix incorrect TARGET_MEM_REF alignment (PR 84419)Richard Sandiford1-1/+4
expand_call_mem_ref checks for TARGET_MEM_REFs that have compatible type, but it didn't then go on to install the specific type we need, which might have different alignment due to: if (TYPE_ALIGN (type) != align) type = build_aligned_type (type, align); This was causing masked stores to be incorrectly marked as aligned on AVX512. 2018-02-20 Richard Sandiford <richard.sandiford@linaro.org> gcc/ PR tree-optimization/84419 * internal-fn.c (expand_call_mem_ref): Create a TARGET_MEM_REF with the required type if its current type is compatible but different. gcc/testsuite/ PR tree-optimization/84419 * gcc.dg/vect/pr84419.c: New test. From-SVN: r257847
2018-01-13Add support for SVE scatter storesRichard Sandiford1-2/+85
This is mostly a mechanical extension of the previous gather load support to scatter stores. The internal functions in this case are: IFN_SCATTER_STORE (base, offsets, scale, values) IFN_MASK_SCATTER_STORE (base, offsets, scale, values, mask) However, one nonobvious change is to vect_analyze_data_ref_access. If we're treating an access as a gather load or scatter store (i.e. if STMT_VINFO_GATHER_SCATTER_P is true), the existing code would create a dummy data_reference whose step is 0. There's not really much else it could do, since the whole point is that the step isn't predictable from iteration to iteration. We then went into this code in vect_analyze_data_ref_access: /* Allow loads with zero step in inner-loop vectorization. */ if (loop_vinfo && integer_zerop (step)) { GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) = NULL; if (!nested_in_vect_loop_p (loop, stmt)) return DR_IS_READ (dr); I.e. we'd take the step literally and assume that this is a load or store to an invariant address. Loads from invariant addresses are supported but stores to them aren't. The code therefore had the effect of disabling all scatter stores. AFAICT this is true of AVX too: although tests like avx512f-scatter-1.c test for the correctness of a scatter-like loop, they don't seem to check whether a scatter instruction is actually used. The patch therefore makes vect_analyze_data_ref_access return true for scatters. We do seem to handle the aliasing correctly; that's tested by other functions, and is symmetrical to the already-working gather case. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * doc/sourcebuild.texi (vect_scatter_store): Document. * optabs.def (scatter_store_optab, mask_scatter_store_optab): New optabs. * doc/md.texi (scatter_store@var{m}, mask_scatter_store@var{m}): Document. * genopinit.c (main): Add supports_vec_scatter_store and supports_vec_scatter_store_cached to target_optabs. * gimple.h (gimple_expr_type): Handle IFN_SCATTER_STORE and IFN_MASK_SCATTER_STORE. * internal-fn.def (SCATTER_STORE, MASK_SCATTER_STORE): New internal functions. * internal-fn.h (internal_store_fn_p): Declare. (internal_fn_stored_value_index): Likewise. * internal-fn.c (scatter_store_direct): New macro. (expand_scatter_store_optab_fn): New function. (direct_scatter_store_optab_supported_p): New macro. (internal_store_fn_p): New function. (internal_gather_scatter_fn_p): Handle IFN_SCATTER_STORE and IFN_MASK_SCATTER_STORE. (internal_fn_mask_index): Likewise. (internal_fn_stored_value_index): New function. (internal_gather_scatter_fn_supported_p): Adjust operand numbers for scatter stores. * optabs-query.h (supports_vec_scatter_store_p): Declare. * optabs-query.c (supports_vec_scatter_store_p): New function. * tree-vectorizer.h (vect_get_store_rhs): Declare. * tree-vect-data-refs.c (vect_analyze_data_ref_access): Return true for scatter stores. (vect_gather_scatter_fn_p): Handle scatter stores too. (vect_check_gather_scatter): Consider using scatter stores if supports_vec_scatter_store_p. * tree-vect-patterns.c (vect_try_gather_scatter_pattern): Handle scatter stores too. * tree-vect-stmts.c (exist_non_indexing_operands_for_use_p): Use internal_fn_stored_value_index. (check_load_store_masking): Handle scatter stores too. (vect_get_store_rhs): Make public. (vectorizable_call): Use internal_store_fn_p. (vectorizable_store): Handle scatter store internal functions. (vect_transform_stmt): Compare GROUP_STORE_COUNT with GROUP_SIZE when deciding whether the end of the group has been reached. * config/aarch64/aarch64.md (UNSPEC_ST1_SCATTER): New unspec. * config/aarch64/aarch64-sve.md (scatter_store<mode>): New expander. (mask_scatter_store<mode>): New insns. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_scatter_store): New proc. * gcc.dg/vect/pr25413a.c: Expect both loops to be optimized on targets with scatter stores. * gcc.dg/vect/vect-71.c: Restrict XFAIL to targets without scatter stores. * gcc.target/aarch64/sve/mask_scatter_store_1.c: New test. * gcc.target/aarch64/sve/mask_scatter_store_2.c: Likewise. * gcc.target/aarch64/sve/scatter_store_1.c: Likewise. * gcc.target/aarch64/sve/scatter_store_2.c: Likewise. * gcc.target/aarch64/sve/scatter_store_3.c: Likewise. * gcc.target/aarch64/sve/scatter_store_4.c: Likewise. * gcc.target/aarch64/sve/scatter_store_5.c: Likewise. * gcc.target/aarch64/sve/scatter_store_6.c: Likewise. * gcc.target/aarch64/sve/scatter_store_7.c: Likewise. * gcc.target/aarch64/sve/strided_store_1.c: Likewise. * gcc.target/aarch64/sve/strided_store_2.c: Likewise. * gcc.target/aarch64/sve/strided_store_3.c: Likewise. * gcc.target/aarch64/sve/strided_store_4.c: Likewise. * gcc.target/aarch64/sve/strided_store_5.c: Likewise. * gcc.target/aarch64/sve/strided_store_6.c: Likewise. * gcc.target/aarch64/sve/strided_store_7.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256643
2018-01-13Add support for SVE gather loadsRichard Sandiford1-0/+134
This patch adds support for SVE gather loads. It uses the basically the same analysis code as the AVX gather support, but after that there are two major differences: - It uses new internal functions rather than target built-ins. The interface is: IFN_GATHER_LOAD (base, offsets scale) IFN_MASK_GATHER_LOAD (base, offsets scale, mask) which should be reasonably generic. One of the advantages of using internal functions is that other passes can understand what the functions do, but a more immediate advantage is that we can query the underlying target pattern to see which scales it supports. - It uses pattern recognition to convert the offset to the right width, if it was originally narrower than that. This avoids having to do a widening operation as part of the gather expansion itself. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * doc/md.texi (gather_load@var{m}): Document. (mask_gather_load@var{m}): Likewise. * genopinit.c (main): Add supports_vec_gather_load and supports_vec_gather_load_cached to target_optabs. * optabs-tree.c (init_tree_optimization_optabs): Use ggc_cleared_alloc to allocate target_optabs. * optabs.def (gather_load_optab, mask_gather_laod_optab): New optabs. * internal-fn.def (GATHER_LOAD, MASK_GATHER_LOAD): New internal functions. * internal-fn.h (internal_load_fn_p): Declare. (internal_gather_scatter_fn_p): Likewise. (internal_fn_mask_index): Likewise. (internal_gather_scatter_fn_supported_p): Likewise. * internal-fn.c (gather_load_direct): New macro. (expand_gather_load_optab_fn): New function. (direct_gather_load_optab_supported_p): New macro. (direct_internal_fn_optab): New function. (internal_load_fn_p): Likewise. (internal_gather_scatter_fn_p): Likewise. (internal_fn_mask_index): Likewise. (internal_gather_scatter_fn_supported_p): Likewise. * optabs-query.c (supports_at_least_one_mode_p): New function. (supports_vec_gather_load_p): Likewise. * optabs-query.h (supports_vec_gather_load_p): Declare. * tree-vectorizer.h (gather_scatter_info): Add ifn, element_type and memory_type field. (NUM_PATTERNS): Bump to 15. * tree-vect-data-refs.c: Include internal-fn.h. (vect_gather_scatter_fn_p): New function. (vect_describe_gather_scatter_call): Likewise. (vect_check_gather_scatter): Try using internal functions for gather loads. Recognize existing calls to a gather load function. (vect_analyze_data_refs): Consider using gather loads if supports_vec_gather_load_p. * tree-vect-patterns.c (vect_get_load_store_mask): New function. (vect_get_gather_scatter_offset_type): Likewise. (vect_convert_mask_for_vectype): Likewise. (vect_add_conversion_to_patterm): Likewise. (vect_try_gather_scatter_pattern): Likewise. (vect_recog_gather_scatter_pattern): New pattern recognizer. (vect_vect_recog_func_ptrs): Add it. * tree-vect-stmts.c (exist_non_indexing_operands_for_use_p): Use internal_fn_mask_index and internal_gather_scatter_fn_p. (check_load_store_masking): Take the gather_scatter_info as an argument and handle gather loads. (vect_get_gather_scatter_ops): New function. (vectorizable_call): Check internal_load_fn_p. (vectorizable_load): Likewise. Handle gather load internal functions. (vectorizable_store): Update call to check_load_store_masking. * config/aarch64/aarch64.md (UNSPEC_LD1_GATHER): New unspec. * config/aarch64/iterators.md (SVE_S, SVE_D): New mode iterators. * config/aarch64/predicates.md (aarch64_gather_scale_operand_w) (aarch64_gather_scale_operand_d): New predicates. * config/aarch64/aarch64-sve.md (gather_load<mode>): New expander. (mask_gather_load<mode>): New insns. gcc/testsuite/ * gcc.target/aarch64/sve/gather_load_1.c: New test. * gcc.target/aarch64/sve/gather_load_2.c: Likewise. * gcc.target/aarch64/sve/gather_load_3.c: Likewise. * gcc.target/aarch64/sve/gather_load_4.c: Likewise. * gcc.target/aarch64/sve/gather_load_5.c: Likewise. * gcc.target/aarch64/sve/gather_load_6.c: Likewise. * gcc.target/aarch64/sve/gather_load_7.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_1.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_2.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_3.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_4.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_5.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_6.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_7.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256640
2018-01-13Add support for in-order addition reduction using SVE FADDARichard Sandiford1-0/+5
This patch adds support for in-order floating-point addition reductions, which are suitable even in strict IEEE mode. Previously vect_is_simple_reduction would reject any cases that forbid reassociation. The idea is instead to tentatively accept them as "FOLD_LEFT_REDUCTIONs" and only fail later if there is no support for them. Although this patch only handles the particular case of plus and minus on floating-point types, there's no reason in principle why we couldn't handle other cases. The reductions use a new fold_left_plus_optab if available, otherwise they fall back to elementwise additions or subtractions. The vect_force_simple_reduction change makes it easier for parloops to read the type of reduction. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * optabs.def (fold_left_plus_optab): New optab. * doc/md.texi (fold_left_plus_@var{m}): Document. * internal-fn.def (IFN_FOLD_LEFT_PLUS): New internal function. * internal-fn.c (fold_left_direct): Define. (expand_fold_left_optab_fn): Likewise. (direct_fold_left_optab_supported_p): Likewise. * fold-const-call.c (fold_const_fold_left): New function. (fold_const_call): Use it to fold CFN_FOLD_LEFT_PLUS. * tree-parloops.c (valid_reduction_p): New function. (gather_scalar_reductions): Use it. * tree-vectorizer.h (FOLD_LEFT_REDUCTION): New vect_reduction_type. (vect_finish_replace_stmt): Declare. * tree-vect-loop.c (fold_left_reduction_fn): New function. (needs_fold_left_reduction_p): New function, split out from... (vect_is_simple_reduction): ...here. Accept reductions that forbid reassociation, but give them type FOLD_LEFT_REDUCTION. (vect_force_simple_reduction): Also store the reduction type in the assignment's STMT_VINFO_REDUC_TYPE. (vect_model_reduction_cost): Handle FOLD_LEFT_REDUCTION. (merge_with_identity): New function. (vect_expand_fold_left): Likewise. (vectorize_fold_left_reduction): Likewise. (vectorizable_reduction): Handle FOLD_LEFT_REDUCTION. Leave the scalar phi in place for it. Check for target support and reject cases that would reassociate the operation. Defer the transform phase to vectorize_fold_left_reduction. * config/aarch64/aarch64.md (UNSPEC_FADDA): New unspec. * config/aarch64/aarch64-sve.md (fold_left_plus_<mode>): New expander. (*fold_left_plus_<mode>, *pred_fold_left_plus_<mode>): New insns. gcc/testsuite/ * gcc.dg/vect/no-fast-math-vect16.c: Expect the test to pass and check for a message about using in-order reductions. * gcc.dg/vect/pr79920.c: Expect both loops to be vectorized and check for a message about using in-order reductions. * gcc.dg/vect/trapv-vect-reduc-4.c: Expect all three loops to be vectorized and check for a message about using in-order reductions. Expect targets with variable-length vectors to fall back to the fixed-length mininum. * gcc.dg/vect/vect-reduc-6.c: Expect the loop to be vectorized and check for a message about using in-order reductions. * gcc.dg/vect/vect-reduc-in-order-1.c: New test. * gcc.dg/vect/vect-reduc-in-order-2.c: Likewise. * gcc.dg/vect/vect-reduc-in-order-3.c: Likewise. * gcc.dg/vect/vect-reduc-in-order-4.c: Likewise. * gcc.target/aarch64/sve/reduc_strict_1.c: New test. * gcc.target/aarch64/sve/reduc_strict_1_run.c: Likewise. * gcc.target/aarch64/sve/reduc_strict_2.c: Likewise. * gcc.target/aarch64/sve/reduc_strict_2_run.c: Likewise. * gcc.target/aarch64/sve/reduc_strict_3.c: Likewise. * gcc.target/aarch64/sve/slp_13.c: Add floating-point types. * gfortran.dg/vect/vect-8.f90: Expect 22 loops to be vectorized if vect_fold_left_plus. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256639
2018-01-13Add support for conditional reductions using SVE CLASTBRichard Sandiford1-0/+5
This patch uses SVE CLASTB to optimise conditional reductions. It means that we no longer need to maintain a separate index vector to record the most recent valid value, and no longer need to worry about overflow cases. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * doc/md.texi (fold_extract_last_@var{m}): Document. * doc/sourcebuild.texi (vect_fold_extract_last): Likewise. * optabs.def (fold_extract_last_optab): New optab. * internal-fn.def (FOLD_EXTRACT_LAST): New internal function. * internal-fn.c (fold_extract_direct): New macro. (expand_fold_extract_optab_fn): Likewise. (direct_fold_extract_optab_supported_p): Likewise. * tree-vectorizer.h (EXTRACT_LAST_REDUCTION): New vect_reduction_type. * tree-vect-loop.c (vect_model_reduction_cost): Handle EXTRACT_LAST_REDUCTION. (get_initial_def_for_reduction): Do not create an initial vector for EXTRACT_LAST_REDUCTION reductions. (vectorizable_reduction): Leave the scalar phi in place for EXTRACT_LAST_REDUCTIONs. Try using EXTRACT_LAST_REDUCTION ahead of INTEGER_INDUC_COND_REDUCTION. Do not check for an epilogue code for EXTRACT_LAST_REDUCTION and defer the transform phase to vectorizable_condition. * tree-vect-stmts.c (vect_finish_stmt_generation_1): New function, split out from... (vect_finish_stmt_generation): ...here. (vect_finish_replace_stmt): New function. (vectorizable_condition): Handle EXTRACT_LAST_REDUCTION. * config/aarch64/aarch64-sve.md (fold_extract_last_<mode>): New pattern. * config/aarch64/aarch64.md (UNSPEC_CLASTB): New unspec. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_fold_extract_last): New proc. * gcc.dg/vect/pr65947-1.c: Update dump messages. Add markup for fold_extract_last. * gcc.dg/vect/pr65947-2.c: Likewise. * gcc.dg/vect/pr65947-3.c: Likewise. * gcc.dg/vect/pr65947-4.c: Likewise. * gcc.dg/vect/pr65947-5.c: Likewise. * gcc.dg/vect/pr65947-6.c: Likewise. * gcc.dg/vect/pr65947-9.c: Likewise. * gcc.dg/vect/pr65947-10.c: Likewise. * gcc.dg/vect/pr65947-12.c: Likewise. * gcc.dg/vect/pr65947-14.c: Likewise. * gcc.dg/vect/pr80631-1.c: Likewise. * gcc.target/aarch64/sve/clastb_1.c: New test. * gcc.target/aarch64/sve/clastb_1_run.c: Likewise. * gcc.target/aarch64/sve/clastb_2.c: Likewise. * gcc.target/aarch64/sve/clastb_2_run.c: Likewise. * gcc.target/aarch64/sve/clastb_3.c: Likewise. * gcc.target/aarch64/sve/clastb_3_run.c: Likewise. * gcc.target/aarch64/sve/clastb_4.c: Likewise. * gcc.target/aarch64/sve/clastb_4_run.c: Likewise. * gcc.target/aarch64/sve/clastb_5.c: Likewise. * gcc.target/aarch64/sve/clastb_5_run.c: Likewise. * gcc.target/aarch64/sve/clastb_6.c: Likewise. * gcc.target/aarch64/sve/clastb_6_run.c: Likewise. * gcc.target/aarch64/sve/clastb_7.c: Likewise. * gcc.target/aarch64/sve/clastb_7_run.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256633
2018-01-13Add support for vectorising live-out values using SVE LASTBRichard Sandiford1-0/+5
This patch uses the SVE LASTB instruction to optimise cases in which a value produced by the final scalar iteration of a vectorised loop is live outside the loop. Previously this situation would stop us from using a fully-masked loop. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * doc/md.texi (extract_last_@var{m}): Document. * optabs.def (extract_last_optab): New optab. * internal-fn.def (EXTRACT_LAST): New internal function. * internal-fn.c (cond_unary_direct): New macro. (expand_cond_unary_optab_fn): Likewise. (direct_cond_unary_optab_supported_p): Likewise. * tree-vect-loop.c (vectorizable_live_operation): Allow fully-masked loops using EXTRACT_LAST. * config/aarch64/aarch64-sve.md (aarch64_sve_lastb<mode>): Rename to... (extract_last_<mode>): ...this optab. (vec_extract<mode><Vel>): Update accordingly. gcc/testsuite/ * gcc.target/aarch64/sve/live_1.c: New test. * gcc.target/aarch64/sve/live_1_run.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256632
2018-01-13Allow ADDR_EXPRs of TARGET_MEM_REFsRichard Sandiford1-14/+44
This patch allows ADDR_EXPR <TARGET_MEM_REF ...>, which is useful when calling internal functions that take pointers to memory that is conditionally loaded or stored. This is a prerequisite to the following ivopts patch. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * expr.c (expand_expr_addr_expr_1): Handle ADDR_EXPRs of TARGET_MEM_REFs. * gimple-expr.h (is_gimple_addressable: Likewise. * gimple-expr.c (is_gimple_address): Likewise. * internal-fn.c (expand_call_mem_ref): New function. (expand_mask_load_optab_fn): Use it. (expand_mask_store_optab_fn): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256627
2018-01-13Add support for reductions in fully-masked loopsRichard Sandiford1-0/+36
This patch removes the restriction that fully-masked loops cannot have reductions. The key thing here is to make sure that the reduction accumulator doesn't include any values associated with inactive lanes; the patch adds a bunch of conditional binary operations for doing that. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * doc/md.texi (cond_add@var{mode}, cond_sub@var{mode}) (cond_and@var{mode}, cond_ior@var{mode}, cond_xor@var{mode}) (cond_smin@var{mode}, cond_smax@var{mode}, cond_umin@var{mode}) (cond_umax@var{mode}): Document. * optabs.def (cond_add_optab, cond_sub_optab, cond_and_optab) (cond_ior_optab, cond_xor_optab, cond_smin_optab, cond_smax_optab) (cond_umin_optab, cond_umax_optab): New optabs. * internal-fn.def (COND_ADD, COND_SUB, COND_MIN, COND_MAX, COND_AND) (COND_IOR, COND_XOR): New internal functions. * internal-fn.h (get_conditional_internal_fn): Declare. * internal-fn.c (cond_binary_direct): New macro. (expand_cond_binary_optab_fn): Likewise. (direct_cond_binary_optab_supported_p): Likewise. (get_conditional_internal_fn): New function. * tree-vect-loop.c (vectorizable_reduction): Handle fully-masked loops. Cope with reduction statements that are vectorized as calls rather than assignments. * config/aarch64/aarch64-sve.md (cond_<optab><mode>): New insns. * config/aarch64/iterators.md (UNSPEC_COND_ADD, UNSPEC_COND_SUB) (UNSPEC_COND_SMAX, UNSPEC_COND_UMAX, UNSPEC_COND_SMIN) (UNSPEC_COND_UMIN, UNSPEC_COND_AND, UNSPEC_COND_ORR) (UNSPEC_COND_EOR): New unspecs. (optab): Add mappings for them. (SVE_COND_INT_OP, SVE_COND_FP_OP): New int iterators. (sve_int_op, sve_fp_op): New int attributes. gcc/testsuite/ * gcc.dg/vect/pr60482.c: Remove XFAIL for variable-length vectors. * gcc.target/aarch64/sve/reduc_1.c: Expect the loop operations to be predicated. * gcc.target/aarch64/sve/slp_5.c: Check for a fully-masked loop. * gcc.target/aarch64/sve/slp_7.c: Likewise. * gcc.target/aarch64/sve/reduc_5.c: New test. * gcc.target/aarch64/sve/slp_13.c: Likewise. * gcc.target/aarch64/sve/slp_13_run.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256626
2018-01-13Add support for fully-predicated loopsRichard Sandiford1-0/+44
This patch adds support for using a single fully-predicated loop instead of a vector loop and a scalar tail. An SVE WHILELO instruction generates the predicate for each iteration of the loop, given the current scalar iv value and the loop bound. This operation is wrapped up in a new internal function called WHILE_ULT. E.g.: WHILE_ULT (0, 3, { 0, 0, 0, 0}) -> { 1, 1, 1, 0 } WHILE_ULT (UINT_MAX - 1, UINT_MAX, { 0, 0, 0, 0 }) -> { 1, 0, 0, 0 } The third WHILE_ULT argument is needed to make the operation unambiguous: without it, WHILE_ULT (0, 3) for one vector type would seem equivalent to WHILE_ULT (0, 3) for another, even if the types have different numbers of elements. Note that the patch uses "mask" and "fully-masked" instead of "predicate" and "fully-predicated", to follow existing GCC terminology. This patch just handles the simple cases, punting for things like reductions and live-out values. Later patches remove most of these restrictions. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * optabs.def (while_ult_optab): New optab. * doc/md.texi (while_ult@var{m}@var{n}): Document. * internal-fn.def (WHILE_ULT): New internal function. * internal-fn.h (direct_internal_fn_supported_p): New override that takes two types as argument. * internal-fn.c (while_direct): New macro. (expand_while_optab_fn): New function. (convert_optab_supported_p): Likewise. (direct_while_optab_supported_p): New macro. * wide-int.h (wi::udiv_ceil): New function. * tree-vectorizer.h (rgroup_masks): New structure. (vec_loop_masks): New typedef. (_loop_vec_info): Add masks, mask_compare_type, can_fully_mask_p and fully_masked_p. (LOOP_VINFO_CAN_FULLY_MASK_P, LOOP_VINFO_FULLY_MASKED_P) (LOOP_VINFO_MASKS, LOOP_VINFO_MASK_COMPARE_TYPE): New macros. (vect_max_vf): New function. (slpeel_make_loop_iterate_ntimes): Delete. (vect_set_loop_condition, vect_get_loop_mask_type, vect_gen_while) (vect_halve_mask_nunits, vect_double_mask_nunits): Declare. (vect_record_loop_mask, vect_get_loop_mask): Likewise. * tree-vect-loop-manip.c: Include tree-ssa-loop-niter.h, internal-fn.h, stor-layout.h and optabs-query.h. (vect_set_loop_mask): New function. (add_preheader_seq): Likewise. (add_header_seq): Likewise. (interleave_supported_p): Likewise. (vect_maybe_permute_loop_masks): Likewise. (vect_set_loop_masks_directly): Likewise. (vect_set_loop_condition_masked): Likewise. (vect_set_loop_condition_unmasked): New function, split out from slpeel_make_loop_iterate_ntimes. (slpeel_make_loop_iterate_ntimes): Rename to.. (vect_set_loop_condition): ...this. Use vect_set_loop_condition_masked for fully-masked loops and vect_set_loop_condition_unmasked otherwise. (vect_do_peeling): Update call accordingly. (vect_gen_vector_loop_niters): Use VF as the step for fully-masked loops. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize mask_compare_type, can_fully_mask_p and fully_masked_p. (release_vec_loop_masks): New function. (_loop_vec_info): Use it to free the loop masks. (can_produce_all_loop_masks_p): New function. (vect_get_max_nscalars_per_iter): Likewise. (vect_verify_full_masking): Likewise. (vect_analyze_loop_2): Save LOOP_VINFO_CAN_FULLY_MASK_P around retries, and free the mask rgroups before retrying. Check loop-wide reasons for disallowing fully-masked loops. Make the final decision about whether use a fully-masked loop or not. (vect_estimate_min_profitable_iters): Do not assume that peeling for the number of iterations will be needed for fully-masked loops. (vectorizable_reduction): Disable fully-masked loops. (vectorizable_live_operation): Likewise. (vect_halve_mask_nunits): New function. (vect_double_mask_nunits): Likewise. (vect_record_loop_mask): Likewise. (vect_get_loop_mask): Likewise. (vect_transform_loop): Handle the case in which the final loop iteration might handle a partial vector. Call vect_set_loop_condition instead of slpeel_make_loop_iterate_ntimes. * tree-vect-stmts.c: Include tree-ssa-loop-niter.h and gimple-fold.h. (check_load_store_masking): New function. (prepare_load_store_mask): Likewise. (vectorizable_store): Handle fully-masked loops. (vectorizable_load): Likewise. (supportable_widening_operation): Use vect_halve_mask_nunits for booleans. (supportable_narrowing_operation): Likewise vect_double_mask_nunits. (vect_gen_while): New function. * config/aarch64/aarch64.md (umax<mode>3): New expander. (aarch64_uqdec<mode>): New insn. gcc/testsuite/ * gcc.dg/tree-ssa/cunroll-10.c: Disable vectorization. * gcc.dg/tree-ssa/peel1.c: Likewise. * gcc.dg/vect/vect-load-lanes-peeling-1.c: Remove XFAIL for variable-length vectors. * gcc.target/aarch64/sve/vcond_6.c: XFAIL test for AND. * gcc.target/aarch64/sve/vec_bool_cmp_1.c: Expect BIC instead of NOT. * gcc.target/aarch64/sve/slp_1.c: Check for a fully-masked loop. * gcc.target/aarch64/sve/slp_2.c: Likewise. * gcc.target/aarch64/sve/slp_3.c: Likewise. * gcc.target/aarch64/sve/slp_4.c: Likewise. * gcc.target/aarch64/sve/slp_6.c: Likewise. * gcc.target/aarch64/sve/slp_8.c: New test. * gcc.target/aarch64/sve/slp_8_run.c: Likewise. * gcc.target/aarch64/sve/slp_9.c: Likewise. * gcc.target/aarch64/sve/slp_9_run.c: Likewise. * gcc.target/aarch64/sve/slp_10.c: Likewise. * gcc.target/aarch64/sve/slp_10_run.c: Likewise. * gcc.target/aarch64/sve/slp_11.c: Likewise. * gcc.target/aarch64/sve/slp_11_run.c: Likewise. * gcc.target/aarch64/sve/slp_12.c: Likewise. * gcc.target/aarch64/sve/slp_12_run.c: Likewise. * gcc.target/aarch64/sve/ld1r_2.c: Likewise. * gcc.target/aarch64/sve/ld1r_2_run.c: Likewise. * gcc.target/aarch64/sve/while_1.c: Likewise. * gcc.target/aarch64/sve/while_2.c: Likewise. * gcc.target/aarch64/sve/while_3.c: Likewise. * gcc.target/aarch64/sve/while_4.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256625
2018-01-13Add support for masked load/store_lanesRichard Sandiford1-8/+26
This patch adds support for vectorising groups of IFN_MASK_LOADs and IFN_MASK_STOREs using conditional load/store-lanes instructions. This requires new internal functions to represent the result (IFN_MASK_{LOAD,STORE}_LANES), as well as associated optabs. The normal IFN_{LOAD,STORE}_LANES functions are const operations that logically just perform the permute: the load or store is encoded as a MEM operand to the call statement. In contrast, the IFN_MASK_{LOAD,STORE}_LANES functions use the same kind of interface as IFN_MASK_{LOAD,STORE}, since the memory is only conditionally accessed. The AArch64 patterns were added as part of the main LD[234]/ST[234] patch. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * doc/md.texi (vec_mask_load_lanes@var{m}@var{n}): Document. (vec_mask_store_lanes@var{m}@var{n}): Likewise. * optabs.def (vec_mask_load_lanes_optab): New optab. (vec_mask_store_lanes_optab): Likewise. * internal-fn.def (MASK_LOAD_LANES): New internal function. (MASK_STORE_LANES): Likewise. * internal-fn.c (mask_load_lanes_direct): New macro. (mask_store_lanes_direct): Likewise. (expand_mask_load_optab_fn): Handle masked operations. (expand_mask_load_lanes_optab_fn): New macro. (expand_mask_store_optab_fn): Handle masked operations. (expand_mask_store_lanes_optab_fn): New macro. (direct_mask_load_lanes_optab_supported_p): Likewise. (direct_mask_store_lanes_optab_supported_p): Likewise. * tree-vectorizer.h (vect_store_lanes_supported): Take a masked_p parameter. (vect_load_lanes_supported): Likewise. * tree-vect-data-refs.c (strip_conversion): New function. (can_group_stmts_p): Likewise. (vect_analyze_data_ref_accesses): Use it instead of checking for a pair of assignments. (vect_store_lanes_supported): Take a masked_p parameter. (vect_load_lanes_supported): Likewise. * tree-vect-loop.c (vect_analyze_loop_2): Update calls to vect_store_lanes_supported and vect_load_lanes_supported. * tree-vect-slp.c (vect_analyze_slp_instance): Likewise. * tree-vect-stmts.c (get_group_load_store_type): Take a masked_p parameter. Don't allow gaps for masked accesses. Use vect_get_store_rhs. Update calls to vect_store_lanes_supported and vect_load_lanes_supported. (get_load_store_type): Take a masked_p parameter and update call to get_group_load_store_type. (vectorizable_store): Update call to get_load_store_type. Handle IFN_MASK_STORE_LANES. (vectorizable_load): Update call to get_load_store_type. Handle IFN_MASK_LOAD_LANES. gcc/testsuite/ * gcc.dg/vect/vect-ooo-group-1.c: New test. * gcc.target/aarch64/sve/mask_struct_load_1.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_1_run.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_2_run.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_3_run.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_6.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_7.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_8.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_3.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_3_run.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_4.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256620
2018-01-03Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r256169
2018-01-03poly_int: expand_vector_ubsan_overflowRichard Sandiford1-7/+10
This patch makes expand_vector_ubsan_overflow cope with a polynomial number of elements. 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * internal-fn.c (expand_vector_ubsan_overflow): Handle polynomial numbers of elements. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256148
2017-12-15re PR sanitizer/83388 (reference statement index not found error with ↵Richard Biener1-0/+8
-fsanitize=null) 2017-12-15 Richard Biener <rguenther@suse.de> PR lto/83388 * internal-fn.def (IFN_NOP): Add. * internal-fn.c (expand_NOP): Do nothing. * lto-streamer-in.c (input_function): Instead of removing sanitizer calls replace them with IFN_NOP calls. * gcc.dg/lto/pr83388_0.c: New testcase. From-SVN: r255694
2017-11-30re PR target/83210 (__builtin_mul_overflow() generates suboptimal code when ↵Jakub Jelinek1-0/+43
exactly one argument is the constant 2) PR target/83210 * internal-fn.c (expand_mul_overflow): Optimize unsigned multiplication by power of 2 constant into two shifts + comparison. * gcc.target/i386/pr83210.c: New test. From-SVN: r255269
2017-11-22Replace REDUC_*_EXPRs with internal functions.Richard Sandiford1-0/+45
This patch replaces the REDUC_*_EXPR tree codes with internal functions. This is needed so that the upcoming in-order reductions can also use internal functions without too much complication. 2017-11-22 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * tree.def (REDUC_MAX_EXPR, REDUC_MIN_EXPR, REDUC_PLUS_EXPR): Delete. * cfgexpand.c (expand_debug_expr): Remove handling for them. * expr.c (expand_expr_real_2): Likewise. * fold-const.c (const_unop): Likewise. * optabs-tree.c (optab_for_tree_code): Likewise. * tree-cfg.c (verify_gimple_assign_unary): Likewise. * tree-inline.c (estimate_operator_cost): Likewise. * tree-pretty-print.c (dump_generic_node): Likewise. (op_code_prio): Likewise. (op_symbol_code): Likewise. * internal-fn.def (DEF_INTERNAL_SIGNED_OPTAB_FN): Define. (IFN_REDUC_PLUS, IFN_REDUC_MAX, IFN_REDUC_MIN): New internal functions. * internal-fn.c (direct_internal_fn_optab): New function. (direct_internal_fn_array, direct_internal_fn_supported_p (internal_fn_expanders): Handle DEF_INTERNAL_SIGNED_OPTAB_FN. * fold-const-call.c (fold_const_reduction): New function. (fold_const_call): Handle CFN_REDUC_PLUS, CFN_REDUC_MAX and CFN_REDUC_MIN. * tree-vect-loop.c: Include internal-fn.h. (reduction_code_for_scalar_code): Rename to... (reduction_fn_for_scalar_code): ...this and return an internal function. (vect_model_reduction_cost): Take an internal_fn rather than a tree_code. (vect_create_epilog_for_reduction): Likewise. Build calls rather than assignments. (vectorizable_reduction): Use internal functions rather than tree codes for the reduction operation. Update calls to the functions above. * config/aarch64/aarch64-builtins.c (aarch64_gimple_fold_builtin): Use calls to internal functions rather than REDUC tree codes. * config/aarch64/aarch64-simd.md: Update comment accordingly. From-SVN: r255073
2017-11-21re PR target/82981 (unnecessary __multi3 call for mips64r6 linux kernel)Jakub Jelinek1-6/+6
PR target/82981 * internal-fn.c (expand_mul_overflow): Use OPTAB_WIDEN instead of OPTAB_DIRECT in calls to expand_simple_binop. From-SVN: r254986
2017-11-15re PR target/82981 (unnecessary __multi3 call for mips64r6 linux kernel)Jakub Jelinek1-2/+88
PR target/82981 * internal-fn.c: Include gimple-ssa.h, tree-phinodes.h and ssa-iterators.h. (can_widen_mult_without_libcall): New function. (expand_mul_overflow): If only checking unsigned mul overflow, not result, and can do efficiently MULT_HIGHPART_EXPR, emit that. Don't use WIDEN_MULT_EXPR if it would involve a libcall, unless no other way works. Add MULT_HIGHPART_EXPR + MULT_EXPR support. (expand_DIVMOD): Formatting fix. * expmed.h (expand_mult): Add NO_LIBCALL argument. * expmed.c (expand_mult): Likewise. Use OPTAB_WIDEN rather than OPTAB_LIB_WIDEN if NO_LIBCALL is true, and allow it to fail. * gcc.target/mips/pr82981.c: New test. From-SVN: r254758
2017-10-22SUBREG_PROMOTED_VAR_P handling in expand_direct_optab_fnRichard Sandiford1-1/+9
This is needed by the later SVE LAST reductions, where an 8-bit or 16-bit result is zero- rather than sign-extended to 32 bits. I think it could occur in other situations too. 2017-09-19 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * internal-fn.c (expand_direct_optab_fn): Don't assign directly to a SUBREG_PROMOTED_VAR. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r253992
2017-10-13re PR target/82274 (__builtin_mul_overflow fails to detect overflow for ↵Jakub Jelinek1-3/+3
int64_t when compiled with -m32) PR target/82274 * internal-fn.c (expand_mul_overflow): If both operands have the same highpart of -1 or 0 and the topmost bit of lowpart is different, overflow is if res <= 0 rather than res < 0. * libgcc2.c (__mulvDI3): If both operands have the same highpart of -1 and the topmost bit of lowpart is 0, multiplication overflows even if both lowparts are 0. * gcc.dg/pr82274-1.c: New test. * gcc.dg/pr82274-2.c: New test. From-SVN: r253734
2017-10-10Require wi::to_wide for treesRichard Sandiford1-1/+1
The wide_int routines allow things like: wi::add (t, 1) to add 1 to an INTEGER_CST T in its native precision. But we also have: wi::to_offset (t) // Treat T as an offset_int wi::to_widest (t) // Treat T as a widest_int Recently we also gained: wi::to_wide (t, prec) // Treat T as a wide_int in preccision PREC This patch therefore requires: wi::to_wide (t) when operating on INTEGER_CSTs in their native precision. This is just as efficient, and makes it clearer that a deliberate choice is being made to treat the tree as a wide_int in its native precision. This also removes the inconsistency that a) INTEGER_CSTs in their native precision can be used without an accessor but must use wi:: functions instead of C++ operators b) the other forms need an explicit accessor but the result can be used with C++ operators. It also helps with SVE, where there's the additional possibility that the tree could be a runtime value. 2017-10-10 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * wide-int.h (wide_int_ref_storage): Make host_dependent_precision a template parameter. (WIDE_INT_REF_FOR): Update accordingly. * tree.h (wi::int_traits <const_tree>): Delete. (wi::tree_to_widest_ref, wi::tree_to_offset_ref): New typedefs. (wi::to_widest, wi::to_offset): Use them. Expand commentary. (wi::tree_to_wide_ref): New typedef. (wi::to_wide): New function. * calls.c (get_size_range): Use wi::to_wide when operating on trees as wide_ints. * cgraph.c (cgraph_node::create_thunk): Likewise. * config/i386/i386.c (ix86_data_alignment): Likewise. (ix86_local_alignment): Likewise. * dbxout.c (stabstr_O): Likewise. * dwarf2out.c (add_scalar_info, gen_enumeration_type_die): Likewise. * expr.c (const_vector_from_tree): Likewise. * fold-const-call.c (host_size_t_cst_p, fold_const_call_1): Likewise. * fold-const.c (may_negate_without_overflow_p, negate_expr_p) (fold_negate_expr_1, int_const_binop_1, const_binop) (fold_convert_const_int_from_real, optimize_bit_field_compare) (all_ones_mask_p, sign_bit_p, unextend, extract_muldiv_1) (fold_div_compare, fold_single_bit_test, fold_plusminus_mult_expr) (pointer_may_wrap_p, expr_not_equal_to, fold_binary_loc) (fold_ternary_loc, multiple_of_p, fold_negate_const, fold_abs_const) (fold_not_const, round_up_loc): Likewise. * gimple-fold.c (gimple_fold_indirect_ref): Likewise. * gimple-ssa-warn-alloca.c (alloca_call_type_by_arg): Likewise. (alloca_call_type): Likewise. * gimple.c (preprocess_case_label_vec_for_gimple): Likewise. * godump.c (go_output_typedef): Likewise. * graphite-sese-to-poly.c (tree_int_to_gmp): Likewise. * internal-fn.c (get_min_precision): Likewise. * ipa-cp.c (ipcp_store_vr_results): Likewise. * ipa-polymorphic-call.c (ipa_polymorphic_call_context::ipa_polymorphic_call_context): Likewise. * ipa-prop.c (ipa_print_node_jump_functions_for_edge): Likewise. (ipa_modify_call_arguments): Likewise. * match.pd: Likewise. * omp-low.c (scan_omp_1_op, lower_omp_ordered_clauses): Likewise. * print-tree.c (print_node_brief, print_node): Likewise. * stmt.c (expand_case): Likewise. * stor-layout.c (layout_type): Likewise. * tree-affine.c (tree_to_aff_combination): Likewise. * tree-cfg.c (group_case_labels_stmt): Likewise. * tree-data-ref.c (dr_analyze_indices): Likewise. (prune_runtime_alias_test_list): Likewise. * tree-dump.c (dequeue_and_dump): Likewise. * tree-inline.c (remap_gimple_op_r, copy_tree_body_r): Likewise. * tree-predcom.c (is_inv_store_elimination_chain): Likewise. * tree-pretty-print.c (dump_generic_node): Likewise. * tree-scalar-evolution.c (iv_can_overflow_p): Likewise. (simple_iv_with_niters): Likewise. * tree-ssa-address.c (addr_for_mem_ref): Likewise. * tree-ssa-ccp.c (ccp_finalize, evaluate_stmt): Likewise. * tree-ssa-loop-ivopts.c (constant_multiple_of): Likewise. * tree-ssa-loop-niter.c (split_to_var_and_offset) (refine_value_range_using_guard, number_of_iterations_ne_max) (number_of_iterations_lt_to_ne, number_of_iterations_lt) (get_cst_init_from_scev, record_nonwrapping_iv) (scev_var_range_cant_overflow): Likewise. * tree-ssa-phiopt.c (minmax_replacement): Likewise. * tree-ssa-pre.c (compute_avail): Likewise. * tree-ssa-sccvn.c (vn_reference_fold_indirect): Likewise. (vn_reference_maybe_forwprop_address, valueized_wider_op): Likewise. * tree-ssa-structalias.c (get_constraint_for_ptr_offset): Likewise. * tree-ssa-uninit.c (is_pred_expr_subset_of): Likewise. * tree-ssanames.c (set_nonzero_bits, get_nonzero_bits): Likewise. * tree-switch-conversion.c (collect_switch_conv_info, array_value_type) (dump_case_nodes, try_switch_expansion): Likewise. * tree-vect-loop-manip.c (vect_gen_vector_loop_niters): Likewise. (vect_do_peeling): Likewise. * tree-vect-patterns.c (vect_recog_bool_pattern): Likewise. * tree-vect-stmts.c (vectorizable_load): Likewise. * tree-vrp.c (compare_values_warnv, vrp_int_const_binop): Likewise. (zero_nonzero_bits_from_vr, ranges_from_anti_range): Likewise. (extract_range_from_binary_expr_1, adjust_range_with_scev): Likewise. (overflow_comparison_p_1, register_edge_assert_for_2): Likewise. (is_masked_range_test, find_switch_asserts, maybe_set_nonzero_bits) (vrp_evaluate_conditional_warnv_with_ops, intersect_ranges): Likewise. (range_fits_type_p, two_valued_val_range_p, vrp_finalize): Likewise. (evrp_dom_walker::before_dom_children): Likewise. * tree.c (cache_integer_cst, real_value_from_int_cst, integer_zerop) (integer_all_onesp, integer_pow2p, integer_nonzerop, tree_log2) (tree_floor_log2, tree_ctz, mem_ref_offset, tree_int_cst_sign_bit) (tree_int_cst_sgn, get_unwidened, int_fits_type_p): Likewise. (get_type_static_bounds, num_ending_zeros, drop_tree_overflow) (get_range_pos_neg): Likewise. * ubsan.c (ubsan_expand_ptr_ifn): Likewise. * config/darwin.c (darwin_mergeable_constant_section): Likewise. * config/aarch64/aarch64.c (aapcs_vfp_sub_candidate): Likewise. * config/arm/arm.c (aapcs_vfp_sub_candidate): Likewise. * config/avr/avr.c (avr_fold_builtin): Likewise. * config/bfin/bfin.c (bfin_local_alignment): Likewise. * config/msp430/msp430.c (msp430_attr): Likewise. * config/nds32/nds32.c (nds32_insert_attributes): Likewise. * config/powerpcspe/powerpcspe-c.c (altivec_resolve_overloaded_builtin): Likewise. * config/powerpcspe/powerpcspe.c (rs6000_aggregate_candidate) (rs6000_expand_ternop_builtin): Likewise. * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Likewise. * config/rs6000/rs6000.c (rs6000_aggregate_candidate): Likewise. (rs6000_expand_ternop_builtin): Likewise. * config/s390/s390.c (s390_handle_hotpatch_attribute): Likewise. gcc/ada/ * gcc-interface/decl.c (annotate_value): Use wi::to_wide when operating on trees as wide_ints. gcc/c/ * c-parser.c (c_parser_cilk_clause_vectorlength): Use wi::to_wide when operating on trees as wide_ints. * c-typeck.c (build_c_cast, c_finish_omp_clauses): Likewise. (c_tree_equal): Likewise. gcc/c-family/ * c-ada-spec.c (dump_generic_ada_node): Use wi::to_wide when operating on trees as wide_ints. * c-common.c (pointer_int_sum): Likewise. * c-pretty-print.c (pp_c_integer_constant): Likewise. * c-warn.c (match_case_to_enum_1): Likewise. (c_do_switch_warnings): Likewise. (maybe_warn_shift_overflow): Likewise. gcc/cp/ * cvt.c (ignore_overflows): Use wi::to_wide when operating on trees as wide_ints. * decl.c (check_array_designated_initializer): Likewise. * mangle.c (write_integer_cst): Likewise. * semantics.c (cp_finish_omp_clause_depend_sink): Likewise. gcc/fortran/ * target-memory.c (gfc_interpret_logical): Use wi::to_wide when operating on trees as wide_ints. * trans-const.c (gfc_conv_tree_to_mpz): Likewise. * trans-expr.c (gfc_conv_cst_int_power): Likewise. * trans-intrinsic.c (trans_this_image): Likewise. (gfc_conv_intrinsic_bound): Likewise. (conv_intrinsic_cobound): Likewise. gcc/lto/ * lto.c (compare_tree_sccs_1): Use wi::to_wide when operating on trees as wide_ints. gcc/objc/ * objc-act.c (objc_decl_method_attributes): Use wi::to_wide when operating on trees as wide_ints. From-SVN: r253595
2017-08-30[62/77] Big machine_mode to scalar_int_mode replacementRichard Sandiford1-3/+2
This patch changes the types of various things from machine_mode to scalar_int_mode, in cases where (after previous patches) simply changing the type is enough on its own. The patch does nothing other than that. 2017-08-30 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * builtins.h (builtin_strncpy_read_str): Take a scalar_int_mode instead of a machine_mode. (builtin_memset_read_str): Likewise. * builtins.c (c_readstr): Likewise. (builtin_memcpy_read_str): Likewise. (builtin_strncpy_read_str): Likewise. (builtin_memset_read_str): Likewise. (builtin_memset_gen_str): Likewise. (expand_builtin_signbit): Use scalar_int_mode for local variables. * cfgexpand.c (convert_debug_memory_address): Take a scalar_int_mode instead of a machine_mode. * combine.c (simplify_if_then_else): Use scalar_int_mode for local variables. (make_extraction): Likewise. (try_widen_shift_mode): Take and return scalar_int_modes instead of machine_modes. * config/aarch64/aarch64.c (aarch64_libgcc_cmp_return_mode): Return a scalar_int_mode instead of a machine_mode. * config/avr/avr.c (avr_addr_space_address_mode): Likewise. (avr_addr_space_pointer_mode): Likewise. * config/cr16/cr16.c (cr16_unwind_word_mode): Likewise. * config/msp430/msp430.c (msp430_addr_space_pointer_mode): Likewise. (msp430_unwind_word_mode): Likewise. * config/spu/spu.c (spu_unwind_word_mode): Likewise. (spu_addr_space_pointer_mode): Likewise. (spu_addr_space_address_mode): Likewise. (spu_libgcc_cmp_return_mode): Likewise. (spu_libgcc_shift_count_mode): Likewise. * config/rl78/rl78.c (rl78_addr_space_address_mode): Likewise. (rl78_addr_space_pointer_mode): Likewise. (fl78_unwind_word_mode): Likewise. (rl78_valid_pointer_mode): Take a scalar_int_mode instead of a machine_mode. * config/alpha/alpha.c (vms_valid_pointer_mode): Likewise. * config/ia64/ia64.c (ia64_vms_valid_pointer_mode): Likewise. * config/mips/mips.c (mips_mode_rep_extended): Likewise. (mips_valid_pointer_mode): Likewise. * config/tilegx/tilegx.c (tilegx_mode_rep_extended): Likewise. * config/ft32/ft32.c (ft32_valid_pointer_mode): Likewise. (ft32_addr_space_pointer_mode): Return a scalar_int_mode instead of a machine_mode. (ft32_addr_space_address_mode): Likewise. * config/m32c/m32c.c (m32c_valid_pointer_mode): Take a scalar_int_mode instead of a machine_mode. (m32c_addr_space_pointer_mode): Return a scalar_int_mode instead of a machine_mode. (m32c_addr_space_address_mode): Likewise. * config/powerpcspe/powerpcspe.c (rs6000_abi_word_mode): Likewise. (rs6000_eh_return_filter_mode): Likewise. * config/rs6000/rs6000.c (rs6000_abi_word_mode): Likewise. (rs6000_eh_return_filter_mode): Likewise. * config/s390/s390.c (s390_libgcc_cmp_return_mode): Likewise. (s390_libgcc_shift_count_mode): Likewise. (s390_unwind_word_mode): Likewise. (s390_valid_pointer_mode): Take a scalar_int_mode rather than a machine_mode. * target.def (mode_rep_extended): Likewise. (valid_pointer_mode): Likewise. (addr_space.valid_pointer_mode): Likewise. (eh_return_filter_mode): Return a scalar_int_mode rather than a machine_mode. (libgcc_cmp_return_mode): Likewise. (libgcc_shift_count_mode): Likewise. (unwind_word_mode): Likewise. (addr_space.pointer_mode): Likewise. (addr_space.address_mode): Likewise. * doc/tm.texi: Regenerate. * dojump.c (prefer_and_bit_test): Take a scalar_int_mode rather than a machine_mode. (do_jump): Use scalar_int_mode for local variables. * dwarf2cfi.c (init_return_column_size): Take a scalar_int_mode rather than a machine_mode. * dwarf2out.c (convert_descriptor_to_mode): Likewise. (scompare_loc_descriptor_wide): Likewise. (scompare_loc_descriptor_narrow): Likewise. * emit-rtl.c (adjust_address_1): Use scalar_int_mode for local variables. * except.c (sjlj_emit_dispatch_table): Likewise. (expand_builtin_eh_copy_values): Likewise. * explow.c (convert_memory_address_addr_space_1): Likewise. Take a scalar_int_mode rather than a machine_mode. (convert_memory_address_addr_space): Take a scalar_int_mode rather than a machine_mode. (memory_address_addr_space): Use scalar_int_mode for local variables. * expmed.h (expand_mult_highpart_adjust): Take a scalar_int_mode rather than a machine_mode. * expmed.c (mask_rtx): Likewise. (init_expmed_one_conv): Likewise. (expand_mult_highpart_adjust): Likewise. (extract_high_half): Likewise. (expmed_mult_highpart_optab): Likewise. (expmed_mult_highpart): Likewise. (expand_smod_pow2): Likewise. (expand_sdiv_pow2): Likewise. (emit_store_flag_int): Likewise. (adjust_bit_field_mem_for_reg): Use scalar_int_mode for local variables. (extract_low_bits): Likewise. * expr.h (by_pieces_constfn): Take a scalar_int_mode rather than a machine_mode. * expr.c (pieces_addr::adjust): Likewise. (can_store_by_pieces): Likewise. (store_by_pieces): Likewise. (clear_by_pieces_1): Likewise. (expand_expr_addr_expr_1): Likewise. (expand_expr_addr_expr): Use scalar_int_mode for local variables. (expand_expr_real_1): Likewise. (try_casesi): Likewise. * final.c (shorten_branches): Likewise. * fold-const.c (fold_convert_const_int_from_fixed): Change the type of "mode" to machine_mode. * internal-fn.c (expand_arith_overflow_result_store): Take a scalar_int_mode rather than a machine_mode. (expand_mul_overflow): Use scalar_int_mode for local variables. * loop-doloop.c (doloop_modify): Likewise. (doloop_optimize): Likewise. * optabs.c (expand_subword_shift): Take a scalar_int_mode rather than a machine_mode. (expand_doubleword_shift_condmove): Likewise. (expand_doubleword_shift): Likewise. (expand_doubleword_clz): Likewise. (expand_doubleword_popcount): Likewise. (expand_doubleword_parity): Likewise. (expand_absneg_bit): Use scalar_int_mode for local variables. (prepare_float_lib_cmp): Likewise. * rtl.h (convert_memory_address_addr_space_1): Take a scalar_int_mode rather than a machine_mode. (convert_memory_address_addr_space): Likewise. (get_mode_bounds): Likewise. (get_address_mode): Return a scalar_int_mode rather than a machine_mode. * rtlanal.c (get_address_mode): Likewise. * stor-layout.c (get_mode_bounds): Take a scalar_int_mode rather than a machine_mode. * targhooks.c (default_mode_rep_extended): Likewise. (default_valid_pointer_mode): Likewise. (default_addr_space_valid_pointer_mode): Likewise. (default_eh_return_filter_mode): Return a scalar_int_mode rather than a machine_mode. (default_libgcc_cmp_return_mode): Likewise. (default_libgcc_shift_count_mode): Likewise. (default_unwind_word_mode): Likewise. (default_addr_space_pointer_mode): Likewise. (default_addr_space_address_mode): Likewise. * targhooks.h (default_eh_return_filter_mode): Likewise. (default_libgcc_cmp_return_mode): Likewise. (default_libgcc_shift_count_mode): Likewise. (default_unwind_word_mode): Likewise. (default_addr_space_pointer_mode): Likewise. (default_addr_space_address_mode): Likewise. (default_mode_rep_extended): Take a scalar_int_mode rather than a machine_mode. (default_valid_pointer_mode): Likewise. (default_addr_space_valid_pointer_mode): Likewise. * tree-ssa-address.c (addr_for_mem_ref): Use scalar_int_mode for local variables. * tree-ssa-loop-ivopts.c (get_shiftadd_cost): Take a scalar_int_mode rather than a machine_mode. * tree-switch-conversion.c (array_value_type): Use scalar_int_mode for local variables. * tree-vrp.c (simplify_float_conversion_using_ranges): Likewise. * var-tracking.c (use_narrower_mode): Take a scalar_int_mode rather than a machine_mode. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r251513
2017-08-30[35/77] Add uses of as_a <scalar_int_mode>Richard Sandiford1-1/+2
This patch adds asserting as_a <scalar_int_mode> conversions to contexts in which the input is known to be a scalar integer mode. In expand_divmod, op1 is always a scalar_int_mode if op1_is_constant (but might not be otherwise). In expand_binop, the patch reverses a < comparison in order to avoid splitting a long line. gcc/ 2017-08-30 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> * cfgexpand.c (convert_debug_memory_address): Use as_a <scalar_int_mode>. * combine.c (expand_compound_operation): Likewise. (make_extraction): Likewise. (change_zero_ext): Likewise. (simplify_comparison): Likewise. * cse.c (cse_insn): Likewise. * dwarf2out.c (minmax_loc_descriptor): Likewise. (mem_loc_descriptor): Likewise. (loc_descriptor): Likewise. * expmed.c (init_expmed_one_mode): Likewise. (synth_mult): Likewise. (emit_store_flag_1): Likewise. (expand_divmod): Likewise. Use HWI_COMPUTABLE_MODE_P instead of a comparison with size. * expr.c (expand_assignment): Use as_a <scalar_int_mode>. (reduce_to_bit_field_precision): Likewise. * function.c (expand_function_end): Likewise. * internal-fn.c (expand_arith_overflow_result_store): Likewise. * loop-doloop.c (doloop_modify): Likewise. * optabs.c (expand_binop): Likewise. (expand_unop): Likewise. (expand_copysign_absneg): Likewise. (prepare_cmp_insn): Likewise. (maybe_legitimize_operand): Likewise. * recog.c (const_scalar_int_operand): Likewise. * rtlanal.c (get_address_mode): Likewise. * simplify-rtx.c (simplify_unary_operation_1): Likewise. (simplify_cond_clz_ctz): Likewise. * tree-nested.c (get_nl_goto_field): Likewise. * tree.c (build_vector_type_for_mode): Likewise. * var-tracking.c (use_narrower_mode): Likewise. gcc/c-family/ 2017-08-30 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> * c-common.c (c_common_type_for_mode): Use as_a <scalar_int_mode>. gcc/lto/ 2017-08-30 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> * lto-lang.c (lto_type_for_mode): Use as_a <scalar_int_mode>. From-SVN: r251487
2017-08-30[34/77] Add a SCALAR_INT_TYPE_MODE macroRichard Sandiford1-4/+4
This patch adds a SCALAR_INT_TYPE_MODE macro that asserts that the type has a scalar integer mode and returns it as a scalar_int_mode. 2017-08-30 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * tree.h (SCALAR_INT_TYPE_MODE): New macro. * builtins.c (expand_builtin_signbit): Use it. * cfgexpand.c (expand_debug_expr): Likewise. * dojump.c (do_jump): Likewise. (do_compare_and_jump): Likewise. * dwarf2cfi.c (expand_builtin_init_dwarf_reg_sizes): Likewise. * expmed.c (make_tree): Likewise. * expr.c (expand_expr_real_2): Likewise. (expand_expr_real_1): Likewise. (try_casesi): Likewise. * fold-const-call.c (fold_const_call_ss): Likewise. * fold-const.c (unextend): Likewise. (extract_muldiv_1): Likewise. (fold_single_bit_test): Likewise. (native_encode_int): Likewise. (native_encode_string): Likewise. (native_interpret_int): Likewise. * gimple-fold.c (gimple_fold_builtin_memset): Likewise. * internal-fn.c (expand_addsub_overflow): Likewise. (expand_neg_overflow): Likewise. (expand_mul_overflow): Likewise. (expand_arith_overflow): Likewise. * match.pd: Likewise. * stor-layout.c (layout_type): Likewise. * tree-cfg.c (verify_gimple_assign_ternary): Likewise. * tree-ssa-math-opts.c (convert_mult_to_widen): Likewise. * tree-ssanames.c (get_range_info): Likewise. * tree-switch-conversion.c (array_value_type) Likewise. * tree-vect-patterns.c (vect_recog_rotate_pattern): Likewise. (vect_recog_divmod_pattern): Likewise. (vect_recog_mixed_size_cond_pattern): Likewise. * tree-vrp.c (extract_range_basic): Likewise. (simplify_float_conversion_using_ranges): Likewise. * tree.c (int_fits_type_p): Likewise. * ubsan.c (instrument_bool_enum_load): Likewise. * varasm.c (mergeable_string_section): Likewise. (narrowing_initializer_constant_valid_p): Likewise. (output_constant): Likewise. gcc/cp/ * cvt.c (cp_convert_to_pointer): Use SCALAR_INT_TYPE_MODE. gcc/fortran/ * target-memory.c (size_integer): Use SCALAR_INT_TYPE_MODE. (size_logical): Likewise. gcc/objc/ * objc-encoding.c (encode_type): Use SCALAR_INT_TYPE_MODE. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r251486
2017-08-30[19/77] Add a smallest_int_mode_for_size helper functionRichard Sandiford1-2/+2
This patch adds a wrapper around smallest_mode_for_size for cases in which the mode class is MODE_INT. Unlike (int_)mode_for_size, smallest_mode_for_size always returns a mode of the specified class, asserting if no such mode exists. smallest_int_mode_for_size therefore returns a scalar_int_mode rather than an opt_scalar_int_mode. 2017-08-30 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * machmode.h (smallest_mode_for_size): Fix formatting. (smallest_int_mode_for_size): New function. * cfgexpand.c (expand_debug_expr): Use smallest_int_mode_for_size instead of smallest_mode_for_size. * combine.c (make_extraction): Likewise. * config/arc/arc.c (arc_expand_movmem): Likewise. * config/arm/arm.c (arm_expand_divmod_libfunc): Likewise. * config/i386/i386.c (ix86_get_mask_mode): Likewise. * config/s390/s390.c (s390_expand_insv): Likewise. * config/sparc/sparc.c (assign_int_registers): Likewise. * config/spu/spu.c (spu_function_value): Likewise. (spu_function_arg): Likewise. * coverage.c (get_gcov_type): Likewise. (get_gcov_unsigned_t): Likewise. * dse.c (find_shift_sequence): Likewise. * expmed.c (store_bit_field_1): Likewise. * expr.c (convert_move): Likewise. (store_field): Likewise. * internal-fn.c (expand_arith_overflow): Likewise. * optabs-query.c (get_best_extraction_insn): Likewise. * optabs.c (expand_twoval_binop_libfunc): Likewise. * stor-layout.c (layout_type): Likewise. (initialize_sizetypes): Likewise. * targhooks.c (default_get_mask_mode): Likewise. * tree-ssa-loop-manip.c (canonicalize_loop_ivs): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r251471
2017-08-30[17/77] Add an int_mode_for_size helper functionRichard Sandiford1-2/+3
This patch adds a wrapper around mode_for_size for cases in which the mode class is MODE_INT (the commonest case). The return type can then be an opt_scalar_int_mode instead of a machine_mode. 2017-08-30 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * machmode.h (int_mode_for_size): New function. * builtins.c (set_builtin_user_assembler_name): Use int_mode_for_size instead of mode_for_size. * calls.c (save_fixed_argument_area): Likewise. Make use of BLKmode explicit. * combine.c (expand_field_assignment): Use int_mode_for_size instead of mode_for_size. (make_extraction): Likewise. (simplify_shift_const_1): Likewise. (simplify_comparison): Likewise. * dojump.c (do_jump): Likewise. * dwarf2out.c (mem_loc_descriptor): Likewise. * emit-rtl.c (init_derived_machine_modes): Likewise. * expmed.c (flip_storage_order): Likewise. (convert_extracted_bit_field): Likewise. * expr.c (copy_blkmode_from_reg): Likewise. * graphite-isl-ast-to-gimple.c (max_mode_int_precision): Likewise. * internal-fn.c (expand_mul_overflow): Likewise. * lower-subreg.c (simple_move): Likewise. * optabs-libfuncs.c (init_optabs): Likewise. * simplify-rtx.c (simplify_unary_operation_1): Likewise. * tree.c (vector_type_mode): Likewise. * tree-ssa-strlen.c (handle_builtin_memcmp): Likewise. * tree-vect-data-refs.c (vect_lanes_optab_supported_p): Likewise. * tree-vect-generic.c (expand_vector_parallel): Likewise. * tree-vect-stmts.c (vectorizable_load): Likewise. (vectorizable_store): Likewise. gcc/ada/ * gcc-interface/decl.c (gnat_to_gnu_entity): Use int_mode_for_size instead of mode_for_size. (gnat_to_gnu_subprog_type): Likewise. * gcc-interface/utils.c (make_type_from_size): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r251469
2017-08-30[6/77] Make GET_MODE_WIDER return an opt_modeRichard Sandiford1-3/+3
GET_MODE_WIDER previously returned VOIDmode if no wider mode existed. That would cause problems with stricter mode classes, since VOIDmode isn't for example a valid scalar integer or floating-point mode. This patch instead makes it return a new opt_mode<T> class, which holds either a T or nothing. 2017-08-30 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * coretypes.h (opt_mode): New class. * machmode.h (opt_mode): Likewise. (opt_mode::else_void): New function. (opt_mode::require): Likewise. (opt_mode::exists): Likewise. (GET_MODE_WIDER_MODE): Turn into a function and return an opt_mode. (GET_MODE_2XWIDER_MODE): Likewise. (mode_iterator::get_wider): Update accordingly. (mode_iterator::get_2xwider): Likewise. (mode_iterator::get_known_wider): Likewise, turning into a template. * combine.c (make_extraction): Update use of GET_MODE_WIDER_MODE, forcing a wider mode to exist. * config/cr16/cr16.h (LONG_REG_P): Likewise. * rtlanal.c (init_num_sign_bit_copies_in_rep): Likewise. * config/c6x/c6x.c (c6x_rtx_costs): Update use of GET_MODE_2XWIDER_MODE, forcing a wider mode to exist. * lower-subreg.c (init_lower_subreg): Likewise. * optabs-libfuncs.c (init_sync_libfuncs_1): Likewise, but not on the final iteration. * config/i386/i386.c (ix86_expand_set_or_movmem): Check whether a wider mode exists before asking for a move pattern. (get_mode_wider_vector): Update use of GET_MODE_WIDER_MODE, forcing a wider mode to exist. (expand_vselect_vconcat): Update use of GET_MODE_2XWIDER_MODE, returning false if no such mode exists. * config/ia64/ia64.c (expand_vselect_vconcat): Likewise. * config/mips/mips.c (mips_expand_vselect_vconcat): Likewise. * expmed.c (init_expmed_one_mode): Update use of GET_MODE_WIDER_MODE. Avoid checking for a MODE_INT if we already know the mode is not a SCALAR_INT_MODE_P. (extract_high_half): Update use of GET_MODE_WIDER_MODE, forcing a wider mode to exist. (expmed_mult_highpart_optab): Likewise. (expmed_mult_highpart): Likewise. * expr.c (expand_expr_real_2): Update use of GET_MODE_WIDER_MODE, using else_void. * lto-streamer-in.c (lto_input_mode_table): Likewise. * optabs-query.c (find_widening_optab_handler_and_mode): Likewise. * stor-layout.c (bit_field_mode_iterator::next_mode): Likewise. * internal-fn.c (expand_mul_overflow): Update use of GET_MODE_2XWIDER_MODE. * omp-low.c (omp_clause_aligned_alignment): Likewise. * tree-ssa-math-opts.c (convert_mult_to_widen): Update use of GET_MODE_WIDER_MODE. (convert_plusminus_to_widen): Likewise. * tree-switch-conversion.c (array_value_type): Likewise. * var-tracking.c (emit_note_insn_var_location): Likewise. * tree-vrp.c (simplify_float_conversion_using_ranges): Likewise. Return false inside rather than outside the loop if no wider mode exists * optabs.c (expand_binop): Update use of GET_MODE_WIDER_MODE and GET_MODE_2XWIDER_MODE (can_compare_p): Use else_void. * gdbhooks.py (OptMachineModePrinter): New class. (build_pretty_printer): Use it for opt_mode. gcc/ada/ * gcc-interface/decl.c (validate_size): Update use of GET_MODE_WIDER_MODE, forcing a wider mode to exist. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r251457
2017-08-08trans.c: Include header files.Martin Liska1-0/+2
. 2017-08-08 Martin Liska <mliska@suse.cz> * gcc-interface/trans.c: Include header files. 2017-08-08 Martin Liska <mliska@suse.cz> * objc-gnu-runtime-abi-01.c: Include header files. * objc-next-runtime-abi-01.c: Likewise. * objc-next-runtime-abi-02.c: Likewise. 2017-08-08 Martin Liska <mliska@suse.cz> * asan.c: Include header files. * attribs.c (build_decl_attribute_variant): New function moved from tree.[ch]. (build_type_attribute_qual_variant): Likewise. (cmp_attrib_identifiers): Likewise. (simple_cst_list_equal): Likewise. (omp_declare_simd_clauses_equal): Likewise. (attribute_value_equal): Likewise. (comp_type_attributes): Likewise. (build_type_attribute_variant): Likewise. (lookup_ident_attribute): Likewise. (remove_attribute): Likewise. (merge_attributes): Likewise. (merge_type_attributes): Likewise. (merge_decl_attributes): Likewise. (merge_dllimport_decl_attributes): Likewise. (handle_dll_attribute): Likewise. (attribute_list_equal): Likewise. (attribute_list_contained): Likewise. * attribs.h (lookup_attribute): New function moved from tree.[ch]. (lookup_attribute_by_prefix): Likewise. * bb-reorder.c: Include header files. * builtins.c: Likewise. * calls.c: Likewise. * cfgexpand.c: Likewise. * cgraph.c: Likewise. * cgraphunit.c: Likewise. * convert.c: Likewise. * dwarf2out.c: Likewise. * final.c: Likewise. * fold-const.c: Likewise. * function.c: Likewise. * gimple-expr.c: Likewise. * gimple-fold.c: Likewise. * gimple-pretty-print.c: Likewise. * gimple.c: Likewise. * gimplify.c: Likewise. * hsa-common.c: Likewise. * hsa-gen.c: Likewise. * internal-fn.c: Likewise. * ipa-chkp.c: Likewise. * ipa-cp.c: Likewise. * ipa-devirt.c: Likewise. * ipa-fnsummary.c: Likewise. * ipa-inline.c: Likewise. * ipa-visibility.c: Likewise. * ipa.c: Likewise. * lto-cgraph.c: Likewise. * omp-expand.c: Likewise. * omp-general.c: Likewise. * omp-low.c: Likewise. * omp-offload.c: Likewise. * omp-simd-clone.c: Likewise. * opts-global.c: Likewise. * passes.c: Likewise. * predict.c: Likewise. * sancov.c: Likewise. * sanopt.c: Likewise. * symtab.c: Likewise. * toplev.c: Likewise. * trans-mem.c: Likewise. * tree-chkp.c: Likewise. * tree-eh.c: Likewise. * tree-into-ssa.c: Likewise. * tree-object-size.c: Likewise. * tree-parloops.c: Likewise. * tree-profile.c: Likewise. * tree-ssa-ccp.c: Likewise. * tree-ssa-live.c: Likewise. * tree-ssa-loop.c: Likewise. * tree-ssa-sccvn.c: Likewise. * tree-ssa-structalias.c: Likewise. * tree-ssa.c: Likewise. * tree-streamer-in.c: Likewise. * tree-vectorizer.c: Likewise. * tree-vrp.c: Likewise. * tsan.c: Likewise. * ubsan.c: Likewise. * varasm.c: Likewise. * varpool.c: Likewise. * tree.c: Remove functions moved to attribs.[ch]. * tree.h: Likewise. * config/aarch64/aarch64.c: Add attrs.h header file. * config/alpha/alpha.c: Likewise. * config/arc/arc.c: Likewise. * config/arm/arm.c: Likewise. * config/avr/avr.c: Likewise. * config/bfin/bfin.c: Likewise. * config/c6x/c6x.c: Likewise. * config/cr16/cr16.c: Likewise. * config/cris/cris.c: Likewise. * config/darwin.c: Likewise. * config/epiphany/epiphany.c: Likewise. * config/fr30/fr30.c: Likewise. * config/frv/frv.c: Likewise. * config/ft32/ft32.c: Likewise. * config/h8300/h8300.c: Likewise. * config/i386/winnt.c: Likewise. * config/ia64/ia64.c: Likewise. * config/iq2000/iq2000.c: Likewise. * config/lm32/lm32.c: Likewise. * config/m32c/m32c.c: Likewise. * config/m32r/m32r.c: Likewise. * config/m68k/m68k.c: Likewise. * config/mcore/mcore.c: Likewise. * config/microblaze/microblaze.c: Likewise. * config/mips/mips.c: Likewise. * config/mmix/mmix.c: Likewise. * config/mn10300/mn10300.c: Likewise. * config/moxie/moxie.c: Likewise. * config/msp430/msp430.c: Likewise. * config/nds32/nds32-isr.c: Likewise. * config/nds32/nds32.c: Likewise. * config/nios2/nios2.c: Likewise. * config/nvptx/nvptx.c: Likewise. * config/pa/pa.c: Likewise. * config/pdp11/pdp11.c: Likewise. * config/powerpcspe/powerpcspe.c: Likewise. * config/riscv/riscv.c: Likewise. * config/rl78/rl78.c: Likewise. * config/rx/rx.c: Likewise. * config/s390/s390.c: Likewise. * config/sh/sh.c: Likewise. * config/sol2.c: Likewise. * config/sparc/sparc.c: Likewise. * config/spu/spu.c: Likewise. * config/stormy16/stormy16.c: Likewise. * config/tilegx/tilegx.c: Likewise. * config/tilepro/tilepro.c: Likewise. * config/v850/v850.c: Likewise. * config/vax/vax.c: Likewise. * config/visium/visium.c: Likewise. * config/xtensa/xtensa.c: Likewise. 2017-08-08 Martin Liska <mliska@suse.cz> * call.c: Include header files. * cp-gimplify.c: Likewise. * cp-ubsan.c: Likewise. * cvt.c: Likewise. * init.c: Likewise. * search.c: Likewise. * semantics.c: Likewise. * typeck.c: Likewise. 2017-08-08 Martin Liska <mliska@suse.cz> * lto-lang.c: Include header files. * lto-symtab.c: Likewise. 2017-08-08 Martin Liska <mliska@suse.cz> * c-convert.c: Include header files. * c-typeck.c: Likewise. 2017-08-08 Martin Liska <mliska@suse.cz> * c-ada-spec.c: Include header files. * c-ubsan.c: Likewise. * c-warn.c: Likewise. 2017-08-08 Martin Liska <mliska@suse.cz> * trans-types.c: Include header files. From-SVN: r250946
2017-07-28re PR sanitizer/80998 (Implement -fsanitize=pointer-overflow)Jakub Jelinek1-0/+8
PR sanitizer/80998 * sanopt.c (pass_sanopt::execute): Handle IFN_UBSAN_PTR. * tree-ssa-alias.c (call_may_clobber_ref_p_1): Likewise. * flag-types.h (enum sanitize_code): Add SANITIZER_POINTER_OVERFLOW. Or it into SANITIZER_UNDEFINED. * ubsan.c: Include gimple-fold.h and varasm.h. (ubsan_expand_ptr_ifn): New function. (instrument_pointer_overflow): New function. (maybe_instrument_pointer_overflow): New function. (instrument_object_size): Formatting fix. (pass_ubsan::execute): Call instrument_pointer_overflow and maybe_instrument_pointer_overflow. * internal-fn.c (expand_UBSAN_PTR): New function. * ubsan.h (ubsan_expand_ptr_ifn): Declare. * sanitizer.def (__ubsan_handle_pointer_overflow, __ubsan_handle_pointer_overflow_abort): New builtins. * tree-ssa-tail-merge.c (merge_stmts_p): Handle IFN_UBSAN_PTR. * internal-fn.def (UBSAN_PTR): New internal function. * opts.c (sanitizer_opts): Add pointer-overflow. * lto-streamer-in.c (input_function): Handle IFN_UBSAN_PTR. * fold-const.c (build_range_check): Compute pointer range check in integral type if pointer arithmetics would be needed. Formatting fixes. gcc/testsuite/ * c-c++-common/ubsan/ptr-overflow-1.c: New test. * c-c++-common/ubsan/ptr-overflow-2.c: New test. libsanitizer/ * ubsan/ubsan_handlers.cc: Cherry-pick upstream r304461. * ubsan/ubsan_checks.inc: Likewise. * ubsan/ubsan_handlers.h: Likewise. From-SVN: r250656
2017-07-16profile-count.h (profile_probability::from_reg_br_prob_note, [...]): New ↵Jan Hubicka1-4/+8
functions. * profile-count.h (profile_probability::from_reg_br_prob_note, profile_probability::to_reg_br_prob_note): New functions. * doc/rtl.texi (REG_BR_PROB_NOTE): Update documentation. * reg-notes.h (REG_BR_PROB, REG_BR_PRED): Update docs. * predict.c (probability_reliable_p): Update. (edge_probability_reliable_p): Update. (br_prob_note_reliable_p): Update. (invert_br_probabilities): Update. (add_reg_br_prob_note): New function. (combine_predictions_for_insn): Update. * asan.c (asan_clear_shadow): Update. * cfgbuild.c (compute_outgoing_frequencies): Update. * cfgrtl.c (force_nonfallthru_and_redirect): Update. (update_br_prob_note): Update. (rtl_verify_edges): Update. (purge_dead_edges): Update. (fixup_reorder_chain): Update. * emit-rtl.c (try_split): Update. * ifcvt.c (cond_exec_process_insns): Update. (cond_exec_process_if_block): Update. (dead_or_predicable): Update. * internal-fn.c (expand_addsub_overflow): Update. (expand_neg_overflow): Update. (expand_mul_overflow): Update. * loop-doloop.c (doloop_modify): Update. * loop-unroll.c (compare_and_jump_seq): Update. * optabs.c (emit_cmp_and_jump_insn_1): Update. * predict.h: Update. * reorg.c (mostly_true_jump): Update. * rtl.h: Update. * config/aarch64/aarch64.c (aarch64_emit_unlikely_jump): Update. * config/alpha/alpha.c (emit_unlikely_jump): Update. * config/arc/arc.c: (emit_unlikely_jump): Update. * config/arm/arm.c: (emit_unlikely_jump): Update. * config/bfin/bfin.c (cbranch_predicted_taken_p): Update. * config/frv/frv.c (frv_print_operand_jump_hint): Update. * config/i386/i386.c (ix86_expand_split_stack_prologue): Update. (ix86_print_operand): Update. (ix86_split_fp_branch): Update. (predict_jump): Update. * config/ia64/ia64.c (ia64_print_operand): Update. * config/mmix/mmix.c (mmix_print_operand): Update. * config/powerpcspe/powerpcspe.c (output_cbranch): Update. (rs6000_expand_split_stack_prologue): Update. * config/rs6000/rs6000.c: Update. * config/s390/s390.c (s390_expand_vec_strlen): Update. (s390_expand_vec_movstr): Update. (s390_expand_cs_tdsi): Update. (s390_expand_split_stack_prologue): Update. * config/sh/sh.c (sh_print_operand): Update. (expand_cbranchsi4): Update. (expand_cbranchdi4): Update. * config/sparc/sparc.c (output_v9branch): Update. * config/spu/spu.c (get_branch_target): Update. (ea_load_store_inline): Update. * config/tilegx/tilegx.c (cbranch_predicted_p): Update. * config/tilepro/tilepro.c: Update. * gcc.dg/predict-8.c: Update. From-SVN: r250239
2017-07-06ASAN: Implement dynamic allocas/VLAs sanitization.Maxim Ostapenko1-0/+1
gcc/ * asan.c: Include gimple-fold.h. (get_last_alloca_addr): New function. (handle_builtin_stackrestore): Likewise. (handle_builtin_alloca): Likewise. (asan_emit_allocas_unpoison): Likewise. (get_mem_refs_of_builtin_call): Add new parameter, remove const quallifier from first paramerer. Handle BUILT_IN_ALLOCA, BUILT_IN_ALLOCA_WITH_ALIGN and BUILT_IN_STACK_RESTORE builtins. (instrument_builtin_call): Pass gimple iterator to get_mem_refs_of_builtin_call. (last_alloca_addr): New global. * asan.h (asan_emit_allocas_unpoison): Declare. * builtins.c (expand_asan_emit_allocas_unpoison): New function. (expand_builtin): Handle BUILT_IN_ASAN_ALLOCAS_UNPOISON. * cfgexpand.c (expand_used_vars): Call asan_emit_allocas_unpoison if function calls alloca. * gimple-fold.c (replace_call_with_value): Remove static keyword. * gimple-fold.h (replace_call_with_value): Declare. * internal-fn.c: Include asan.h. * sanitizer.def (BUILT_IN_ASAN_ALLOCA_POISON, BUILT_IN_ASAN_ALLOCAS_UNPOISON): New builtins. gcc/testsuite/ * c-c++-common/asan/alloca_big_alignment.c: New test. * c-c++-common/asan/alloca_detect_custom_size.c: Likewise. * c-c++-common/asan/alloca_instruments_all_paddings.c: Likewise. * c-c++-common/asan/alloca_loop_unpoisoning.c: Likewise. * c-c++-common/asan/alloca_overflow_partial.c: Likewise. * c-c++-common/asan/alloca_overflow_right.c: Likewise. * c-c++-common/asan/alloca_safe_access.c: Likewise. * c-c++-common/asan/alloca_underflow_left.c: Likewise. From-SVN: r250031
2017-07-05Remove enum before machine_modeRichard Sandiford1-3/+3
r216834 did a mass removal of "enum" before "machine_mode". This patch removes some new uses that have been added since then. 2017-07-05 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * combine.c (simplify_if_then_else): Remove "enum" before "machine_mode". * compare-elim.c (can_eliminate_compare): Likewise. * config/aarch64/aarch64-builtins.c (aarch64_simd_builtin_std_type): Likewise. (aarch64_lookup_simd_builtin_type): Likewise. (aarch64_simd_builtin_type): Likewise. (aarch64_init_simd_builtin_types): Likewise. (aarch64_simd_expand_args): Likewise. * config/aarch64/aarch64-protos.h (aarch64_simd_attr_length_rglist): Likewise. (aarch64_reverse_mask): Likewise. (aarch64_simd_emit_reg_reg_move): Likewise. (aarch64_gen_adjusted_ldpstp): Likewise. (aarch64_ccmp_mode_to_code): Likewise. (aarch64_operands_ok_for_ldpstp): Likewise. (aarch64_operands_adjust_ok_for_ldpstp): Likewise. * config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class): Likewise. (aarch64_min_divisions_for_recip_mul): Likewise. (aarch64_reassociation_width): Likewise. (aarch64_get_condition_code_1): Likewise. (aarch64_simd_emit_reg_reg_move): Likewise. (aarch64_simd_attr_length_rglist): Likewise. (aarch64_reverse_mask): Likewise. (aarch64_operands_ok_for_ldpstp): Likewise. (aarch64_operands_adjust_ok_for_ldpstp): Likewise. (aarch64_gen_adjusted_ldpstp): Likewise. * config/aarch64/cortex-a57-fma-steering.c (fma_node::rename): Likewise. * config/arc/arc.c (legitimate_offset_address_p): Likewise. * config/arm/arm-builtins.c (arm_simd_builtin_std_type): Likewise. (arm_lookup_simd_builtin_type): Likewise. (arm_simd_builtin_type): Likewise. (arm_init_simd_builtin_types): Likewise. (arm_expand_builtin_args): Likewise. * config/arm/arm-protos.h (arm_expand_builtin): Likewise. * config/ft32/ft32.c (ft32_libcall_value): Likewise. (ft32_setup_incoming_varargs): Likewise. (ft32_function_arg): Likewise. (ft32_function_arg_advance): Likewise. (ft32_pass_by_reference): Likewise. (ft32_arg_partial_bytes): Likewise. (ft32_valid_pointer_mode): Likewise. (ft32_addr_space_pointer_mode): Likewise. (ft32_addr_space_legitimate_address_p): Likewise. * config/i386/i386-protos.h (ix86_operands_ok_for_move_multiple): Likewise. * config/i386/i386.c (ix86_setup_incoming_vararg_bounds): Likewise. (ix86_emit_outlined_ms2sysv_restore): Likewise. (iamcu_alignment): Likewise. (canonicalize_vector_int_perm): Likewise. (ix86_noce_conversion_profitable_p): Likewise. (ix86_mpx_bound_mode): Likewise. (ix86_operands_ok_for_move_multiple): Likewise. * config/microblaze/microblaze-protos.h (microblaze_expand_conditional_branch_reg): Likewise. * config/microblaze/microblaze.c (microblaze_expand_conditional_branch_reg): Likewise. * config/powerpcspe/powerpcspe.c (rs6000_init_hard_regno_mode_ok): Likewise. (rs6000_reassociation_width): Likewise. (rs6000_invalid_binary_op): Likewise. (fusion_p9_p): Likewise. (emit_fusion_p9_load): Likewise. (emit_fusion_p9_store): Likewise. * config/riscv/riscv-protos.h (riscv_regno_mode_ok_for_base_p): Likewise. (riscv_hard_regno_mode_ok_p): Likewise. (riscv_address_insns): Likewise. (riscv_split_symbol): Likewise. (riscv_legitimize_move): Likewise. (riscv_function_value): Likewise. (riscv_hard_regno_nregs): Likewise. (riscv_expand_builtin): Likewise. * config/riscv/riscv.c (riscv_build_integer_1): Likewise. (riscv_build_integer): Likewise. (riscv_split_integer): Likewise. (riscv_legitimate_constant_p): Likewise. (riscv_cannot_force_const_mem): Likewise. (riscv_regno_mode_ok_for_base_p): Likewise. (riscv_valid_base_register_p): Likewise. (riscv_valid_offset_p): Likewise. (riscv_valid_lo_sum_p): Likewise. (riscv_classify_address): Likewise. (riscv_legitimate_address_p): Likewise. (riscv_address_insns): Likewise. (riscv_load_store_insns): Likewise. (riscv_force_binary): Likewise. (riscv_split_symbol): Likewise. (riscv_force_address): Likewise. (riscv_legitimize_address): Likewise. (riscv_move_integer): Likewise. (riscv_legitimize_const_move): Likewise. (riscv_legitimize_move): Likewise. (riscv_address_cost): Likewise. (riscv_subword): Likewise. (riscv_output_move): Likewise. (riscv_canonicalize_int_order_test): Likewise. (riscv_emit_int_order_test): Likewise. (riscv_function_arg_boundary): Likewise. (riscv_pass_mode_in_fpr_p): Likewise. (riscv_pass_fpr_single): Likewise. (riscv_pass_fpr_pair): Likewise. (riscv_get_arg_info): Likewise. (riscv_function_arg): Likewise. (riscv_function_arg_advance): Likewise. (riscv_arg_partial_bytes): Likewise. (riscv_function_value): Likewise. (riscv_pass_by_reference): Likewise. (riscv_setup_incoming_varargs): Likewise. (riscv_print_operand): Likewise. (riscv_elf_select_rtx_section): Likewise. (riscv_save_restore_reg): Likewise. (riscv_for_each_saved_reg): Likewise. (riscv_register_move_cost): Likewise. (riscv_hard_regno_mode_ok_p): Likewise. (riscv_hard_regno_nregs): Likewise. (riscv_class_max_nregs): Likewise. (riscv_memory_move_cost): Likewise. * config/rl78/rl78-protos.h (rl78_split_movsi): Likewise. * config/rl78/rl78.c (rl78_split_movsi): Likewise. (rl78_addr_space_address_mode): Likewise. * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Likewise. * config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Likewise. (rs6000_reassociation_width): Likewise. (rs6000_invalid_binary_op): Likewise. (fusion_p9_p): Likewise. (emit_fusion_p9_load): Likewise. (emit_fusion_p9_store): Likewise. * config/visium/visium-protos.h (prepare_move_operands): Likewise. (ok_for_simple_move_operands): Likewise. (ok_for_simple_move_strict_operands): Likewise. (ok_for_simple_arith_logic_operands): Likewise. (visium_legitimize_reload_address): Likewise. (visium_select_cc_mode): Likewise. (output_cbranch): Likewise. (visium_split_double_move): Likewise. (visium_expand_copysign): Likewise. (visium_expand_int_cstore): Likewise. (visium_expand_fp_cstore): Likewise. * config/visium/visium.c (visium_pass_by_reference): Likewise. (visium_function_arg): Likewise. (visium_function_arg_advance): Likewise. (visium_libcall_value): Likewise. (visium_setup_incoming_varargs): Likewise. (visium_legitimate_constant_p): Likewise. (visium_legitimate_address_p): Likewise. (visium_legitimize_address): Likewise. (visium_secondary_reload): Likewise. (visium_register_move_cost): Likewise. (visium_memory_move_cost): Likewise. (prepare_move_operands): Likewise. (ok_for_simple_move_operands): Likewise. (ok_for_simple_move_strict_operands): Likewise. (ok_for_simple_arith_logic_operands): Likewise. (visium_function_value_1): Likewise. (rtx_ok_for_offset_p): Likewise. (visium_legitimize_reload_address): Likewise. (visium_split_double_move): Likewise. (visium_expand_copysign): Likewise. (visium_expand_int_cstore): Likewise. (visium_expand_fp_cstore): Likewise. (visium_split_cstore): Likewise. (visium_select_cc_mode): Likewise. (visium_split_cbranch): Likewise. (output_cbranch): Likewise. (visium_print_operand_address): Likewise. * expmed.c (flip_storage_order): Likewise. * expmed.h (emit_cstore): Likewise. (flip_storage_order): Likewise. * genrecog.c (validate_pattern): Likewise. * hsa-gen.c (gen_hsa_addr): Likewise. * internal-fn.c (expand_arith_overflow): Likewise. * ira-color.c (allocno_copy_cost_saving): Likewise. * lra-assigns.c (find_hard_regno_for_1): Likewise. * lra-constraints.c (prohibited_class_reg_set_mode_p): Likewise. (process_invariant_for_inheritance): Likewise. * lra-eliminations.c (move_plus_up): Likewise. * omp-low.c (lower_oacc_reductions): Likewise. * simplify-rtx.c (simplify_subreg): Likewise. * target.def (TARGET_SETUP_INCOMING_VARARG_BOUNDS): Likewise. (TARGET_CHKP_BOUND_MODE): Likewise.. * targhooks.c (default_chkp_bound_mode): Likewise. (default_setup_incoming_vararg_bounds): Likewise. * targhooks.h (default_chkp_bound_mode): Likewise. (default_setup_incoming_vararg_bounds): Likewise. * tree-ssa-math-opts.c (divmod_candidate_p): Likewise. * tree-vect-loop.c (calc_vec_perm_mask_for_shift): Likewise. (have_whole_vector_shift): Likewise. * tree-vect-stmts.c (vectorizable_load): Likewise. * doc/tm.texi: Regenerate. gcc/brig/ * brig-c.h (brig_type_for_mode): Remove "enum" before "machine_mode". * brig-lang.c (brig_langhook_type_for_mode): Likewise. gcc/jit/ * dummy-frontend.c (jit_langhook_type_for_mode): Remove "enum" before "machine_mode". Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r250003
2017-07-05cfgloop.h (struct loop): Add comment.Bin Cheng1-0/+8
* cfgloop.h (struct loop): Add comment. New field orig_loop_num. * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change. * internal-fn.c (expand_LOOP_DIST_ALIAS): New function. * internal-fn.def (LOOP_DIST_ALIAS): New. * tree-vectorizer.c (fold_loop_vectorized_call): Rename to ... (fold_loop_internal_call): ... this. (vect_loop_dist_alias_call): New function. (set_uid_loop_bbs): Call fold_loop_internal_call. (vectorize_loops): Fold IFN_LOOP_VECTORIZED and IFN_LOOP_DIST_ALIAS internal calls. From-SVN: r249983
2017-06-29asan.c (asan_emit_stack_protection): Update.Jan Hubicka1-49/+49
* asan.c (asan_emit_stack_protection): Update. (create_cond_insert_point): Update. * auto-profile.c (afdo_propagate_circuit): Update. * basic-block.h (struct edge_def): Turn probability to profile_probability. (EDGE_FREQUENCY): Update. * bb-reorder.c (find_traces_1_round): Update. (better_edge_p): Update. (sanitize_hot_paths): Update. * cfg.c (unchecked_make_edge): Initialize probability to uninitialized. (make_single_succ_edge): Update. (check_bb_profile): Update. (dump_edge_info): Update. (update_bb_profile_for_threading): Update. * cfganal.c (connect_infinite_loops_to_exit): Initialize new edge probabilitycount to 0. * cfgbuild.c (compute_outgoing_frequencies): Update. * cfgcleanup.c (try_forward_edges): Update. (outgoing_edges_match): Update. (try_crossjump_to_edge): Update. * cfgexpand.c (expand_gimple_cond): Update make_single_succ_edge. (expand_gimple_tailcall): Update. (construct_init_block): Use make_single_succ_edge. (construct_exit_block): Use make_single_succ_edge. * cfghooks.c (verify_flow_info): Update. (redirect_edge_succ_nodup): Update. (split_edge): Update. (account_profile_record): Update. * cfgloopanal.c (single_likely_exit): Update. * cfgloopmanip.c (scale_loop_profile): Update. (set_zero_probability): Remove. (duplicate_loop_to_header_edge): Update. * cfgloopmanip.h (loop_version): Update prototype. * cfgrtl.c (try_redirect_by_replacing_jump): Update. (force_nonfallthru_and_redirect): Update. (update_br_prob_note): Update. (rtl_verify_edges): Update. (purge_dead_edges): Update. (rtl_lv_add_condition_to_bb): Update. * cgraph.c: (cgraph_edge::redirect_call_stmt_to_calle): Update. * cgraphunit.c (init_lowered_empty_function): Update. (cgraph_node::expand_thunk): Update. * cilk-common.c: Include profile-count.h * dojump.c (inv): Remove. (jumpifnot): Update. (jumpifnot_1): Update. (do_jump_1): Update. (do_jump): Update. (do_jump_by_parts_greater_rtx): Update. (do_compare_rtx_and_jump): Update. * dojump.h (jumpifnot, jumpifnot_1, jumpif_1, jumpif, do_jump, do_jump_1. do_compare_rtx_and_jump): Update prototype. * dwarf2cfi.c: Include profile-count.h * except.c (dw2_build_landing_pads): Use make_single_succ_edge. (sjlj_emit_dispatch_table): Likewise. * explow.c: Include profile-count.h * expmed.c (emit_store_flag_force): Update. (do_cmp_and_jump): Update. * expr.c (compare_by_pieces_d::generate): Update. (compare_by_pieces_d::finish_mode): Update. (emit_block_move_via_loop): Update. (store_expr_with_bounds): Update. (store_constructor): Update. (expand_expr_real_2): Update. (expand_expr_real_1): Update. * expr.h (try_casesi, try_tablejump): Update prototypes. * gimple-pretty-print.c (dump_probability): Update. (dump_profile): New. (dump_gimple_label): Update. (dump_gimple_bb_header): Update. * graph.c (draw_cfg_node_succ_edges): Update. * hsa-gen.c (convert_switch_statements): Update. * ifcvt.c (cheap_bb_rtx_cost_p): Update. (find_if_case_1): Update. (find_if_case_2): Update. * internal-fn.c (expand_arith_overflow_result_store): Update. (expand_addsub_overflow): Update. (expand_neg_overflow): Update. (expand_mul_overflow): Update. (expand_vector_ubsan_overflow): Update. * ipa-cp.c (good_cloning_opportunity_p): Update. * ipa-split.c (split_function): Use make_single_succ_edge. * ipa-utils.c (ipa_merge_profiles): Update. * loop-doloop.c (add_test): Update. (doloop_modify): Update. * loop-unroll.c (compare_and_jump_seq): Update. (unroll_loop_runtime_iterations): Update. * lra-constraints.c (lra_inheritance): Update. * lto-streamer-in.c (input_cfg): Update. * lto-streamer-out.c (output_cfg): Update. * mcf.c (adjust_cfg_counts): Update. * modulo-sched.c (sms_schedule): Update. * omp-expand.c (expand_omp_for_init_counts): Update. (extract_omp_for_update_vars): Update. (expand_omp_ordered_sink): Update. (expand_omp_for_ordered_loops): Update. (expand_omp_for_generic): Update. (expand_omp_for_static_nochunk): Update. (expand_omp_for_static_chunk): Update. (expand_cilk_for): Update. (expand_omp_simd): Update. (expand_omp_taskloop_for_outer): Update. (expand_omp_taskloop_for_inner): Update. * omp-simd-clone.c (simd_clone_adjust): Update. * optabs.c (expand_doubleword_shift): Update. (expand_abs): Update. (emit_cmp_and_jump_insn_1): Update. (expand_compare_and_swap_loop): Update. * optabs.h (emit_cmp_and_jump_insns): Update prototype. * predict.c (predictable_edge_p): Update. (edge_probability_reliable_p): Update. (set_even_probabilities): Update. (combine_predictions_for_insn): Update. (combine_predictions_for_bb): Update. (propagate_freq): Update. (estimate_bb_frequencies): Update. (force_edge_cold): Update. * profile-count.c (profile_count::dump): Add missing space into dump. (profile_count::debug): Add newline. (profile_count::differs_from_p): Explicitly convert to unsigned. (profile_count::stream_in): Update. (profile_probability::dump): New member function. (profile_probability::debug): New member function. (profile_probability::differs_from_p): New member function. (profile_probability::differs_lot_from_p): New member function. (profile_probability::stream_in): New member function. (profile_probability::stream_out): New member function. * profile-count.h (profile_count_quality): Rename to ... (profile_quality): ... this one. (profile_probability): New. (profile_count): Update. * profile.c (compute_branch_probabilities): Update. * recog.c (peep2_attempt): Update. * sched-ebb.c (schedule_ebbs): Update. * sched-rgn.c (find_single_block_region): Update. (compute_dom_prob_ps): Update. (schedule_region): Update. * sel-sched-ir.c (compute_succs_info): Update. * stmt.c (struct case_node): Update. (do_jump_if_equal): Update. (get_outgoing_edge_probs): Update. (conditional_probability): Update. (emit_case_dispatch_table): Update. (expand_case): Update. (expand_sjlj_dispatch_table): Update. (emit_case_nodes): Update. * targhooks.c: Update. * tracer.c (better_p): Update. (find_best_successor): Update. * trans-mem.c (expand_transaction): Update. * tree-call-cdce.c: Update. * tree-cfg.c (gimple_split_edge): Upate. (move_sese_region_to_fn): Upate. * tree-cfgcleanup.c (cleanup_control_expr_graph): Upate. * tree-eh.c (lower_resx): Upate. (cleanup_empty_eh_move_lp): Upate. * tree-if-conv.c (version_loop_for_if_conversion): Update. * tree-inline.c (copy_edges_for_bb): Update. (copy_cfg_body): Update. * tree-parloops.c (gen_parallel_loop): Update. * tree-profile.c (gimple_gen_ic_func_profiler): Update. (gimple_gen_time_profiler): Update. * tree-ssa-dce.c (remove_dead_stmt): Update. * tree-ssa-ifcombine.c (update_profile_after_ifcombine): Update. * tree-ssa-loop-im.c (execute_sm_if_changed): Update. * tree-ssa-loop-ivcanon.c (remove_exits_and_undefined_stmts): Update. (unloop_loops): Update. (try_peel_loop): Update. * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Update. * tree-ssa-loop-split.c (connect_loops): Update. (split_loop): Update. * tree-ssa-loop-unswitch.c (tree_unswitch_loop): Update. (hoist_guard): Update. * tree-ssa-phionlycprop.c (propagate_rhs_into_lhs): Update. * tree-ssa-phiopt.c (replace_phi_edge_with_variable): Update. (value_replacement): Update. * tree-ssa-reassoc.c (branch_fixup): Update. * tree-ssa-tail-merge.c (replace_block_by): Update. * tree-ssa-threadupdate.c (remove_ctrl_stmt_and_useless_edges): Update. (create_edge_and_update_destination_phis): Update. (compute_path_counts): Update. (recompute_probabilities): Update. (update_joiner_offpath_counts): Update. (freqs_to_counts_path): Update. (duplicate_thread_path): Update. * tree-switch-conversion.c (hoist_edge_and_branch_if_true): Update. (struct switch_conv_info): Update. (gen_inbound_check): Update. * tree-vect-loop-manip.c (slpeel_add_loop_guard): Update. (vect_do_peeling): Update. (vect_loop_versioning): Update. * tree-vect-loop.c (scale_profile_for_vect_loop): Update. (optimize_mask_stores): Update. * ubsan.c (ubsan_expand_null_ifn): Update. * value-prof.c (gimple_divmod_fixed_value): Update. (gimple_divmod_fixed_value_transform): Update. (gimple_mod_pow2): Update. (gimple_mod_pow2_value_transform): Update. (gimple_mod_subtract): Update. (gimple_mod_subtract_transform): Update. (gimple_ic): Update. (gimple_stringop_fixed_value): Update. (gimple_stringops_transform): Update. * value-prof.h: Update. From-SVN: r249800
2017-03-28OpenMP/PTX privatization in SIMD regionsAlexander Monakov1-0/+42
* config/nvptx/nvptx-protos.h (nvptx_output_simt_enter): Declare. (nvptx_output_simt_exit): Declare. * config/nvptx/nvptx.c (nvptx_init_unisimt_predicate): Use cfun->machine->unisimt_location. Handle NULL unisimt_predicate. (init_softstack_frame): Move initialization of crtl->is_leaf to... (nvptx_declare_function_name): ...here. Emit declaration of local memory space buffer for omp_simt_enter insn. (nvptx_output_unisimt_switch): New. (nvptx_output_softstack_switch): New. (nvptx_output_simt_enter): New. (nvptx_output_simt_exit): New. * config/nvptx/nvptx.h (struct machine_function): New fields has_simtreg, unisimt_location, simt_stack_size, simt_stack_align. * config/nvptx/nvptx.md (UNSPECV_SIMT_ENTER): New unspec. (UNSPECV_SIMT_EXIT): Ditto. (omp_simt_enter_insn): New insn. (omp_simt_enter): New expansion. (omp_simt_exit): New insn. * config/nvptx/nvptx.opt (msoft-stack-reserve-local): New option. * internal-fn.c (expand_GOMP_SIMT_ENTER): New. (expand_GOMP_SIMT_ENTER_ALLOC): New. (expand_GOMP_SIMT_EXIT): New. * internal-fn.def (GOMP_SIMT_ENTER): New internal function. (GOMP_SIMT_ENTER_ALLOC): Ditto. (GOMP_SIMT_EXIT): Ditto. * target-insns.def (omp_simt_enter): New insn. (omp_simt_exit): Ditto. * omp-low.c (struct omplow_simd_context): New fields simt_eargs, simt_dlist. (lower_rec_simd_input_clauses): Implement SIMT privatization. (lower_rec_input_clauses): Likewise. (lower_lastprivate_clauses): Handle SIMT privatization. * omp-offload.c: Include langhooks.h, tree-nested.h, stor-layout.h. (ompdevlow_adjust_simt_enter): New. (find_simtpriv_var_op): New. (execute_omp_device_lower): Handle IFN_GOMP_SIMT_ENTER, IFN_GOMP_SIMT_ENTER_ALLOC, IFN_GOMP_SIMT_EXIT. * tree-inline.h (struct copy_body_data): New field dst_simt_vars. * tree-inline.c (expand_call_inline): Handle SIMT privatization. (copy_decl_for_dup_finish): Ditto. * tree-ssa.c (execute_update_addresses_taken): Handle GOMP_SIMT_ENTER. From-SVN: r246550
2017-03-08re PR target/79904 (ICE in annotate_constant_pool_refs, at ↵Jakub Jelinek1-6/+14
config/s390/s390.c:7909) PR sanitizer/79904 * internal-fn.c (expand_vector_ubsan_overflow): If arg0 or arg1 is a uniform vector, use uniform_vector_p return value instead of building ARRAY_REF on folded VIEW_CONVERT_EXPR to array type. * gcc.dg/ubsan/pr79904.c: New test. From-SVN: r245967
2017-02-23re PR middle-end/79665 (gcc's signed (x*x)/200 is slower than clang's)Jakub Jelinek1-80/+0
PR middle-end/79665 * internal-fn.c (get_range_pos_neg): Moved to ... * tree.c (get_range_pos_neg): ... here. No longer static. * tree.h (get_range_pos_neg): New prototype. * expr.c (expand_expr_real_2) <case TRUNC_DIV_EXPR>: If both arguments are known to be in between 0 and signed maximum inclusive, try to expand both unsigned and signed divmod and use the cheaper one from those. From-SVN: r245676
2017-02-11re PR middle-end/79454 (c-c++-common/ubsan/overflow-vec-*.c FAILs on some ↵Jakub Jelinek1-1/+1
64-bit BE targets) PR middle-end/79454 * internal-fn.c (expand_vector_ubsan_overflow): Use piece-wise result computation whenever lhs doesn't have vector mode, not just when it has BLKmode. From-SVN: r245354
2017-02-09gimplify.c (gimplify_scan_omp_clauses): No special handling for OMP_CLAUSE_TILE.Chung-Lin Tang1-0/+8
2017-02-09 Nathan Sidwell <nathan@codesourcery.com> Cesar Philippidis <cesar@codesourcery.com> Joseph Myers <joseph@codesourcery.com> Chung-Lin Tang <cltang@codesourcery.com> gcc/ * gimplify.c (gimplify_scan_omp_clauses): No special handling for OMP_CLAUSE_TILE. (gimplify_adjust_omp_clauses): Don't delete TILE. (gimplify_omp_for): Deal with TILE. * internal-fn.c (expand_GOACC_TILE): New function. * internal-fn.def (GOACC_DIM_POS): Comment may be overly conservative. (GOACC_TILE): New. * omp-expand.c (struct oacc_collapse): Add tile and outer fields. (expand_oacc_collapse_init): Add LOC paramter. Initialize tile element fields. (expand_oacc_collapse_vars): Add INNER parm, adjust for tiling, avoid DIV for outermost collapse var. (expand_oacc_for): Insert tile element loop as needed. Adjust. Remove out of date comments, fix whitespace. * omp-general.c (omp_extract_for_data): Deal with tiling. * omp-general.h (enum oacc_loop_flags): Add OLF_TILE flag, adjust OLF_DIM_BASE value. (struct omp_for_data): Add tiling field. * omp-low.c (scan_sharing_clauses): Allow OMP_CLAUSE_TILE. (lower_oacc_head_mark): Add OLF_TILE as appropriate. Ensure 2 levels for auto loops. Remove default auto determining, moved to oacc_loop_fixed_partitions. * omp-offload.c (struct oacc_loop): Change 'ifns' to vector of call stmts, add e_mask field. (oacc_dim_call): New function, abstracted out from oacc_thread_numbers. (oacc_thread_numbers): Use oacc_dim_call. (oacc_xform_tile): New. (new_oacc_loop_raw): Initialize e_mask, adjust for ifns vector. (finish_oacc_loop): Adjust for ifns vector. (oacc_loop_discover_walk): Append loop abstraction sites to list, add case for GOACC_TILE fns. (oacc_loop_xform_loop): Delete. (oacc_loop_process): Iterate over call list directly, and add handling for GOACC_TILE fns. (oacc_loop_fixed_partitions): Determine default auto, deal with TILE, dump partitioning. (oacc_loop_auto_partitions): Add outer_assign parm. Assign all but vector partitioning to outer loops. Assign 2 partitions to loops when available. Add TILE handling. (oacc_loop_partition): Adjust oacc_loop_auto_partitions call. (execite_oacc_device_lower): Process GOACC_TILE fns, ignore unknown specs. * tree-nested.c (convert_nonlocal_omp_clauses): Allow OMP_CLAUSE_TILE. * tree.c (omp_clause_num_ops): Adjust TILE ops. * tree.h (OMP_CLAUSE_TILE_ITERVAR, OMP_CLAUSE_TILE_COUNT): New. gcc/c/ * c-parser.c (c_parser_omp_clause_collapse): Disallow tile. (c_parser_oacc_clause_tile): Disallow collapse. Fix parsing and semantic checking. * c-parser.c (c_parser_omp_for_loop): Accept tiling constructs. gcc/cp/ * parser.c (cp_parser_oacc_clause_tile): Disallow collapse. Fix parsing. Parse constant expression. Remove semantic checking. (cp_parser_omp_clause_collapse): Disallow tile. (cp_parser_omp_for_loop): Deal with tile clause. Don't emit a parse error about missing for after already emitting one. Use more conventional for idiom for unbounded loop. * pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_TILE. * semantics.c (finish_omp_clauses): Correct TILE semantic check. (finish_omp_for): Deal with tile clause. gcc/fortran/ * openmp.c (resolve_omp_clauses): Error on directives containing both tile and collapse clauses. (resolve_oacc_loop_blocks): Represent '*' tile arguments as zero. * trans-openmp.c (gfc_trans_omp_do): Lower tiled loops like collapsed loops. gcc/testsuite/ * c-c++-common/goacc/combined-directives.c: Remove xfail. * c-c++-common/goacc/loop-auto-1.c: Adjust and add additional case. * c-c++-common/goacc/loop-auto-2.c: New. * c-c++-common/goacc/tile.c: Include stdbool, fix expected errors. * c-c++-common/goacc/tile-2.c: New. * g++.dg/goacc/template.C: Test tile subst. Adjust erroneous uses. * g++.dg/goacc/tile-1.C: New, check tile subst. * gcc.dg/goacc/loop-processing-1.c: Adjust dg-final pattern. * gfortran.dg/goacc/combined-directives.f90: Remove xfail. * gfortran.dg/goacc/tile-1.f90: New test. * gfortran.dg/goacc/tile-2.f90: New test. * gfortran.dg/goacc/tile-lowering.f95: New test. libgomp/ * testsuite/libgomp.oacc-c-c++-common/tile-1.c: New. * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Adjust and add additional case. * testsuite/libgomp.oacc-c-c++-common/vprop.c: XFAIL under "openacc_nvidia_accel_selected". * libgomp.oacc-fortran/nested-function-1.f90 (test2): Add num_workers(8) clause. From-SVN: r245300
2017-01-23use-after-scope: handle writes to a poisoned variableMartin Liska1-0/+8
2017-01-23 Martin Liska <mliska@suse.cz> * gcc.dg/asan/use-after-scope-10.c: New test. * gcc.dg/asan/use-after-scope-11.c: New test. * g++.dg/asan/use-after-scope-5.C: New test. 2017-01-23 Jakub Jelinek <jakub@redhat.com> Martin Liska <mliska@suse.cz> * asan.h: Define ASAN_USE_AFTER_SCOPE_ATTRIBUTE. * asan.c (asan_expand_poison_ifn): Support stores and use appropriate ASAN report function. * internal-fn.c (expand_ASAN_POISON_USE): New function. * internal-fn.def (ASAN_POISON_USE): Declare. * tree-into-ssa.c (maybe_add_asan_poison_write): New function. (maybe_register_def): Create ASAN_POISON_USE when sanitizing. * tree-ssa-dce.c (eliminate_unnecessary_stmts): Remove ASAN_POISON calls w/o LHS. * tree-ssa.c (execute_update_addresses_taken): Create clobber for ASAN_MARK (UNPOISON, &x, ...) in order to prevent usage of a LHS from ASAN_MARK (POISON, &x, ...) coming to a PHI node. * gimplify.c (asan_poison_variables): Add attribute use_after_scope_memory to variables that really needs to live in memory. * tree-ssa.c (is_asan_mark_p): Do not rewrite into SSA when having the attribute. From-SVN: r244793
2017-01-23Speed up use-after-scope (v2): rewrite into SSAMartin Liska1-0/+7
2017-01-23 Martin Liska <mliska@suse.cz> * asan.c (create_asan_shadow_var): New function. (asan_expand_poison_ifn): Likewise. * asan.h (asan_expand_poison_ifn): New declaration. * internal-fn.c (expand_ASAN_POISON): Likewise. * internal-fn.def (ASAN_POISON): New builtin. * sanopt.c (pass_sanopt::execute): Expand asan_expand_poison_ifn. * tree-inline.c (copy_decl_for_dup_finish): Make function external. * tree-inline.h (copy_decl_for_dup_finish): Likewise. * tree-ssa.c (is_asan_mark_p): New function. (execute_update_addresses_taken): Rewrite local variables (identified just by use-after-scope as addressable) into SSA. 2017-01-23 Martin Liska <mliska@suse.cz> * gcc.dg/asan/use-after-scope-3.c: Add additional flags. * gcc.dg/asan/use-after-scope-9.c: Likewise and grep for sanopt optimization for ASAN_POISON. From-SVN: r244791