Age | Commit message (Collapse) | Author | Files | Lines |
|
We couldn't vectorise:
for (int j = 0; j < n; ++j)
{
for (int i = 0; i < 16; ++i)
a[i] = (b[i] + c[i]) >> 1;
a += step;
b += step;
c += step;
}
at -O3 because cunrolli unrolled the inner loop and SLP couldn't handle
AVG_FLOOR patterns (see also PR86504). The problem was some overly
strict checking of pattern statements compared to normal statements
in vect_get_and_check_slp_defs:
switch (gimple_code (def_stmt))
{
case GIMPLE_PHI:
case GIMPLE_ASSIGN:
break;
default:
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
"unsupported defining stmt:\n");
return -1;
}
The easy fix would have been to add GIMPLE_CALL to the switch,
but I don't think the switch is doing anything useful. We only create
pattern statements that the rest of the vectoriser can handle, and the
other checks in this function and elsewhere check whether SLP is possible.
I'm also not sure why:
if (!first && !oprnd_info->first_pattern
/* Allow different pattern state for the defs of the
first stmt in reduction chains. */
&& (oprnd_info->first_dt != vect_reduction_def
is necessary. All that should matter is that the statements in the
node are "similar enough". It turned out to be quite hard to find a
convincing example that used a mixture of pattern and non-pattern
statements, so bb-slp-pow-1.c is the best I could come up with.
But it does show that the combination of "xi * xi" statements and
"pow (xj, 2) -> xj * xj" patterns are handled correctly.
The patch therefore just removes the whole if block.
The loop also needed commutative swapping to be extended to at least
AVG_FLOOR.
This gives +3.9% on 525.x264_r at -O3.
2018-08-03 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* internal-fn.h (first_commutative_argument): Declare.
* internal-fn.c (first_commutative_argument): New function.
* tree-vect-slp.c (vect_get_and_check_slp_defs): Remove extra
restrictions for pattern statements. Use first_commutative_argument
to look for commutative operands in calls to internal functions.
gcc/testsuite/
* gcc.dg/vect/bb-slp-over-widen-1.c: Expect AVG_FLOOR to be used
on vect_avg_qi targets.
* gcc.dg/vect/bb-slp-over-widen-2.c: Likewise.
* gcc.dg/vect/bb-slp-pow-1.c: New test.
* gcc.dg/vect/vect-avg-15.c: Likewise.
From-SVN: r263290
|
|
This patch uses IFN_COND_* to vectorise conditionally-executed,
potentially-trapping arithmetic, such as most floating-point
ops with -ftrapping-math. E.g.:
if (cond) { ... x = a + b; ... }
becomes:
...
x = .COND_ADD (cond, a, b, else_value);
...
When this transformation is done on its own, the value of x for
!cond isn't important, so else_value is simply the target's
preferred_else_value (i.e. the value it can handle the most
efficiently).
However, the patch also looks for the equivalent of:
y = cond ? x : c;
in which the "then" value is the result of the conditionally-executed
operation and the "else" value "c" is some value that is available at x.
In that case we can instead use:
x = .COND_ADD (cond, a, b, c);
and replace uses of y with uses of x.
The patch also looks for:
y = !cond ? c : x;
which can be transformed in the same way. This involved adding a new
utility function inverse_conditions_p, which was already open-coded
in a more limited way in match.pd.
2018-07-12 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* fold-const.h (inverse_conditions_p): Declare.
* fold-const.c (inverse_conditions_p): New function.
* match.pd: Use inverse_conditions_p. Add folds of view_converts
that test the inverse condition of a conditional internal function.
* internal-fn.h (vectorized_internal_fn_supported_p): Declare.
* internal-fn.c (internal_fn_mask_index): Handle conditional
internal functions.
(vectorized_internal_fn_supported_p): New function.
* tree-if-conv.c: Include internal-fn.h and fold-const.h.
(any_pred_load_store): Replace with...
(need_to_predicate): ...this new variable.
(redundant_ssa_names): New variable.
(ifcvt_can_use_mask_load_store): Move initial checks to...
(ifcvt_can_predicate): ...this new function. Handle tree codes
for which a conditional internal function exists.
(if_convertible_gimple_assign_stmt_p): Use ifcvt_can_predicate
instead of ifcvt_can_use_mask_load_store. Update after variable
name change.
(predicate_load_or_store): New function, split out from
predicate_mem_writes.
(check_redundant_cond_expr): New function.
(value_available_p): Likewise.
(predicate_rhs_code): Likewise.
(predicate_mem_writes): Rename to...
(predicate_statements): ...this. Use predicate_load_or_store
and predicate_rhs_code.
(combine_blocks, tree_if_conversion): Update after above name changes.
(ifcvt_local_dce): Handle redundant_ssa_names.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Handle
general conditional functions.
* tree-vect-stmts.c (vectorizable_call): Likewise.
gcc/testsuite/
* gcc.dg/vect/vect-cond-arith-4.c: New test.
* gcc.dg/vect/vect-cond-arith-5.c: Likewise.
* gcc.target/aarch64/sve/cond_arith_1.c: Likewise.
* gcc.target/aarch64/sve/cond_arith_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_arith_2.c: Likewise.
* gcc.target/aarch64/sve/cond_arith_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_arith_3.c: Likewise.
* gcc.target/aarch64/sve/cond_arith_3_run.c: Likewise.
From-SVN: r262589
|
|
This patch adds support for fusing a conditional add or subtract
with a multiplication, so that we can use fused multiply-add and
multiply-subtract operations for fully-masked reductions. E.g.
for SVE we vectorise:
double res = 0.0;
for (int i = 0; i < n; ++i)
res += x[i] * y[i];
using a fully-masked loop in which the loop body has the form:
res_1 = PHI<0(preheader), res_2(latch)>;
avec = .MASK_LOAD (loop_mask, a)
bvec = .MASK_LOAD (loop_mask, b)
prod = avec * bvec;
res_2 = .COND_ADD (loop_mask, res_1, prod, res_1);
where the last statement does the equivalent of:
res_2 = loop_mask ? res_1 + prod : res_1;
(operating elementwise). The point of the patch is to convert the last
two statements into:
res_s = .COND_FMA (loop_mask, avec, bvec, res_1, res_1);
which is equivalent to:
res_2 = loop_mask ? fma (avec, bvec, res_1) : res_1;
(again operating elementwise).
2018-07-12 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* internal-fn.h (can_interpret_as_conditional_op_p): Declare.
* internal-fn.c (can_interpret_as_conditional_op_p): New function.
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Handle conditional
plus and minus and convert them into IFN_COND_FMA-based sequences.
(convert_mult_to_fma): Handle conditional plus and minus.
gcc/testsuite/
* gcc.dg/vect/vect-fma-2.c: New test.
* gcc.target/aarch64/sve/reduc_4.c: Likewise.
* gcc.target/aarch64/sve/reduc_6.c: Likewise.
* gcc.target/aarch64/sve/reduc_7.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r262588
|
|
This patch adds conditional equivalents of the IFN_FMA built-in functions.
Most of it is just a mechanical extension of the binary stuff.
2018-07-12 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* doc/md.texi (cond_fma, cond_fms, cond_fnma, cond_fnms): Document.
* optabs.def (cond_fma_optab, cond_fms_optab, cond_fnma_optab)
(cond_fnms_optab): New optabs.
* internal-fn.def (COND_FMA, COND_FMS, COND_FNMA, COND_FNMS): New
internal functions.
(FMA): Use DEF_INTERNAL_FLT_FN rather than DEF_INTERNAL_FLT_FLOATN_FN.
* internal-fn.h (get_conditional_internal_fn): Declare.
(get_unconditional_internal_fn): Likewise.
* internal-fn.c (cond_ternary_direct): New macro.
(expand_cond_ternary_optab_fn): Likewise.
(direct_cond_ternary_optab_supported_p): Likewise.
(FOR_EACH_COND_FN_PAIR): Likewise.
(get_conditional_internal_fn): New function.
(get_unconditional_internal_fn): Likewise.
* gimple-match.h (gimple_match_op::MAX_NUM_OPS): Bump to 5.
(gimple_match_op::gimple_match_op): Add a new overload for 5
operands.
(gimple_match_op::set_op): Likewise.
(gimple_resimplify5): Declare.
* genmatch.c (decision_tree::gen): Generate simplifications for
5 operands.
* gimple-match-head.c (gimple_simplify): Define an overload for
5 operands. Handle calls with 5 arguments in the top-level overload.
(convert_conditional_op): Handle conversions from unconditional
internal functions to conditional ones.
(gimple_resimplify5): New function.
(build_call_internal): Pass a fifth operand.
(maybe_push_res_to_seq): Likewise.
(try_conditional_simplification): Try converting conditional
internal functions to unconditional internal functions.
Handle 3-operand unconditional forms.
* match.pd (UNCOND_TERNARY, COND_TERNARY): Operator lists.
Define ternary equivalents of the current rules for binary conditional
internal functions.
* config/aarch64/aarch64.c (aarch64_preferred_else_value): Handle
ternary operations.
* config/aarch64/iterators.md (UNSPEC_COND_FMLA, UNSPEC_COND_FMLS)
(UNSPEC_COND_FNMLA, UNSPEC_COND_FNMLS): New unspecs.
(optab): Handle them.
(SVE_COND_FP_TERNARY): New int iterator.
(sve_fmla_op, sve_fmad_op): New int attributes.
* config/aarch64/aarch64-sve.md (cond_<optab><mode>)
(*cond_<optab><mode>_2, *cond_<optab><mode_4)
(*cond_<optab><mode>_any): New SVE_COND_FP_TERNARY patterns.
gcc/testsuite/
* gcc.dg/vect/vect-cond-arith-3.c: New test.
* gcc.target/aarch64/sve/vcond_13.c: Likewise.
* gcc.target/aarch64/sve/vcond_13_run.c: Likewise.
* gcc.target/aarch64/sve/vcond_14.c: Likewise.
* gcc.target/aarch64/sve/vcond_14_run.c: Likewise.
* gcc.target/aarch64/sve/vcond_15.c: Likewise.
* gcc.target/aarch64/sve/vcond_15_run.c: Likewise.
* gcc.target/aarch64/sve/vcond_16.c: Likewise.
* gcc.target/aarch64/sve/vcond_16_run.c: Likewise.
From-SVN: r262587
|
|
This patch adds match.pd support for applying normal folds to their
IFN_COND_* forms. E.g. the rule:
(plus @0 (negate @1)) -> (minus @0 @1)
also allows the fold:
(IFN_COND_ADD @0 @1 (negate @2) @3) -> (IFN_COND_SUB @0 @1 @2 @3)
Actually doing this by direct matches in gimple-match.c would
probably lead to combinatorial explosion, so instead, the patch
makes gimple_match_op carry a condition under which the operation
happens ("cond"), and the value to use when the condition is false
("else_value"). Thus in the example above we'd do the following
(a) convert:
cond:NULL_TREE (IFN_COND_ADD @0 @1 @4 @3) else_value:NULL_TREE
to:
cond:@0 (plus @1 @4) else_value:@3
(b) apply gimple_resimplify to (plus @1 @4)
(c) reintroduce cond and else_value when constructing the result.
Nested operations inherit the condition of the outer operation
(so that we don't introduce extra faults) but have a null else_value.
If we try to build such an operation, the target gets to choose what
else_value it can handle efficiently: obvious choices include one of
the operands or a zero constant. (The alternative would be to have some
representation for an undefined value, but that seems a bit invasive,
and isn't likely to be useful here.)
I've made the condition a mandatory part of the gimple_match_op
constructor so that it doesn't accidentally get dropped.
2018-07-12 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* target.def (preferred_else_value): New target hook.
* doc/tm.texi.in (TARGET_PREFERRED_ELSE_VALUE): New hook.
* doc/tm.texi: Regenerate.
* targhooks.h (default_preferred_else_value): Declare.
* targhooks.c (default_preferred_else_value): New function.
* internal-fn.h (conditional_internal_fn_code): Declare.
* internal-fn.c (FOR_EACH_CODE_MAPPING): New macro.
(get_conditional_internal_fn): Use it.
(conditional_internal_fn_code): New function.
* gimple-match.h (gimple_match_cond): New struct.
(gimple_match_op): Add a cond member function.
(gimple_match_op::gimple_match_op): Update all forms to take a
gimple_match_cond.
* genmatch.c (expr::gen_transform): Use the same condition as res_op
for the suboperation, but don't specify a particular else_value.
* tree-ssa-sccvn.c (vn_nary_simplify, vn_reference_lookup_3)
(visit_nary_op, visit_reference_op_load): Pass
gimple_match_cond::UNCOND to the gimple_match_op constructor.
* gimple-match-head.c: Include tree-eh.h
(convert_conditional_op): New function.
(maybe_resimplify_conditional_op): Likewise.
(gimple_resimplify1): Call maybe_resimplify_conditional_op.
(gimple_resimplify2): Likewise.
(gimple_resimplify3): Likewise.
(gimple_resimplify4): Likewise.
(maybe_push_res_to_seq): Return null for conditional operations.
(try_conditional_simplification): New function.
(gimple_simplify): Call it. Pass conditions to the gimple_match_op
constructor.
* match.pd: Fold VEC_COND_EXPRs of an IFN_COND_* call to a new
IFN_COND_* call.
* config/aarch64/aarch64.c (aarch64_preferred_else_value): New
function.
(TARGET_PREFERRED_ELSE_VALUE): Redefine.
gcc/testsuite/
* gcc.dg/vect/vect-cond-arith-2.c: New test.
* gcc.target/aarch64/sve/loop_add_6.c: Likewise.
From-SVN: r262586
|
|
This patch adds support for conditional multiplication and division.
It's mostly mechanical, but a few notes:
* The *_optab name and the .md names are the same as the unconditional
forms, just with "cond_" added to the front. This means we still
have the awkward difference between sdiv and div, etc.
* It was easier to retain the difference between integer and FP
division in the function names, given that they map to different
tree codes (TRUNC_DIV_EXPR and RDIV_EXPR).
* SVE has no direct support for IFN_COND_MOD, but it seemed more
consistent to add it anyway.
* Adding IFN_COND_MUL enables an extra fully-masked reduction
in gcc.dg/vect/pr53773.c.
* In practice we don't actually use the integer division forms without
if-conversion support (added by a later patch).
2018-05-25 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* doc/sourcebuild.texi (vect_double_cond_arith): Include
multiplication and division.
* doc/md.texi (cond_mul@var{m}, cond_div@var{m}, cond_mod@var{m})
(cond_udiv@var{m}, cond_umod@var{m}): Document.
* optabs.def (cond_smul_optab, cond_sdiv_optab, cond_smod_optab)
(cond_udiv_optab, cond_umod_optab): New optabs.
* internal-fn.def (IFN_COND_MUL, IFN_COND_DIV, IFN_COND_MOD)
(IFN_COND_RDIV): New internal functions.
* internal-fn.c (get_conditional_internal_fn): Handle TRUNC_DIV_EXPR,
TRUNC_MOD_EXPR and RDIV_EXPR.
* match.pd (UNCOND_BINARY, COND_BINARY): Handle them.
* config/aarch64/iterators.md (UNSPEC_COND_MUL, UNSPEC_COND_DIV):
New unspecs.
(SVE_INT_BINARY): Include mult.
(SVE_COND_FP_BINARY): Include UNSPEC_MUL and UNSPEC_DIV.
(optab, sve_int_op): Handle mult.
(optab, sve_fp_op, commutative): Handle UNSPEC_COND_MUL and
UNSPEC_COND_DIV.
* config/aarch64/aarch64-sve.md (cond_<optab><mode>): New pattern
for SVE_INT_BINARY_SD.
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_double_cond_arith): Include
multiplication and division.
* gcc.dg/vect/pr53773.c: Do not expect a scalar tail when using
fully-masked loops with a fixed vector length.
* gcc.dg/vect/vect-cond-arith-1.c: Add multiplication and division
tests.
* gcc.target/aarch64/sve/vcond_8.c: Likewise.
* gcc.target/aarch64/sve/vcond_9.c: Likewise.
* gcc.target/aarch64/sve/vcond_12.c: Add multiplication tests.
From-SVN: r260713
|
|
As suggested by Richard B, this patch changes the IFN_COND_*
functions so that they take the else value of the ?: operation
as a final argument, rather than always using argument 1.
All current callers will still use the equivalent of argument 1,
so this patch makes the SVE code assert that for now. Later patches
add the general case.
2018-05-25 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* doc/md.texi: Update the documentation of the cond_* optabs
to mention the new final operand. Fix GET_MODE_NUNITS call.
Describe the scalar case too.
* internal-fn.def (IFN_EXTRACT_LAST): Change type to fold_left.
* internal-fn.c (expand_cond_unary_optab_fn): Expect 3 operands
instead of 2.
(expand_cond_binary_optab_fn): Expect 4 operands instead of 3.
(get_conditional_internal_fn): Update comment.
* tree-vect-loop.c (vectorizable_reduction): Pass the original
accumulator value as a final argument to conditional functions.
* config/aarch64/aarch64-sve.md (cond_<optab><mode>): Turn into
a define_expand and add an "else" operand. Assert for now that
the else operand is equal to operand 2. Use SVE_INT_BINARY and
SVE_COND_FP_BINARY instead of SVE_COND_INT_OP and SVE_COND_FP_OP.
(*cond_<optab><mode>): New patterns.
* config/aarch64/iterators.md (UNSPEC_COND_SMAX, UNSPEC_COND_UMAX)
(UNSPEC_COND_SMIN, UNSPEC_COND_UMIN, UNSPEC_COND_AND, UNSPEC_COND_ORR)
(UNSPEC_COND_EOR): Delete.
(optab): Remove associated mappings.
(SVE_INT_BINARY): New code iterator.
(sve_int_op): Remove int attribute and add "minus" to the code
attribute.
(SVE_COND_INT_OP): Delete.
(SVE_COND_FP_OP): Rename to...
(SVE_COND_FP_BINARY): ...this.
From-SVN: r260707
|
|
This PR showed that the normal function for expanding directly-mapped
internal functions didn't handle the case in which the call was only
being kept for its side-effects.
2018-05-22 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR middle-end/85862
* internal-fn.c (expand_direct_optab_fn): Cope with a null lhs.
gcc/testsuite/
PR middle-end/85862
* gcc.dg/torture/pr85862.c: New test.
From-SVN: r260504
|
|
There are four optabs for various forms of fused multiply-add:
fma, fms, fnma and fnms. Of these, only fma had a direct gimple
representation. For the other three we relied on special pattern-
matching during expand, although tree-ssa-math-opts.c did have
some code to try to second-guess what expand would do.
This patch removes the old FMA_EXPR representation of fma and
introduces four new internal functions, one for each optab.
IFN_FMA is tied to BUILT_IN_FMA* while the other three are
independent directly-mapped internal functions. It's then
possible to do the pattern-matching in match.pd and
tree-ssa-math-opts.c (via folding) can select the exact
FMA-based operation.
The BRIG & HSA parts are a best guess, but seem relatively simple.
2018-05-18 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* doc/sourcebuild.texi (scalar_all_fma): Document.
* tree.def (FMA_EXPR): Delete.
* internal-fn.def (FMA, FMS, FNMA, FNMS): New internal functions.
* internal-fn.c (ternary_direct): New macro.
(expand_ternary_optab_fn): Likewise.
(direct_ternary_optab_supported_p): Likewise.
* Makefile.in (build/genmatch.o): Depend on case-fn-macros.h.
* builtins.c (fold_builtin_fma): Delete.
(fold_builtin_3): Don't call it.
* cfgexpand.c (expand_debug_expr): Remove FMA_EXPR handling.
* expr.c (expand_expr_real_2): Likewise.
* fold-const.c (operand_equal_p): Likewise.
(fold_ternary_loc): Likewise.
* gimple-pretty-print.c (dump_ternary_rhs): Likewise.
* gimple.c (DEFTREECODE): Likewise.
* gimplify.c (gimplify_expr): Likewise.
* optabs-tree.c (optab_for_tree_code): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-eh.c (operation_could_trap_p): Likewise.
(stmt_could_throw_1_p): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
(op_code_prio): Likewise.
* tree-ssa-loop-im.c (stmt_cost): Likewise.
* tree-ssa-operands.c (get_expr_operands): Likewise.
* tree.c (commutative_ternary_tree_code, add_expr): Likewise.
* fold-const-call.h (fold_fma): Delete.
* fold-const-call.c (fold_const_call_ssss): Handle CFN_FMS,
CFN_FNMA and CFN_FNMS.
(fold_fma): Delete.
* genmatch.c (combined_fn): New enum.
(commutative_ternary_tree_code): Remove FMA_EXPR handling.
(commutative_op): New function.
(commutate): Use it. Handle more than 2 operands.
(dt_operand::gen_gimple_expr): Use commutative_op.
(parser::parse_expr): Allow :c to be used with non-binary
operators if the commutative operand is known.
* gimple-ssa-backprop.c (backprop::process_builtin_call_use): Handle
CFN_FMS, CFN_FNMA and CFN_FNMS.
(backprop::process_assign_use): Remove FMA_EXPR handling.
* hsa-gen.c (gen_hsa_insns_for_operation_assignment): Likewise.
(gen_hsa_fma): New function.
(gen_hsa_insn_for_internal_fn_call): Use it for IFN_FMA, IFN_FMS,
IFN_FNMA and IFN_FNMS.
* match.pd: Add folds for IFN_FMS, IFN_FNMA and IFN_FNMS.
* gimple-fold.h (follow_all_ssa_edges): Declare.
* gimple-fold.c (follow_all_ssa_edges): New function.
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Use the
gimple_build interface and use follow_all_ssa_edges to fold the result.
(convert_mult_to_fma): Use direct_internal_fn_suppoerted_p
instead of checking for optabs directly.
* config/i386/i386.c (ix86_add_stmt_cost): Recognize FMAs as calls
rather than FMA_EXPRs.
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Create a
call to IFN_FMA instead of an FMA_EXPR.
gcc/brig/
* brigfrontend/brig-function.cc
(brig_function::get_builtin_for_hsa_opcode): Use BUILT_IN_FMA
for BRIG_OPCODE_FMA.
(brig_function::get_tree_code_for_hsa_opcode): Treat BUILT_IN_FMA
as a call.
gcc/c/
* gimple-parser.c (c_parser_gimple_postfix_expression): Remove
__FMA_EXPR handlng.
gcc/cp/
* constexpr.c (cxx_eval_constant_expression): Remove FMA_EXPR handling.
(potential_constant_expression_1): Likewise.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_scalar_all_fma):
New proc.
* gcc.dg/fma-1.c: New test.
* gcc.dg/fma-2.c: Likewise.
* gcc.dg/fma-3.c: Likewise.
* gcc.dg/fma-4.c: Likewise.
* gcc.dg/fma-5.c: Likewise.
* gcc.dg/fma-6.c: Likewise.
* gcc.dg/fma-7.c: Likewise.
* gcc.dg/gimplefe-26.c: Use .FMA instead of __FMA and require
scalar_all_fma.
* gfortran.dg/reassoc_7.f: Pass -ffp-contract=off.
* gfortran.dg/reassoc_8.f: Likewise.
* gfortran.dg/reassoc_9.f: Likewise.
* gfortran.dg/reassoc_10.f: Likewise.
From-SVN: r260348
|
|
This patch gets the gimple FE to parse calls to internal functions.
The only non-obvious thing was how the functions should be written
to avoid clashes with real function names. One option would be to
go the magic number of underscores route, but we already do that for
built-in functions, and it would be good to keep them visually
distinct. In the end I borrowed the local/internal label convention
from asm and used:
x = .SQRT (y);
2018-05-17 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* internal-fn.h (lookup_internal_fn): Declare
* internal-fn.c (lookup_internal_fn): New function.
* gimple.c (gimple_build_call_from_tree): Handle calls to
internal functions.
* gimple-pretty-print.c (dump_gimple_call): Print "." before
internal function names.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-ssa-scopedtables.c (expr_hash_elt::print): Likewise.
gcc/c/
* gimple-parser.c: Include internal-fn.h.
(c_parser_gimple_statement): Treat a leading CPP_DOT as a call.
(c_parser_gimple_call_internal): New function.
(c_parser_gimple_postfix_expression): Use it to handle CPP_DOT.
Fix typos in comment.
gcc/testsuite/
* gcc.dg/gimplefe-28.c: New test.
* gcc.dg/asan/use-after-scope-9.c: Adjust expected output for
internal function calls.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
From-SVN: r260316
|
|
expand_call_mem_ref checks for TARGET_MEM_REFs that have compatible
type, but it didn't then go on to install the specific type we need,
which might have different alignment due to:
if (TYPE_ALIGN (type) != align)
type = build_aligned_type (type, align);
This was causing masked stores to be incorrectly marked as
aligned on AVX512.
2018-02-20 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR tree-optimization/84419
* internal-fn.c (expand_call_mem_ref): Create a TARGET_MEM_REF
with the required type if its current type is compatible but
different.
gcc/testsuite/
PR tree-optimization/84419
* gcc.dg/vect/pr84419.c: New test.
From-SVN: r257847
|
|
This is mostly a mechanical extension of the previous gather load
support to scatter stores. The internal functions in this case are:
IFN_SCATTER_STORE (base, offsets, scale, values)
IFN_MASK_SCATTER_STORE (base, offsets, scale, values, mask)
However, one nonobvious change is to vect_analyze_data_ref_access.
If we're treating an access as a gather load or scatter store
(i.e. if STMT_VINFO_GATHER_SCATTER_P is true), the existing code
would create a dummy data_reference whose step is 0. There's not
really much else it could do, since the whole point is that the
step isn't predictable from iteration to iteration. We then
went into this code in vect_analyze_data_ref_access:
/* Allow loads with zero step in inner-loop vectorization. */
if (loop_vinfo && integer_zerop (step))
{
GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) = NULL;
if (!nested_in_vect_loop_p (loop, stmt))
return DR_IS_READ (dr);
I.e. we'd take the step literally and assume that this is a load
or store to an invariant address. Loads from invariant addresses
are supported but stores to them aren't.
The code therefore had the effect of disabling all scatter stores.
AFAICT this is true of AVX too: although tests like avx512f-scatter-1.c
test for the correctness of a scatter-like loop, they don't seem to
check whether a scatter instruction is actually used.
The patch therefore makes vect_analyze_data_ref_access return true
for scatters. We do seem to handle the aliasing correctly;
that's tested by other functions, and is symmetrical to the
already-working gather case.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/sourcebuild.texi (vect_scatter_store): Document.
* optabs.def (scatter_store_optab, mask_scatter_store_optab): New
optabs.
* doc/md.texi (scatter_store@var{m}, mask_scatter_store@var{m}):
Document.
* genopinit.c (main): Add supports_vec_scatter_store and
supports_vec_scatter_store_cached to target_optabs.
* gimple.h (gimple_expr_type): Handle IFN_SCATTER_STORE and
IFN_MASK_SCATTER_STORE.
* internal-fn.def (SCATTER_STORE, MASK_SCATTER_STORE): New internal
functions.
* internal-fn.h (internal_store_fn_p): Declare.
(internal_fn_stored_value_index): Likewise.
* internal-fn.c (scatter_store_direct): New macro.
(expand_scatter_store_optab_fn): New function.
(direct_scatter_store_optab_supported_p): New macro.
(internal_store_fn_p): New function.
(internal_gather_scatter_fn_p): Handle IFN_SCATTER_STORE and
IFN_MASK_SCATTER_STORE.
(internal_fn_mask_index): Likewise.
(internal_fn_stored_value_index): New function.
(internal_gather_scatter_fn_supported_p): Adjust operand numbers
for scatter stores.
* optabs-query.h (supports_vec_scatter_store_p): Declare.
* optabs-query.c (supports_vec_scatter_store_p): New function.
* tree-vectorizer.h (vect_get_store_rhs): Declare.
* tree-vect-data-refs.c (vect_analyze_data_ref_access): Return
true for scatter stores.
(vect_gather_scatter_fn_p): Handle scatter stores too.
(vect_check_gather_scatter): Consider using scatter stores if
supports_vec_scatter_store_p.
* tree-vect-patterns.c (vect_try_gather_scatter_pattern): Handle
scatter stores too.
* tree-vect-stmts.c (exist_non_indexing_operands_for_use_p): Use
internal_fn_stored_value_index.
(check_load_store_masking): Handle scatter stores too.
(vect_get_store_rhs): Make public.
(vectorizable_call): Use internal_store_fn_p.
(vectorizable_store): Handle scatter store internal functions.
(vect_transform_stmt): Compare GROUP_STORE_COUNT with GROUP_SIZE
when deciding whether the end of the group has been reached.
* config/aarch64/aarch64.md (UNSPEC_ST1_SCATTER): New unspec.
* config/aarch64/aarch64-sve.md (scatter_store<mode>): New expander.
(mask_scatter_store<mode>): New insns.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_scatter_store):
New proc.
* gcc.dg/vect/pr25413a.c: Expect both loops to be optimized on
targets with scatter stores.
* gcc.dg/vect/vect-71.c: Restrict XFAIL to targets without scatter
stores.
* gcc.target/aarch64/sve/mask_scatter_store_1.c: New test.
* gcc.target/aarch64/sve/mask_scatter_store_2.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_1.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_2.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_3.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_4.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_5.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_6.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_7.c: Likewise.
* gcc.target/aarch64/sve/strided_store_1.c: Likewise.
* gcc.target/aarch64/sve/strided_store_2.c: Likewise.
* gcc.target/aarch64/sve/strided_store_3.c: Likewise.
* gcc.target/aarch64/sve/strided_store_4.c: Likewise.
* gcc.target/aarch64/sve/strided_store_5.c: Likewise.
* gcc.target/aarch64/sve/strided_store_6.c: Likewise.
* gcc.target/aarch64/sve/strided_store_7.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256643
|
|
This patch adds support for SVE gather loads. It uses the basically
the same analysis code as the AVX gather support, but after that
there are two major differences:
- It uses new internal functions rather than target built-ins.
The interface is:
IFN_GATHER_LOAD (base, offsets scale)
IFN_MASK_GATHER_LOAD (base, offsets scale, mask)
which should be reasonably generic. One of the advantages of
using internal functions is that other passes can understand what
the functions do, but a more immediate advantage is that we can
query the underlying target pattern to see which scales it supports.
- It uses pattern recognition to convert the offset to the right width,
if it was originally narrower than that. This avoids having to do
a widening operation as part of the gather expansion itself.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/md.texi (gather_load@var{m}): Document.
(mask_gather_load@var{m}): Likewise.
* genopinit.c (main): Add supports_vec_gather_load and
supports_vec_gather_load_cached to target_optabs.
* optabs-tree.c (init_tree_optimization_optabs): Use
ggc_cleared_alloc to allocate target_optabs.
* optabs.def (gather_load_optab, mask_gather_laod_optab): New optabs.
* internal-fn.def (GATHER_LOAD, MASK_GATHER_LOAD): New internal
functions.
* internal-fn.h (internal_load_fn_p): Declare.
(internal_gather_scatter_fn_p): Likewise.
(internal_fn_mask_index): Likewise.
(internal_gather_scatter_fn_supported_p): Likewise.
* internal-fn.c (gather_load_direct): New macro.
(expand_gather_load_optab_fn): New function.
(direct_gather_load_optab_supported_p): New macro.
(direct_internal_fn_optab): New function.
(internal_load_fn_p): Likewise.
(internal_gather_scatter_fn_p): Likewise.
(internal_fn_mask_index): Likewise.
(internal_gather_scatter_fn_supported_p): Likewise.
* optabs-query.c (supports_at_least_one_mode_p): New function.
(supports_vec_gather_load_p): Likewise.
* optabs-query.h (supports_vec_gather_load_p): Declare.
* tree-vectorizer.h (gather_scatter_info): Add ifn, element_type
and memory_type field.
(NUM_PATTERNS): Bump to 15.
* tree-vect-data-refs.c: Include internal-fn.h.
(vect_gather_scatter_fn_p): New function.
(vect_describe_gather_scatter_call): Likewise.
(vect_check_gather_scatter): Try using internal functions for
gather loads. Recognize existing calls to a gather load function.
(vect_analyze_data_refs): Consider using gather loads if
supports_vec_gather_load_p.
* tree-vect-patterns.c (vect_get_load_store_mask): New function.
(vect_get_gather_scatter_offset_type): Likewise.
(vect_convert_mask_for_vectype): Likewise.
(vect_add_conversion_to_patterm): Likewise.
(vect_try_gather_scatter_pattern): Likewise.
(vect_recog_gather_scatter_pattern): New pattern recognizer.
(vect_vect_recog_func_ptrs): Add it.
* tree-vect-stmts.c (exist_non_indexing_operands_for_use_p): Use
internal_fn_mask_index and internal_gather_scatter_fn_p.
(check_load_store_masking): Take the gather_scatter_info as an
argument and handle gather loads.
(vect_get_gather_scatter_ops): New function.
(vectorizable_call): Check internal_load_fn_p.
(vectorizable_load): Likewise. Handle gather load internal
functions.
(vectorizable_store): Update call to check_load_store_masking.
* config/aarch64/aarch64.md (UNSPEC_LD1_GATHER): New unspec.
* config/aarch64/iterators.md (SVE_S, SVE_D): New mode iterators.
* config/aarch64/predicates.md (aarch64_gather_scale_operand_w)
(aarch64_gather_scale_operand_d): New predicates.
* config/aarch64/aarch64-sve.md (gather_load<mode>): New expander.
(mask_gather_load<mode>): New insns.
gcc/testsuite/
* gcc.target/aarch64/sve/gather_load_1.c: New test.
* gcc.target/aarch64/sve/gather_load_2.c: Likewise.
* gcc.target/aarch64/sve/gather_load_3.c: Likewise.
* gcc.target/aarch64/sve/gather_load_4.c: Likewise.
* gcc.target/aarch64/sve/gather_load_5.c: Likewise.
* gcc.target/aarch64/sve/gather_load_6.c: Likewise.
* gcc.target/aarch64/sve/gather_load_7.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_1.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_2.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_3.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_4.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_5.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_6.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_7.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256640
|
|
This patch adds support for in-order floating-point addition reductions,
which are suitable even in strict IEEE mode.
Previously vect_is_simple_reduction would reject any cases that forbid
reassociation. The idea is instead to tentatively accept them as
"FOLD_LEFT_REDUCTIONs" and only fail later if there is no support
for them. Although this patch only handles the particular case of plus
and minus on floating-point types, there's no reason in principle why
we couldn't handle other cases.
The reductions use a new fold_left_plus_optab if available, otherwise
they fall back to elementwise additions or subtractions.
The vect_force_simple_reduction change makes it easier for parloops
to read the type of reduction.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* optabs.def (fold_left_plus_optab): New optab.
* doc/md.texi (fold_left_plus_@var{m}): Document.
* internal-fn.def (IFN_FOLD_LEFT_PLUS): New internal function.
* internal-fn.c (fold_left_direct): Define.
(expand_fold_left_optab_fn): Likewise.
(direct_fold_left_optab_supported_p): Likewise.
* fold-const-call.c (fold_const_fold_left): New function.
(fold_const_call): Use it to fold CFN_FOLD_LEFT_PLUS.
* tree-parloops.c (valid_reduction_p): New function.
(gather_scalar_reductions): Use it.
* tree-vectorizer.h (FOLD_LEFT_REDUCTION): New vect_reduction_type.
(vect_finish_replace_stmt): Declare.
* tree-vect-loop.c (fold_left_reduction_fn): New function.
(needs_fold_left_reduction_p): New function, split out from...
(vect_is_simple_reduction): ...here. Accept reductions that
forbid reassociation, but give them type FOLD_LEFT_REDUCTION.
(vect_force_simple_reduction): Also store the reduction type in
the assignment's STMT_VINFO_REDUC_TYPE.
(vect_model_reduction_cost): Handle FOLD_LEFT_REDUCTION.
(merge_with_identity): New function.
(vect_expand_fold_left): Likewise.
(vectorize_fold_left_reduction): Likewise.
(vectorizable_reduction): Handle FOLD_LEFT_REDUCTION. Leave the
scalar phi in place for it. Check for target support and reject
cases that would reassociate the operation. Defer the transform
phase to vectorize_fold_left_reduction.
* config/aarch64/aarch64.md (UNSPEC_FADDA): New unspec.
* config/aarch64/aarch64-sve.md (fold_left_plus_<mode>): New expander.
(*fold_left_plus_<mode>, *pred_fold_left_plus_<mode>): New insns.
gcc/testsuite/
* gcc.dg/vect/no-fast-math-vect16.c: Expect the test to pass and
check for a message about using in-order reductions.
* gcc.dg/vect/pr79920.c: Expect both loops to be vectorized and
check for a message about using in-order reductions.
* gcc.dg/vect/trapv-vect-reduc-4.c: Expect all three loops to be
vectorized and check for a message about using in-order reductions.
Expect targets with variable-length vectors to fall back to the
fixed-length mininum.
* gcc.dg/vect/vect-reduc-6.c: Expect the loop to be vectorized and
check for a message about using in-order reductions.
* gcc.dg/vect/vect-reduc-in-order-1.c: New test.
* gcc.dg/vect/vect-reduc-in-order-2.c: Likewise.
* gcc.dg/vect/vect-reduc-in-order-3.c: Likewise.
* gcc.dg/vect/vect-reduc-in-order-4.c: Likewise.
* gcc.target/aarch64/sve/reduc_strict_1.c: New test.
* gcc.target/aarch64/sve/reduc_strict_1_run.c: Likewise.
* gcc.target/aarch64/sve/reduc_strict_2.c: Likewise.
* gcc.target/aarch64/sve/reduc_strict_2_run.c: Likewise.
* gcc.target/aarch64/sve/reduc_strict_3.c: Likewise.
* gcc.target/aarch64/sve/slp_13.c: Add floating-point types.
* gfortran.dg/vect/vect-8.f90: Expect 22 loops to be vectorized if
vect_fold_left_plus.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256639
|
|
This patch uses SVE CLASTB to optimise conditional reductions. It means
that we no longer need to maintain a separate index vector to record
the most recent valid value, and no longer need to worry about overflow
cases.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/md.texi (fold_extract_last_@var{m}): Document.
* doc/sourcebuild.texi (vect_fold_extract_last): Likewise.
* optabs.def (fold_extract_last_optab): New optab.
* internal-fn.def (FOLD_EXTRACT_LAST): New internal function.
* internal-fn.c (fold_extract_direct): New macro.
(expand_fold_extract_optab_fn): Likewise.
(direct_fold_extract_optab_supported_p): Likewise.
* tree-vectorizer.h (EXTRACT_LAST_REDUCTION): New vect_reduction_type.
* tree-vect-loop.c (vect_model_reduction_cost): Handle
EXTRACT_LAST_REDUCTION.
(get_initial_def_for_reduction): Do not create an initial vector
for EXTRACT_LAST_REDUCTION reductions.
(vectorizable_reduction): Leave the scalar phi in place for
EXTRACT_LAST_REDUCTIONs. Try using EXTRACT_LAST_REDUCTION
ahead of INTEGER_INDUC_COND_REDUCTION. Do not check for an
epilogue code for EXTRACT_LAST_REDUCTION and defer the
transform phase to vectorizable_condition.
* tree-vect-stmts.c (vect_finish_stmt_generation_1): New function,
split out from...
(vect_finish_stmt_generation): ...here.
(vect_finish_replace_stmt): New function.
(vectorizable_condition): Handle EXTRACT_LAST_REDUCTION.
* config/aarch64/aarch64-sve.md (fold_extract_last_<mode>): New
pattern.
* config/aarch64/aarch64.md (UNSPEC_CLASTB): New unspec.
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_fold_extract_last): New proc.
* gcc.dg/vect/pr65947-1.c: Update dump messages. Add markup
for fold_extract_last.
* gcc.dg/vect/pr65947-2.c: Likewise.
* gcc.dg/vect/pr65947-3.c: Likewise.
* gcc.dg/vect/pr65947-4.c: Likewise.
* gcc.dg/vect/pr65947-5.c: Likewise.
* gcc.dg/vect/pr65947-6.c: Likewise.
* gcc.dg/vect/pr65947-9.c: Likewise.
* gcc.dg/vect/pr65947-10.c: Likewise.
* gcc.dg/vect/pr65947-12.c: Likewise.
* gcc.dg/vect/pr65947-14.c: Likewise.
* gcc.dg/vect/pr80631-1.c: Likewise.
* gcc.target/aarch64/sve/clastb_1.c: New test.
* gcc.target/aarch64/sve/clastb_1_run.c: Likewise.
* gcc.target/aarch64/sve/clastb_2.c: Likewise.
* gcc.target/aarch64/sve/clastb_2_run.c: Likewise.
* gcc.target/aarch64/sve/clastb_3.c: Likewise.
* gcc.target/aarch64/sve/clastb_3_run.c: Likewise.
* gcc.target/aarch64/sve/clastb_4.c: Likewise.
* gcc.target/aarch64/sve/clastb_4_run.c: Likewise.
* gcc.target/aarch64/sve/clastb_5.c: Likewise.
* gcc.target/aarch64/sve/clastb_5_run.c: Likewise.
* gcc.target/aarch64/sve/clastb_6.c: Likewise.
* gcc.target/aarch64/sve/clastb_6_run.c: Likewise.
* gcc.target/aarch64/sve/clastb_7.c: Likewise.
* gcc.target/aarch64/sve/clastb_7_run.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256633
|
|
This patch uses the SVE LASTB instruction to optimise cases in which
a value produced by the final scalar iteration of a vectorised loop is
live outside the loop. Previously this situation would stop us from
using a fully-masked loop.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/md.texi (extract_last_@var{m}): Document.
* optabs.def (extract_last_optab): New optab.
* internal-fn.def (EXTRACT_LAST): New internal function.
* internal-fn.c (cond_unary_direct): New macro.
(expand_cond_unary_optab_fn): Likewise.
(direct_cond_unary_optab_supported_p): Likewise.
* tree-vect-loop.c (vectorizable_live_operation): Allow fully-masked
loops using EXTRACT_LAST.
* config/aarch64/aarch64-sve.md (aarch64_sve_lastb<mode>): Rename to...
(extract_last_<mode>): ...this optab.
(vec_extract<mode><Vel>): Update accordingly.
gcc/testsuite/
* gcc.target/aarch64/sve/live_1.c: New test.
* gcc.target/aarch64/sve/live_1_run.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256632
|
|
This patch allows ADDR_EXPR <TARGET_MEM_REF ...>, which is useful
when calling internal functions that take pointers to memory that
is conditionally loaded or stored. This is a prerequisite to the
following ivopts patch.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* expr.c (expand_expr_addr_expr_1): Handle ADDR_EXPRs of
TARGET_MEM_REFs.
* gimple-expr.h (is_gimple_addressable: Likewise.
* gimple-expr.c (is_gimple_address): Likewise.
* internal-fn.c (expand_call_mem_ref): New function.
(expand_mask_load_optab_fn): Use it.
(expand_mask_store_optab_fn): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256627
|
|
This patch removes the restriction that fully-masked loops cannot
have reductions. The key thing here is to make sure that the
reduction accumulator doesn't include any values associated with
inactive lanes; the patch adds a bunch of conditional binary
operations for doing that.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/md.texi (cond_add@var{mode}, cond_sub@var{mode})
(cond_and@var{mode}, cond_ior@var{mode}, cond_xor@var{mode})
(cond_smin@var{mode}, cond_smax@var{mode}, cond_umin@var{mode})
(cond_umax@var{mode}): Document.
* optabs.def (cond_add_optab, cond_sub_optab, cond_and_optab)
(cond_ior_optab, cond_xor_optab, cond_smin_optab, cond_smax_optab)
(cond_umin_optab, cond_umax_optab): New optabs.
* internal-fn.def (COND_ADD, COND_SUB, COND_MIN, COND_MAX, COND_AND)
(COND_IOR, COND_XOR): New internal functions.
* internal-fn.h (get_conditional_internal_fn): Declare.
* internal-fn.c (cond_binary_direct): New macro.
(expand_cond_binary_optab_fn): Likewise.
(direct_cond_binary_optab_supported_p): Likewise.
(get_conditional_internal_fn): New function.
* tree-vect-loop.c (vectorizable_reduction): Handle fully-masked loops.
Cope with reduction statements that are vectorized as calls rather
than assignments.
* config/aarch64/aarch64-sve.md (cond_<optab><mode>): New insns.
* config/aarch64/iterators.md (UNSPEC_COND_ADD, UNSPEC_COND_SUB)
(UNSPEC_COND_SMAX, UNSPEC_COND_UMAX, UNSPEC_COND_SMIN)
(UNSPEC_COND_UMIN, UNSPEC_COND_AND, UNSPEC_COND_ORR)
(UNSPEC_COND_EOR): New unspecs.
(optab): Add mappings for them.
(SVE_COND_INT_OP, SVE_COND_FP_OP): New int iterators.
(sve_int_op, sve_fp_op): New int attributes.
gcc/testsuite/
* gcc.dg/vect/pr60482.c: Remove XFAIL for variable-length vectors.
* gcc.target/aarch64/sve/reduc_1.c: Expect the loop operations
to be predicated.
* gcc.target/aarch64/sve/slp_5.c: Check for a fully-masked loop.
* gcc.target/aarch64/sve/slp_7.c: Likewise.
* gcc.target/aarch64/sve/reduc_5.c: New test.
* gcc.target/aarch64/sve/slp_13.c: Likewise.
* gcc.target/aarch64/sve/slp_13_run.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256626
|
|
This patch adds support for using a single fully-predicated loop instead
of a vector loop and a scalar tail. An SVE WHILELO instruction generates
the predicate for each iteration of the loop, given the current scalar
iv value and the loop bound. This operation is wrapped up in a new internal
function called WHILE_ULT. E.g.:
WHILE_ULT (0, 3, { 0, 0, 0, 0}) -> { 1, 1, 1, 0 }
WHILE_ULT (UINT_MAX - 1, UINT_MAX, { 0, 0, 0, 0 }) -> { 1, 0, 0, 0 }
The third WHILE_ULT argument is needed to make the operation
unambiguous: without it, WHILE_ULT (0, 3) for one vector type would
seem equivalent to WHILE_ULT (0, 3) for another, even if the types have
different numbers of elements.
Note that the patch uses "mask" and "fully-masked" instead of
"predicate" and "fully-predicated", to follow existing GCC terminology.
This patch just handles the simple cases, punting for things like
reductions and live-out values. Later patches remove most of these
restrictions.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* optabs.def (while_ult_optab): New optab.
* doc/md.texi (while_ult@var{m}@var{n}): Document.
* internal-fn.def (WHILE_ULT): New internal function.
* internal-fn.h (direct_internal_fn_supported_p): New override
that takes two types as argument.
* internal-fn.c (while_direct): New macro.
(expand_while_optab_fn): New function.
(convert_optab_supported_p): Likewise.
(direct_while_optab_supported_p): New macro.
* wide-int.h (wi::udiv_ceil): New function.
* tree-vectorizer.h (rgroup_masks): New structure.
(vec_loop_masks): New typedef.
(_loop_vec_info): Add masks, mask_compare_type, can_fully_mask_p
and fully_masked_p.
(LOOP_VINFO_CAN_FULLY_MASK_P, LOOP_VINFO_FULLY_MASKED_P)
(LOOP_VINFO_MASKS, LOOP_VINFO_MASK_COMPARE_TYPE): New macros.
(vect_max_vf): New function.
(slpeel_make_loop_iterate_ntimes): Delete.
(vect_set_loop_condition, vect_get_loop_mask_type, vect_gen_while)
(vect_halve_mask_nunits, vect_double_mask_nunits): Declare.
(vect_record_loop_mask, vect_get_loop_mask): Likewise.
* tree-vect-loop-manip.c: Include tree-ssa-loop-niter.h,
internal-fn.h, stor-layout.h and optabs-query.h.
(vect_set_loop_mask): New function.
(add_preheader_seq): Likewise.
(add_header_seq): Likewise.
(interleave_supported_p): Likewise.
(vect_maybe_permute_loop_masks): Likewise.
(vect_set_loop_masks_directly): Likewise.
(vect_set_loop_condition_masked): Likewise.
(vect_set_loop_condition_unmasked): New function, split out from
slpeel_make_loop_iterate_ntimes.
(slpeel_make_loop_iterate_ntimes): Rename to..
(vect_set_loop_condition): ...this. Use vect_set_loop_condition_masked
for fully-masked loops and vect_set_loop_condition_unmasked otherwise.
(vect_do_peeling): Update call accordingly.
(vect_gen_vector_loop_niters): Use VF as the step for fully-masked
loops.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
mask_compare_type, can_fully_mask_p and fully_masked_p.
(release_vec_loop_masks): New function.
(_loop_vec_info): Use it to free the loop masks.
(can_produce_all_loop_masks_p): New function.
(vect_get_max_nscalars_per_iter): Likewise.
(vect_verify_full_masking): Likewise.
(vect_analyze_loop_2): Save LOOP_VINFO_CAN_FULLY_MASK_P around
retries, and free the mask rgroups before retrying. Check loop-wide
reasons for disallowing fully-masked loops. Make the final decision
about whether use a fully-masked loop or not.
(vect_estimate_min_profitable_iters): Do not assume that peeling
for the number of iterations will be needed for fully-masked loops.
(vectorizable_reduction): Disable fully-masked loops.
(vectorizable_live_operation): Likewise.
(vect_halve_mask_nunits): New function.
(vect_double_mask_nunits): Likewise.
(vect_record_loop_mask): Likewise.
(vect_get_loop_mask): Likewise.
(vect_transform_loop): Handle the case in which the final loop
iteration might handle a partial vector. Call vect_set_loop_condition
instead of slpeel_make_loop_iterate_ntimes.
* tree-vect-stmts.c: Include tree-ssa-loop-niter.h and gimple-fold.h.
(check_load_store_masking): New function.
(prepare_load_store_mask): Likewise.
(vectorizable_store): Handle fully-masked loops.
(vectorizable_load): Likewise.
(supportable_widening_operation): Use vect_halve_mask_nunits for
booleans.
(supportable_narrowing_operation): Likewise vect_double_mask_nunits.
(vect_gen_while): New function.
* config/aarch64/aarch64.md (umax<mode>3): New expander.
(aarch64_uqdec<mode>): New insn.
gcc/testsuite/
* gcc.dg/tree-ssa/cunroll-10.c: Disable vectorization.
* gcc.dg/tree-ssa/peel1.c: Likewise.
* gcc.dg/vect/vect-load-lanes-peeling-1.c: Remove XFAIL for
variable-length vectors.
* gcc.target/aarch64/sve/vcond_6.c: XFAIL test for AND.
* gcc.target/aarch64/sve/vec_bool_cmp_1.c: Expect BIC instead of NOT.
* gcc.target/aarch64/sve/slp_1.c: Check for a fully-masked loop.
* gcc.target/aarch64/sve/slp_2.c: Likewise.
* gcc.target/aarch64/sve/slp_3.c: Likewise.
* gcc.target/aarch64/sve/slp_4.c: Likewise.
* gcc.target/aarch64/sve/slp_6.c: Likewise.
* gcc.target/aarch64/sve/slp_8.c: New test.
* gcc.target/aarch64/sve/slp_8_run.c: Likewise.
* gcc.target/aarch64/sve/slp_9.c: Likewise.
* gcc.target/aarch64/sve/slp_9_run.c: Likewise.
* gcc.target/aarch64/sve/slp_10.c: Likewise.
* gcc.target/aarch64/sve/slp_10_run.c: Likewise.
* gcc.target/aarch64/sve/slp_11.c: Likewise.
* gcc.target/aarch64/sve/slp_11_run.c: Likewise.
* gcc.target/aarch64/sve/slp_12.c: Likewise.
* gcc.target/aarch64/sve/slp_12_run.c: Likewise.
* gcc.target/aarch64/sve/ld1r_2.c: Likewise.
* gcc.target/aarch64/sve/ld1r_2_run.c: Likewise.
* gcc.target/aarch64/sve/while_1.c: Likewise.
* gcc.target/aarch64/sve/while_2.c: Likewise.
* gcc.target/aarch64/sve/while_3.c: Likewise.
* gcc.target/aarch64/sve/while_4.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256625
|
|
This patch adds support for vectorising groups of IFN_MASK_LOADs
and IFN_MASK_STOREs using conditional load/store-lanes instructions.
This requires new internal functions to represent the result
(IFN_MASK_{LOAD,STORE}_LANES), as well as associated optabs.
The normal IFN_{LOAD,STORE}_LANES functions are const operations
that logically just perform the permute: the load or store is
encoded as a MEM operand to the call statement. In contrast,
the IFN_MASK_{LOAD,STORE}_LANES functions use the same kind of
interface as IFN_MASK_{LOAD,STORE}, since the memory is only
conditionally accessed.
The AArch64 patterns were added as part of the main LD[234]/ST[234] patch.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/md.texi (vec_mask_load_lanes@var{m}@var{n}): Document.
(vec_mask_store_lanes@var{m}@var{n}): Likewise.
* optabs.def (vec_mask_load_lanes_optab): New optab.
(vec_mask_store_lanes_optab): Likewise.
* internal-fn.def (MASK_LOAD_LANES): New internal function.
(MASK_STORE_LANES): Likewise.
* internal-fn.c (mask_load_lanes_direct): New macro.
(mask_store_lanes_direct): Likewise.
(expand_mask_load_optab_fn): Handle masked operations.
(expand_mask_load_lanes_optab_fn): New macro.
(expand_mask_store_optab_fn): Handle masked operations.
(expand_mask_store_lanes_optab_fn): New macro.
(direct_mask_load_lanes_optab_supported_p): Likewise.
(direct_mask_store_lanes_optab_supported_p): Likewise.
* tree-vectorizer.h (vect_store_lanes_supported): Take a masked_p
parameter.
(vect_load_lanes_supported): Likewise.
* tree-vect-data-refs.c (strip_conversion): New function.
(can_group_stmts_p): Likewise.
(vect_analyze_data_ref_accesses): Use it instead of checking
for a pair of assignments.
(vect_store_lanes_supported): Take a masked_p parameter.
(vect_load_lanes_supported): Likewise.
* tree-vect-loop.c (vect_analyze_loop_2): Update calls to
vect_store_lanes_supported and vect_load_lanes_supported.
* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
* tree-vect-stmts.c (get_group_load_store_type): Take a masked_p
parameter. Don't allow gaps for masked accesses.
Use vect_get_store_rhs. Update calls to vect_store_lanes_supported
and vect_load_lanes_supported.
(get_load_store_type): Take a masked_p parameter and update
call to get_group_load_store_type.
(vectorizable_store): Update call to get_load_store_type.
Handle IFN_MASK_STORE_LANES.
(vectorizable_load): Update call to get_load_store_type.
Handle IFN_MASK_LOAD_LANES.
gcc/testsuite/
* gcc.dg/vect/vect-ooo-group-1.c: New test.
* gcc.target/aarch64/sve/mask_struct_load_1.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_1_run.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_2_run.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_3_run.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_6.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_7.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_8.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_3.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_3_run.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_4.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256620
|
|
From-SVN: r256169
|
|
This patch makes expand_vector_ubsan_overflow cope with a polynomial
number of elements.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* internal-fn.c (expand_vector_ubsan_overflow): Handle polynomial
numbers of elements.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256148
|
|
-fsanitize=null)
2017-12-15 Richard Biener <rguenther@suse.de>
PR lto/83388
* internal-fn.def (IFN_NOP): Add.
* internal-fn.c (expand_NOP): Do nothing.
* lto-streamer-in.c (input_function): Instead of removing
sanitizer calls replace them with IFN_NOP calls.
* gcc.dg/lto/pr83388_0.c: New testcase.
From-SVN: r255694
|
|
exactly one argument is the constant 2)
PR target/83210
* internal-fn.c (expand_mul_overflow): Optimize unsigned
multiplication by power of 2 constant into two shifts + comparison.
* gcc.target/i386/pr83210.c: New test.
From-SVN: r255269
|
|
This patch replaces the REDUC_*_EXPR tree codes with internal functions.
This is needed so that the upcoming in-order reductions can also use
internal functions without too much complication.
2017-11-22 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree.def (REDUC_MAX_EXPR, REDUC_MIN_EXPR, REDUC_PLUS_EXPR): Delete.
* cfgexpand.c (expand_debug_expr): Remove handling for them.
* expr.c (expand_expr_real_2): Likewise.
* fold-const.c (const_unop): Likewise.
* optabs-tree.c (optab_for_tree_code): Likewise.
* tree-cfg.c (verify_gimple_assign_unary): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
(op_code_prio): Likewise.
(op_symbol_code): Likewise.
* internal-fn.def (DEF_INTERNAL_SIGNED_OPTAB_FN): Define.
(IFN_REDUC_PLUS, IFN_REDUC_MAX, IFN_REDUC_MIN): New internal functions.
* internal-fn.c (direct_internal_fn_optab): New function.
(direct_internal_fn_array, direct_internal_fn_supported_p
(internal_fn_expanders): Handle DEF_INTERNAL_SIGNED_OPTAB_FN.
* fold-const-call.c (fold_const_reduction): New function.
(fold_const_call): Handle CFN_REDUC_PLUS, CFN_REDUC_MAX and
CFN_REDUC_MIN.
* tree-vect-loop.c: Include internal-fn.h.
(reduction_code_for_scalar_code): Rename to...
(reduction_fn_for_scalar_code): ...this and return an internal
function.
(vect_model_reduction_cost): Take an internal_fn rather than
a tree_code.
(vect_create_epilog_for_reduction): Likewise. Build calls rather
than assignments.
(vectorizable_reduction): Use internal functions rather than tree
codes for the reduction operation. Update calls to the functions
above.
* config/aarch64/aarch64-builtins.c (aarch64_gimple_fold_builtin):
Use calls to internal functions rather than REDUC tree codes.
* config/aarch64/aarch64-simd.md: Update comment accordingly.
From-SVN: r255073
|
|
PR target/82981
* internal-fn.c (expand_mul_overflow): Use OPTAB_WIDEN instead of
OPTAB_DIRECT in calls to expand_simple_binop.
From-SVN: r254986
|
|
PR target/82981
* internal-fn.c: Include gimple-ssa.h, tree-phinodes.h and
ssa-iterators.h.
(can_widen_mult_without_libcall): New function.
(expand_mul_overflow): If only checking unsigned mul overflow,
not result, and can do efficiently MULT_HIGHPART_EXPR, emit that.
Don't use WIDEN_MULT_EXPR if it would involve a libcall, unless
no other way works. Add MULT_HIGHPART_EXPR + MULT_EXPR support.
(expand_DIVMOD): Formatting fix.
* expmed.h (expand_mult): Add NO_LIBCALL argument.
* expmed.c (expand_mult): Likewise. Use OPTAB_WIDEN rather
than OPTAB_LIB_WIDEN if NO_LIBCALL is true, and allow it to fail.
* gcc.target/mips/pr82981.c: New test.
From-SVN: r254758
|
|
This is needed by the later SVE LAST reductions, where an 8-bit
or 16-bit result is zero- rather than sign-extended to 32 bits.
I think it could occur in other situations too.
2017-09-19 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* internal-fn.c (expand_direct_optab_fn): Don't assign directly
to a SUBREG_PROMOTED_VAR.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r253992
|
|
int64_t when compiled with -m32)
PR target/82274
* internal-fn.c (expand_mul_overflow): If both operands have
the same highpart of -1 or 0 and the topmost bit of lowpart
is different, overflow is if res <= 0 rather than res < 0.
* libgcc2.c (__mulvDI3): If both operands have
the same highpart of -1 and the topmost bit of lowpart is 0,
multiplication overflows even if both lowparts are 0.
* gcc.dg/pr82274-1.c: New test.
* gcc.dg/pr82274-2.c: New test.
From-SVN: r253734
|
|
The wide_int routines allow things like:
wi::add (t, 1)
to add 1 to an INTEGER_CST T in its native precision. But we also have:
wi::to_offset (t) // Treat T as an offset_int
wi::to_widest (t) // Treat T as a widest_int
Recently we also gained:
wi::to_wide (t, prec) // Treat T as a wide_int in preccision PREC
This patch therefore requires:
wi::to_wide (t)
when operating on INTEGER_CSTs in their native precision. This is
just as efficient, and makes it clearer that a deliberate choice is
being made to treat the tree as a wide_int in its native precision.
This also removes the inconsistency that
a) INTEGER_CSTs in their native precision can be used without an accessor
but must use wi:: functions instead of C++ operators
b) the other forms need an explicit accessor but the result can be used
with C++ operators.
It also helps with SVE, where there's the additional possibility
that the tree could be a runtime value.
2017-10-10 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* wide-int.h (wide_int_ref_storage): Make host_dependent_precision
a template parameter.
(WIDE_INT_REF_FOR): Update accordingly.
* tree.h (wi::int_traits <const_tree>): Delete.
(wi::tree_to_widest_ref, wi::tree_to_offset_ref): New typedefs.
(wi::to_widest, wi::to_offset): Use them. Expand commentary.
(wi::tree_to_wide_ref): New typedef.
(wi::to_wide): New function.
* calls.c (get_size_range): Use wi::to_wide when operating on
trees as wide_ints.
* cgraph.c (cgraph_node::create_thunk): Likewise.
* config/i386/i386.c (ix86_data_alignment): Likewise.
(ix86_local_alignment): Likewise.
* dbxout.c (stabstr_O): Likewise.
* dwarf2out.c (add_scalar_info, gen_enumeration_type_die): Likewise.
* expr.c (const_vector_from_tree): Likewise.
* fold-const-call.c (host_size_t_cst_p, fold_const_call_1): Likewise.
* fold-const.c (may_negate_without_overflow_p, negate_expr_p)
(fold_negate_expr_1, int_const_binop_1, const_binop)
(fold_convert_const_int_from_real, optimize_bit_field_compare)
(all_ones_mask_p, sign_bit_p, unextend, extract_muldiv_1)
(fold_div_compare, fold_single_bit_test, fold_plusminus_mult_expr)
(pointer_may_wrap_p, expr_not_equal_to, fold_binary_loc)
(fold_ternary_loc, multiple_of_p, fold_negate_const, fold_abs_const)
(fold_not_const, round_up_loc): Likewise.
* gimple-fold.c (gimple_fold_indirect_ref): Likewise.
* gimple-ssa-warn-alloca.c (alloca_call_type_by_arg): Likewise.
(alloca_call_type): Likewise.
* gimple.c (preprocess_case_label_vec_for_gimple): Likewise.
* godump.c (go_output_typedef): Likewise.
* graphite-sese-to-poly.c (tree_int_to_gmp): Likewise.
* internal-fn.c (get_min_precision): Likewise.
* ipa-cp.c (ipcp_store_vr_results): Likewise.
* ipa-polymorphic-call.c
(ipa_polymorphic_call_context::ipa_polymorphic_call_context): Likewise.
* ipa-prop.c (ipa_print_node_jump_functions_for_edge): Likewise.
(ipa_modify_call_arguments): Likewise.
* match.pd: Likewise.
* omp-low.c (scan_omp_1_op, lower_omp_ordered_clauses): Likewise.
* print-tree.c (print_node_brief, print_node): Likewise.
* stmt.c (expand_case): Likewise.
* stor-layout.c (layout_type): Likewise.
* tree-affine.c (tree_to_aff_combination): Likewise.
* tree-cfg.c (group_case_labels_stmt): Likewise.
* tree-data-ref.c (dr_analyze_indices): Likewise.
(prune_runtime_alias_test_list): Likewise.
* tree-dump.c (dequeue_and_dump): Likewise.
* tree-inline.c (remap_gimple_op_r, copy_tree_body_r): Likewise.
* tree-predcom.c (is_inv_store_elimination_chain): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-scalar-evolution.c (iv_can_overflow_p): Likewise.
(simple_iv_with_niters): Likewise.
* tree-ssa-address.c (addr_for_mem_ref): Likewise.
* tree-ssa-ccp.c (ccp_finalize, evaluate_stmt): Likewise.
* tree-ssa-loop-ivopts.c (constant_multiple_of): Likewise.
* tree-ssa-loop-niter.c (split_to_var_and_offset)
(refine_value_range_using_guard, number_of_iterations_ne_max)
(number_of_iterations_lt_to_ne, number_of_iterations_lt)
(get_cst_init_from_scev, record_nonwrapping_iv)
(scev_var_range_cant_overflow): Likewise.
* tree-ssa-phiopt.c (minmax_replacement): Likewise.
* tree-ssa-pre.c (compute_avail): Likewise.
* tree-ssa-sccvn.c (vn_reference_fold_indirect): Likewise.
(vn_reference_maybe_forwprop_address, valueized_wider_op): Likewise.
* tree-ssa-structalias.c (get_constraint_for_ptr_offset): Likewise.
* tree-ssa-uninit.c (is_pred_expr_subset_of): Likewise.
* tree-ssanames.c (set_nonzero_bits, get_nonzero_bits): Likewise.
* tree-switch-conversion.c (collect_switch_conv_info, array_value_type)
(dump_case_nodes, try_switch_expansion): Likewise.
* tree-vect-loop-manip.c (vect_gen_vector_loop_niters): Likewise.
(vect_do_peeling): Likewise.
* tree-vect-patterns.c (vect_recog_bool_pattern): Likewise.
* tree-vect-stmts.c (vectorizable_load): Likewise.
* tree-vrp.c (compare_values_warnv, vrp_int_const_binop): Likewise.
(zero_nonzero_bits_from_vr, ranges_from_anti_range): Likewise.
(extract_range_from_binary_expr_1, adjust_range_with_scev): Likewise.
(overflow_comparison_p_1, register_edge_assert_for_2): Likewise.
(is_masked_range_test, find_switch_asserts, maybe_set_nonzero_bits)
(vrp_evaluate_conditional_warnv_with_ops, intersect_ranges): Likewise.
(range_fits_type_p, two_valued_val_range_p, vrp_finalize): Likewise.
(evrp_dom_walker::before_dom_children): Likewise.
* tree.c (cache_integer_cst, real_value_from_int_cst, integer_zerop)
(integer_all_onesp, integer_pow2p, integer_nonzerop, tree_log2)
(tree_floor_log2, tree_ctz, mem_ref_offset, tree_int_cst_sign_bit)
(tree_int_cst_sgn, get_unwidened, int_fits_type_p): Likewise.
(get_type_static_bounds, num_ending_zeros, drop_tree_overflow)
(get_range_pos_neg): Likewise.
* ubsan.c (ubsan_expand_ptr_ifn): Likewise.
* config/darwin.c (darwin_mergeable_constant_section): Likewise.
* config/aarch64/aarch64.c (aapcs_vfp_sub_candidate): Likewise.
* config/arm/arm.c (aapcs_vfp_sub_candidate): Likewise.
* config/avr/avr.c (avr_fold_builtin): Likewise.
* config/bfin/bfin.c (bfin_local_alignment): Likewise.
* config/msp430/msp430.c (msp430_attr): Likewise.
* config/nds32/nds32.c (nds32_insert_attributes): Likewise.
* config/powerpcspe/powerpcspe-c.c
(altivec_resolve_overloaded_builtin): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_aggregate_candidate)
(rs6000_expand_ternop_builtin): Likewise.
* config/rs6000/rs6000-c.c
(altivec_resolve_overloaded_builtin): Likewise.
* config/rs6000/rs6000.c (rs6000_aggregate_candidate): Likewise.
(rs6000_expand_ternop_builtin): Likewise.
* config/s390/s390.c (s390_handle_hotpatch_attribute): Likewise.
gcc/ada/
* gcc-interface/decl.c (annotate_value): Use wi::to_wide when
operating on trees as wide_ints.
gcc/c/
* c-parser.c (c_parser_cilk_clause_vectorlength): Use wi::to_wide when
operating on trees as wide_ints.
* c-typeck.c (build_c_cast, c_finish_omp_clauses): Likewise.
(c_tree_equal): Likewise.
gcc/c-family/
* c-ada-spec.c (dump_generic_ada_node): Use wi::to_wide when
operating on trees as wide_ints.
* c-common.c (pointer_int_sum): Likewise.
* c-pretty-print.c (pp_c_integer_constant): Likewise.
* c-warn.c (match_case_to_enum_1): Likewise.
(c_do_switch_warnings): Likewise.
(maybe_warn_shift_overflow): Likewise.
gcc/cp/
* cvt.c (ignore_overflows): Use wi::to_wide when
operating on trees as wide_ints.
* decl.c (check_array_designated_initializer): Likewise.
* mangle.c (write_integer_cst): Likewise.
* semantics.c (cp_finish_omp_clause_depend_sink): Likewise.
gcc/fortran/
* target-memory.c (gfc_interpret_logical): Use wi::to_wide when
operating on trees as wide_ints.
* trans-const.c (gfc_conv_tree_to_mpz): Likewise.
* trans-expr.c (gfc_conv_cst_int_power): Likewise.
* trans-intrinsic.c (trans_this_image): Likewise.
(gfc_conv_intrinsic_bound): Likewise.
(conv_intrinsic_cobound): Likewise.
gcc/lto/
* lto.c (compare_tree_sccs_1): Use wi::to_wide when
operating on trees as wide_ints.
gcc/objc/
* objc-act.c (objc_decl_method_attributes): Use wi::to_wide when
operating on trees as wide_ints.
From-SVN: r253595
|
|
This patch changes the types of various things from machine_mode
to scalar_int_mode, in cases where (after previous patches)
simply changing the type is enough on its own. The patch does
nothing other than that.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* builtins.h (builtin_strncpy_read_str): Take a scalar_int_mode
instead of a machine_mode.
(builtin_memset_read_str): Likewise.
* builtins.c (c_readstr): Likewise.
(builtin_memcpy_read_str): Likewise.
(builtin_strncpy_read_str): Likewise.
(builtin_memset_read_str): Likewise.
(builtin_memset_gen_str): Likewise.
(expand_builtin_signbit): Use scalar_int_mode for local variables.
* cfgexpand.c (convert_debug_memory_address): Take a scalar_int_mode
instead of a machine_mode.
* combine.c (simplify_if_then_else): Use scalar_int_mode for local
variables.
(make_extraction): Likewise.
(try_widen_shift_mode): Take and return scalar_int_modes instead
of machine_modes.
* config/aarch64/aarch64.c (aarch64_libgcc_cmp_return_mode): Return
a scalar_int_mode instead of a machine_mode.
* config/avr/avr.c (avr_addr_space_address_mode): Likewise.
(avr_addr_space_pointer_mode): Likewise.
* config/cr16/cr16.c (cr16_unwind_word_mode): Likewise.
* config/msp430/msp430.c (msp430_addr_space_pointer_mode): Likewise.
(msp430_unwind_word_mode): Likewise.
* config/spu/spu.c (spu_unwind_word_mode): Likewise.
(spu_addr_space_pointer_mode): Likewise.
(spu_addr_space_address_mode): Likewise.
(spu_libgcc_cmp_return_mode): Likewise.
(spu_libgcc_shift_count_mode): Likewise.
* config/rl78/rl78.c (rl78_addr_space_address_mode): Likewise.
(rl78_addr_space_pointer_mode): Likewise.
(fl78_unwind_word_mode): Likewise.
(rl78_valid_pointer_mode): Take a scalar_int_mode instead of a
machine_mode.
* config/alpha/alpha.c (vms_valid_pointer_mode): Likewise.
* config/ia64/ia64.c (ia64_vms_valid_pointer_mode): Likewise.
* config/mips/mips.c (mips_mode_rep_extended): Likewise.
(mips_valid_pointer_mode): Likewise.
* config/tilegx/tilegx.c (tilegx_mode_rep_extended): Likewise.
* config/ft32/ft32.c (ft32_valid_pointer_mode): Likewise.
(ft32_addr_space_pointer_mode): Return a scalar_int_mode instead
of a machine_mode.
(ft32_addr_space_address_mode): Likewise.
* config/m32c/m32c.c (m32c_valid_pointer_mode): Take a
scalar_int_mode instead of a machine_mode.
(m32c_addr_space_pointer_mode): Return a scalar_int_mode instead
of a machine_mode.
(m32c_addr_space_address_mode): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_abi_word_mode): Likewise.
(rs6000_eh_return_filter_mode): Likewise.
* config/rs6000/rs6000.c (rs6000_abi_word_mode): Likewise.
(rs6000_eh_return_filter_mode): Likewise.
* config/s390/s390.c (s390_libgcc_cmp_return_mode): Likewise.
(s390_libgcc_shift_count_mode): Likewise.
(s390_unwind_word_mode): Likewise.
(s390_valid_pointer_mode): Take a scalar_int_mode rather than a
machine_mode.
* target.def (mode_rep_extended): Likewise.
(valid_pointer_mode): Likewise.
(addr_space.valid_pointer_mode): Likewise.
(eh_return_filter_mode): Return a scalar_int_mode rather than
a machine_mode.
(libgcc_cmp_return_mode): Likewise.
(libgcc_shift_count_mode): Likewise.
(unwind_word_mode): Likewise.
(addr_space.pointer_mode): Likewise.
(addr_space.address_mode): Likewise.
* doc/tm.texi: Regenerate.
* dojump.c (prefer_and_bit_test): Take a scalar_int_mode rather than
a machine_mode.
(do_jump): Use scalar_int_mode for local variables.
* dwarf2cfi.c (init_return_column_size): Take a scalar_int_mode
rather than a machine_mode.
* dwarf2out.c (convert_descriptor_to_mode): Likewise.
(scompare_loc_descriptor_wide): Likewise.
(scompare_loc_descriptor_narrow): Likewise.
* emit-rtl.c (adjust_address_1): Use scalar_int_mode for local
variables.
* except.c (sjlj_emit_dispatch_table): Likewise.
(expand_builtin_eh_copy_values): Likewise.
* explow.c (convert_memory_address_addr_space_1): Likewise.
Take a scalar_int_mode rather than a machine_mode.
(convert_memory_address_addr_space): Take a scalar_int_mode rather
than a machine_mode.
(memory_address_addr_space): Use scalar_int_mode for local variables.
* expmed.h (expand_mult_highpart_adjust): Take a scalar_int_mode
rather than a machine_mode.
* expmed.c (mask_rtx): Likewise.
(init_expmed_one_conv): Likewise.
(expand_mult_highpart_adjust): Likewise.
(extract_high_half): Likewise.
(expmed_mult_highpart_optab): Likewise.
(expmed_mult_highpart): Likewise.
(expand_smod_pow2): Likewise.
(expand_sdiv_pow2): Likewise.
(emit_store_flag_int): Likewise.
(adjust_bit_field_mem_for_reg): Use scalar_int_mode for local
variables.
(extract_low_bits): Likewise.
* expr.h (by_pieces_constfn): Take a scalar_int_mode rather than
a machine_mode.
* expr.c (pieces_addr::adjust): Likewise.
(can_store_by_pieces): Likewise.
(store_by_pieces): Likewise.
(clear_by_pieces_1): Likewise.
(expand_expr_addr_expr_1): Likewise.
(expand_expr_addr_expr): Use scalar_int_mode for local variables.
(expand_expr_real_1): Likewise.
(try_casesi): Likewise.
* final.c (shorten_branches): Likewise.
* fold-const.c (fold_convert_const_int_from_fixed): Change the
type of "mode" to machine_mode.
* internal-fn.c (expand_arith_overflow_result_store): Take a
scalar_int_mode rather than a machine_mode.
(expand_mul_overflow): Use scalar_int_mode for local variables.
* loop-doloop.c (doloop_modify): Likewise.
(doloop_optimize): Likewise.
* optabs.c (expand_subword_shift): Take a scalar_int_mode rather
than a machine_mode.
(expand_doubleword_shift_condmove): Likewise.
(expand_doubleword_shift): Likewise.
(expand_doubleword_clz): Likewise.
(expand_doubleword_popcount): Likewise.
(expand_doubleword_parity): Likewise.
(expand_absneg_bit): Use scalar_int_mode for local variables.
(prepare_float_lib_cmp): Likewise.
* rtl.h (convert_memory_address_addr_space_1): Take a scalar_int_mode
rather than a machine_mode.
(convert_memory_address_addr_space): Likewise.
(get_mode_bounds): Likewise.
(get_address_mode): Return a scalar_int_mode rather than a
machine_mode.
* rtlanal.c (get_address_mode): Likewise.
* stor-layout.c (get_mode_bounds): Take a scalar_int_mode rather
than a machine_mode.
* targhooks.c (default_mode_rep_extended): Likewise.
(default_valid_pointer_mode): Likewise.
(default_addr_space_valid_pointer_mode): Likewise.
(default_eh_return_filter_mode): Return a scalar_int_mode rather
than a machine_mode.
(default_libgcc_cmp_return_mode): Likewise.
(default_libgcc_shift_count_mode): Likewise.
(default_unwind_word_mode): Likewise.
(default_addr_space_pointer_mode): Likewise.
(default_addr_space_address_mode): Likewise.
* targhooks.h (default_eh_return_filter_mode): Likewise.
(default_libgcc_cmp_return_mode): Likewise.
(default_libgcc_shift_count_mode): Likewise.
(default_unwind_word_mode): Likewise.
(default_addr_space_pointer_mode): Likewise.
(default_addr_space_address_mode): Likewise.
(default_mode_rep_extended): Take a scalar_int_mode rather than
a machine_mode.
(default_valid_pointer_mode): Likewise.
(default_addr_space_valid_pointer_mode): Likewise.
* tree-ssa-address.c (addr_for_mem_ref): Use scalar_int_mode for
local variables.
* tree-ssa-loop-ivopts.c (get_shiftadd_cost): Take a scalar_int_mode
rather than a machine_mode.
* tree-switch-conversion.c (array_value_type): Use scalar_int_mode
for local variables.
* tree-vrp.c (simplify_float_conversion_using_ranges): Likewise.
* var-tracking.c (use_narrower_mode): Take a scalar_int_mode rather
than a machine_mode.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251513
|
|
This patch adds asserting as_a <scalar_int_mode> conversions
to contexts in which the input is known to be a scalar integer mode.
In expand_divmod, op1 is always a scalar_int_mode if
op1_is_constant (but might not be otherwise).
In expand_binop, the patch reverses a < comparison in order to
avoid splitting a long line.
gcc/
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* cfgexpand.c (convert_debug_memory_address): Use
as_a <scalar_int_mode>.
* combine.c (expand_compound_operation): Likewise.
(make_extraction): Likewise.
(change_zero_ext): Likewise.
(simplify_comparison): Likewise.
* cse.c (cse_insn): Likewise.
* dwarf2out.c (minmax_loc_descriptor): Likewise.
(mem_loc_descriptor): Likewise.
(loc_descriptor): Likewise.
* expmed.c (init_expmed_one_mode): Likewise.
(synth_mult): Likewise.
(emit_store_flag_1): Likewise.
(expand_divmod): Likewise. Use HWI_COMPUTABLE_MODE_P instead
of a comparison with size.
* expr.c (expand_assignment): Use as_a <scalar_int_mode>.
(reduce_to_bit_field_precision): Likewise.
* function.c (expand_function_end): Likewise.
* internal-fn.c (expand_arith_overflow_result_store): Likewise.
* loop-doloop.c (doloop_modify): Likewise.
* optabs.c (expand_binop): Likewise.
(expand_unop): Likewise.
(expand_copysign_absneg): Likewise.
(prepare_cmp_insn): Likewise.
(maybe_legitimize_operand): Likewise.
* recog.c (const_scalar_int_operand): Likewise.
* rtlanal.c (get_address_mode): Likewise.
* simplify-rtx.c (simplify_unary_operation_1): Likewise.
(simplify_cond_clz_ctz): Likewise.
* tree-nested.c (get_nl_goto_field): Likewise.
* tree.c (build_vector_type_for_mode): Likewise.
* var-tracking.c (use_narrower_mode): Likewise.
gcc/c-family/
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* c-common.c (c_common_type_for_mode): Use as_a <scalar_int_mode>.
gcc/lto/
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* lto-lang.c (lto_type_for_mode): Use as_a <scalar_int_mode>.
From-SVN: r251487
|
|
This patch adds a SCALAR_INT_TYPE_MODE macro that asserts
that the type has a scalar integer mode and returns it as
a scalar_int_mode.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree.h (SCALAR_INT_TYPE_MODE): New macro.
* builtins.c (expand_builtin_signbit): Use it.
* cfgexpand.c (expand_debug_expr): Likewise.
* dojump.c (do_jump): Likewise.
(do_compare_and_jump): Likewise.
* dwarf2cfi.c (expand_builtin_init_dwarf_reg_sizes): Likewise.
* expmed.c (make_tree): Likewise.
* expr.c (expand_expr_real_2): Likewise.
(expand_expr_real_1): Likewise.
(try_casesi): Likewise.
* fold-const-call.c (fold_const_call_ss): Likewise.
* fold-const.c (unextend): Likewise.
(extract_muldiv_1): Likewise.
(fold_single_bit_test): Likewise.
(native_encode_int): Likewise.
(native_encode_string): Likewise.
(native_interpret_int): Likewise.
* gimple-fold.c (gimple_fold_builtin_memset): Likewise.
* internal-fn.c (expand_addsub_overflow): Likewise.
(expand_neg_overflow): Likewise.
(expand_mul_overflow): Likewise.
(expand_arith_overflow): Likewise.
* match.pd: Likewise.
* stor-layout.c (layout_type): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-ssa-math-opts.c (convert_mult_to_widen): Likewise.
* tree-ssanames.c (get_range_info): Likewise.
* tree-switch-conversion.c (array_value_type) Likewise.
* tree-vect-patterns.c (vect_recog_rotate_pattern): Likewise.
(vect_recog_divmod_pattern): Likewise.
(vect_recog_mixed_size_cond_pattern): Likewise.
* tree-vrp.c (extract_range_basic): Likewise.
(simplify_float_conversion_using_ranges): Likewise.
* tree.c (int_fits_type_p): Likewise.
* ubsan.c (instrument_bool_enum_load): Likewise.
* varasm.c (mergeable_string_section): Likewise.
(narrowing_initializer_constant_valid_p): Likewise.
(output_constant): Likewise.
gcc/cp/
* cvt.c (cp_convert_to_pointer): Use SCALAR_INT_TYPE_MODE.
gcc/fortran/
* target-memory.c (size_integer): Use SCALAR_INT_TYPE_MODE.
(size_logical): Likewise.
gcc/objc/
* objc-encoding.c (encode_type): Use SCALAR_INT_TYPE_MODE.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251486
|
|
This patch adds a wrapper around smallest_mode_for_size
for cases in which the mode class is MODE_INT. Unlike
(int_)mode_for_size, smallest_mode_for_size always returns
a mode of the specified class, asserting if no such mode exists.
smallest_int_mode_for_size therefore returns a scalar_int_mode
rather than an opt_scalar_int_mode.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (smallest_mode_for_size): Fix formatting.
(smallest_int_mode_for_size): New function.
* cfgexpand.c (expand_debug_expr): Use smallest_int_mode_for_size
instead of smallest_mode_for_size.
* combine.c (make_extraction): Likewise.
* config/arc/arc.c (arc_expand_movmem): Likewise.
* config/arm/arm.c (arm_expand_divmod_libfunc): Likewise.
* config/i386/i386.c (ix86_get_mask_mode): Likewise.
* config/s390/s390.c (s390_expand_insv): Likewise.
* config/sparc/sparc.c (assign_int_registers): Likewise.
* config/spu/spu.c (spu_function_value): Likewise.
(spu_function_arg): Likewise.
* coverage.c (get_gcov_type): Likewise.
(get_gcov_unsigned_t): Likewise.
* dse.c (find_shift_sequence): Likewise.
* expmed.c (store_bit_field_1): Likewise.
* expr.c (convert_move): Likewise.
(store_field): Likewise.
* internal-fn.c (expand_arith_overflow): Likewise.
* optabs-query.c (get_best_extraction_insn): Likewise.
* optabs.c (expand_twoval_binop_libfunc): Likewise.
* stor-layout.c (layout_type): Likewise.
(initialize_sizetypes): Likewise.
* targhooks.c (default_get_mask_mode): Likewise.
* tree-ssa-loop-manip.c (canonicalize_loop_ivs): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251471
|
|
This patch adds a wrapper around mode_for_size for cases in which
the mode class is MODE_INT (the commonest case). The return type
can then be an opt_scalar_int_mode instead of a machine_mode.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (int_mode_for_size): New function.
* builtins.c (set_builtin_user_assembler_name): Use int_mode_for_size
instead of mode_for_size.
* calls.c (save_fixed_argument_area): Likewise. Make use of BLKmode
explicit.
* combine.c (expand_field_assignment): Use int_mode_for_size
instead of mode_for_size.
(make_extraction): Likewise.
(simplify_shift_const_1): Likewise.
(simplify_comparison): Likewise.
* dojump.c (do_jump): Likewise.
* dwarf2out.c (mem_loc_descriptor): Likewise.
* emit-rtl.c (init_derived_machine_modes): Likewise.
* expmed.c (flip_storage_order): Likewise.
(convert_extracted_bit_field): Likewise.
* expr.c (copy_blkmode_from_reg): Likewise.
* graphite-isl-ast-to-gimple.c (max_mode_int_precision): Likewise.
* internal-fn.c (expand_mul_overflow): Likewise.
* lower-subreg.c (simple_move): Likewise.
* optabs-libfuncs.c (init_optabs): Likewise.
* simplify-rtx.c (simplify_unary_operation_1): Likewise.
* tree.c (vector_type_mode): Likewise.
* tree-ssa-strlen.c (handle_builtin_memcmp): Likewise.
* tree-vect-data-refs.c (vect_lanes_optab_supported_p): Likewise.
* tree-vect-generic.c (expand_vector_parallel): Likewise.
* tree-vect-stmts.c (vectorizable_load): Likewise.
(vectorizable_store): Likewise.
gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_entity): Use int_mode_for_size
instead of mode_for_size.
(gnat_to_gnu_subprog_type): Likewise.
* gcc-interface/utils.c (make_type_from_size): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251469
|
|
GET_MODE_WIDER previously returned VOIDmode if no wider mode existed.
That would cause problems with stricter mode classes, since VOIDmode
isn't for example a valid scalar integer or floating-point mode.
This patch instead makes it return a new opt_mode<T> class, which
holds either a T or nothing.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* coretypes.h (opt_mode): New class.
* machmode.h (opt_mode): Likewise.
(opt_mode::else_void): New function.
(opt_mode::require): Likewise.
(opt_mode::exists): Likewise.
(GET_MODE_WIDER_MODE): Turn into a function and return an opt_mode.
(GET_MODE_2XWIDER_MODE): Likewise.
(mode_iterator::get_wider): Update accordingly.
(mode_iterator::get_2xwider): Likewise.
(mode_iterator::get_known_wider): Likewise, turning into a template.
* combine.c (make_extraction): Update use of GET_MODE_WIDER_MODE,
forcing a wider mode to exist.
* config/cr16/cr16.h (LONG_REG_P): Likewise.
* rtlanal.c (init_num_sign_bit_copies_in_rep): Likewise.
* config/c6x/c6x.c (c6x_rtx_costs): Update use of
GET_MODE_2XWIDER_MODE, forcing a wider mode to exist.
* lower-subreg.c (init_lower_subreg): Likewise.
* optabs-libfuncs.c (init_sync_libfuncs_1): Likewise, but not
on the final iteration.
* config/i386/i386.c (ix86_expand_set_or_movmem): Check whether
a wider mode exists before asking for a move pattern.
(get_mode_wider_vector): Update use of GET_MODE_WIDER_MODE,
forcing a wider mode to exist.
(expand_vselect_vconcat): Update use of GET_MODE_2XWIDER_MODE,
returning false if no such mode exists.
* config/ia64/ia64.c (expand_vselect_vconcat): Likewise.
* config/mips/mips.c (mips_expand_vselect_vconcat): Likewise.
* expmed.c (init_expmed_one_mode): Update use of GET_MODE_WIDER_MODE.
Avoid checking for a MODE_INT if we already know the mode is not a
SCALAR_INT_MODE_P.
(extract_high_half): Update use of GET_MODE_WIDER_MODE,
forcing a wider mode to exist.
(expmed_mult_highpart_optab): Likewise.
(expmed_mult_highpart): Likewise.
* expr.c (expand_expr_real_2): Update use of GET_MODE_WIDER_MODE,
using else_void.
* lto-streamer-in.c (lto_input_mode_table): Likewise.
* optabs-query.c (find_widening_optab_handler_and_mode): Likewise.
* stor-layout.c (bit_field_mode_iterator::next_mode): Likewise.
* internal-fn.c (expand_mul_overflow): Update use of
GET_MODE_2XWIDER_MODE.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* tree-ssa-math-opts.c (convert_mult_to_widen): Update use of
GET_MODE_WIDER_MODE.
(convert_plusminus_to_widen): Likewise.
* tree-switch-conversion.c (array_value_type): Likewise.
* var-tracking.c (emit_note_insn_var_location): Likewise.
* tree-vrp.c (simplify_float_conversion_using_ranges): Likewise.
Return false inside rather than outside the loop if no wider mode
exists
* optabs.c (expand_binop): Update use of GET_MODE_WIDER_MODE
and GET_MODE_2XWIDER_MODE
(can_compare_p): Use else_void.
* gdbhooks.py (OptMachineModePrinter): New class.
(build_pretty_printer): Use it for opt_mode.
gcc/ada/
* gcc-interface/decl.c (validate_size): Update use of
GET_MODE_WIDER_MODE, forcing a wider mode to exist.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251457
|
|
.
2017-08-08 Martin Liska <mliska@suse.cz>
* gcc-interface/trans.c: Include header files.
2017-08-08 Martin Liska <mliska@suse.cz>
* objc-gnu-runtime-abi-01.c: Include header files.
* objc-next-runtime-abi-01.c: Likewise.
* objc-next-runtime-abi-02.c: Likewise.
2017-08-08 Martin Liska <mliska@suse.cz>
* asan.c: Include header files.
* attribs.c (build_decl_attribute_variant): New function moved
from tree.[ch].
(build_type_attribute_qual_variant): Likewise.
(cmp_attrib_identifiers): Likewise.
(simple_cst_list_equal): Likewise.
(omp_declare_simd_clauses_equal): Likewise.
(attribute_value_equal): Likewise.
(comp_type_attributes): Likewise.
(build_type_attribute_variant): Likewise.
(lookup_ident_attribute): Likewise.
(remove_attribute): Likewise.
(merge_attributes): Likewise.
(merge_type_attributes): Likewise.
(merge_decl_attributes): Likewise.
(merge_dllimport_decl_attributes): Likewise.
(handle_dll_attribute): Likewise.
(attribute_list_equal): Likewise.
(attribute_list_contained): Likewise.
* attribs.h (lookup_attribute): New function moved from tree.[ch].
(lookup_attribute_by_prefix): Likewise.
* bb-reorder.c: Include header files.
* builtins.c: Likewise.
* calls.c: Likewise.
* cfgexpand.c: Likewise.
* cgraph.c: Likewise.
* cgraphunit.c: Likewise.
* convert.c: Likewise.
* dwarf2out.c: Likewise.
* final.c: Likewise.
* fold-const.c: Likewise.
* function.c: Likewise.
* gimple-expr.c: Likewise.
* gimple-fold.c: Likewise.
* gimple-pretty-print.c: Likewise.
* gimple.c: Likewise.
* gimplify.c: Likewise.
* hsa-common.c: Likewise.
* hsa-gen.c: Likewise.
* internal-fn.c: Likewise.
* ipa-chkp.c: Likewise.
* ipa-cp.c: Likewise.
* ipa-devirt.c: Likewise.
* ipa-fnsummary.c: Likewise.
* ipa-inline.c: Likewise.
* ipa-visibility.c: Likewise.
* ipa.c: Likewise.
* lto-cgraph.c: Likewise.
* omp-expand.c: Likewise.
* omp-general.c: Likewise.
* omp-low.c: Likewise.
* omp-offload.c: Likewise.
* omp-simd-clone.c: Likewise.
* opts-global.c: Likewise.
* passes.c: Likewise.
* predict.c: Likewise.
* sancov.c: Likewise.
* sanopt.c: Likewise.
* symtab.c: Likewise.
* toplev.c: Likewise.
* trans-mem.c: Likewise.
* tree-chkp.c: Likewise.
* tree-eh.c: Likewise.
* tree-into-ssa.c: Likewise.
* tree-object-size.c: Likewise.
* tree-parloops.c: Likewise.
* tree-profile.c: Likewise.
* tree-ssa-ccp.c: Likewise.
* tree-ssa-live.c: Likewise.
* tree-ssa-loop.c: Likewise.
* tree-ssa-sccvn.c: Likewise.
* tree-ssa-structalias.c: Likewise.
* tree-ssa.c: Likewise.
* tree-streamer-in.c: Likewise.
* tree-vectorizer.c: Likewise.
* tree-vrp.c: Likewise.
* tsan.c: Likewise.
* ubsan.c: Likewise.
* varasm.c: Likewise.
* varpool.c: Likewise.
* tree.c: Remove functions moved to attribs.[ch].
* tree.h: Likewise.
* config/aarch64/aarch64.c: Add attrs.h header file.
* config/alpha/alpha.c: Likewise.
* config/arc/arc.c: Likewise.
* config/arm/arm.c: Likewise.
* config/avr/avr.c: Likewise.
* config/bfin/bfin.c: Likewise.
* config/c6x/c6x.c: Likewise.
* config/cr16/cr16.c: Likewise.
* config/cris/cris.c: Likewise.
* config/darwin.c: Likewise.
* config/epiphany/epiphany.c: Likewise.
* config/fr30/fr30.c: Likewise.
* config/frv/frv.c: Likewise.
* config/ft32/ft32.c: Likewise.
* config/h8300/h8300.c: Likewise.
* config/i386/winnt.c: Likewise.
* config/ia64/ia64.c: Likewise.
* config/iq2000/iq2000.c: Likewise.
* config/lm32/lm32.c: Likewise.
* config/m32c/m32c.c: Likewise.
* config/m32r/m32r.c: Likewise.
* config/m68k/m68k.c: Likewise.
* config/mcore/mcore.c: Likewise.
* config/microblaze/microblaze.c: Likewise.
* config/mips/mips.c: Likewise.
* config/mmix/mmix.c: Likewise.
* config/mn10300/mn10300.c: Likewise.
* config/moxie/moxie.c: Likewise.
* config/msp430/msp430.c: Likewise.
* config/nds32/nds32-isr.c: Likewise.
* config/nds32/nds32.c: Likewise.
* config/nios2/nios2.c: Likewise.
* config/nvptx/nvptx.c: Likewise.
* config/pa/pa.c: Likewise.
* config/pdp11/pdp11.c: Likewise.
* config/powerpcspe/powerpcspe.c: Likewise.
* config/riscv/riscv.c: Likewise.
* config/rl78/rl78.c: Likewise.
* config/rx/rx.c: Likewise.
* config/s390/s390.c: Likewise.
* config/sh/sh.c: Likewise.
* config/sol2.c: Likewise.
* config/sparc/sparc.c: Likewise.
* config/spu/spu.c: Likewise.
* config/stormy16/stormy16.c: Likewise.
* config/tilegx/tilegx.c: Likewise.
* config/tilepro/tilepro.c: Likewise.
* config/v850/v850.c: Likewise.
* config/vax/vax.c: Likewise.
* config/visium/visium.c: Likewise.
* config/xtensa/xtensa.c: Likewise.
2017-08-08 Martin Liska <mliska@suse.cz>
* call.c: Include header files.
* cp-gimplify.c: Likewise.
* cp-ubsan.c: Likewise.
* cvt.c: Likewise.
* init.c: Likewise.
* search.c: Likewise.
* semantics.c: Likewise.
* typeck.c: Likewise.
2017-08-08 Martin Liska <mliska@suse.cz>
* lto-lang.c: Include header files.
* lto-symtab.c: Likewise.
2017-08-08 Martin Liska <mliska@suse.cz>
* c-convert.c: Include header files.
* c-typeck.c: Likewise.
2017-08-08 Martin Liska <mliska@suse.cz>
* c-ada-spec.c: Include header files.
* c-ubsan.c: Likewise.
* c-warn.c: Likewise.
2017-08-08 Martin Liska <mliska@suse.cz>
* trans-types.c: Include header files.
From-SVN: r250946
|
|
PR sanitizer/80998
* sanopt.c (pass_sanopt::execute): Handle IFN_UBSAN_PTR.
* tree-ssa-alias.c (call_may_clobber_ref_p_1): Likewise.
* flag-types.h (enum sanitize_code): Add SANITIZER_POINTER_OVERFLOW.
Or it into SANITIZER_UNDEFINED.
* ubsan.c: Include gimple-fold.h and varasm.h.
(ubsan_expand_ptr_ifn): New function.
(instrument_pointer_overflow): New function.
(maybe_instrument_pointer_overflow): New function.
(instrument_object_size): Formatting fix.
(pass_ubsan::execute): Call instrument_pointer_overflow
and maybe_instrument_pointer_overflow.
* internal-fn.c (expand_UBSAN_PTR): New function.
* ubsan.h (ubsan_expand_ptr_ifn): Declare.
* sanitizer.def (__ubsan_handle_pointer_overflow,
__ubsan_handle_pointer_overflow_abort): New builtins.
* tree-ssa-tail-merge.c (merge_stmts_p): Handle IFN_UBSAN_PTR.
* internal-fn.def (UBSAN_PTR): New internal function.
* opts.c (sanitizer_opts): Add pointer-overflow.
* lto-streamer-in.c (input_function): Handle IFN_UBSAN_PTR.
* fold-const.c (build_range_check): Compute pointer range check in
integral type if pointer arithmetics would be needed. Formatting
fixes.
gcc/testsuite/
* c-c++-common/ubsan/ptr-overflow-1.c: New test.
* c-c++-common/ubsan/ptr-overflow-2.c: New test.
libsanitizer/
* ubsan/ubsan_handlers.cc: Cherry-pick upstream r304461.
* ubsan/ubsan_checks.inc: Likewise.
* ubsan/ubsan_handlers.h: Likewise.
From-SVN: r250656
|
|
functions.
* profile-count.h (profile_probability::from_reg_br_prob_note,
profile_probability::to_reg_br_prob_note): New functions.
* doc/rtl.texi (REG_BR_PROB_NOTE): Update documentation.
* reg-notes.h (REG_BR_PROB, REG_BR_PRED): Update docs.
* predict.c (probability_reliable_p): Update.
(edge_probability_reliable_p): Update.
(br_prob_note_reliable_p): Update.
(invert_br_probabilities): Update.
(add_reg_br_prob_note): New function.
(combine_predictions_for_insn): Update.
* asan.c (asan_clear_shadow): Update.
* cfgbuild.c (compute_outgoing_frequencies): Update.
* cfgrtl.c (force_nonfallthru_and_redirect): Update.
(update_br_prob_note): Update.
(rtl_verify_edges): Update.
(purge_dead_edges): Update.
(fixup_reorder_chain): Update.
* emit-rtl.c (try_split): Update.
* ifcvt.c (cond_exec_process_insns): Update.
(cond_exec_process_if_block): Update.
(dead_or_predicable): Update.
* internal-fn.c (expand_addsub_overflow): Update.
(expand_neg_overflow): Update.
(expand_mul_overflow): Update.
* loop-doloop.c (doloop_modify): Update.
* loop-unroll.c (compare_and_jump_seq): Update.
* optabs.c (emit_cmp_and_jump_insn_1): Update.
* predict.h: Update.
* reorg.c (mostly_true_jump): Update.
* rtl.h: Update.
* config/aarch64/aarch64.c (aarch64_emit_unlikely_jump): Update.
* config/alpha/alpha.c (emit_unlikely_jump): Update.
* config/arc/arc.c: (emit_unlikely_jump): Update.
* config/arm/arm.c: (emit_unlikely_jump): Update.
* config/bfin/bfin.c (cbranch_predicted_taken_p): Update.
* config/frv/frv.c (frv_print_operand_jump_hint): Update.
* config/i386/i386.c (ix86_expand_split_stack_prologue): Update.
(ix86_print_operand): Update.
(ix86_split_fp_branch): Update.
(predict_jump): Update.
* config/ia64/ia64.c (ia64_print_operand): Update.
* config/mmix/mmix.c (mmix_print_operand): Update.
* config/powerpcspe/powerpcspe.c (output_cbranch): Update.
(rs6000_expand_split_stack_prologue): Update.
* config/rs6000/rs6000.c: Update.
* config/s390/s390.c (s390_expand_vec_strlen): Update.
(s390_expand_vec_movstr): Update.
(s390_expand_cs_tdsi): Update.
(s390_expand_split_stack_prologue): Update.
* config/sh/sh.c (sh_print_operand): Update.
(expand_cbranchsi4): Update.
(expand_cbranchdi4): Update.
* config/sparc/sparc.c (output_v9branch): Update.
* config/spu/spu.c (get_branch_target): Update.
(ea_load_store_inline): Update.
* config/tilegx/tilegx.c (cbranch_predicted_p): Update.
* config/tilepro/tilepro.c: Update.
* gcc.dg/predict-8.c: Update.
From-SVN: r250239
|
|
gcc/
* asan.c: Include gimple-fold.h.
(get_last_alloca_addr): New function.
(handle_builtin_stackrestore): Likewise.
(handle_builtin_alloca): Likewise.
(asan_emit_allocas_unpoison): Likewise.
(get_mem_refs_of_builtin_call): Add new parameter, remove const
quallifier from first paramerer. Handle BUILT_IN_ALLOCA,
BUILT_IN_ALLOCA_WITH_ALIGN and BUILT_IN_STACK_RESTORE builtins.
(instrument_builtin_call): Pass gimple iterator to
get_mem_refs_of_builtin_call.
(last_alloca_addr): New global.
* asan.h (asan_emit_allocas_unpoison): Declare.
* builtins.c (expand_asan_emit_allocas_unpoison): New function.
(expand_builtin): Handle BUILT_IN_ASAN_ALLOCAS_UNPOISON.
* cfgexpand.c (expand_used_vars): Call asan_emit_allocas_unpoison
if function calls alloca.
* gimple-fold.c (replace_call_with_value): Remove static keyword.
* gimple-fold.h (replace_call_with_value): Declare.
* internal-fn.c: Include asan.h.
* sanitizer.def (BUILT_IN_ASAN_ALLOCA_POISON,
BUILT_IN_ASAN_ALLOCAS_UNPOISON): New builtins.
gcc/testsuite/
* c-c++-common/asan/alloca_big_alignment.c: New test.
* c-c++-common/asan/alloca_detect_custom_size.c: Likewise.
* c-c++-common/asan/alloca_instruments_all_paddings.c: Likewise.
* c-c++-common/asan/alloca_loop_unpoisoning.c: Likewise.
* c-c++-common/asan/alloca_overflow_partial.c: Likewise.
* c-c++-common/asan/alloca_overflow_right.c: Likewise.
* c-c++-common/asan/alloca_safe_access.c: Likewise.
* c-c++-common/asan/alloca_underflow_left.c: Likewise.
From-SVN: r250031
|
|
r216834 did a mass removal of "enum" before "machine_mode". This patch
removes some new uses that have been added since then.
2017-07-05 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* combine.c (simplify_if_then_else): Remove "enum" before
"machine_mode".
* compare-elim.c (can_eliminate_compare): Likewise.
* config/aarch64/aarch64-builtins.c (aarch64_simd_builtin_std_type):
Likewise.
(aarch64_lookup_simd_builtin_type): Likewise.
(aarch64_simd_builtin_type): Likewise.
(aarch64_init_simd_builtin_types): Likewise.
(aarch64_simd_expand_args): Likewise.
* config/aarch64/aarch64-protos.h (aarch64_simd_attr_length_rglist):
Likewise.
(aarch64_reverse_mask): Likewise.
(aarch64_simd_emit_reg_reg_move): Likewise.
(aarch64_gen_adjusted_ldpstp): Likewise.
(aarch64_ccmp_mode_to_code): Likewise.
(aarch64_operands_ok_for_ldpstp): Likewise.
(aarch64_operands_adjust_ok_for_ldpstp): Likewise.
* config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
Likewise.
(aarch64_min_divisions_for_recip_mul): Likewise.
(aarch64_reassociation_width): Likewise.
(aarch64_get_condition_code_1): Likewise.
(aarch64_simd_emit_reg_reg_move): Likewise.
(aarch64_simd_attr_length_rglist): Likewise.
(aarch64_reverse_mask): Likewise.
(aarch64_operands_ok_for_ldpstp): Likewise.
(aarch64_operands_adjust_ok_for_ldpstp): Likewise.
(aarch64_gen_adjusted_ldpstp): Likewise.
* config/aarch64/cortex-a57-fma-steering.c (fma_node::rename):
Likewise.
* config/arc/arc.c (legitimate_offset_address_p): Likewise.
* config/arm/arm-builtins.c (arm_simd_builtin_std_type): Likewise.
(arm_lookup_simd_builtin_type): Likewise.
(arm_simd_builtin_type): Likewise.
(arm_init_simd_builtin_types): Likewise.
(arm_expand_builtin_args): Likewise.
* config/arm/arm-protos.h (arm_expand_builtin): Likewise.
* config/ft32/ft32.c (ft32_libcall_value): Likewise.
(ft32_setup_incoming_varargs): Likewise.
(ft32_function_arg): Likewise.
(ft32_function_arg_advance): Likewise.
(ft32_pass_by_reference): Likewise.
(ft32_arg_partial_bytes): Likewise.
(ft32_valid_pointer_mode): Likewise.
(ft32_addr_space_pointer_mode): Likewise.
(ft32_addr_space_legitimate_address_p): Likewise.
* config/i386/i386-protos.h (ix86_operands_ok_for_move_multiple):
Likewise.
* config/i386/i386.c (ix86_setup_incoming_vararg_bounds): Likewise.
(ix86_emit_outlined_ms2sysv_restore): Likewise.
(iamcu_alignment): Likewise.
(canonicalize_vector_int_perm): Likewise.
(ix86_noce_conversion_profitable_p): Likewise.
(ix86_mpx_bound_mode): Likewise.
(ix86_operands_ok_for_move_multiple): Likewise.
* config/microblaze/microblaze-protos.h
(microblaze_expand_conditional_branch_reg): Likewise.
* config/microblaze/microblaze.c
(microblaze_expand_conditional_branch_reg): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_init_hard_regno_mode_ok):
Likewise.
(rs6000_reassociation_width): Likewise.
(rs6000_invalid_binary_op): Likewise.
(fusion_p9_p): Likewise.
(emit_fusion_p9_load): Likewise.
(emit_fusion_p9_store): Likewise.
* config/riscv/riscv-protos.h (riscv_regno_mode_ok_for_base_p):
Likewise.
(riscv_hard_regno_mode_ok_p): Likewise.
(riscv_address_insns): Likewise.
(riscv_split_symbol): Likewise.
(riscv_legitimize_move): Likewise.
(riscv_function_value): Likewise.
(riscv_hard_regno_nregs): Likewise.
(riscv_expand_builtin): Likewise.
* config/riscv/riscv.c (riscv_build_integer_1): Likewise.
(riscv_build_integer): Likewise.
(riscv_split_integer): Likewise.
(riscv_legitimate_constant_p): Likewise.
(riscv_cannot_force_const_mem): Likewise.
(riscv_regno_mode_ok_for_base_p): Likewise.
(riscv_valid_base_register_p): Likewise.
(riscv_valid_offset_p): Likewise.
(riscv_valid_lo_sum_p): Likewise.
(riscv_classify_address): Likewise.
(riscv_legitimate_address_p): Likewise.
(riscv_address_insns): Likewise.
(riscv_load_store_insns): Likewise.
(riscv_force_binary): Likewise.
(riscv_split_symbol): Likewise.
(riscv_force_address): Likewise.
(riscv_legitimize_address): Likewise.
(riscv_move_integer): Likewise.
(riscv_legitimize_const_move): Likewise.
(riscv_legitimize_move): Likewise.
(riscv_address_cost): Likewise.
(riscv_subword): Likewise.
(riscv_output_move): Likewise.
(riscv_canonicalize_int_order_test): Likewise.
(riscv_emit_int_order_test): Likewise.
(riscv_function_arg_boundary): Likewise.
(riscv_pass_mode_in_fpr_p): Likewise.
(riscv_pass_fpr_single): Likewise.
(riscv_pass_fpr_pair): Likewise.
(riscv_get_arg_info): Likewise.
(riscv_function_arg): Likewise.
(riscv_function_arg_advance): Likewise.
(riscv_arg_partial_bytes): Likewise.
(riscv_function_value): Likewise.
(riscv_pass_by_reference): Likewise.
(riscv_setup_incoming_varargs): Likewise.
(riscv_print_operand): Likewise.
(riscv_elf_select_rtx_section): Likewise.
(riscv_save_restore_reg): Likewise.
(riscv_for_each_saved_reg): Likewise.
(riscv_register_move_cost): Likewise.
(riscv_hard_regno_mode_ok_p): Likewise.
(riscv_hard_regno_nregs): Likewise.
(riscv_class_max_nregs): Likewise.
(riscv_memory_move_cost): Likewise.
* config/rl78/rl78-protos.h (rl78_split_movsi): Likewise.
* config/rl78/rl78.c (rl78_split_movsi): Likewise.
(rl78_addr_space_address_mode): Likewise.
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Likewise.
* config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Likewise.
(rs6000_reassociation_width): Likewise.
(rs6000_invalid_binary_op): Likewise.
(fusion_p9_p): Likewise.
(emit_fusion_p9_load): Likewise.
(emit_fusion_p9_store): Likewise.
* config/visium/visium-protos.h (prepare_move_operands): Likewise.
(ok_for_simple_move_operands): Likewise.
(ok_for_simple_move_strict_operands): Likewise.
(ok_for_simple_arith_logic_operands): Likewise.
(visium_legitimize_reload_address): Likewise.
(visium_select_cc_mode): Likewise.
(output_cbranch): Likewise.
(visium_split_double_move): Likewise.
(visium_expand_copysign): Likewise.
(visium_expand_int_cstore): Likewise.
(visium_expand_fp_cstore): Likewise.
* config/visium/visium.c (visium_pass_by_reference): Likewise.
(visium_function_arg): Likewise.
(visium_function_arg_advance): Likewise.
(visium_libcall_value): Likewise.
(visium_setup_incoming_varargs): Likewise.
(visium_legitimate_constant_p): Likewise.
(visium_legitimate_address_p): Likewise.
(visium_legitimize_address): Likewise.
(visium_secondary_reload): Likewise.
(visium_register_move_cost): Likewise.
(visium_memory_move_cost): Likewise.
(prepare_move_operands): Likewise.
(ok_for_simple_move_operands): Likewise.
(ok_for_simple_move_strict_operands): Likewise.
(ok_for_simple_arith_logic_operands): Likewise.
(visium_function_value_1): Likewise.
(rtx_ok_for_offset_p): Likewise.
(visium_legitimize_reload_address): Likewise.
(visium_split_double_move): Likewise.
(visium_expand_copysign): Likewise.
(visium_expand_int_cstore): Likewise.
(visium_expand_fp_cstore): Likewise.
(visium_split_cstore): Likewise.
(visium_select_cc_mode): Likewise.
(visium_split_cbranch): Likewise.
(output_cbranch): Likewise.
(visium_print_operand_address): Likewise.
* expmed.c (flip_storage_order): Likewise.
* expmed.h (emit_cstore): Likewise.
(flip_storage_order): Likewise.
* genrecog.c (validate_pattern): Likewise.
* hsa-gen.c (gen_hsa_addr): Likewise.
* internal-fn.c (expand_arith_overflow): Likewise.
* ira-color.c (allocno_copy_cost_saving): Likewise.
* lra-assigns.c (find_hard_regno_for_1): Likewise.
* lra-constraints.c (prohibited_class_reg_set_mode_p): Likewise.
(process_invariant_for_inheritance): Likewise.
* lra-eliminations.c (move_plus_up): Likewise.
* omp-low.c (lower_oacc_reductions): Likewise.
* simplify-rtx.c (simplify_subreg): Likewise.
* target.def (TARGET_SETUP_INCOMING_VARARG_BOUNDS): Likewise.
(TARGET_CHKP_BOUND_MODE): Likewise..
* targhooks.c (default_chkp_bound_mode): Likewise.
(default_setup_incoming_vararg_bounds): Likewise.
* targhooks.h (default_chkp_bound_mode): Likewise.
(default_setup_incoming_vararg_bounds): Likewise.
* tree-ssa-math-opts.c (divmod_candidate_p): Likewise.
* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Likewise.
(have_whole_vector_shift): Likewise.
* tree-vect-stmts.c (vectorizable_load): Likewise.
* doc/tm.texi: Regenerate.
gcc/brig/
* brig-c.h (brig_type_for_mode): Remove "enum" before "machine_mode".
* brig-lang.c (brig_langhook_type_for_mode): Likewise.
gcc/jit/
* dummy-frontend.c (jit_langhook_type_for_mode): Remove "enum" before
"machine_mode".
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r250003
|
|
* cfgloop.h (struct loop): Add comment. New field orig_loop_num.
* cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
* internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
* internal-fn.def (LOOP_DIST_ALIAS): New.
* tree-vectorizer.c (fold_loop_vectorized_call): Rename to ...
(fold_loop_internal_call): ... this.
(vect_loop_dist_alias_call): New function.
(set_uid_loop_bbs): Call fold_loop_internal_call.
(vectorize_loops): Fold IFN_LOOP_VECTORIZED and IFN_LOOP_DIST_ALIAS
internal calls.
From-SVN: r249983
|
|
* asan.c (asan_emit_stack_protection): Update.
(create_cond_insert_point): Update.
* auto-profile.c (afdo_propagate_circuit): Update.
* basic-block.h (struct edge_def): Turn probability to
profile_probability.
(EDGE_FREQUENCY): Update.
* bb-reorder.c (find_traces_1_round): Update.
(better_edge_p): Update.
(sanitize_hot_paths): Update.
* cfg.c (unchecked_make_edge): Initialize probability to uninitialized.
(make_single_succ_edge): Update.
(check_bb_profile): Update.
(dump_edge_info): Update.
(update_bb_profile_for_threading): Update.
* cfganal.c (connect_infinite_loops_to_exit): Initialize new edge
probabilitycount to 0.
* cfgbuild.c (compute_outgoing_frequencies): Update.
* cfgcleanup.c (try_forward_edges): Update.
(outgoing_edges_match): Update.
(try_crossjump_to_edge): Update.
* cfgexpand.c (expand_gimple_cond): Update make_single_succ_edge.
(expand_gimple_tailcall): Update.
(construct_init_block): Use make_single_succ_edge.
(construct_exit_block): Use make_single_succ_edge.
* cfghooks.c (verify_flow_info): Update.
(redirect_edge_succ_nodup): Update.
(split_edge): Update.
(account_profile_record): Update.
* cfgloopanal.c (single_likely_exit): Update.
* cfgloopmanip.c (scale_loop_profile): Update.
(set_zero_probability): Remove.
(duplicate_loop_to_header_edge): Update.
* cfgloopmanip.h (loop_version): Update prototype.
* cfgrtl.c (try_redirect_by_replacing_jump): Update.
(force_nonfallthru_and_redirect): Update.
(update_br_prob_note): Update.
(rtl_verify_edges): Update.
(purge_dead_edges): Update.
(rtl_lv_add_condition_to_bb): Update.
* cgraph.c: (cgraph_edge::redirect_call_stmt_to_calle): Update.
* cgraphunit.c (init_lowered_empty_function): Update.
(cgraph_node::expand_thunk): Update.
* cilk-common.c: Include profile-count.h
* dojump.c (inv): Remove.
(jumpifnot): Update.
(jumpifnot_1): Update.
(do_jump_1): Update.
(do_jump): Update.
(do_jump_by_parts_greater_rtx): Update.
(do_compare_rtx_and_jump): Update.
* dojump.h (jumpifnot, jumpifnot_1, jumpif_1, jumpif, do_jump,
do_jump_1. do_compare_rtx_and_jump): Update prototype.
* dwarf2cfi.c: Include profile-count.h
* except.c (dw2_build_landing_pads): Use make_single_succ_edge.
(sjlj_emit_dispatch_table): Likewise.
* explow.c: Include profile-count.h
* expmed.c (emit_store_flag_force): Update.
(do_cmp_and_jump): Update.
* expr.c (compare_by_pieces_d::generate): Update.
(compare_by_pieces_d::finish_mode): Update.
(emit_block_move_via_loop): Update.
(store_expr_with_bounds): Update.
(store_constructor): Update.
(expand_expr_real_2): Update.
(expand_expr_real_1): Update.
* expr.h (try_casesi, try_tablejump): Update prototypes.
* gimple-pretty-print.c (dump_probability): Update.
(dump_profile): New.
(dump_gimple_label): Update.
(dump_gimple_bb_header): Update.
* graph.c (draw_cfg_node_succ_edges): Update.
* hsa-gen.c (convert_switch_statements): Update.
* ifcvt.c (cheap_bb_rtx_cost_p): Update.
(find_if_case_1): Update.
(find_if_case_2): Update.
* internal-fn.c (expand_arith_overflow_result_store): Update.
(expand_addsub_overflow): Update.
(expand_neg_overflow): Update.
(expand_mul_overflow): Update.
(expand_vector_ubsan_overflow): Update.
* ipa-cp.c (good_cloning_opportunity_p): Update.
* ipa-split.c (split_function): Use make_single_succ_edge.
* ipa-utils.c (ipa_merge_profiles): Update.
* loop-doloop.c (add_test): Update.
(doloop_modify): Update.
* loop-unroll.c (compare_and_jump_seq): Update.
(unroll_loop_runtime_iterations): Update.
* lra-constraints.c (lra_inheritance): Update.
* lto-streamer-in.c (input_cfg): Update.
* lto-streamer-out.c (output_cfg): Update.
* mcf.c (adjust_cfg_counts): Update.
* modulo-sched.c (sms_schedule): Update.
* omp-expand.c (expand_omp_for_init_counts): Update.
(extract_omp_for_update_vars): Update.
(expand_omp_ordered_sink): Update.
(expand_omp_for_ordered_loops): Update.
(expand_omp_for_generic): Update.
(expand_omp_for_static_nochunk): Update.
(expand_omp_for_static_chunk): Update.
(expand_cilk_for): Update.
(expand_omp_simd): Update.
(expand_omp_taskloop_for_outer): Update.
(expand_omp_taskloop_for_inner): Update.
* omp-simd-clone.c (simd_clone_adjust): Update.
* optabs.c (expand_doubleword_shift): Update.
(expand_abs): Update.
(emit_cmp_and_jump_insn_1): Update.
(expand_compare_and_swap_loop): Update.
* optabs.h (emit_cmp_and_jump_insns): Update prototype.
* predict.c (predictable_edge_p): Update.
(edge_probability_reliable_p): Update.
(set_even_probabilities): Update.
(combine_predictions_for_insn): Update.
(combine_predictions_for_bb): Update.
(propagate_freq): Update.
(estimate_bb_frequencies): Update.
(force_edge_cold): Update.
* profile-count.c (profile_count::dump): Add missing space into dump.
(profile_count::debug): Add newline.
(profile_count::differs_from_p): Explicitly convert to unsigned.
(profile_count::stream_in): Update.
(profile_probability::dump): New member function.
(profile_probability::debug): New member function.
(profile_probability::differs_from_p): New member function.
(profile_probability::differs_lot_from_p): New member function.
(profile_probability::stream_in): New member function.
(profile_probability::stream_out): New member function.
* profile-count.h (profile_count_quality): Rename to ...
(profile_quality): ... this one.
(profile_probability): New.
(profile_count): Update.
* profile.c (compute_branch_probabilities): Update.
* recog.c (peep2_attempt): Update.
* sched-ebb.c (schedule_ebbs): Update.
* sched-rgn.c (find_single_block_region): Update.
(compute_dom_prob_ps): Update.
(schedule_region): Update.
* sel-sched-ir.c (compute_succs_info): Update.
* stmt.c (struct case_node): Update.
(do_jump_if_equal): Update.
(get_outgoing_edge_probs): Update.
(conditional_probability): Update.
(emit_case_dispatch_table): Update.
(expand_case): Update.
(expand_sjlj_dispatch_table): Update.
(emit_case_nodes): Update.
* targhooks.c: Update.
* tracer.c (better_p): Update.
(find_best_successor): Update.
* trans-mem.c (expand_transaction): Update.
* tree-call-cdce.c: Update.
* tree-cfg.c (gimple_split_edge): Upate.
(move_sese_region_to_fn): Upate.
* tree-cfgcleanup.c (cleanup_control_expr_graph): Upate.
* tree-eh.c (lower_resx): Upate.
(cleanup_empty_eh_move_lp): Upate.
* tree-if-conv.c (version_loop_for_if_conversion): Update.
* tree-inline.c (copy_edges_for_bb): Update.
(copy_cfg_body): Update.
* tree-parloops.c (gen_parallel_loop): Update.
* tree-profile.c (gimple_gen_ic_func_profiler): Update.
(gimple_gen_time_profiler): Update.
* tree-ssa-dce.c (remove_dead_stmt): Update.
* tree-ssa-ifcombine.c (update_profile_after_ifcombine): Update.
* tree-ssa-loop-im.c (execute_sm_if_changed): Update.
* tree-ssa-loop-ivcanon.c (remove_exits_and_undefined_stmts): Update.
(unloop_loops): Update.
(try_peel_loop): Update.
* tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Update.
* tree-ssa-loop-split.c (connect_loops): Update.
(split_loop): Update.
* tree-ssa-loop-unswitch.c (tree_unswitch_loop): Update.
(hoist_guard): Update.
* tree-ssa-phionlycprop.c (propagate_rhs_into_lhs): Update.
* tree-ssa-phiopt.c (replace_phi_edge_with_variable): Update.
(value_replacement): Update.
* tree-ssa-reassoc.c (branch_fixup): Update.
* tree-ssa-tail-merge.c (replace_block_by): Update.
* tree-ssa-threadupdate.c (remove_ctrl_stmt_and_useless_edges): Update.
(create_edge_and_update_destination_phis): Update.
(compute_path_counts): Update.
(recompute_probabilities): Update.
(update_joiner_offpath_counts): Update.
(freqs_to_counts_path): Update.
(duplicate_thread_path): Update.
* tree-switch-conversion.c (hoist_edge_and_branch_if_true): Update.
(struct switch_conv_info): Update.
(gen_inbound_check): Update.
* tree-vect-loop-manip.c (slpeel_add_loop_guard): Update.
(vect_do_peeling): Update.
(vect_loop_versioning): Update.
* tree-vect-loop.c (scale_profile_for_vect_loop): Update.
(optimize_mask_stores): Update.
* ubsan.c (ubsan_expand_null_ifn): Update.
* value-prof.c (gimple_divmod_fixed_value): Update.
(gimple_divmod_fixed_value_transform): Update.
(gimple_mod_pow2): Update.
(gimple_mod_pow2_value_transform): Update.
(gimple_mod_subtract): Update.
(gimple_mod_subtract_transform): Update.
(gimple_ic): Update.
(gimple_stringop_fixed_value): Update.
(gimple_stringops_transform): Update.
* value-prof.h: Update.
From-SVN: r249800
|
|
* config/nvptx/nvptx-protos.h (nvptx_output_simt_enter): Declare.
(nvptx_output_simt_exit): Declare.
* config/nvptx/nvptx.c (nvptx_init_unisimt_predicate): Use
cfun->machine->unisimt_location. Handle NULL unisimt_predicate.
(init_softstack_frame): Move initialization of crtl->is_leaf to...
(nvptx_declare_function_name): ...here. Emit declaration of local
memory space buffer for omp_simt_enter insn.
(nvptx_output_unisimt_switch): New.
(nvptx_output_softstack_switch): New.
(nvptx_output_simt_enter): New.
(nvptx_output_simt_exit): New.
* config/nvptx/nvptx.h (struct machine_function): New fields
has_simtreg, unisimt_location, simt_stack_size, simt_stack_align.
* config/nvptx/nvptx.md (UNSPECV_SIMT_ENTER): New unspec.
(UNSPECV_SIMT_EXIT): Ditto.
(omp_simt_enter_insn): New insn.
(omp_simt_enter): New expansion.
(omp_simt_exit): New insn.
* config/nvptx/nvptx.opt (msoft-stack-reserve-local): New option.
* internal-fn.c (expand_GOMP_SIMT_ENTER): New.
(expand_GOMP_SIMT_ENTER_ALLOC): New.
(expand_GOMP_SIMT_EXIT): New.
* internal-fn.def (GOMP_SIMT_ENTER): New internal function.
(GOMP_SIMT_ENTER_ALLOC): Ditto.
(GOMP_SIMT_EXIT): Ditto.
* target-insns.def (omp_simt_enter): New insn.
(omp_simt_exit): Ditto.
* omp-low.c (struct omplow_simd_context): New fields simt_eargs,
simt_dlist.
(lower_rec_simd_input_clauses): Implement SIMT privatization.
(lower_rec_input_clauses): Likewise.
(lower_lastprivate_clauses): Handle SIMT privatization.
* omp-offload.c: Include langhooks.h, tree-nested.h, stor-layout.h.
(ompdevlow_adjust_simt_enter): New.
(find_simtpriv_var_op): New.
(execute_omp_device_lower): Handle IFN_GOMP_SIMT_ENTER,
IFN_GOMP_SIMT_ENTER_ALLOC, IFN_GOMP_SIMT_EXIT.
* tree-inline.h (struct copy_body_data): New field dst_simt_vars.
* tree-inline.c (expand_call_inline): Handle SIMT privatization.
(copy_decl_for_dup_finish): Ditto.
* tree-ssa.c (execute_update_addresses_taken): Handle GOMP_SIMT_ENTER.
From-SVN: r246550
|
|
config/s390/s390.c:7909)
PR sanitizer/79904
* internal-fn.c (expand_vector_ubsan_overflow): If arg0 or arg1
is a uniform vector, use uniform_vector_p return value instead of
building ARRAY_REF on folded VIEW_CONVERT_EXPR to array type.
* gcc.dg/ubsan/pr79904.c: New test.
From-SVN: r245967
|
|
PR middle-end/79665
* internal-fn.c (get_range_pos_neg): Moved to ...
* tree.c (get_range_pos_neg): ... here. No longer static.
* tree.h (get_range_pos_neg): New prototype.
* expr.c (expand_expr_real_2) <case TRUNC_DIV_EXPR>: If both arguments
are known to be in between 0 and signed maximum inclusive, try to
expand both unsigned and signed divmod and use the cheaper one from
those.
From-SVN: r245676
|
|
64-bit BE targets)
PR middle-end/79454
* internal-fn.c (expand_vector_ubsan_overflow): Use piece-wise
result computation whenever lhs doesn't have vector mode, not
just when it has BLKmode.
From-SVN: r245354
|
|
2017-02-09 Nathan Sidwell <nathan@codesourcery.com>
Cesar Philippidis <cesar@codesourcery.com>
Joseph Myers <joseph@codesourcery.com>
Chung-Lin Tang <cltang@codesourcery.com>
gcc/
* gimplify.c (gimplify_scan_omp_clauses): No special handling for
OMP_CLAUSE_TILE.
(gimplify_adjust_omp_clauses): Don't delete TILE.
(gimplify_omp_for): Deal with TILE.
* internal-fn.c (expand_GOACC_TILE): New function.
* internal-fn.def (GOACC_DIM_POS): Comment may be overly conservative.
(GOACC_TILE): New.
* omp-expand.c (struct oacc_collapse): Add tile and outer fields.
(expand_oacc_collapse_init): Add LOC paramter. Initialize tile
element fields.
(expand_oacc_collapse_vars): Add INNER parm, adjust for tiling,
avoid DIV for outermost collapse var.
(expand_oacc_for): Insert tile element loop as needed. Adjust.
Remove out of date comments, fix whitespace.
* omp-general.c (omp_extract_for_data): Deal with tiling.
* omp-general.h (enum oacc_loop_flags): Add OLF_TILE flag,
adjust OLF_DIM_BASE value.
(struct omp_for_data): Add tiling field.
* omp-low.c (scan_sharing_clauses): Allow OMP_CLAUSE_TILE.
(lower_oacc_head_mark): Add OLF_TILE as appropriate. Ensure 2 levels
for auto loops. Remove default auto determining, moved to
oacc_loop_fixed_partitions.
* omp-offload.c (struct oacc_loop): Change 'ifns' to vector of call
stmts, add e_mask field.
(oacc_dim_call): New function, abstracted out from oacc_thread_numbers.
(oacc_thread_numbers): Use oacc_dim_call.
(oacc_xform_tile): New.
(new_oacc_loop_raw): Initialize e_mask, adjust for ifns vector.
(finish_oacc_loop): Adjust for ifns vector.
(oacc_loop_discover_walk): Append loop abstraction sites to list,
add case for GOACC_TILE fns.
(oacc_loop_xform_loop): Delete.
(oacc_loop_process): Iterate over call list directly, and add
handling for GOACC_TILE fns.
(oacc_loop_fixed_partitions): Determine default auto, deal with TILE,
dump partitioning.
(oacc_loop_auto_partitions): Add outer_assign parm. Assign all but
vector partitioning to outer loops. Assign 2 partitions to loops
when available. Add TILE handling.
(oacc_loop_partition): Adjust oacc_loop_auto_partitions call.
(execite_oacc_device_lower): Process GOACC_TILE fns, ignore unknown specs.
* tree-nested.c (convert_nonlocal_omp_clauses): Allow OMP_CLAUSE_TILE.
* tree.c (omp_clause_num_ops): Adjust TILE ops.
* tree.h (OMP_CLAUSE_TILE_ITERVAR, OMP_CLAUSE_TILE_COUNT): New.
gcc/c/
* c-parser.c (c_parser_omp_clause_collapse): Disallow tile.
(c_parser_oacc_clause_tile): Disallow collapse. Fix parsing and
semantic checking.
* c-parser.c (c_parser_omp_for_loop): Accept tiling constructs.
gcc/cp/
* parser.c (cp_parser_oacc_clause_tile): Disallow collapse. Fix
parsing. Parse constant expression. Remove semantic checking.
(cp_parser_omp_clause_collapse): Disallow tile.
(cp_parser_omp_for_loop): Deal with tile clause. Don't emit a parse
error about missing for after already emitting one. Use more
conventional for idiom for unbounded loop.
* pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_TILE.
* semantics.c (finish_omp_clauses): Correct TILE semantic check.
(finish_omp_for): Deal with tile clause.
gcc/fortran/
* openmp.c (resolve_omp_clauses): Error on directives
containing both tile and collapse clauses.
(resolve_oacc_loop_blocks): Represent '*' tile arguments as zero.
* trans-openmp.c (gfc_trans_omp_do): Lower tiled loops like
collapsed loops.
gcc/testsuite/
* c-c++-common/goacc/combined-directives.c: Remove xfail.
* c-c++-common/goacc/loop-auto-1.c: Adjust and add additional case.
* c-c++-common/goacc/loop-auto-2.c: New.
* c-c++-common/goacc/tile.c: Include stdbool, fix expected errors.
* c-c++-common/goacc/tile-2.c: New.
* g++.dg/goacc/template.C: Test tile subst. Adjust erroneous uses.
* g++.dg/goacc/tile-1.C: New, check tile subst.
* gcc.dg/goacc/loop-processing-1.c: Adjust dg-final pattern.
* gfortran.dg/goacc/combined-directives.f90: Remove xfail.
* gfortran.dg/goacc/tile-1.f90: New test.
* gfortran.dg/goacc/tile-2.f90: New test.
* gfortran.dg/goacc/tile-lowering.f95: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/tile-1.c: New.
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Adjust and
add additional case.
* testsuite/libgomp.oacc-c-c++-common/vprop.c: XFAIL under
"openacc_nvidia_accel_selected".
* libgomp.oacc-fortran/nested-function-1.f90 (test2):
Add num_workers(8) clause.
From-SVN: r245300
|
|
2017-01-23 Martin Liska <mliska@suse.cz>
* gcc.dg/asan/use-after-scope-10.c: New test.
* gcc.dg/asan/use-after-scope-11.c: New test.
* g++.dg/asan/use-after-scope-5.C: New test.
2017-01-23 Jakub Jelinek <jakub@redhat.com>
Martin Liska <mliska@suse.cz>
* asan.h: Define ASAN_USE_AFTER_SCOPE_ATTRIBUTE.
* asan.c (asan_expand_poison_ifn): Support stores and use
appropriate ASAN report function.
* internal-fn.c (expand_ASAN_POISON_USE): New function.
* internal-fn.def (ASAN_POISON_USE): Declare.
* tree-into-ssa.c (maybe_add_asan_poison_write): New function.
(maybe_register_def): Create ASAN_POISON_USE when sanitizing.
* tree-ssa-dce.c (eliminate_unnecessary_stmts): Remove
ASAN_POISON calls w/o LHS.
* tree-ssa.c (execute_update_addresses_taken): Create clobber
for ASAN_MARK (UNPOISON, &x, ...) in order to prevent usage of a LHS
from ASAN_MARK (POISON, &x, ...) coming to a PHI node.
* gimplify.c (asan_poison_variables): Add attribute
use_after_scope_memory to variables that really needs to live
in memory.
* tree-ssa.c (is_asan_mark_p): Do not rewrite into SSA when
having the attribute.
From-SVN: r244793
|
|
2017-01-23 Martin Liska <mliska@suse.cz>
* asan.c (create_asan_shadow_var): New function.
(asan_expand_poison_ifn): Likewise.
* asan.h (asan_expand_poison_ifn): New declaration.
* internal-fn.c (expand_ASAN_POISON): Likewise.
* internal-fn.def (ASAN_POISON): New builtin.
* sanopt.c (pass_sanopt::execute): Expand
asan_expand_poison_ifn.
* tree-inline.c (copy_decl_for_dup_finish): Make function
external.
* tree-inline.h (copy_decl_for_dup_finish): Likewise.
* tree-ssa.c (is_asan_mark_p): New function.
(execute_update_addresses_taken): Rewrite local variables
(identified just by use-after-scope as addressable) into SSA.
2017-01-23 Martin Liska <mliska@suse.cz>
* gcc.dg/asan/use-after-scope-3.c: Add additional flags.
* gcc.dg/asan/use-after-scope-9.c: Likewise and grep for
sanopt optimization for ASAN_POISON.
From-SVN: r244791
|