Age | Commit message (Collapse) | Author | Files | Lines |
|
[PR95853]
The following patch recognizes some further forms of additions with overflow
checks as shown in the testcase, in particular where the unsigned addition is
performed in a wider mode just to catch overflow with a > narrower_utype_max
check.
2020-11-22 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/95853
* tree-ssa-math-opts.c (uaddsub_overflow_check_p): Add maxval
argument, if non-NULL, instead look for r > maxval or r <= maxval
comparisons.
(match_uaddsub_overflow): Pattern recognize even other forms of
__builtin_add_overflow, in particular when addition is performed
in a wider type and result compared to maximum of the narrower
type.
* gcc.dg/pr95853.c: New test.
|
|
On platforms in which Aux_[Real_Type] involves non-NOP conversions
(e.g., between single- and double-precision, or between short float
and float), the conversions before the calls are CSEd too late for
sincos to combine calls.
This patch enables the sincos pass to CSE type casts used as arguments
to eligible calls before looking for other calls using the same
operand.
for gcc/ChangeLog
* tree-ssa-math-opts.c (sincos_stats): Add conv_removed.
(execute_cse_conv_1): New.
(execute_cse_sincos_1): Call it. Fix return within
FOR_EACH_IMM_USE_STMT.
(pass_cse_sincos::execute): Report conv_inserted.
for gcc/testsuite/ChangeLog
* gnat.dg/sin_cos.ads: New.
* gnat.dg/sin_cos.adb: New.
* gcc.dg/sin_cos.c: New.
|
|
This is a first step towards enabling the sincos optimization in Ada.
The issue this patch solves is that sincos takes the type to be looked
up with mathfn_built_in from variables or temporaries passed as
arguments to SIN and COS intrinsics. In Ada, different float types
may be used but, despite their representation equivalence, their
distinctness causes the optimization to be skipped, because they are
not the types that mathfn_built_in expects.
This patch introduces a function that maps intrinsics to the type
they're associated with, and uses that type, obtained from the
intrinsics used in calls to be optimized, to look up the correspoding
CEXPI intrinsic.
For the sake of defensive programming, when using the type obtained
from the intrinsic, it now checks that, if different types are found
for the used argument, or for other calls that use it, that the types
are interchangeable.
for gcc/ChangeLog
* builtins.c (mathfn_built_in_type): New.
* builtins.h (mathfn_built_in_type): Declare.
* tree-ssa-math-opts.c (execute_cse_sincos_1): Use it to
obtain the type expected by the intrinsic.
|
|
As written in the comment, tree-ssa-math-opts.c wouldn't create a DIVMOD
ifn call for division + modulo by constant for the fear that during
expansion we could generate better code for those cases.
If the divisoris a power of two, that is certainly the case always,
but otherwise expand_divmod can punt in many cases, e.g. if the division
type's precision is above HOST_BITS_PER_WIDE_INT, we don't even call
choose_multiplier, because it works on HOST_WIDE_INTs (true, something
we should fix eventually now that we have wide_ints), or if pre/post shift
is larger than BITS_PER_WORD.
So, the following patch recognizes DIVMOD with constant last argument even
when it is unclear if expand_divmod will be able to optimize it, and then
during DIVMOD expansion if the divisor is constant attempts to expand it as
division + modulo and if they actually don't contain any libcalls or
division/modulo, they are kept as is, otherwise that sequence is thrown away
and divmod optab or libcall is used.
2020-10-06 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/97282
* tree-ssa-math-opts.c (divmod_candidate_p): Don't return false for
constant op2 if it is not a power of two and the type has precision
larger than HOST_BITS_PER_WIDE_INT or BITS_PER_WORD.
* internal-fn.c (contains_call_div_mod): New function.
(expand_DIVMOD): If last argument is a constant, try to expand it as
TRUNC_DIV_EXPR followed by TRUNC_MOD_EXPR, but if the sequence
contains any calls or {,U}{DIV,MOD} rtxes, throw it away and use
divmod optab or divmod libfunc.
* gcc.target/i386/pr97282.c: New test.
|
|
GCC has a target hook TARGET_LIBC_HAS_FUNCTION, which tells the compiler
which functions it can expect to be present in libc.
The default target hook does not include the sincos functions.
The nvptx port of newlib does include sincos and sincosf, but not sincosl.
The target hook TARGET_LIBC_HAS_FUNCTION does not distinguish between sincos,
sincosf and sincosl, so if we enable it for the sincos functions, then for
test.c:
...
long double x, a, b;
int main (void) {
x = 0.5;
a = sinl (x);
b = cosl (x);
printf ("a: %f\n", (double)a);
printf ("b: %f\n", (double)b);
return 0;
}
...
we introduce a regression:
...
$ gcc test.c -lm -O2
unresolved symbol sincosl
collect2: error: ld returned 1 exit status
...
Add a type argument to target hook TARGET_LIBC_HAS_FUNCTION_TYPE, and use it
in nvptx_libc_has_function_type to enable sincos and sincosf, but not sincosl.
Build and reg-tested on x86_64-linux.
Build and tested on nvptx.
gcc/ChangeLog:
2020-09-28 Tobias Burnus <tobias@codesourcery.com>
Tom de Vries <tdevries@suse.de>
* builtins.c (expand_builtin_cexpi, fold_builtin_sincos): Update
targetm.libc_has_function call.
* builtins.def (DEF_C94_BUILTIN, DEF_C99_BUILTIN, DEF_C11_BUILTIN):
(DEF_C2X_BUILTIN, DEF_C99_COMPL_BUILTIN, DEF_C99_C90RES_BUILTIN):
Same.
* config/darwin-protos.h (darwin_libc_has_function): Update prototype.
* config/darwin.c (darwin_libc_has_function): Add arg.
* config/linux-protos.h (linux_libc_has_function): Update prototype.
* config/linux.c (linux_libc_has_function): Add arg.
* config/i386/i386.c (ix86_libc_has_function): Update
targetm.libc_has_function call.
* config/nvptx/nvptx.c (nvptx_libc_has_function): New function.
(TARGET_LIBC_HAS_FUNCTION): Redefine to nvptx_libc_has_function.
* convert.c (convert_to_integer_1): Update targetm.libc_has_function
call.
* match.pd: Same.
* target.def (libc_has_function): Add arg.
* doc/tm.texi: Regenerate.
* targhooks.c (default_libc_has_function, gnu_libc_has_function)
(no_c99_libc_has_function): Add arg.
* targhooks.h (default_libc_has_function, no_c99_libc_has_function)
(gnu_libc_has_function): Update prototype.
* tree-ssa-math-opts.c (pass_cse_sincos::execute): Update
targetm.libc_has_function call.
gcc/fortran/ChangeLog:
2020-09-30 Tom de Vries <tdevries@suse.de>
* f95-lang.c (gfc_init_builtin_functions): Update
targetm.libc_has_function call.
|
|
gcc/ChangeLog:
* tree-ssa-ccp.c (gsi_prev_dom_bb_nondebug): Use gsi_bb
instead of gimple_stmt_iterator::bb.
* tree-ssa-math-opts.c (insert_reciprocals): Likewise.
* tree-vectorizer.h: Likewise.
|
|
This adds an example how to use new/delete operators to pool
allocated objects.
2020-06-04 Jonathan Wakely <jwakely@redhat.com>
* alloc-pool.h (object_allocator::remove_raw): New.
* tree-ssa-math-opts.c (struct occurrence): Use NSMDI.
(occurrence::occurrence): Add.
(occurrence::~occurrence): Likewise.
(occurrence::new): Likewise.
(occurrence::delete): Likewise.
(occ_new): Remove.
(insert_bb): Use new occurence (...) instead of occ_new.
(register_division_in): Likewise.
(free_bb): Use delete occ instead of manually removing
from the pool.
|
|
match.pd already has simplifications for negation of a FMA (FMS, FNMA, FNMS)
call if it is single use, but when the widening_mul pass discovers FMAs,
nothing folds the statements anymore.
So, the following patch adjusts the widening_mul pass to handle that.
I had to adjust quite a lot of tests, because they have in them nested FMAs
(one FMA feeding another one) and the patch results in some (equivalent) changes
in the chosen instructions, previously the negation of one FMA's result
would result in the dependent FMA being adjusted for the negation, but now
instead the first FMA is adjusted.
2020-05-13 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/95060
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Fold a NEGATE_EXPR
if it is the single use of the FMA internal builtin.
* gcc.target/i386/avx512f-pr95060.c: New test.
* gcc.target/i386/fma_double_1.c: Adjust expected insn counts.
* gcc.target/i386/fma_double_2.c: Likewise.
* gcc.target/i386/fma_double_3.c: Likewise.
* gcc.target/i386/fma_double_4.c: Likewise.
* gcc.target/i386/fma_double_5.c: Likewise.
* gcc.target/i386/fma_double_6.c: Likewise.
* gcc.target/i386/fma_float_1.c: Likewise.
* gcc.target/i386/fma_float_2.c: Likewise.
* gcc.target/i386/fma_float_3.c: Likewise.
* gcc.target/i386/fma_float_4.c: Likewise.
* gcc.target/i386/fma_float_5.c: Likewise.
* gcc.target/i386/fma_float_6.c: Likewise.
* gcc.target/i386/l_fma_double_1.c: Likewise.
* gcc.target/i386/l_fma_double_2.c: Likewise.
* gcc.target/i386/l_fma_double_3.c: Likewise.
* gcc.target/i386/l_fma_double_4.c: Likewise.
* gcc.target/i386/l_fma_double_5.c: Likewise.
* gcc.target/i386/l_fma_double_6.c: Likewise.
* gcc.target/i386/l_fma_float_1.c: Likewise.
* gcc.target/i386/l_fma_float_2.c: Likewise.
* gcc.target/i386/l_fma_float_3.c: Likewise.
* gcc.target/i386/l_fma_float_4.c: Likewise.
* gcc.target/i386/l_fma_float_5.c: Likewise.
* gcc.target/i386/l_fma_float_6.c: Likewise.
|
|
convert plusminus to widen
In the testcase for PR94269, widening_mul moves two multiply
instructions from outside the loop to inside
the loop, merging with two add instructions separately. This
increases the cost of the loop. Like FMA detection
in the same pass, simply restrict ops to be defined in the same
basic-block to avoid possibly moving multiply
to a different block with a higher execution frequency.
2020-03-26 Felix Yang <felix.yang@huawei.com>
PR tree-optimization/94269
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Restrict
this
operation to single basic block.
* gcc.dg/pr94269.c: New test.
|
|
In the r10-7197-gbae7b38cf8a21e068ad5c0bab089dedb78af3346 commit I've
noticed duplicated word in a message, which lead me to grep for those and
we have a tons of them.
I've used
grep -v 'long long\|optab optab\|template template\|double double' *.[chS] */*.[chS] *.def config/*/* 2>/dev/null | grep ' \([a-zA-Z]\+\) \1 '
Note, the command will not detect the doubled words at the start or end of
line or when one of the words is at the end of line and the next one at the
start of another one.
Some of it is fairly obvious, e.g. all the "the the" cases which is
something I've posted and committed patch for already e.g. in 2016,
other cases are often valid, e.g. "that that" seems to look mostly ok to me.
Some cases are quite hard to figure out, I've left out some of them from the
patch (e.g. "and and" in some cases isn't talking about bitwise/logical and
and so looks incorrect, but in other cases it is talking about those
operations).
In most cases the right solution seems to be to remove one of the duplicated
words, but not always.
I think most important are the ones with user visible messages (in the patch
3 of the first 4 hunks), the rest is just comments (and internal
documentation; for that see the doc/tm.texi changes).
2020-03-17 Jakub Jelinek <jakub@redhat.com>
* lra-spills.c (remove_pseudos): Fix up duplicated word issue in
a dump message.
* tree-sra.c (create_access_replacement): Fix up duplicated word issue
in a comment.
* read-rtl-function.c (find_param_by_name,
function_reader::parse_enum_value, function_reader::get_insn_by_uid):
Likewise.
* spellcheck.c (get_edit_distance_cutoff): Likewise.
* tree-data-ref.c (create_ifn_alias_checks): Likewise.
* tree.def (SWITCH_EXPR): Likewise.
* selftest.c (assert_str_contains): Likewise.
* ipa-param-manipulation.h (class ipa_param_body_adjustments):
Likewise.
* tree-ssa-math-opts.c (convert_expand_mult_copysign): Likewise.
* tree-ssa-loop-split.c (find_vdef_in_loop): Likewise.
* langhooks.h (struct lang_hooks_for_decls): Likewise.
* ipa-prop.h (struct ipa_param_descriptor): Likewise.
* tree-ssa-strlen.c (handle_builtin_string_cmp, handle_store):
Likewise.
* tree-ssa-dom.c (simplify_stmt_for_jump_threading): Likewise.
* tree-ssa-reassoc.c (reassociate_bb): Likewise.
* tree.c (component_ref_size): Likewise.
* hsa-common.c (hsa_init_compilation_unit_data): Likewise.
* gimple-ssa-sprintf.c (get_string_length, format_string,
format_directive): Likewise.
* omp-grid.c (grid_process_kernel_body_copy): Likewise.
* input.c (string_concat_db::get_string_concatenation,
test_lexer_string_locations_ucn4): Likewise.
* cfgexpand.c (pass_expand::execute): Likewise.
* gimple-ssa-warn-restrict.c (builtin_memref::offset_out_of_bounds,
maybe_diag_overlap): Likewise.
* rtl.c (RTX_CODE_HWINT_P_1): Likewise.
* shrink-wrap.c (spread_components): Likewise.
* tree-ssa-dse.c (initialize_ao_ref_for_dse, valid_ao_ref_for_dse):
Likewise.
* tree-call-cdce.c (shrink_wrap_one_built_in_call_with_conds):
Likewise.
* dwarf2out.c (dwarf2out_early_finish): Likewise.
* gimple-ssa-store-merging.c: Likewise.
* ira-costs.c (record_operand_costs): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Likewise.
* target.def (dispatch): Likewise.
(validate_dims, gen_ccmp_first): Fix up duplicated word issue
in documentation text.
* doc/tm.texi: Regenerated.
* config/i386/x86-tune.def (X86_TUNE_PARTIAL_FLAG_REG_STALL): Fix up
duplicated word issue in a comment.
* config/i386/i386.c (ix86_test_loading_unspec): Likewise.
* config/i386/i386-features.c (remove_partial_avx_dependency):
Likewise.
* config/msp430/msp430.c (msp430_select_section): Likewise.
* config/gcn/gcn-run.c (load_image): Likewise.
* config/aarch64/aarch64-sve.md (sve_ld1r<mode>): Likewise.
* config/aarch64/aarch64.c (aarch64_gen_adjusted_ldpstp): Likewise.
* config/aarch64/falkor-tag-collision-avoidance.c
(single_dest_per_chain): Likewise.
* config/nvptx/nvptx.c (nvptx_record_fndecl): Likewise.
* config/fr30/fr30.c (fr30_arg_partial_bytes): Likewise.
* config/rs6000/rs6000-string.c (expand_cmp_vec_sequence): Likewise.
* config/rs6000/rs6000-p8swap.c (replace_swapped_load_constant):
Likewise.
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Likewise.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Likewise.
* config/rs6000/rs6000-logue.c
(rs6000_emit_probe_stack_range_stack_clash): Likewise.
* config/nds32/nds32-md-auxiliary.c (nds32_split_ashiftdi3): Likewise.
Fix various other issues in the comment.
c-family/
* c-common.c (resolve_overloaded_builtin): Fix up duplicated word
issue in a diagnostic message.
cp/
* pt.c (tsubst): Fix up duplicated word issue in a diagnostic message.
(lookup_template_class_1, tsubst_expr): Fix up duplicated word issue
in a comment.
* parser.c (cp_parser_statement, cp_parser_linkage_specification,
cp_parser_placeholder_type_specifier,
cp_parser_constraint_requires_parens): Likewise.
* name-lookup.c (suggest_alternative_in_explicit_scope): Likewise.
fortran/
* array.c (gfc_check_iter_variable): Fix up duplicated word issue
in a comment.
* arith.c (gfc_arith_concat): Likewise.
* resolve.c (gfc_resolve_ref): Likewise.
* frontend-passes.c (matmul_lhs_realloc): Likewise.
* module.c (gfc_match_submodule, load_needed): Likewise.
* trans-expr.c (gfc_init_se): Likewise.
|
|
From-SVN: r279813
|
|
2019-11-12 Martin Liska <mliska@suse.cz>
* Makefile.in: Remove PARAMS_H and params.list
and params.options.
* params-enum.h: Remove.
* params-list.h: Remove.
* params-options.h: Remove.
* params.c: Remove.
* params.def: Remove.
* params.h: Remove.
* asan.c: Do not include params.h.
* auto-profile.c: Likewise.
* bb-reorder.c: Likewise.
* builtins.c: Likewise.
* cfgcleanup.c: Likewise.
* cfgexpand.c: Likewise.
* cfgloopanal.c: Likewise.
* cgraph.c: Likewise.
* combine.c: Likewise.
* common/config/aarch64/aarch64-common.c: Likewise.
* common/config/gcn/gcn-common.c: Likewise.
* common/config/ia64/ia64-common.c: Likewise.
* common/config/powerpcspe/powerpcspe-common.c: Likewise.
* common/config/rs6000/rs6000-common.c: Likewise.
* common/config/sh/sh-common.c: Likewise.
* config/aarch64/aarch64.c: Likewise.
* config/alpha/alpha.c: Likewise.
* config/arm/arm.c: Likewise.
* config/avr/avr.c: Likewise.
* config/csky/csky.c: Likewise.
* config/i386/i386-builtins.c: Likewise.
* config/i386/i386-expand.c: Likewise.
* config/i386/i386-features.c: Likewise.
* config/i386/i386-options.c: Likewise.
* config/i386/i386.c: Likewise.
* config/ia64/ia64.c: Likewise.
* config/rs6000/rs6000-logue.c: Likewise.
* config/rs6000/rs6000.c: Likewise.
* config/s390/s390.c: Likewise.
* config/sparc/sparc.c: Likewise.
* config/visium/visium.c: Likewise.
* coverage.c: Likewise.
* cprop.c: Likewise.
* cse.c: Likewise.
* cselib.c: Likewise.
* dse.c: Likewise.
* emit-rtl.c: Likewise.
* explow.c: Likewise.
* final.c: Likewise.
* fold-const.c: Likewise.
* gcc.c: Likewise.
* gcse.c: Likewise.
* ggc-common.c: Likewise.
* ggc-page.c: Likewise.
* gimple-loop-interchange.cc: Likewise.
* gimple-loop-jam.c: Likewise.
* gimple-loop-versioning.cc: Likewise.
* gimple-ssa-split-paths.c: Likewise.
* gimple-ssa-sprintf.c: Likewise.
* gimple-ssa-store-merging.c: Likewise.
* gimple-ssa-strength-reduction.c: Likewise.
* gimple-ssa-warn-alloca.c: Likewise.
* gimple-ssa-warn-restrict.c: Likewise.
* graphite-isl-ast-to-gimple.c: Likewise.
* graphite-optimize-isl.c: Likewise.
* graphite-scop-detection.c: Likewise.
* graphite-sese-to-poly.c: Likewise.
* graphite.c: Likewise.
* haifa-sched.c: Likewise.
* hsa-gen.c: Likewise.
* ifcvt.c: Likewise.
* ipa-cp.c: Likewise.
* ipa-fnsummary.c: Likewise.
* ipa-inline-analysis.c: Likewise.
* ipa-inline.c: Likewise.
* ipa-polymorphic-call.c: Likewise.
* ipa-profile.c: Likewise.
* ipa-prop.c: Likewise.
* ipa-split.c: Likewise.
* ipa-sra.c: Likewise.
* ira-build.c: Likewise.
* ira-conflicts.c: Likewise.
* loop-doloop.c: Likewise.
* loop-invariant.c: Likewise.
* loop-unroll.c: Likewise.
* lra-assigns.c: Likewise.
* lra-constraints.c: Likewise.
* modulo-sched.c: Likewise.
* opt-suggestions.c: Likewise.
* opts.c: Likewise.
* postreload-gcse.c: Likewise.
* predict.c: Likewise.
* reload.c: Likewise.
* reorg.c: Likewise.
* resource.c: Likewise.
* sanopt.c: Likewise.
* sched-deps.c: Likewise.
* sched-ebb.c: Likewise.
* sched-rgn.c: Likewise.
* sel-sched-ir.c: Likewise.
* sel-sched.c: Likewise.
* shrink-wrap.c: Likewise.
* stmt.c: Likewise.
* targhooks.c: Likewise.
* toplev.c: Likewise.
* tracer.c: Likewise.
* trans-mem.c: Likewise.
* tree-chrec.c: Likewise.
* tree-data-ref.c: Likewise.
* tree-if-conv.c: Likewise.
* tree-inline.c: Likewise.
* tree-loop-distribution.c: Likewise.
* tree-parloops.c: Likewise.
* tree-predcom.c: Likewise.
* tree-profile.c: Likewise.
* tree-scalar-evolution.c: Likewise.
* tree-sra.c: Likewise.
* tree-ssa-ccp.c: Likewise.
* tree-ssa-dom.c: Likewise.
* tree-ssa-dse.c: Likewise.
* tree-ssa-ifcombine.c: Likewise.
* tree-ssa-loop-ch.c: Likewise.
* tree-ssa-loop-im.c: Likewise.
* tree-ssa-loop-ivcanon.c: Likewise.
* tree-ssa-loop-ivopts.c: Likewise.
* tree-ssa-loop-manip.c: Likewise.
* tree-ssa-loop-niter.c: Likewise.
* tree-ssa-loop-prefetch.c: Likewise.
* tree-ssa-loop-unswitch.c: Likewise.
* tree-ssa-math-opts.c: Likewise.
* tree-ssa-phiopt.c: Likewise.
* tree-ssa-pre.c: Likewise.
* tree-ssa-reassoc.c: Likewise.
* tree-ssa-sccvn.c: Likewise.
* tree-ssa-scopedtables.c: Likewise.
* tree-ssa-sink.c: Likewise.
* tree-ssa-strlen.c: Likewise.
* tree-ssa-structalias.c: Likewise.
* tree-ssa-tail-merge.c: Likewise.
* tree-ssa-threadbackward.c: Likewise.
* tree-ssa-threadedge.c: Likewise.
* tree-ssa-uninit.c: Likewise.
* tree-switch-conversion.c: Likewise.
* tree-vect-data-refs.c: Likewise.
* tree-vect-loop.c: Likewise.
* tree-vect-slp.c: Likewise.
* tree-vrp.c: Likewise.
* tree.c: Likewise.
* value-prof.c: Likewise.
* var-tracking.c: Likewise.
2019-11-12 Martin Liska <mliska@suse.cz>
* gimple-parser.c: Do not include params.h.
2019-11-12 Martin Liska <mliska@suse.cz>
* name-lookup.c: Do not include params.h.
* typeck.c: Likewise.
2019-11-12 Martin Liska <mliska@suse.cz>
* lto-common.c: Do not include params.h.
* lto-partition.c: Likewise.
* lto.c: Likewise.
From-SVN: r278086
|
|
2019-11-12 Martin Liska <mliska@suse.cz>
* asan.c (asan_sanitize_stack_p): Replace old parameter syntax
with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET
macro.
(asan_sanitize_allocas_p): Likewise.
(asan_emit_stack_protection): Likewise.
(asan_protect_global): Likewise.
(instrument_derefs): Likewise.
(instrument_builtin_call): Likewise.
(asan_expand_mark_ifn): Likewise.
* auto-profile.c (auto_profile): Likewise.
* bb-reorder.c (copy_bb_p): Likewise.
(duplicate_computed_gotos): Likewise.
* builtins.c (inline_expand_builtin_string_cmp): Likewise.
* cfgcleanup.c (try_crossjump_to_edge): Likewise.
(try_crossjump_bb): Likewise.
* cfgexpand.c (defer_stack_allocation): Likewise.
(stack_protect_classify_type): Likewise.
(pass_expand::execute): Likewise.
* cfgloopanal.c (expected_loop_iterations_unbounded): Likewise.
(estimate_reg_pressure_cost): Likewise.
* cgraph.c (cgraph_edge::maybe_hot_p): Likewise.
* combine.c (combine_instructions): Likewise.
(record_value_for_reg): Likewise.
* common/config/aarch64/aarch64-common.c (aarch64_option_validate_param): Likewise.
(aarch64_option_default_params): Likewise.
* common/config/ia64/ia64-common.c (ia64_option_default_params): Likewise.
* common/config/powerpcspe/powerpcspe-common.c (rs6000_option_default_params): Likewise.
* common/config/rs6000/rs6000-common.c (rs6000_option_default_params): Likewise.
* common/config/sh/sh-common.c (sh_option_default_params): Likewise.
* config/aarch64/aarch64.c (aarch64_output_probe_stack_range): Likewise.
(aarch64_allocate_and_probe_stack_space): Likewise.
(aarch64_expand_epilogue): Likewise.
(aarch64_override_options_internal): Likewise.
* config/alpha/alpha.c (alpha_option_override): Likewise.
* config/arm/arm.c (arm_option_override): Likewise.
(arm_valid_target_attribute_p): Likewise.
* config/i386/i386-options.c (ix86_option_override_internal): Likewise.
* config/i386/i386.c (get_probe_interval): Likewise.
(ix86_adjust_stack_and_probe_stack_clash): Likewise.
(ix86_max_noce_ifcvt_seq_cost): Likewise.
* config/ia64/ia64.c (ia64_adjust_cost): Likewise.
* config/rs6000/rs6000-logue.c (get_stack_clash_protection_probe_interval): Likewise.
(get_stack_clash_protection_guard_size): Likewise.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Likewise.
* config/s390/s390.c (allocate_stack_space): Likewise.
(s390_emit_prologue): Likewise.
(s390_option_override_internal): Likewise.
* config/sparc/sparc.c (sparc_option_override): Likewise.
* config/visium/visium.c (visium_option_override): Likewise.
* coverage.c (get_coverage_counts): Likewise.
(coverage_compute_profile_id): Likewise.
(coverage_begin_function): Likewise.
(coverage_end_function): Likewise.
* cse.c (cse_find_path): Likewise.
(cse_extended_basic_block): Likewise.
(cse_main): Likewise.
* cselib.c (cselib_invalidate_mem): Likewise.
* dse.c (dse_step1): Likewise.
* emit-rtl.c (set_new_first_and_last_insn): Likewise.
(get_max_insn_count): Likewise.
(make_debug_insn_raw): Likewise.
(init_emit): Likewise.
* explow.c (compute_stack_clash_protection_loop_data): Likewise.
* final.c (compute_alignments): Likewise.
* fold-const.c (fold_range_test): Likewise.
(fold_truth_andor): Likewise.
(tree_single_nonnegative_warnv_p): Likewise.
(integer_valued_real_single_p): Likewise.
* gcse.c (want_to_gcse_p): Likewise.
(prune_insertions_deletions): Likewise.
(hoist_code): Likewise.
(gcse_or_cprop_is_too_expensive): Likewise.
* ggc-common.c: Likewise.
* ggc-page.c (ggc_collect): Likewise.
* gimple-loop-interchange.cc (MAX_NUM_STMT): Likewise.
(MAX_DATAREFS): Likewise.
(OUTER_STRIDE_RATIO): Likewise.
* gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
* gimple-loop-versioning.cc (loop_versioning::max_insns_for_loop): Likewise.
* gimple-ssa-split-paths.c (is_feasible_trace): Likewise.
* gimple-ssa-store-merging.c (imm_store_chain_info::try_coalesce_bswap): Likewise.
(imm_store_chain_info::coalesce_immediate_stores): Likewise.
(imm_store_chain_info::output_merged_store): Likewise.
(pass_store_merging::process_store): Likewise.
* gimple-ssa-strength-reduction.c (find_basis_for_base_expr): Likewise.
* graphite-isl-ast-to-gimple.c (class translate_isl_ast_to_gimple): Likewise.
(scop_to_isl_ast): Likewise.
* graphite-optimize-isl.c (get_schedule_for_node_st): Likewise.
(optimize_isl): Likewise.
* graphite-scop-detection.c (build_scops): Likewise.
* haifa-sched.c (set_modulo_params): Likewise.
(rank_for_schedule): Likewise.
(model_add_to_worklist): Likewise.
(model_promote_insn): Likewise.
(model_choose_insn): Likewise.
(queue_to_ready): Likewise.
(autopref_multipass_dfa_lookahead_guard): Likewise.
(schedule_block): Likewise.
(sched_init): Likewise.
* hsa-gen.c (init_prologue): Likewise.
* ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Likewise.
(cond_move_process_if_block): Likewise.
* ipa-cp.c (ipcp_lattice::add_value): Likewise.
(merge_agg_lats_step): Likewise.
(devirtualization_time_bonus): Likewise.
(hint_time_bonus): Likewise.
(incorporate_penalties): Likewise.
(good_cloning_opportunity_p): Likewise.
(ipcp_propagate_stage): Likewise.
* ipa-fnsummary.c (decompose_param_expr): Likewise.
(set_switch_stmt_execution_predicate): Likewise.
(analyze_function_body): Likewise.
(compute_fn_summary): Likewise.
* ipa-inline-analysis.c (estimate_growth): Likewise.
* ipa-inline.c (caller_growth_limits): Likewise.
(inline_insns_single): Likewise.
(inline_insns_auto): Likewise.
(can_inline_edge_by_limits_p): Likewise.
(want_early_inline_function_p): Likewise.
(big_speedup_p): Likewise.
(want_inline_small_function_p): Likewise.
(want_inline_self_recursive_call_p): Likewise.
(edge_badness): Likewise.
(recursive_inlining): Likewise.
(compute_max_insns): Likewise.
(early_inliner): Likewise.
* ipa-polymorphic-call.c (csftc_abort_walking_p): Likewise.
* ipa-profile.c (ipa_profile): Likewise.
* ipa-prop.c (determine_known_aggregate_parts): Likewise.
(ipa_analyze_node): Likewise.
(ipcp_transform_function): Likewise.
* ipa-split.c (consider_split): Likewise.
* ipa-sra.c (allocate_access): Likewise.
(process_scan_results): Likewise.
(ipa_sra_summarize_function): Likewise.
(pull_accesses_from_callee): Likewise.
* ira-build.c (loop_compare_func): Likewise.
(mark_loops_for_removal): Likewise.
* ira-conflicts.c (build_conflict_bit_table): Likewise.
* loop-doloop.c (doloop_optimize): Likewise.
* loop-invariant.c (gain_for_invariant): Likewise.
(move_loop_invariants): Likewise.
* loop-unroll.c (decide_unroll_constant_iterations): Likewise.
(decide_unroll_runtime_iterations): Likewise.
(decide_unroll_stupid): Likewise.
(expand_var_during_unrolling): Likewise.
* lra-assigns.c (spill_for): Likewise.
* lra-constraints.c (EBB_PROBABILITY_CUTOFF): Likewise.
* modulo-sched.c (sms_schedule): Likewise.
(DFA_HISTORY): Likewise.
* opts.c (default_options_optimization): Likewise.
(finish_options): Likewise.
(common_handle_option): Likewise.
* postreload-gcse.c (eliminate_partially_redundant_load): Likewise.
(if): Likewise.
* predict.c (get_hot_bb_threshold): Likewise.
(maybe_hot_count_p): Likewise.
(probably_never_executed): Likewise.
(predictable_edge_p): Likewise.
(predict_loops): Likewise.
(expr_expected_value_1): Likewise.
(tree_predict_by_opcode): Likewise.
(handle_missing_profiles): Likewise.
* reload.c (find_equiv_reg): Likewise.
* reorg.c (redundant_insn): Likewise.
* resource.c (mark_target_live_regs): Likewise.
(incr_ticks_for_insn): Likewise.
* sanopt.c (pass_sanopt::execute): Likewise.
* sched-deps.c (sched_analyze_1): Likewise.
(sched_analyze_2): Likewise.
(sched_analyze_insn): Likewise.
(deps_analyze_insn): Likewise.
* sched-ebb.c (schedule_ebbs): Likewise.
* sched-rgn.c (find_single_block_region): Likewise.
(too_large): Likewise.
(haifa_find_rgns): Likewise.
(extend_rgns): Likewise.
(new_ready): Likewise.
(schedule_region): Likewise.
(sched_rgn_init): Likewise.
* sel-sched-ir.c (make_region_from_loop): Likewise.
* sel-sched-ir.h (MAX_WS): Likewise.
* sel-sched.c (process_pipelined_exprs): Likewise.
(sel_setup_region_sched_flags): Likewise.
* shrink-wrap.c (try_shrink_wrapping): Likewise.
* targhooks.c (default_max_noce_ifcvt_seq_cost): Likewise.
* toplev.c (print_version): Likewise.
(process_options): Likewise.
* tracer.c (tail_duplicate): Likewise.
* trans-mem.c (tm_log_add): Likewise.
* tree-chrec.c (chrec_fold_plus_1): Likewise.
* tree-data-ref.c (split_constant_offset): Likewise.
(compute_all_dependences): Likewise.
* tree-if-conv.c (MAX_PHI_ARG_NUM): Likewise.
* tree-inline.c (remap_gimple_stmt): Likewise.
* tree-loop-distribution.c (MAX_DATAREFS_NUM): Likewise.
* tree-parloops.c (MIN_PER_THREAD): Likewise.
(create_parallel_loop): Likewise.
* tree-predcom.c (determine_unroll_factor): Likewise.
* tree-scalar-evolution.c (instantiate_scev_r): Likewise.
* tree-sra.c (analyze_all_variable_accesses): Likewise.
* tree-ssa-ccp.c (fold_builtin_alloca_with_align): Likewise.
* tree-ssa-dse.c (setup_live_bytes_from_ref): Likewise.
(dse_optimize_redundant_stores): Likewise.
(dse_classify_store): Likewise.
* tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise.
* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
* tree-ssa-loop-im.c (LIM_EXPENSIVE): Likewise.
* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Likewise.
(try_peel_loop): Likewise.
(tree_unroll_loops_completely): Likewise.
* tree-ssa-loop-ivopts.c (avg_loop_niter): Likewise.
(CONSIDER_ALL_CANDIDATES_BOUND): Likewise.
(MAX_CONSIDERED_GROUPS): Likewise.
(ALWAYS_PRUNE_CAND_SET_BOUND): Likewise.
* tree-ssa-loop-manip.c (can_unroll_loop_p): Likewise.
* tree-ssa-loop-niter.c (MAX_ITERATIONS_TO_TRACK): Likewise.
* tree-ssa-loop-prefetch.c (PREFETCH_BLOCK): Likewise.
(L1_CACHE_SIZE_BYTES): Likewise.
(L2_CACHE_SIZE_BYTES): Likewise.
(should_issue_prefetch_p): Likewise.
(schedule_prefetches): Likewise.
(determine_unroll_factor): Likewise.
(volume_of_references): Likewise.
(add_subscript_strides): Likewise.
(self_reuse_distance): Likewise.
(mem_ref_count_reasonable_p): Likewise.
(insn_to_prefetch_ratio_too_small_p): Likewise.
(loop_prefetch_arrays): Likewise.
(tree_ssa_prefetch_arrays): Likewise.
* tree-ssa-loop-unswitch.c (tree_unswitch_single_loop): Likewise.
* tree-ssa-math-opts.c (gimple_expand_builtin_pow): Likewise.
(convert_mult_to_fma): Likewise.
(math_opts_dom_walker::after_dom_children): Likewise.
* tree-ssa-phiopt.c (cond_if_else_store_replacement): Likewise.
(hoist_adjacent_loads): Likewise.
(gate_hoist_loads): Likewise.
* tree-ssa-pre.c (translate_vuse_through_block): Likewise.
(compute_partial_antic_aux): Likewise.
* tree-ssa-reassoc.c (get_reassociation_width): Likewise.
* tree-ssa-sccvn.c (vn_reference_lookup_pieces): Likewise.
(vn_reference_lookup): Likewise.
(do_rpo_vn): Likewise.
* tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr): Likewise.
* tree-ssa-sink.c (select_best_block): Likewise.
* tree-ssa-strlen.c (new_stridx): Likewise.
(new_addr_stridx): Likewise.
(get_range_strlen_dynamic): Likewise.
(class ssa_name_limit_t): Likewise.
* tree-ssa-structalias.c (push_fields_onto_fieldstack): Likewise.
(create_variable_info_for_1): Likewise.
(init_alias_vars): Likewise.
* tree-ssa-tail-merge.c (find_clusters_1): Likewise.
(tail_merge_optimize): Likewise.
* tree-ssa-threadbackward.c (thread_jumps::profitable_jump_thread_path): Likewise.
(thread_jumps::fsm_find_control_statement_thread_paths): Likewise.
(thread_jumps::find_jump_threads_backwards): Likewise.
* tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Likewise.
* tree-ssa-uninit.c (compute_control_dep_chain): Likewise.
* tree-switch-conversion.c (switch_conversion::check_range): Likewise.
(jump_table_cluster::can_be_handled): Likewise.
* tree-switch-conversion.h (jump_table_cluster::case_values_threshold): Likewise.
(SWITCH_CONVERSION_BRANCH_RATIO): Likewise.
(param_switch_conversion_branch_ratio): Likewise.
* tree-vect-data-refs.c (vect_mark_for_runtime_alias_test): Likewise.
(vect_enhance_data_refs_alignment): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
* tree-vect-loop.c (vect_analyze_loop_costing): Likewise.
(vect_get_datarefs_in_loop): Likewise.
(vect_analyze_loop): Likewise.
* tree-vect-slp.c (vect_slp_bb): Likewise.
* tree-vectorizer.h: Likewise.
* tree-vrp.c (find_switch_asserts): Likewise.
(vrp_prop::check_mem_ref): Likewise.
* tree.c (wide_int_to_tree_1): Likewise.
(cache_integer_cst): Likewise.
* var-tracking.c (EXPR_USE_DEPTH): Likewise.
(reverse_op): Likewise.
(vt_find_locations): Likewise.
2019-11-12 Martin Liska <mliska@suse.cz>
* gimple-parser.c (c_parser_parse_gimple_body): Replace old parameter syntax
with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET
macro.
2019-11-12 Martin Liska <mliska@suse.cz>
* name-lookup.c (namespace_hints::namespace_hints): Replace old parameter syntax
with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET
macro.
* typeck.c (comptypes): Likewise.
2019-11-12 Martin Liska <mliska@suse.cz>
* lto-partition.c (lto_balanced_map): Replace old parameter syntax
with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET
macro.
* lto.c (do_whole_program_analysis): Likewise.
From-SVN: r278085
|
|
* gimple-streamer-out.c (output_gimple_stmt): Add explicit function
parameter.
* lto-streamer-out.c: Include tree-dfa.h.
(output_cfg): Do not use cfun.
(lto_prepare_function_for_streaming): New.
(output_function): Do not push cfun; do not initialize loop optimizer.
* lto-streamer.h (lto_prepare_function_for_streaming): Declare.
* passes.c (ipa_write_summaries): Use it.
(ipa_write_optimization_summaries): Do not modify bodies.
* tree-dfa.c (renumber_gimple_stmt_uids): Add function parameter.
* tree.dfa.h (renumber_gimple_stmt_uids): Update prototype.
* tree-ssa-dse.c (pass_dse::execute): Update use of
renumber_gimple_stmt_uids.
* tree-ssa-math-opts.c (pass_optimize_widening_mul::execute): Likewise.
* lto.c (lto_wpa_write_files): Prepare all bodies for streaming.
From-SVN: r276870
|
|
I needed to add another instance of this idiom, so thought it'd
be worth having a helper function.
2019-08-05 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* gimple.h (gimple_move_vops): Declare.
* gimple.c (gimple_move_vops): New function
* gimple-fold.c (replace_call_with_call_and_fold)
(gimple_fold_builtin_memory_op, gimple_fold_builtin_memset)
(gimple_fold_builtin_stpcpy, fold_builtin_atomic_compare_exchange)
(gimple_fold_call): Use it.
* ipa-param-manipulation.c (ipa_modify_call_arguments): Likewise.
* tree-call-cdce.c (use_internal_fn): Likewise.
* tree-if-conv.c (predicate_load_or_store): Likewise.
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Likewise.
* tree-ssa-math-opts.c (pass_cse_reciprocals::execute): Likewise.
* tree-ssa-propagate.c (finish_update_gimple_call): Likewise.
(update_call_from_tree): Likewise.
* tree-vect-stmts.c (vectorizable_load): Likewise.
* tree-vectorizer.c (adjust_simduid_builtins): Likewise.
From-SVN: r274117
|
|
This patch extends the FMA handling in tree-ssa-math-opts.c so
that it can cope with conditional multiplications as well as
unconditional multiplications. The addition or subtraction must then
have the same condition as the multiplication (at least for now).
E.g. we can currently fold:
(IFN_COND_ADD cond (mul x y) z fallback)
-> (IFN_COND_FMA cond x y z fallback)
This patch also allows:
(IFN_COND_ADD cond (IFN_COND_MUL cond x y <whatever>) z fallback)
-> (IFN_COND_FMA cond x y z fallback)
2019-07-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-ssa-math-opts.c (convert_mult_to_fma): Add a mul_cond
parameter. When nonnull, make sure that the addition or subtraction
has the same condition.
(math_opts_dom_walker::after_dom_children): Try convert_mult_to_fma
for CFN_COND_MUL too.
gcc/testsuite/
* gcc.dg/vect/vect-cond-arith-7.c: New test.
From-SVN: r273905
|
|
tree-eh.c:3938 since r219202)
PR tree-optimization/90090
* tree-ssa-math-opts.c (is_division_by): Ignore divisions that can
throw internally.
(is_division_by_square): Likewise. Formatting fix.
* g++.dg/opt/pr90090.C: New test.
From-SVN: r270379
|
|
This patch fixes a case in which, due to forced missed optimisations
in earlier passes, we have:
_1 = a * b
_2 = -_1
_3 = -_1
_4 = _2 + _3
and treated _4 as two FNMA candidates, once via _2 and once via _3.
2019-04-05 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/89956
* tree-ssa-math-opts.c (convert_mult_to_fma): Protect against
multiple negates of the same value.
gcc/testsuite/
PR tree-optimization/89956
* gfortran.dg/pr89956.f90: New test.
From-SVN: r270162
|
|
(error: dead STMT in EH table))
2019-03-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/89802
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Properly
move EH data to folded stmt.
* g++.dg/tree-ssa/pr89802.C: New testcase.
From-SVN: r269913
|
|
2019-03-12 Richard Biener <rguenther@suse.de>
PR tree-optimization/89664
* tree-ssa-math-opts.c (execute_cse_reciprocals_1): Properly
free the occurance tree after the early out.
* gfortran.dg/pr89664.f90: New testcase.
From-SVN: r269604
|
|
From-SVN: r267494
|
|
block 4 follows the use))
PR tree-optimization/87977
* tree-ssa-math-opts.c (optimize_recip_sqrt): Don't reuse division
stmt, build a new one and replace the old one with it. Formatting fix.
Call release_ssa_name (x) if !has_other_use and !delete_div.
(pass_cse_reciprocals::execute): Before calling optimize_recip_sqrt
verify lhs of stmt is still def.
* gcc.dg/recip_sqrt_mult_1.c: Add -fcompare-debug to dg-options.
* gcc.dg/recip_sqrt_mult_2.c: Likewise.
* gcc.dg/recip_sqrt_mult_3.c: Likewise.
* gcc.dg/recip_sqrt_mult_4.c: Likewise.
* gcc.dg/recip_sqrt_mult_5.c: Likewise.
From-SVN: r266098
|
|
This long patch only does one simple thing, adds an explicit function
parameter to predicates stmt_could_throw_p, stmt_can_throw_external
and stmt_can_throw_internal.
My motivation was ability to use stmt_can_throw_external in IPA
analysis phase without the need to push cfun. As I have discovered,
we were already doing that in cgraph.c, which this patch avoids as
well. In the process, I had to add a struct function parameter to
stmt_could_throw_p and decided to also change the interface of
stmt_can_throw_internal just for the sake of some minimal consistency.
In the process I have discovered that calling method
cgraph_node::create_version_clone_with_body (used by ipa-split,
ipa-sra, OMP simd and multiple_target) leads to calls of
stmt_can_throw_external with NULL cfun. I have worked around this by
making stmt_can_throw_external and stmt_could_throw_p gracefully
accept NULL and just be pessimistic in that case. The problem with
fixing this in a better way is that struct function for the clone is
created after cloning edges where we attempt to push the yet not
existing cfun, and moving it before would require a bit of surgery in
tree-inline.c. A slightly hackish but simpler fix might be to
explicitely pass the "old" function to symbol_table::create_edge
because it should be just as good at that moment. In any event, that
is a topic for another patch.
I believe that currently we incorrectly use cfun in
maybe_clean_eh_stmt_fn and maybe_duplicate_eh_stmt_fn, both in
tree-eh.c, and so I have fixed these cases too. The bulk of other
changes is just mechanical adding of cfun to all users.
Bootstrapped and tested on x86_64-linux (also with extra NULLing and
restoring cfun to double check it is not used in a place I missed), OK
for trunk?
Thanks,
Martin
2018-10-22 Martin Jambor <mjambor@suse.cz>
* tree-eh.h (stmt_could_throw_p): Add function parameter.
(stmt_can_throw_external): Likewise.
(stmt_can_throw_internal): Likewise.
* tree-eh.c (lower_eh_constructs_2): Pass cfun to stmt_could_throw_p.
(lower_eh_constructs_2): Likewise.
(stmt_could_throw_p): Add fun parameter, use it instead of cfun.
(stmt_can_throw_external): Likewise.
(stmt_can_throw_internal): Likewise.
(maybe_clean_eh_stmt_fn): Pass cfun to stmt_could_throw_p.
(maybe_clean_or_replace_eh_stmt): Pass cfun to stmt_could_throw_p.
(maybe_duplicate_eh_stmt_fn): Pass new_fun to stmt_could_throw_p.
(maybe_duplicate_eh_stmt): Pass cfun to stmt_could_throw_p.
(pass_lower_eh_dispatch::execute): Pass cfun to
stmt_can_throw_external.
(cleanup_empty_eh): Likewise.
(verify_eh_edges): Pass cfun to stmt_could_throw_p.
* cgraph.c (cgraph_edge::set_call_stmt): Pass a function to
stmt_can_throw_external instead of pushing it to cfun.
(symbol_table::create_edge): Likewise.
* gimple-fold.c (fold_builtin_atomic_compare_exchange): Pass cfun to
stmt_can_throw_internal.
* gimple-ssa-evrp.c (evrp_dom_walker::before_dom_children): Pass cfun
to stmt_could_throw_p.
* gimple-ssa-store-merging.c (handled_load): Pass cfun to
stmt_can_throw_internal.
(pass_store_merging::execute): Likewise.
* gimple-ssa-strength-reduction.c
(find_candidates_dom_walker::before_dom_children): Pass cfun to
stmt_could_throw_p.
* gimplify-me.c (gimple_regimplify_operands): Pass cfun to
stmt_can_throw_internal.
* ipa-pure-const.c (check_call): Pass cfun to stmt_could_throw_p and
to stmt_can_throw_external.
(check_stmt): Pass cfun to stmt_could_throw_p.
(check_stmt): Pass cfun to stmt_can_throw_external.
(pass_nothrow::execute): Likewise.
* trans-mem.c (expand_call_tm): Pass cfun to stmt_can_throw_internal.
* tree-cfg.c (is_ctrl_altering_stmt): Pass cfun to
stmt_can_throw_internal.
(verify_gimple_in_cfg): Pass cfun to stmt_could_throw_p.
(stmt_can_terminate_bb_p): Pass cfun to stmt_can_throw_external.
(gimple_purge_dead_eh_edges): Pass cfun to stmt_can_throw_internal.
* tree-complex.c (expand_complex_libcall): Pass cfun to
stmt_could_throw_p and to stmt_can_throw_internal.
(expand_complex_multiplication): Pass cfun to stmt_can_throw_internal.
* tree-inline.c (copy_edges_for_bb): Likewise.
(maybe_move_debug_stmts_to_successors): Likewise.
* tree-outof-ssa.c (ssa_is_replaceable_p): Pass cfun to
stmt_could_throw_p.
* tree-parloops.c (oacc_entry_exit_ok_1): Likewise.
* tree-sra.c (scan_function): Pass cfun to stmt_can_throw_external.
* tree-ssa-alias.c (stmt_kills_ref_p): Pass cfun to
stmt_can_throw_internal.
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Likewise.
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Pass cfun to
stmt_could_throw_p.
(mark_aliased_reaching_defs_necessary_1): Pass cfun to
stmt_can_throw_internal.
* tree-ssa-forwprop.c (pass_forwprop::execute): Likewise.
* tree-ssa-loop-im.c (movement_possibility): Pass cfun to
stmt_could_throw_p.
* tree-ssa-loop-ivopts.c (find_givs_in_stmt_scev): Likewise.
(add_autoinc_candidates): Pass cfun to stmt_can_throw_internal.
* tree-ssa-math-opts.c (pass_cse_reciprocals::execute): Likewise.
(convert_mult_to_fma_1): Likewise.
(convert_to_divmod): Likewise.
* tree-ssa-phiprop.c (propagate_with_phi): Likewise.
* tree-ssa-pre.c (compute_avail): Pass cfun to stmt_could_throw_p.
* tree-ssa-propagate.c
(substitute_and_fold_dom_walker::before_dom_children): Likewise.
* tree-ssa-reassoc.c (suitable_cond_bb): Likewise.
(maybe_optimize_range_tests): Likewise.
(linearize_expr_tree): Likewise.
(reassociate_bb): Likewise.
* tree-ssa-sccvn.c (copy_reference_ops_from_call): Likewise.
* tree-ssa-scopedtables.c (hashable_expr_equal_p): Likewise.
* tree-ssa-strlen.c (adjust_last_stmt): Likewise.
(handle_char_store): Likewise.
* tree-vect-data-refs.c (vect_find_stmt_data_reference): Pass cfun to
stmt_can_throw_internal.
* tree-vect-patterns.c (check_bool_pattern): Pass cfun to
stmt_could_throw_p.
* tree-vect-stmts.c (vect_finish_stmt_generation_1): Likewise.
(vectorizable_call): Pass cfun to stmt_can_throw_internal.
(vectorizable_simd_clone_call): Likewise.
* value-prof.c (gimple_ic): Pass cfun to stmt_could_throw_p.
(gimple_stringop_fixed_value): Likewise.
From-SVN: r265372
|
|
execute_cse_reciprocals_1 before trying optimize_recip_sqrt
PR tree-optimization/87259
PR lto/87283
(pass_cse_reciprocals::execute): Run optimize_recip_sqrt after
execute_cse_reciprocals_1 has tried transforming.
PR tree-optimization/87259
* gcc.dg/pr87259.c: New test.
From-SVN: r264312
|
|
This patch aims to optimise sequences involving uses of 1.0 / sqrt (a) under -freciprocal-math and -funsafe-math-optimizations.
In particular consider:
x = 1.0 / sqrt (a);
r1 = x * x; // same as 1.0 / a
r2 = a * x; // same as sqrt (a)
If x, r1 and r2 are all used further on in the code, this can be transformed into:
tmp1 = 1.0 / a
tmp2 = sqrt (a)
tmp3 = tmp1 * tmp2
x = tmp3
r1 = tmp1
r2 = tmp2
A bit convoluted, but this saves us one multiplication and, more importantly, the sqrt and division are now independent.
This also allows optimisation of a subset of these expressions.
For example:
x = 1.0 / sqrt (a)
r1 = x * x
can be transformed to r1 = 1.0 / a, eliminating the sqrt if x is not used anywhere else.
And similarly:
x = 1.0 / sqrt (a)
r1 = a * x
can be transformed to sqrt (a) eliminating the division.
For the testcase:
double res, res2, tmp;
void
foo (double a, double b)
{
tmp = 1.0 / __builtin_sqrt (a);
res = tmp * tmp;
res2 = a * tmp;
}
We now generate for aarch64 with -Ofast:
foo:
fmov d2, 1.0e+0
adrp x2, res2
fsqrt d1, d0
adrp x1, res
fdiv d0, d2, d0
adrp x0, tmp
str d1, [x2, #:lo12:res2]
fmul d1, d1, d0
str d0, [x1, #:lo12:res]
str d1, [x0, #:lo12:tmp]
ret
where before it generated:
foo:
fsqrt d2, d0
fmov d1, 1.0e+0
adrp x1, res2
adrp x2, tmp
adrp x0, res
fdiv d1, d1, d2
fmul d0, d1, d0
fmul d2, d1, d1
str d1, [x2, #:lo12:tmp]
str d0, [x1, #:lo12:res2]
str d2, [x0, #:lo12:res]
ret
As you can see, the new sequence has one fewer multiply and the fsqrt and fdiv are independent.
* tree-ssa-math-opts.c (is_mult_by): New function.
(is_square_of): Use the above.
(optimize_recip_sqrt): New function.
(pass_cse_reciprocals::execute): Use the above.
* gcc.dg/recip_sqrt_mult_1.c: New test.
* gcc.dg/recip_sqrt_mult_2.c: Likewise.
* gcc.dg/recip_sqrt_mult_3.c: Likewise.
* gcc.dg/recip_sqrt_mult_4.c: Likewise.
* gcc.dg/recip_sqrt_mult_5.c: Likewise.
* g++.dg/recip_sqrt_mult_1.C: Likewise.
* g++.dg/recip_sqrt_mult_2.C: Likewise.
From-SVN: r264126
|
|
2018-08-27 Martin Liska <mliska@suse.cz>
* builtins.h (is_builtin_fn): Remove and fndecl_built_in_p.
* builtins.c (is_builtin_fn): Likewise.
* attribs.c (diag_attr_exclusions): Use new function
fndecl_built_in_p and remove check for FUNCTION_DECL if
possible.
(builtin_mathfn_code): Likewise.
(fold_builtin_expect): Likewise.
(fold_call_expr): Likewise.
(fold_builtin_call_array): Likewise.
(fold_call_stmt): Likewise.
(set_builtin_user_assembler_name): Likewise.
(is_simple_builtin): Likewise.
* calls.c (gimple_alloca_call_p): Likewise.
(maybe_warn_nonstring_arg): Likewise.
* cfgexpand.c (expand_call_stmt): Likewise.
* cgraph.c (cgraph_update_edges_for_call_stmt_node): Likewise.
(cgraph_edge::verify_corresponds_to_fndecl): Likewise.
(cgraph_node::verify_node): Likewise.
* cgraphclones.c (build_function_decl_skip_args): Likewise.
(cgraph_node::create_clone): Likewise.
* config/arm/arm.c (arm_insert_attributes): Likewise.
* config/i386/i386.c (ix86_gimple_fold_builtin): Likewise.
* dse.c (scan_insn): Likewise.
* expr.c (expand_expr_real_1): Likewise.
* fold-const.c (operand_equal_p): Likewise.
(fold_binary_loc): Likewise.
* gimple-fold.c (gimple_fold_stmt_to_constant_1): Likewise.
* gimple-low.c (lower_stmt): Likewise.
* gimple-pretty-print.c (dump_gimple_call): Likewise.
* gimple-ssa-warn-restrict.c (wrestrict_dom_walker::check_call): Likewise.
* gimple.c (gimple_build_call_from_tree): Likewise.
(gimple_call_builtin_p): Likewise.
(gimple_call_combined_fn): Likewise.
* gimplify.c (gimplify_call_expr): Likewise.
(gimple_boolify): Likewise.
(gimplify_modify_expr): Likewise.
(gimplify_addr_expr): Likewise.
* hsa-gen.c (gen_hsa_insns_for_call): Likewise.
* ipa-cp.c (determine_versionability): Likewise.
* ipa-fnsummary.c (compute_fn_summary): Likewise.
* ipa-param-manipulation.c (ipa_modify_formal_parameters): Likewise.
* ipa-split.c (visit_bb): Likewise.
(split_function): Likewise.
* ipa-visibility.c (cgraph_externally_visible_p): Likewise.
* lto-cgraph.c (input_node): Likewise.
* lto-streamer-out.c (write_symbol): Likewise.
* omp-low.c (setjmp_or_longjmp_p): Likewise.
(lower_omp_1): Likewise.
* predict.c (strip_predict_hints): Likewise.
* print-tree.c (print_node): Likewise.
* symtab.c (symtab_node::output_to_lto_symbol_table_p): Likewise.
* trans-mem.c (is_tm_irrevocable): Likewise.
(is_tm_load): Likewise.
(is_tm_simple_load): Likewise.
(is_tm_store): Likewise.
(is_tm_simple_store): Likewise.
(is_tm_abort): Likewise.
(tm_region_init_1): Likewise.
* tree-call-cdce.c (gen_shrink_wrap_conditions): Likewise.
* tree-cfg.c (verify_gimple_call): Likewise.
(move_stmt_r): Likewise.
(stmt_can_terminate_bb_p): Likewise.
* tree-eh.c (lower_eh_constructs_2): Likewise.
* tree-if-conv.c (if_convertible_stmt_p): Likewise.
* tree-inline.c (remap_gimple_stmt): Likewise.
(copy_bb): Likewise.
(estimate_num_insns): Likewise.
(fold_marked_statements): Likewise.
* tree-sra.c (scan_function): Likewise.
* tree-ssa-ccp.c (surely_varying_stmt_p): Likewise.
(optimize_stack_restore): Likewise.
(pass_fold_builtins::execute): Likewise.
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Likewise.
(mark_all_reaching_defs_necessary_1): Likewise.
* tree-ssa-dom.c (dom_opt_dom_walker::optimize_stmt): Likewise.
* tree-ssa-forwprop.c (simplify_builtin_call): Likewise.
(pass_forwprop::execute): Likewise.
* tree-ssa-loop-im.c (stmt_cost): Likewise.
* tree-ssa-math-opts.c (pass_cse_reciprocals::execute): Likewise.
* tree-ssa-sccvn.c (fully_constant_vn_reference_p): Likewise.
* tree-ssa-strlen.c (get_string_length): Likewise.
* tree-ssa-structalias.c (handle_lhs_call): Likewise.
(find_func_aliases_for_call): Likewise.
* tree-ssa-ter.c (find_replaceable_in_bb): Likewise.
* tree-stdarg.c (optimize_va_list_gpr_fpr_size): Likewise.
* tree-tailcall.c (find_tail_calls): Likewise.
* tree.c (need_assembler_name_p): Likewise.
(free_lang_data_in_decl): Likewise.
(get_call_combined_fn): Likewise.
* ubsan.c (is_ubsan_builtin_p): Likewise.
* varasm.c (incorporeal_function_p): Likewise.
* tree.h (DECL_BUILT_IN): Remove and replace with
fndecl_built_in_p.
(DECL_BUILT_IN_P): Transfort to fndecl_built_in_p.
(fndecl_built_in_p): New.
2018-08-27 Martin Liska <mliska@suse.cz>
* gcc-interface/decl.c (update_profile): Use new function
fndecl_built_in_p and remove check for FUNCTION_DECL if
possible.
* gcc-interface/gigi.h (call_is_atomic_load): Likewise.
* gcc-interface/utils.c (gnat_pushdecl): Likewise.
2018-08-27 Martin Liska <mliska@suse.cz>
* c-common.c (check_function_restrict): Use new function
fndecl_built_in_p and remove check for FUNCTION_DECL if
possible.
(check_builtin_function_arguments): Likewise.
(reject_gcc_builtin): Likewise.
* c-warn.c (sizeof_pointer_memaccess_warning): Likewise.
2018-08-27 Martin Liska <mliska@suse.cz>
* c-decl.c (locate_old_decl): Use new function
fndecl_built_in_p and remove check for FUNCTION_DECL if
possible.
(diagnose_mismatched_decls): Likewise.
(merge_decls): Likewise.
(warn_if_shadowing): Likewise.
(pushdecl): Likewise.
(implicitly_declare): Likewise.
* c-parser.c (c_parser_postfix_expression_after_primary): Likewise.
* c-tree.h (C_DECL_ISNT_PROTOTYPE): Likewise.
* c-typeck.c (build_function_call_vec): Likewise.
(convert_arguments): Likewise.
2018-08-27 Martin Liska <mliska@suse.cz>
* call.c (build_call_a): Use new function
fndecl_built_in_p and remove check for FUNCTION_DECL if
possible.
(build_cxx_call): Likewise.
* constexpr.c (constexpr_fn_retval): Likewise.
(cxx_eval_builtin_function_call): Likewise.
(cxx_eval_call_expression): Likewise.
(potential_constant_expression_1): Likewise.
* cp-gimplify.c (cp_gimplify_expr): Likewise.
(cp_fold): Likewise.
* decl.c (decls_match): Likewise.
(validate_constexpr_redeclaration): Likewise.
(duplicate_decls): Likewise.
(make_rtl_for_nonlocal_decl): Likewise.
* name-lookup.c (consider_binding_level): Likewise.
(cp_emit_debug_info_for_using): Likewise.
* semantics.c (finish_call_expr): Likewise.
* tree.c (builtin_valid_in_constant_expr_p): Likewise.
2018-08-27 Martin Liska <mliska@suse.cz>
* go-gcc.cc (Gcc_backend::call_expression): Use new function
fndecl_built_in_p and remove check for FUNCTION_DECL if
possible.
2018-08-27 Martin Liska <mliska@suse.cz>
* lto-lang.c (handle_const_attribute): Use new function
fndecl_built_in_p and remove check for FUNCTION_DECL if
possible.
* lto-symtab.c (lto_symtab_merge_p): Likewise.
(lto_symtab_merge_decls_1): Likewise.
(lto_symtab_merge_symbols): Likewise.
* lto.c (lto_maybe_register_decl): Likewise.
(read_cgraph_and_symbols): Likewise.
From-SVN: r263880
|
|
-ffast-math)
PR tree-optimization/86835
* tree-ssa-math-opts.c (insert_reciprocals): Even when inserting
new_stmt after def_gsi, make sure to insert new_square_stmt after
that stmt, not 2 stmts before it.
* gcc.dg/pr86835.c: New test.
From-SVN: r263487
|
|
This patch adds support for fusing a conditional add or subtract
with a multiplication, so that we can use fused multiply-add and
multiply-subtract operations for fully-masked reductions. E.g.
for SVE we vectorise:
double res = 0.0;
for (int i = 0; i < n; ++i)
res += x[i] * y[i];
using a fully-masked loop in which the loop body has the form:
res_1 = PHI<0(preheader), res_2(latch)>;
avec = .MASK_LOAD (loop_mask, a)
bvec = .MASK_LOAD (loop_mask, b)
prod = avec * bvec;
res_2 = .COND_ADD (loop_mask, res_1, prod, res_1);
where the last statement does the equivalent of:
res_2 = loop_mask ? res_1 + prod : res_1;
(operating elementwise). The point of the patch is to convert the last
two statements into:
res_s = .COND_FMA (loop_mask, avec, bvec, res_1, res_1);
which is equivalent to:
res_2 = loop_mask ? fma (avec, bvec, res_1) : res_1;
(again operating elementwise).
2018-07-12 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* internal-fn.h (can_interpret_as_conditional_op_p): Declare.
* internal-fn.c (can_interpret_as_conditional_op_p): New function.
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Handle conditional
plus and minus and convert them into IFN_COND_FMA-based sequences.
(convert_mult_to_fma): Handle conditional plus and minus.
gcc/testsuite/
* gcc.dg/vect/vect-fma-2.c: New test.
* gcc.target/aarch64/sve/reduc_4.c: Likewise.
* gcc.target/aarch64/sve/reduc_6.c: Likewise.
* gcc.target/aarch64/sve/reduc_7.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r262588
|
|
gcc/brig/ChangeLog:
* brigfrontend/brig-to-generic.cc
(brig_to_generic::write_globals): Use TDF_NONE rather than 0.
(dump_function): Likewise.
gcc/c-family/ChangeLog:
* c-pretty-print.c (c_pretty_printer::statement): Use TDF_NONE
rather than 0.
gcc/ChangeLog:
* cfg.c (debug): Use TDF_NONE rather than 0.
* cfghooks.c (debug): Likewise.
* dumpfile.c (DUMP_FILE_INFO): Likewise; also for OPTGROUP.
(struct dump_option_value_info): Convert to...
(struct kv_pair): ...this template type.
(dump_options): Convert to kv_pair<dump_flags_t>; use TDF_NONE
rather than 0.
(optinfo_verbosity_options): Likewise.
(optgroup_options): Convert to kv_pair<optgroup_flags_t>; use
OPTGROUP_NONE.
(gcc::dump_manager::dump_register): Use optgroup_flags_t rather
than int for "optgroup_flags" param.
(dump_generic_expr_loc): Use dump_flags_t rather than int for
"dump_kind" param.
(dump_dec): Likewise.
(dump_finish): Use TDF_NONE rather than 0.
(gcc::dump_manager::opt_info_enable_passes): Use optgroup_flags_t
rather than int for "optgroup_flags" param. Use TDF_NONE rather
than 0. Update for change to option_ptr.
(opt_info_switch_p_1): Convert "optgroup_flags" param from int *
to optgroup_flags_t *. Use TDF_NONE and OPTGROUP_NONE rather than
0. Update for changes to optinfo_verbosity_options and
optgroup_options.
(opt_info_switch_p): Convert optgroup_flags from int to
optgroup_flags_t.
(dump_basic_block): Use dump_flags_t rather than int
for "dump_kind" param.
* dumpfile.h (TDF_ADDRESS, TDF_SLIM, TDF_RAW, TDF_DETAILS,
TDF_STATS, TDF_BLOCKS, TDF_VOPS, TDF_LINENO, TDF_UID)
TDF_STMTADDR, TDF_GRAPH, TDF_MEMSYMS, TDF_RHS_ONLY, TDF_ASMNAME,
TDF_EH, TDF_NOUID, TDF_ALIAS, TDF_ENUMERATE_LOCALS, TDF_CSELIB,
TDF_SCEV, TDF_GIMPLE, TDF_FOLDING, MSG_OPTIMIZED_LOCATIONS,
MSG_MISSED_OPTIMIZATION, MSG_NOTE, MSG_ALL, TDF_COMPARE_DEBUG,
TDF_NONE): Convert from macros to...
(enum dump_flag): ...this new enum.
(dump_flags_t): Update to use enum.
(operator|, operator&, operator~, operator|=, operator&=):
Implement for dump_flags_t.
(OPTGROUP_NONE, OPTGROUP_IPA, OPTGROUP_LOOP, OPTGROUP_INLINE,
OPTGROUP_OMP, OPTGROUP_VEC, OPTGROUP_OTHER, OPTGROUP_ALL):
Convert from macros to...
(enum optgroup_flag): ...this new enum.
(optgroup_flags_t): New typedef.
(operator|, operator|=): Implement for optgroup_flags_t.
(struct dump_file_info): Convert field "alt_flags" to
dump_flags_t. Convert field "optgroup_flags" to
optgroup_flags_t.
(dump_basic_block): Use dump_flags_t rather than int for param.
(dump_generic_expr_loc): Likewise.
(dump_dec): Likewise.
(dump_register): Convert param "optgroup_flags" to
optgroup_flags_t.
(opt_info_enable_passes): Likewise.
* early-remat.c (early_remat::dump_edge_list): Use TDF_NONE rather
than 0.
* gimple-pretty-print.c (debug): Likewise.
* gimple-ssa-store-merging.c (bswap_replace): Likewise.
(merged_store_group::apply_stores): Likewise.
* gimple-ssa-strength-reduction.c (insert_initializers): Likewise.
* gimple.c (verify_gimple_pp): Likewise.
* graphite-poly.c (print_pbb_body): Likewise.
* passes.c (pass_manager::register_one_dump_file): Convert
local "optgroup_flags" to optgroup_flags_t.
* print-tree.c (print_node): Use TDF_NONE rather than 0.
(debug): Likewise.
(debug_body): Likewise.
* tree-pass.h (struct pass_data): Convert field "optgroup_flags"
to optgroup_flags_t.
* tree-pretty-print.c (print_struct_decl): Use TDF_NONE rather
than 0.
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Likewise.
(convert_mult_to_fma): Likewise.
* tree-ssa-reassoc.c (undistribute_ops_list): Likewise.
* tree-ssa-sccvn.c (vn_eliminate): Likewise.
* tree-vect-data-refs.c (dump_lower_bound): Convert param
"dump_kind" to dump_flags_t.
From-SVN: r261325
|
|
There are four optabs for various forms of fused multiply-add:
fma, fms, fnma and fnms. Of these, only fma had a direct gimple
representation. For the other three we relied on special pattern-
matching during expand, although tree-ssa-math-opts.c did have
some code to try to second-guess what expand would do.
This patch removes the old FMA_EXPR representation of fma and
introduces four new internal functions, one for each optab.
IFN_FMA is tied to BUILT_IN_FMA* while the other three are
independent directly-mapped internal functions. It's then
possible to do the pattern-matching in match.pd and
tree-ssa-math-opts.c (via folding) can select the exact
FMA-based operation.
The BRIG & HSA parts are a best guess, but seem relatively simple.
2018-05-18 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* doc/sourcebuild.texi (scalar_all_fma): Document.
* tree.def (FMA_EXPR): Delete.
* internal-fn.def (FMA, FMS, FNMA, FNMS): New internal functions.
* internal-fn.c (ternary_direct): New macro.
(expand_ternary_optab_fn): Likewise.
(direct_ternary_optab_supported_p): Likewise.
* Makefile.in (build/genmatch.o): Depend on case-fn-macros.h.
* builtins.c (fold_builtin_fma): Delete.
(fold_builtin_3): Don't call it.
* cfgexpand.c (expand_debug_expr): Remove FMA_EXPR handling.
* expr.c (expand_expr_real_2): Likewise.
* fold-const.c (operand_equal_p): Likewise.
(fold_ternary_loc): Likewise.
* gimple-pretty-print.c (dump_ternary_rhs): Likewise.
* gimple.c (DEFTREECODE): Likewise.
* gimplify.c (gimplify_expr): Likewise.
* optabs-tree.c (optab_for_tree_code): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-eh.c (operation_could_trap_p): Likewise.
(stmt_could_throw_1_p): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
(op_code_prio): Likewise.
* tree-ssa-loop-im.c (stmt_cost): Likewise.
* tree-ssa-operands.c (get_expr_operands): Likewise.
* tree.c (commutative_ternary_tree_code, add_expr): Likewise.
* fold-const-call.h (fold_fma): Delete.
* fold-const-call.c (fold_const_call_ssss): Handle CFN_FMS,
CFN_FNMA and CFN_FNMS.
(fold_fma): Delete.
* genmatch.c (combined_fn): New enum.
(commutative_ternary_tree_code): Remove FMA_EXPR handling.
(commutative_op): New function.
(commutate): Use it. Handle more than 2 operands.
(dt_operand::gen_gimple_expr): Use commutative_op.
(parser::parse_expr): Allow :c to be used with non-binary
operators if the commutative operand is known.
* gimple-ssa-backprop.c (backprop::process_builtin_call_use): Handle
CFN_FMS, CFN_FNMA and CFN_FNMS.
(backprop::process_assign_use): Remove FMA_EXPR handling.
* hsa-gen.c (gen_hsa_insns_for_operation_assignment): Likewise.
(gen_hsa_fma): New function.
(gen_hsa_insn_for_internal_fn_call): Use it for IFN_FMA, IFN_FMS,
IFN_FNMA and IFN_FNMS.
* match.pd: Add folds for IFN_FMS, IFN_FNMA and IFN_FNMS.
* gimple-fold.h (follow_all_ssa_edges): Declare.
* gimple-fold.c (follow_all_ssa_edges): New function.
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Use the
gimple_build interface and use follow_all_ssa_edges to fold the result.
(convert_mult_to_fma): Use direct_internal_fn_suppoerted_p
instead of checking for optabs directly.
* config/i386/i386.c (ix86_add_stmt_cost): Recognize FMAs as calls
rather than FMA_EXPRs.
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Create a
call to IFN_FMA instead of an FMA_EXPR.
gcc/brig/
* brigfrontend/brig-function.cc
(brig_function::get_builtin_for_hsa_opcode): Use BUILT_IN_FMA
for BRIG_OPCODE_FMA.
(brig_function::get_tree_code_for_hsa_opcode): Treat BUILT_IN_FMA
as a call.
gcc/c/
* gimple-parser.c (c_parser_gimple_postfix_expression): Remove
__FMA_EXPR handlng.
gcc/cp/
* constexpr.c (cxx_eval_constant_expression): Remove FMA_EXPR handling.
(potential_constant_expression_1): Likewise.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_scalar_all_fma):
New proc.
* gcc.dg/fma-1.c: New test.
* gcc.dg/fma-2.c: Likewise.
* gcc.dg/fma-3.c: Likewise.
* gcc.dg/fma-4.c: Likewise.
* gcc.dg/fma-5.c: Likewise.
* gcc.dg/fma-6.c: Likewise.
* gcc.dg/fma-7.c: Likewise.
* gcc.dg/gimplefe-26.c: Use .FMA instead of __FMA and require
scalar_all_fma.
* gfortran.dg/reassoc_7.f: Pass -ffp-contract=off.
* gfortran.dg/reassoc_8.f: Likewise.
* gfortran.dg/reassoc_9.f: Likewise.
* gfortran.dg/reassoc_10.f: Likewise.
From-SVN: r260348
|
|
2018-01-12 Martin Jambor <mjambor@suse.cz>
PR target/81616
* params.def: New parameter PARAM_AVOID_FMA_MAX_BITS.
* tree-ssa-math-opts.c: Include domwalk.h.
(convert_mult_to_fma_1): New function.
(fma_transformation_info): New type.
(fma_deferring_state): Likewise.
(cancel_fma_deferring): New function.
(result_of_phi): Likewise.
(last_fma_candidate_feeds_initial_phi): Likewise.
(convert_mult_to_fma): Added deferring logic, split actual
transformation to convert_mult_to_fma_1.
(math_opts_dom_walker): New type.
(math_opts_dom_walker::after_dom_children): New method, body moved
here from pass_optimize_widening_mul::execute, added deferring logic
bits.
(pass_optimize_widening_mul::execute): Moved most of code to
math_opts_dom_walker::after_dom_children.
* config/i386/x86-tune.def (X86_TUNE_AVOID_128FMA_CHAINS): New.
* config/i386/i386.c (ix86_option_override_internal): Added
maybe_setting of PARAM_AVOID_FMA_MAX_BITS.
From-SVN: r256581
|
|
assertion.
* tree-ssa-math-opts.c (execute_cse_reciprocals_1): Remove
redundant test in assertion.
From-SVN: r256260
|
|
From-SVN: r256169
|
|
marked for throw, but doesn't))
PR tree-optimization/83523
* tree-ssa-math-opts.c (is_widening_mult_p): Return false if
for INTEGER_TYPE TYPE_OVERFLOW_TRAPS.
(convert_mult_to_fma): Likewise.
* g++.dg/tree-ssa/pr83523.C: New test.
From-SVN: r255953
|
|
gcc/tree-ssa-math-opts.c:585)
PR tree-optimization/83491
* tree-ssa-math-opts.c (execute_cse_reciprocals_1): Check for SSA_NAME
before walking uses. Improve coding style and comments.
PR tree-optimization/83491
* gcc.dg/pr83491.c: Add new test.
From-SVN: r255906
|
|
2017-12-05 Richard Biener <rguenther@suse.de>
* timevar.def (TV_TREE_RECIP, TV_TREE_SINCOS, TV_TREE_WIDEN_MUL):
Add.
* tree-ssa-math-opts.c (pass_data_cse_reciprocal): Use TV_TREE_RECIP.
(pass_data_cse_sincos): Use TV_TREE_SINCOS.
(pass_data_optimize_widening_mul): Use TV_TREE_WIDEN_MUL.
From-SVN: r255415
|
|
This patch implements the some of the division optimizations discussed in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71026.
The division reciprocal optimization now handles divisions by squares:
x / (y * y) -> x * (1 / y) * (1 / y)
This requires at least one more division by y before it triggers - the
3 divisions of (1/ y) are then CSEd into a single division. Overall
this changes 1 division into 1 multiply, which is generally much faster.
2017-11-24 Jackson Woodruff <jackson.woodruff@arm.com>
gcc/
PR tree-optimization/71026
* tree-ssa-math-opts (is_division_by_square, is_square_of): New.
(insert_reciprocals): Change to insert reciprocals before a division
by a square and to insert the square of a reciprocal.
(execute_cse_reciprocals_1): Change to consider division by a square.
(register_division_in): Add importance parameter.
testsuite/
PR tree-optimization/71026
* gfortran.dg/extract_recip_1.f: New test.
* gcc.dg/extract_recip_3.c: New test.
* gcc.dg/extract_recip_4.c: New test.
From-SVN: r255141
|
|
* tree-ssa-math-opts.c (nop_stats, bswap_stats, struct symbolic_number,
BITS_PER_MARKER, MARKER_MASK, MARKER_BYTE_UNKNOWN, HEAD_MARKER, CMPNOP,
CMPXCHG, do_shift_rotate, verify_symbolic_number_p,
init_symbolic_number, find_bswap_or_nop_load, perform_symbolic_merge,
find_bswap_or_nop_1, find_bswap_or_nop, pass_data_optimize_bswap,
class pass_optimize_bswap, bswap_replace,
pass_optimize_bswap::execute): Moved to ...
* gimple-ssa-store-merging.c: ... this file.
Include optabs-tree.h.
(nop_stats, bswap_stats, do_shift_rotate, verify_symbolic_number_p,
init_symbolic_number, find_bswap_or_nop_load, perform_symbolic_merge,
find_bswap_or_nop_1, find_bswap_or_nop, bswap_replace): Put into
anonymous namespace, remove static keywords.
(pass_optimize_bswap::gate): Test BITS_PER_UNIT == 8 here...
(pass_optimize_bswap::execute): ... rather than here. Formatting fix.
From-SVN: r254947
|
|
In this PR we tried to create a widening multiply of two 3-bit numbers,
but that isn't a widening multiply at the optab/rtl level, since both
the input and output still have the same mode.
We could trap this either in is_widening_mult_p or (as the patch does)
in the routines that actually ask for an optab. The latter seemed
more natural since is_widening_mult_p doesn't otherwise care about modes.
2017-11-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
PR tree-optimization/82816
* tree-ssa-math-opts.c (convert_mult_to_widen): Return false
if the modes of the two types are the same.
(convert_plusminus_to_widen): Likewise.
gcc/testsuite/
* gcc.c-torture/compile/pr82816.c: New test.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254454
|
|
widening_optab_handler had the comment:
/* ??? Why does find_widening_optab_handler_and_mode attempt to
widen things that can't be widened? E.g. add_optab... */
if (op > LAST_CONV_OPTAB)
return CODE_FOR_nothing;
I think it comes from expand_binop using
find_widening_optab_handler_and_mode for two things: to test whether
a "normal" optab like add_optab is supported for a standard binary
operation and to test whether a "convert" optab is supported for a
widening operation like umul_widen_optab. In the former case from_mode
and to_mode must be the same, in the latter from_mode must be narrower
than to_mode.
For the former case, find_widening_optab_handler_and_mode is only really
testing the modes that are passed in. permit_non_widening must be true
here.
For the latter case, find_widening_optab_handler_and_mode should only
really consider new from_modes that are wider than the original
from_mode and narrower than the original to_mode. Logically
permit_non_widening should be false, since widening optabs aren't
supposed to take operands that are the same width as the destination.
We get away with permit_non_widening being true because no target
would/should define a widening .md pattern with matching modes.
But really, it seems better for expand_binop to handle these two
cases itself rather than pushing them down. With that change,
find_widening_optab_handler_and_mode is only ever called with
permit_non_widening set to false and is only ever called with
a "proper" convert optab. We then no longer need widening_optab_handler,
we can just use convert_optab_handler directly.
The patch also passes the instruction code down to expand_binop_directly.
This should be more efficient and removes an extra call to
find_widening_optab_handler_and_mode.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* optabs-query.h (convert_optab_p): New function, split out from...
(convert_optab_handler): ...here.
(widening_optab_handler): Delete.
(find_widening_optab_handler): Remove permit_non_widening parameter.
(find_widening_optab_handler_and_mode): Likewise. Provide an
override that operates on mode class wrappers.
* optabs-query.c (widening_optab_handler): Delete.
(find_widening_optab_handler_and_mode): Remove permit_non_widening
parameter. Assert that the two modes are the same class and that
the "from" mode is narrower than the "to" mode. Use
convert_optab_handler instead of widening_optab_handler.
* expmed.c (expmed_mult_highpart_optab): Use convert_optab_handler
instead of widening_optab_handler.
* expr.c (expand_expr_real_2): Update calls to
find_widening_optab_handler.
* optabs.c (expand_widen_pattern_expr): Likewise.
(expand_binop_directly): Take the insn_code as a parameter.
(expand_binop): Only call find_widening_optab_handler for
conversion optabs; use optab_handler otherwise. Update calls
to find_widening_optab_handler and expand_binop_directly.
Use convert_optab_handler instead of widening_optab_handler.
* tree-ssa-math-opts.c (convert_mult_to_widen): Update calls to
find_widening_optab_handler and use scalar_mode rather than
machine_mode.
(convert_plusminus_to_widen): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254302
|
|
functions that have _Float<N> and...
[gcc]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* builtins.c (CASE_MATHFN_FLOATN): New helper macro to add cases
for math functions that have _Float<N> and _Float<N>X variants.
(mathfn_built_in_2): Add support for math functions that have
_Float<N> and _Float<N>X variants.
(DEF_INTERNAL_FLT_FLOATN_FN): New helper macro.
(expand_builtin_mathfn_ternary): Add support for fma with
_Float<N> and _Float<N>X variants.
(expand_builtin): Likewise.
(fold_builtin_3): Likewise.
* builtins.def (DEF_EXT_LIB_FLOATN_NX_BUILTINS): New macro to
create math function _Float<N> and _Float<N>X variants as external
library builtins.
(BUILT_IN_COPYSIGN _Float<N> and _Float<N>X variants) Use
DEF_EXT_LIB_FLOATN_NX_BUILTINS to make built-in functions using
the __builtin_ prefix and if not strict ansi, without the prefix.
(BUILT_IN_FABS _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_FMA _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_FMAX _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_FMIN _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_NAN _Float<N> and _Float<N>X variants): Likewise.
(BUILT_IN_SQRT _Float<N> and _Float<N>X variants): Likewise.
* builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16): New
function signatures for fma _Float<N> and _Float<N>X variants.
(BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise.
(BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise.
(BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise.
(BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise.
(BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise.
(BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise.
* gencfn-macros.c (print_case_cfn): Add support for math functions
that have _Float<N> and _Float<N>X variants.
(print_define_operator_list): Likewise.
(fltfn_suffixes): Likewise.
(main): Likewise.
* internal-fn.def (DEF_INTERNAL_FLT_FLOATN_FN): New helper macro
for math functions that have _Float<N> and _Float<N>X variants.
(SQRT): Add support for sqrt, copysign, fmin and fmax _Float<N>
and _Float<N>X variants.
(COPYSIGN): Likewise.
(FMIN): Likewise.
(FMAX): Likewise.
* fold-const.c (tree_call_nonnegative_warnv_p): Add support for
copysign, fma, fmax, fmin, and sqrt _Float<N> and _Float<N>X
variants.
(integer_valued_read_call_p): Likewise.
* fold-const-call.c (fold_const_call_ss): Likewise.
(fold_const_call_sss): Add support for copysign, fmin, and fmax
_Float<N> and _Float<N>X variants.
(fold_const_call_ssss): Add support for fma _Float<N> and
_Float<N>X variants.
* gimple-ssa-backprop.c (backprop::process_builtin_call_use): Add
support for copysign and fma _Float<N> and _Float<N>X variants.
(backprop::process_builtin_call_use): Likewise.
* tree-call-cdce.c (can_test_argument_range); Add support for
sqrt _Float<N> and _Float<N>X variants.
(edom_only_function): Likewise.
(get_no_error_domain): Likewise.
* tree-ssa-math-opts.c (internal_fn_reciprocal): Likewise.
* tree-ssa-reassoc.c (attempt_builtin_copysign): Add support for
copysign _Float<N> and _Float<N>X variants.
* config/rs6000/rs6000-builtin.def (SQRTF128): Delete, this is now
handled by machine independent code.
(FMAF128): Likewise.
* doc/cpp.texi (Common Predefined Macros): Document defining
__FP_FAST_FMAF<N> and __FP_FAST_FMAF<N>X if the backend supports
fma _Float<N> and _Float<N>X variants.
[gcc/c]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* c-decl.c (header_for_builtin_fn): Add support for copysign, fma,
fmax, fmin, and sqrt _Float<N> and _Float<N>X variants.
[gcc/c-family]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* c-cppbuiltin.c (mode_has_fma): Add support for PowerPC KFmode.
(c_cpp_builtins): If a machine has a fast fma _Float<N> and
_Float<N>X variant, define __FP_FAST_FMA<N> and/or
__FP_FAST_FMA<N>X.
[gcc/testsuite]
2017-10-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/float128-hw.c: Add support for all 4 FMA
variants. Check various conversions to/from float128. Check
negation. Use {\m...\M} in the tests.
* gcc.target/powerpc/float128-hw2.c: New test for implicit
_Float128 math functions.
* gcc.target/powerpc/float128-hw3.c: New test for strict ansi mode
not implicitly adding the _Float128 math functions.
* gcc.target/powerpc/float128-fma2.c: Delete, test is no longer
valid.
* gcc.target/powerpc/float128-sqrt2.c: Likewise.
From-SVN: r254168
|
|
This patch uses global rather than member operators for wide-int.h,
so that the first operand can be a non-wide-int type.
The patch also removes the and_not and or_not member functions.
It was already inconsistent to have member functions for these
two operations (one of which was never used) and not other wi::
ones like udiv. After the operator change, we'd have the additional
inconsistency that "non-wi & wi" would work but "non-wi.and_not (wi)"
wouldn't.
2017-10-09 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* wide-int.h (WI_BINARY_OPERATOR_RESULT): New macro.
(WI_BINARY_PREDICATE_RESULT): Likewise.
(wi::binary_traits::operator_result): New type.
(wi::binary_traits::predicate_result): Likewise.
(generic_wide_int::operator~, unary generic_wide_int::operator-)
(generic_wide_int::operator==, generic_wide_int::operator!=)
(generic_wide_int::operator&, generic_wide_int::and_not)
(generic_wide_int::operator|, generic_wide_int::or_not)
(generic_wide_int::operator^, generic_wide_int::operator+
(binary generic_wide_int::operator-, generic_wide_int::operator*):
Delete.
(operator~, unary operator-, operator==, operator!=, operator&)
(operator|, operator^, operator+, binary operator-, operator*): New
functions.
* expr.c (get_inner_reference): Use wi::bit_and_not.
* fold-const.c (fold_binary_loc): Likewise.
* ipa-prop.c (ipa_compute_jump_functions_for_edge): Likewise.
* tree-ssa-ccp.c (get_value_from_alignment): Likewise.
(bit_value_binop): Likewise.
* tree-ssa-math-opts.c (find_bswap_or_nop_load): Likewise.
* tree-vrp.c (zero_nonzero_bits_from_vr): Likewise.
(extract_range_from_binary_expr_1): Likewise.
(masked_increment): Likewise.
(simplify_bit_ops_using_ranges): Likewise.
From-SVN: r253539
|
|
This patch adds a SCALAR_TYPE_MODE macro, along the same lines as
SCALAR_INT_TYPE_MODE and SCALAR_FLOAT_TYPE_MODE. It also adds
two instances of as_a <scalar_mode> to c_common_type, when converting
an unsigned fixed-point SCALAR_TYPE_MODE to the equivalent signed mode.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree.h (SCALAR_TYPE_MODE): New macro.
* expr.c (expand_expr_addr_expr_1): Use it.
(expand_expr_real_2): Likewise.
* fold-const.c (fold_convert_const_fixed_from_fixed): Likeise.
(fold_convert_const_fixed_from_int): Likewise.
(fold_convert_const_fixed_from_real): Likewise.
(native_encode_fixed): Likewise
(native_encode_complex): Likewise
(native_encode_vector): Likewise.
(native_interpret_fixed): Likewise.
(native_interpret_real): Likewise.
(native_interpret_complex): Likewise.
(native_interpret_vector): Likewise.
* omp-simd-clone.c (simd_clone_adjust_return_type): Likewise.
(simd_clone_adjust_argument_types): Likewise.
(simd_clone_init_simd_arrays): Likewise.
(simd_clone_adjust): Likewise.
* stor-layout.c (layout_type): Likewise.
* tree.c (build_minus_one_cst): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-inline.c (estimate_move_cost): Likewise.
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Likewise.
* tree-vect-loop.c (vect_create_epilog_for_reduction): Likewise.
(vectorizable_reduction): Likewise.
* tree-vect-patterns.c (vect_recog_widen_mult_pattern): Likewise.
(vect_recog_mixed_size_cond_pattern): Likewise.
(check_bool_pattern): Likewise.
(adjust_bool_pattern): Likewise.
(search_type_for_mask_1): Likewise.
* tree-vect-slp.c (vect_schedule_slp_instance): Likewise.
* tree-vect-stmts.c (vectorizable_conversion): Likewise.
(vectorizable_load): Likewise.
(vectorizable_store): Likewise.
* ubsan.c (ubsan_encode_value): Likewise.
* varasm.c (output_constant): Likewise.
gcc/c-family/
* c-lex.c (interpret_fixed): Use SCALAR_TYPE_MODE.
* c-common.c (c_build_vec_perm_expr): Likewise.
gcc/c/
* c-typeck.c (build_binary_op): Use SCALAR_TYPE_MODE.
(c_common_type): Likewise. Use as_a <scalar_mode> when setting
m1 and m2 to the signed equivalent of a fixed-point
SCALAR_TYPE_MODE.
gcc/cp/
* typeck.c (cp_build_binary_op): Use SCALAR_TYPE_MODE.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251516
|
|
This patch adds a SCALAR_INT_TYPE_MODE macro that asserts
that the type has a scalar integer mode and returns it as
a scalar_int_mode.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree.h (SCALAR_INT_TYPE_MODE): New macro.
* builtins.c (expand_builtin_signbit): Use it.
* cfgexpand.c (expand_debug_expr): Likewise.
* dojump.c (do_jump): Likewise.
(do_compare_and_jump): Likewise.
* dwarf2cfi.c (expand_builtin_init_dwarf_reg_sizes): Likewise.
* expmed.c (make_tree): Likewise.
* expr.c (expand_expr_real_2): Likewise.
(expand_expr_real_1): Likewise.
(try_casesi): Likewise.
* fold-const-call.c (fold_const_call_ss): Likewise.
* fold-const.c (unextend): Likewise.
(extract_muldiv_1): Likewise.
(fold_single_bit_test): Likewise.
(native_encode_int): Likewise.
(native_encode_string): Likewise.
(native_interpret_int): Likewise.
* gimple-fold.c (gimple_fold_builtin_memset): Likewise.
* internal-fn.c (expand_addsub_overflow): Likewise.
(expand_neg_overflow): Likewise.
(expand_mul_overflow): Likewise.
(expand_arith_overflow): Likewise.
* match.pd: Likewise.
* stor-layout.c (layout_type): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-ssa-math-opts.c (convert_mult_to_widen): Likewise.
* tree-ssanames.c (get_range_info): Likewise.
* tree-switch-conversion.c (array_value_type) Likewise.
* tree-vect-patterns.c (vect_recog_rotate_pattern): Likewise.
(vect_recog_divmod_pattern): Likewise.
(vect_recog_mixed_size_cond_pattern): Likewise.
* tree-vrp.c (extract_range_basic): Likewise.
(simplify_float_conversion_using_ranges): Likewise.
* tree.c (int_fits_type_p): Likewise.
* ubsan.c (instrument_bool_enum_load): Likewise.
* varasm.c (mergeable_string_section): Likewise.
(narrowing_initializer_constant_valid_p): Likewise.
(output_constant): Likewise.
gcc/cp/
* cvt.c (cp_convert_to_pointer): Use SCALAR_INT_TYPE_MODE.
gcc/fortran/
* target-memory.c (size_integer): Use SCALAR_INT_TYPE_MODE.
(size_logical): Likewise.
gcc/objc/
* objc-encoding.c (encode_type): Use SCALAR_INT_TYPE_MODE.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251486
|
|
GET_MODE_WIDER previously returned VOIDmode if no wider mode existed.
That would cause problems with stricter mode classes, since VOIDmode
isn't for example a valid scalar integer or floating-point mode.
This patch instead makes it return a new opt_mode<T> class, which
holds either a T or nothing.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* coretypes.h (opt_mode): New class.
* machmode.h (opt_mode): Likewise.
(opt_mode::else_void): New function.
(opt_mode::require): Likewise.
(opt_mode::exists): Likewise.
(GET_MODE_WIDER_MODE): Turn into a function and return an opt_mode.
(GET_MODE_2XWIDER_MODE): Likewise.
(mode_iterator::get_wider): Update accordingly.
(mode_iterator::get_2xwider): Likewise.
(mode_iterator::get_known_wider): Likewise, turning into a template.
* combine.c (make_extraction): Update use of GET_MODE_WIDER_MODE,
forcing a wider mode to exist.
* config/cr16/cr16.h (LONG_REG_P): Likewise.
* rtlanal.c (init_num_sign_bit_copies_in_rep): Likewise.
* config/c6x/c6x.c (c6x_rtx_costs): Update use of
GET_MODE_2XWIDER_MODE, forcing a wider mode to exist.
* lower-subreg.c (init_lower_subreg): Likewise.
* optabs-libfuncs.c (init_sync_libfuncs_1): Likewise, but not
on the final iteration.
* config/i386/i386.c (ix86_expand_set_or_movmem): Check whether
a wider mode exists before asking for a move pattern.
(get_mode_wider_vector): Update use of GET_MODE_WIDER_MODE,
forcing a wider mode to exist.
(expand_vselect_vconcat): Update use of GET_MODE_2XWIDER_MODE,
returning false if no such mode exists.
* config/ia64/ia64.c (expand_vselect_vconcat): Likewise.
* config/mips/mips.c (mips_expand_vselect_vconcat): Likewise.
* expmed.c (init_expmed_one_mode): Update use of GET_MODE_WIDER_MODE.
Avoid checking for a MODE_INT if we already know the mode is not a
SCALAR_INT_MODE_P.
(extract_high_half): Update use of GET_MODE_WIDER_MODE,
forcing a wider mode to exist.
(expmed_mult_highpart_optab): Likewise.
(expmed_mult_highpart): Likewise.
* expr.c (expand_expr_real_2): Update use of GET_MODE_WIDER_MODE,
using else_void.
* lto-streamer-in.c (lto_input_mode_table): Likewise.
* optabs-query.c (find_widening_optab_handler_and_mode): Likewise.
* stor-layout.c (bit_field_mode_iterator::next_mode): Likewise.
* internal-fn.c (expand_mul_overflow): Update use of
GET_MODE_2XWIDER_MODE.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* tree-ssa-math-opts.c (convert_mult_to_widen): Update use of
GET_MODE_WIDER_MODE.
(convert_plusminus_to_widen): Likewise.
* tree-switch-conversion.c (array_value_type): Likewise.
* var-tracking.c (emit_note_insn_var_location): Likewise.
* tree-vrp.c (simplify_float_conversion_using_ranges): Likewise.
Return false inside rather than outside the loop if no wider mode
exists
* optabs.c (expand_binop): Update use of GET_MODE_WIDER_MODE
and GET_MODE_2XWIDER_MODE
(can_compare_p): Use else_void.
* gdbhooks.py (OptMachineModePrinter): New class.
(build_pretty_printer): Use it for opt_mode.
gcc/ada/
* gcc-interface/decl.c (validate_size): Update use of
GET_MODE_WIDER_MODE, forcing a wider mode to exist.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251457
|
|
The new iterators are:
- FOR_EACH_MODE_IN_CLASS: iterate over all the modes in a mode class.
- FOR_EACH_MODE_FROM: iterate over all the modes in a class,
starting at a given mode.
- FOR_EACH_WIDER_MODE: iterate over all the modes in a class,
starting at the next widest mode after a given mode.
- FOR_EACH_2XWIDER_MODE: same, but considering only modes that
are two times wider than the previous mode.
- FOR_EACH_MODE_UNTIL: iterate over all the modes in a class until
a given mode is reached.
- FOR_EACH_MODE: iterate over all the modes in a class between
two given modes, inclusive of the first but not the second.
These help with the stronger type checking added by later patches,
since every new mode will be in the same class as the previous one.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_traits): New structure.
(get_narrowest_mode): New function.
(mode_iterator::start): Likewise.
(mode_iterator::iterate_p): Likewise.
(mode_iterator::get_wider): Likewise.
(mode_iterator::get_known_wider): Likewise.
(mode_iterator::get_2xwider): Likewise.
(FOR_EACH_MODE_IN_CLASS): New mode iterator.
(FOR_EACH_MODE): Likewise.
(FOR_EACH_MODE_FROM): Likewise.
(FOR_EACH_MODE_UNTIL): Likewise.
(FOR_EACH_WIDER_MODE): Likewise.
(FOR_EACH_2XWIDER_MODE): Likewise.
* builtins.c (expand_builtin_strlen): Use new mode iterators.
* combine.c (simplify_comparison): Likewise
* config/i386/i386.c (type_natural_mode): Likewise.
* cse.c (cse_insn): Likewise.
* dse.c (find_shift_sequence): Likewise.
* emit-rtl.c (init_derived_machine_modes): Likewise.
(init_emit_once): Likewise.
* explow.c (hard_function_value): Likewise.
* expmed.c (extract_fixed_bit_field_1): Likewise.
(extract_bit_field_1): Likewise.
(expand_divmod): Likewise.
(emit_store_flag_1): Likewise.
* expr.c (init_expr_target): Likewise.
(convert_move): Likewise.
(alignment_for_piecewise_move): Likewise.
(widest_int_mode_for_size): Likewise.
(emit_block_move_via_movmem): Likewise.
(copy_blkmode_to_reg): Likewise.
(set_storage_via_setmem): Likewise.
(compress_float_constant): Likewise.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* optabs-query.c (get_best_extraction_insn): Likewise.
* optabs.c (expand_binop): Likewise.
(expand_twoval_unop): Likewise.
(expand_twoval_binop): Likewise.
(widen_leading): Likewise.
(widen_bswap): Likewise.
(expand_parity): Likewise.
(expand_unop): Likewise.
(prepare_cmp_insn): Likewise.
(prepare_float_lib_cmp): Likewise.
(expand_float): Likewise.
(expand_fix): Likewise.
(expand_sfix_optab): Likewise.
* postreload.c (move2add_use_add2_insn): Likewise.
* reg-stack.c (reg_to_stack): Likewise.
* reginfo.c (choose_hard_reg_mode): Likewise.
* rtlanal.c (init_num_sign_bit_copies_in_rep): Likewise.
* stor-layout.c (mode_for_size): Likewise.
(smallest_mode_for_size): Likewise.
(mode_for_vector): Likewise.
(finish_bitfield_representative): Likewise.
* tree-ssa-math-opts.c (target_supports_divmod_p): Likewise.
* tree-vect-generic.c (type_for_widest_vector_mode): Likewise.
* tree-vect-stmts.c (vectorizable_conversion): Likewise.
* var-tracking.c (prepare_call_arguments): Likewise.
gcc/ada/
* gcc-interface/misc.c (fp_prec_to_size): Use new mode iterators.
(fp_size_to_prec): Likewise.
gcc/c-family/
* c-common.c (c_common_fixed_point_type_for_size): Use new mode
iterators.
* c-cppbuiltin.c (c_cpp_builtins): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251455
|
|
This patch sets the nothrow flag for various calls to internal functions
that are not inherently NOTHROW (and so can't be declared that way in
internal-fn.def) but that are used in contexts that can guarantee
NOTHROWness.
2017-08-29 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* gimplify.c (gimplify_call_expr): Copy the nothrow flag to
calls to internal functions.
(gimplify_modify_expr): Likewise.
* tree-call-cdce.c (use_internal_fn): Likewise.
* tree-ssa-math-opts.c (pass_cse_reciprocals::execute): Likewise.
(convert_to_divmod): Set the nothrow flag.
* tree-if-conv.c (predicate_mem_writes): Likewise.
* tree-vect-stmts.c (vectorizable_mask_load_store): Likewise.
(vectorizable_call): Likewise.
(vectorizable_store): Likewise.
(vectorizable_load): Likewise.
* tree-vect-patterns.c (vect_recog_pow_pattern): Likewise.
(vect_recog_mask_conversion_pattern): Likewise.
From-SVN: r251401
|
|
2017-08-23 Tamar Christina <tamar.christina@arm.com>
PR middle-end/19706
* tree-ssa-math-opts.c (convert_expand_mult_copysign):
Fix single-use check.
From-SVN: r251303
|
|
...to replace instances of:
TYPE_PRECISION (t) == GET_MODE_PRECISION (TYPE_MODE (t))
These conditions would need to be rewritten with variable-sized
modes anyway.
2017-08-21 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree.h (type_has_mode_precision_p): New function.
* convert.c (convert_to_integer_1): Use it.
* expr.c (expand_expr_real_2): Likewise.
(expand_expr_real_1): Likewise.
* fold-const.c (fold_single_bit_test_into_sign_test): Likewise.
* match.pd: Likewise.
* tree-ssa-forwprop.c (simplify_rotate): Likewise.
* tree-ssa-math-opts.c (convert_mult_to_fma): Likewise.
* tree-tailcall.c (process_assignment): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Likewise.
* tree-vect-patterns.c (vect_recog_vector_vector_shift_pattern)
(vect_recog_mult_pattern, vect_recog_divmod_pattern): Likewise.
* tree-vect-stmts.c (vectorizable_conversion): Likewise.
(vectorizable_assignment): Likewise.
(vectorizable_shift): Likewise.
(vectorizable_operation): Likewise.
* tree-vrp.c (register_edge_assert_for_2): Likewise.
From-SVN: r251231
|
|
2017-08-08 Tamar Christina <tamar.christina@arm.com>
Andrew Pinski <pinskia@gmail.com>
PR middle-end/19706
* internal-fn.def (XORSIGN): New.
* optabs.def (xorsign_optab): New.
* tree-ssa-math-opts.c (is_copysign_call_with_1): New.
(convert_expand_mult_copysign): New.
(pass_optimize_widening_mul::execute): Call convert_expand_mult_copysign.
Co-Authored-By: Andrew Pinski <pinskia@gmail.com>
From-SVN: r250956
|