diff options
author | Sebastian Pop <sebastian.pop@amd.com> | 2010-08-24 23:35:48 +0000 |
---|---|---|
committer | Sebastian Pop <spop@gcc.gnu.org> | 2010-08-24 23:35:48 +0000 |
commit | bd544141e09187ded0b02cac4ee5ce56ca38622c (patch) | |
tree | 98e3e8412290ce628f63d66548ba9b9cdb10c1ca /gcc/doc/invoke.texi | |
parent | 3a9abc98c235676b9268f9bce3ee37a50266557d (diff) | |
download | gcc-bd544141e09187ded0b02cac4ee5ce56ca38622c.zip gcc-bd544141e09187ded0b02cac4ee5ce56ca38622c.tar.gz gcc-bd544141e09187ded0b02cac4ee5ce56ca38622c.tar.bz2 |
Add flag -ftree-loop-if-convert-stores.
This patch adds a flag that controls the replacement of the memory
writes that are in predicated basic blocks with a full write:
for (...)
if (cond)
A[i] = foo
is replaced with:
for (...)
A[i] = cond ? foo : A[i]
In order to do this, we have to call gimple_could_trap_p instead of
gimple_assign_rhs_could_trap_p, as we have to also check that the LHS
of assign stmts does not trap.
* common.opt (ftree-loop-if-convert-stores): New flag.
* doc/invoke.texi (ftree-loop-if-convert-stores): Documented.
* tree-if-conv.c (ifc_temp_var): Pass an extra parameter GSI. Insert
the created statement before GSI.
(if_convertible_phi_p): Allow virtual phi nodes when
flag_loop_if_convert_stores is set.
(if_convertible_gimple_assign_stmt_p): Allow memory reads and writes
Do not handle types that do not match is_gimple_reg_type.
Remove loop and bb parameters. Call gimple_could_trap_p instead of
when flag_loop_if_convert_stores is set, as LHS can contain
memory refs.
(if_convertible_stmt_p): Remove loop and bb parameters. Update calls
to if_convertible_gimple_assign_stmt_p.
(if_convertible_loop_p): Update call to if_convertible_stmt_p.
(replace_phi_with_cond_gimple_assign_stmt): Renamed
predicate_scalar_phi. Do not handle virtual phi nodes.
(ifconvert_phi_nodes): Renamed predicate_all_scalar_phis.
Call predicate_scalar_phi.
(insert_gimplified_predicates): Insert the gimplified predicate of a BB
just after the labels for flag_loop_if_convert_stores, otherwise
insert the predicate in the end of the BB.
(predicate_mem_writes): New.
(combine_blocks): Call predicate_all_scalar_phis. When
flag_loop_if_convert_stores is set, call predicate_mem_writes.
(tree_if_conversion): Call mark_sym_for_renaming when
flag_loop_if_convert_stores is set.
(main_tree_if_conversion): Return TODO_update_ssa_only_virtuals when
flag_loop_if_convert_stores is set.
* gcc.dg/tree-ssa/ifc-4.c: New.
* gcc.dg/tree-ssa/ifc-7.c: New.
From-SVN: r163530
Diffstat (limited to 'gcc/doc/invoke.texi')
-rw-r--r-- | gcc/doc/invoke.texi | 20 |
1 files changed, 19 insertions, 1 deletions
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 148117d..7e069f0 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -383,7 +383,8 @@ Objective-C and Objective-C++ Dialects}. -fstrict-aliasing -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-copy-prop @gol -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol --ftree-forwprop -ftree-fre -ftree-loop-if-convert -ftree-loop-im @gol +-ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol +-ftree-loop-if-convert-memory-writes -ftree-loop-im @gol -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-pta -ftree-reassoc @gol @@ -6931,6 +6932,23 @@ the innermost loops in order to improve the ability of the vectorization pass to handle these loops. This is enabled by default if vectorization is enabled. +@item -ftree-loop-if-convert-stores +Attempt to also if-convert conditional jumps containing memory writes. +This transformation can be unsafe for multi-threaded programs as it +transforms conditional memory writes into unconditional memory writes. +For example, +@smallexample +for (i = 0; i < N; i++) + if (cond) + A[i] = expr; +@end smallexample +would be transformed to +@smallexample +for (i = 0; i < N; i++) + A[i] = cond ? expr : A[i]; +@end smallexample +potentially producing data races. + @item -ftree-loop-distribution Perform loop distribution. This flag can improve cache performance on big loop bodies and allow further loop optimizations, like |