aboutsummaryrefslogtreecommitdiff
path: root/gcc/doc
diff options
context:
space:
mode:
authorBin Cheng <bin.cheng@arm.com>2017-12-07 18:03:53 +0000
committerBin Cheng <amker@gcc.gnu.org>2017-12-07 18:03:53 +0000
commitfbdec14e80e9399cd301ed30340268bdc5b5c2eb (patch)
tree6174dc33b68cbeea645678d9d23bde9cd4603643 /gcc/doc
parent75214935bee043e659ca7172a84451ded10e8987 (diff)
downloadgcc-fbdec14e80e9399cd301ed30340268bdc5b5c2eb.zip
gcc-fbdec14e80e9399cd301ed30340268bdc5b5c2eb.tar.gz
gcc-fbdec14e80e9399cd301ed30340268bdc5b5c2eb.tar.bz2
re PR tree-optimization/81303 (410.bwaves regression caused by r249919)
PR tree-optimization/81303 * Makefile.in (gimple-loop-interchange.o): New object file. * common.opt (floop-interchange): Reuse the option from graphite. * doc/invoke.texi (-floop-interchange): Ditto. New document for -floop-interchange and mention it for -O3. * opts.c (default_options_table): Enable -floop-interchange at -O3. * gimple-loop-interchange.cc: New file. * params.def (PARAM_LOOP_INTERCHANGE_MAX_NUM_STMTS): New parameter. (PARAM_LOOP_INTERCHANGE_STRIDE_RATIO): New parameter. * passes.def (pass_linterchange): New pass. * timevar.def (TV_LINTERCHANGE): New time var. * tree-pass.h (make_pass_linterchange): New declaration. * tree-ssa-loop-ivcanon.c (create_canonical_iv): Change to external interchange. Record IV before/after increment in new parameters. * tree-ssa-loop-ivopts.h (create_canonical_iv): New declaration. * tree-vect-loop.c (vect_is_simple_reduction): Factor out reduction path check into... (check_reduction_path): ...New function here. * tree-vectorizer.h (check_reduction_path): New declaration. gcc/testsuite * gcc.dg/tree-ssa/loop-interchange-1.c: New test. * gcc.dg/tree-ssa/loop-interchange-1b.c: New test. * gcc.dg/tree-ssa/loop-interchange-2.c: New test. * gcc.dg/tree-ssa/loop-interchange-3.c: New test. * gcc.dg/tree-ssa/loop-interchange-4.c: New test. * gcc.dg/tree-ssa/loop-interchange-5.c: New test. * gcc.dg/tree-ssa/loop-interchange-6.c: New test. * gcc.dg/tree-ssa/loop-interchange-7.c: New test. * gcc.dg/tree-ssa/loop-interchange-8.c: New test. * gcc.dg/tree-ssa/loop-interchange-9.c: New test. * gcc.dg/tree-ssa/loop-interchange-10.c: New test. * gcc.dg/tree-ssa/loop-interchange-11.c: New test. * gcc.dg/tree-ssa/loop-interchange-12.c: New test. * gcc.dg/tree-ssa/loop-interchange-13.c: New test. Co-Authored-By: Richard Biener <rguenther@suse.de> From-SVN: r255472
Diffstat (limited to 'gcc/doc')
-rw-r--r--gcc/doc/invoke.texi28
1 files changed, 26 insertions, 2 deletions
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 447b66a..50740c5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7409,6 +7409,7 @@ by @option{-O2} and also turns on the following optimization flags:
-ftree-loop-vectorize @gol
-ftree-loop-distribution @gol
-ftree-loop-distribute-patterns @gol
+-floop-interchange @gol
-fsplit-paths @gol
-ftree-slp-vectorize @gol
-fvect-cost-model @gol
@@ -8508,12 +8509,10 @@ Perform loop optimizations on trees. This flag is enabled by default
at @option{-O} and higher.
@item -ftree-loop-linear
-@itemx -floop-interchange
@itemx -floop-strip-mine
@itemx -floop-block
@itemx -floop-unroll-and-jam
@opindex ftree-loop-linear
-@opindex floop-interchange
@opindex floop-strip-mine
@opindex floop-block
@opindex floop-unroll-and-jam
@@ -8608,6 +8607,25 @@ ENDDO
@end smallexample
and the initialization loop is transformed into a call to memset zero.
+@item -floop-interchange
+@opindex floop-interchange
+Perform loop interchange outside of graphite. This flag can improve cache
+performance on loop nest and allow further loop optimizations, like
+vectorization, to take place. For example, the loop
+@smallexample
+for (int i = 0; i < N; i++)
+ for (int j = 0; j < N; j++)
+ for (int k = 0; k < N; k++)
+ c[i][j] = c[i][j] + a[i][k]*b[k][j];
+@end smallexample
+is transformed to
+@smallexample
+for (int i = 0; i < N; i++)
+ for (int k = 0; k < N; k++)
+ for (int j = 0; j < N; j++)
+ c[i][j] = c[i][j] + a[i][k]*b[k][j];
+@end smallexample
+
@item -ftree-loop-im
@opindex ftree-loop-im
Perform loop invariant motion on trees. This pass moves only invariants that
@@ -10479,6 +10497,12 @@ The size of L1 cache, in kilobytes.
@item l2-cache-size
The size of L2 cache, in kilobytes.
+@item loop-interchange-max-num-stmts
+The maximum number of stmts in a loop to be interchanged.
+
+@item loop-interchange-stride-ratio
+The minimum ratio between stride of two loops for interchange to be profitable.
+
@item min-insn-to-prefetch-ratio
The minimum ratio between the number of instructions and the
number of prefetches to enable prefetching in a loop.