aboutsummaryrefslogtreecommitdiff
path: root/gcc/doc/invoke.texi
diff options
context:
space:
mode:
Diffstat (limited to 'gcc/doc/invoke.texi')
-rw-r--r--gcc/doc/invoke.texi76
1 files changed, 76 insertions, 0 deletions
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index aa73f82..5768f08 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -338,6 +338,7 @@ Objective-C and Objective-C++ Dialects}.
-fira-coalesce -fno-ira-share-save-slots @gol
-fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
-fivopts -fkeep-inline-functions -fkeep-static-consts @gol
+-floop-block -floop-interchange -floop-strip-mine @gol
-fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
-fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
-fmudflapir -fmudflapth -fno-branch-count-reg -fno-default-inline @gol
@@ -6012,6 +6013,81 @@ at @option{-O} and higher.
Perform linear loop transformations on tree. This flag can improve cache
performance and allow further loop optimizations to take place.
+@item -floop-interchange
+Perform loop interchange transformations on loops. Interchanging two
+nested loops switches the inner and outer loops. For example, given a
+loop like:
+@smallexample
+DO J = 1, M
+ DO I = 1, N
+ A(J, I) = A(J, I) * C
+ ENDDO
+ENDDO
+@end smallexample
+loop interchange will transform the loop as if the user had written:
+@smallexample
+DO I = 1, N
+ DO J = 1, M
+ A(J, I) = A(J, I) * C
+ ENDDO
+ENDDO
+@end smallexample
+which can be beneficial when @code{N} is larger than the caches,
+because in Fortran, the elements of an array are stored in memory
+contiguously by column, and the original loop iterates over rows,
+potentially creating at each access a cache miss. This optimization
+applies to all the languages supported by GCC and is not limited to
+Fortran.
+
+@item -floop-strip-mine
+Perform loop strip mining transformations on loops. Strip mining
+splits a loop into two nested loops. The outer loop has strides
+equal to the strip size and the inner loop has strides of the
+original loop within a strip. For example, given a loop like:
+@smallexample
+DO I = 1, N
+ A(I) = A(I) + C
+ENDDO
+@end smallexample
+loop strip mining will transform the loop as if the user had written:
+@smallexample
+DO II = 1, N, 4
+ DO I = II, min (II + 4, N)
+ A(I) = A(I) + C
+ ENDDO
+ENDDO
+@end smallexample
+This optimization applies to all the languages supported by GCC and is
+not limited to Fortran.
+
+@item -floop-block
+Perform loop blocking transformations on loops. Blocking strip mines
+each loop in the loop nest such that the memory accesses of the
+element loops fit inside caches. For example, given a loop like:
+@smallexample
+DO I = 1, N
+ DO J = 1, M
+ A(J, I) = B(I) + C(J)
+ ENDDO
+ENDDO
+@end smallexample
+loop blocking will transform the loop as if the user had written:
+@smallexample
+DO II = 1, N, 64
+ DO JJ = 1, M, 64
+ DO I = II, min (II + 64, N)
+ DO J = JJ, min (JJ + 64, M)
+ A(J, I) = B(I) + C(J)
+ ENDDO
+ ENDDO
+ ENDDO
+ENDDO
+@end smallexample
+which can be beneficial when @code{M} is larger than the caches,
+because the innermost loop will iterate over a smaller amount of data
+that can be kept in the caches. This optimization applies to all the
+languages supported by GCC and is not limited to Fortran.
+
@item -fcheck-data-deps
@opindex fcheck-data-deps
Compare the results of several data dependence analyzers. This option