diff options
Diffstat (limited to 'gcc/doc/invoke.texi')
-rw-r--r-- | gcc/doc/invoke.texi | 76 |
1 files changed, 76 insertions, 0 deletions
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index aa73f82..5768f08 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -338,6 +338,7 @@ Objective-C and Objective-C++ Dialects}. -fira-coalesce -fno-ira-share-save-slots @gol -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol -fivopts -fkeep-inline-functions -fkeep-static-consts @gol +-floop-block -floop-interchange -floop-strip-mine @gol -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol -fmudflapir -fmudflapth -fno-branch-count-reg -fno-default-inline @gol @@ -6012,6 +6013,81 @@ at @option{-O} and higher. Perform linear loop transformations on tree. This flag can improve cache performance and allow further loop optimizations to take place. +@item -floop-interchange +Perform loop interchange transformations on loops. Interchanging two +nested loops switches the inner and outer loops. For example, given a +loop like: +@smallexample +DO J = 1, M + DO I = 1, N + A(J, I) = A(J, I) * C + ENDDO +ENDDO +@end smallexample +loop interchange will transform the loop as if the user had written: +@smallexample +DO I = 1, N + DO J = 1, M + A(J, I) = A(J, I) * C + ENDDO +ENDDO +@end smallexample +which can be beneficial when @code{N} is larger than the caches, +because in Fortran, the elements of an array are stored in memory +contiguously by column, and the original loop iterates over rows, +potentially creating at each access a cache miss. This optimization +applies to all the languages supported by GCC and is not limited to +Fortran. + +@item -floop-strip-mine +Perform loop strip mining transformations on loops. Strip mining +splits a loop into two nested loops. The outer loop has strides +equal to the strip size and the inner loop has strides of the +original loop within a strip. For example, given a loop like: +@smallexample +DO I = 1, N + A(I) = A(I) + C +ENDDO +@end smallexample +loop strip mining will transform the loop as if the user had written: +@smallexample +DO II = 1, N, 4 + DO I = II, min (II + 4, N) + A(I) = A(I) + C + ENDDO +ENDDO +@end smallexample +This optimization applies to all the languages supported by GCC and is +not limited to Fortran. + +@item -floop-block +Perform loop blocking transformations on loops. Blocking strip mines +each loop in the loop nest such that the memory accesses of the +element loops fit inside caches. For example, given a loop like: +@smallexample +DO I = 1, N + DO J = 1, M + A(J, I) = B(I) + C(J) + ENDDO +ENDDO +@end smallexample +loop blocking will transform the loop as if the user had written: +@smallexample +DO II = 1, N, 64 + DO JJ = 1, M, 64 + DO I = II, min (II + 64, N) + DO J = JJ, min (JJ + 64, M) + A(J, I) = B(I) + C(J) + ENDDO + ENDDO + ENDDO +ENDDO +@end smallexample +which can be beneficial when @code{M} is larger than the caches, +because the innermost loop will iterate over a smaller amount of data +that can be kept in the caches. This optimization applies to all the +languages supported by GCC and is not limited to Fortran. + @item -fcheck-data-deps @opindex fcheck-data-deps Compare the results of several data dependence analyzers. This option |