aboutsummaryrefslogtreecommitdiff
path: root/gcc/params.def
diff options
context:
space:
mode:
authorLuis Machado <luis.machado@linaro.org>2018-05-07 14:08:55 +0000
committerLuis Machado <luisgpm@gcc.gnu.org>2018-05-07 14:08:55 +0000
commit57e2d1175cf972b0a352be46903d87633304ce4e (patch)
tree1b3038b599d5bdaa534d28e0a828211ef7f2acb6 /gcc/params.def
parent4826f48ec9fcb7e068df047110ea74d795b6bb04 (diff)
downloadgcc-57e2d1175cf972b0a352be46903d87633304ce4e.zip
gcc-57e2d1175cf972b0a352be46903d87633304ce4e.tar.gz
gcc-57e2d1175cf972b0a352be46903d87633304ce4e.tar.bz2
Introduce prefetch-minimum stride option
This patch adds a new option to control the minimum stride, for a memory reference, after which the loop prefetch pass may issue software prefetch hints for. There are two motivations: * Make the pass less aggressive, only issuing prefetch hints for bigger strides that are more likely to benefit from prefetching. I've noticed a case in cpu2017 where we were issuing thousands of hints, for example. * For processors that have a hardware prefetcher, like Falkor, it allows the loop prefetch pass to defer prefetching of smaller (less than the threshold) strides to the hardware prefetcher instead. This prevents conflicts between the software prefetcher and the hardware prefetcher. I've noticed considerable reduction in the number of prefetch hints and slightly positive performance numbers. This aligns GCC and LLVM in terms of prefetch behavior for Falkor. The default settings should guarantee no changes for existing targets. Those are free to tweak the settings as necessary. 2018-05-07 Luis Machado <luis.machado@linaro.org> Introduce option to limit software prefetching to known constant strides above a specific threshold with the goal of preventing conflicts with a hardware prefetcher. gcc/ * config/aarch64/aarch64-protos.h (cpu_prefetch_tune) <minimum_stride>: New const int field. * config/aarch64/aarch64.c (generic_prefetch_tune): Update to include minimum_stride field. (exynosm1_prefetch_tune): Likewise. (thunderxt88_prefetch_tune): Likewise. (thunderx_prefetch_tune): Likewise. (thunderx2t99_prefetch_tune): Likewise. (qdf24xx_prefetch_tune): Likewise. Set minimum_stride to 2048. (aarch64_override_options_internal): Update to set PARAM_PREFETCH_MINIMUM_STRIDE. * doc/invoke.texi (prefetch-minimum-stride): Document new option. * params.def (PARAM_PREFETCH_MINIMUM_STRIDE): New. * params.h (PARAM_PREFETCH_MINIMUM_STRIDE): Define. * tree-ssa-loop-prefetch.c (should_issue_prefetch_p): Return false if stride is constant and is below the minimum stride threshold. From-SVN: r259995
Diffstat (limited to 'gcc/params.def')
-rw-r--r--gcc/params.def9
1 files changed, 9 insertions, 0 deletions
diff --git a/gcc/params.def b/gcc/params.def
index dad47ec..2166deb 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -790,6 +790,15 @@ DEFPARAM (PARAM_L2_CACHE_SIZE,
"The size of L2 cache.",
512, 0, 0)
+/* The minimum constant stride beyond which we should use prefetch hints
+ for. */
+
+DEFPARAM (PARAM_PREFETCH_MINIMUM_STRIDE,
+ "prefetch-minimum-stride",
+ "The minimum constant stride beyond which we should use prefetch "
+ "hints for.",
+ -1, 0, 0)
+
/* Maximum number of statements in loop nest for loop interchange. */
DEFPARAM (PARAM_LOOP_INTERCHANGE_MAX_NUM_STMTS,