From 59100dfc42bbe92caff61bca1560da4a30f99906 Mon Sep 17 00:00:00 2001 From: Luis Machado Date: Wed, 23 May 2018 16:20:30 +0000 Subject: [Patch 01/02] Introduce prefetch-minimum stride option This patch adds a new option to control the minimum stride, for a memory reference, after which the loop prefetch pass may issue software prefetch hints for. There are two motivations: * Make the pass less aggressive, only issuing prefetch hints for bigger strides that are more likely to benefit from prefetching. I've noticed a case in cpu2017 where we were issuing thousands of hints, for example. * For processors that have a hardware prefetcher, like Falkor, it allows the loop prefetch pass to defer prefetching of smaller (less than the threshold) strides to the hardware prefetcher instead. This prevents conflicts between the software prefetcher and the hardware prefetcher. I've noticed considerable reduction in the number of prefetch hints and slightly positive performance numbers. This aligns GCC and LLVM in terms of prefetch behavior for Falkor. The default settings should guarantee no changes for existing targets. Those are free to tweak the settings as necessary. gcc/ChangeLog: 2018-05-23 Luis Machado * config/aarch64/aarch64-protos.h (cpu_prefetch_tune) : New const int field. * config/aarch64/aarch64.c (generic_prefetch_tune): Update to include minimum_stride field defaulting to -1. (exynosm1_prefetch_tune): Likewise. (thunderxt88_prefetch_tune): Likewise. (thunderx_prefetch_tune): Likewise. (thunderx2t99_prefetch_tune): Likewise. (qdf24xx_prefetch_tune) : Set to 2048. : Set to 3. (aarch64_override_options_internal): Update to set PARAM_PREFETCH_MINIMUM_STRIDE. * doc/invoke.texi (prefetch-minimum-stride): Document new option. * params.def (PARAM_PREFETCH_MINIMUM_STRIDE): New. * params.h (PARAM_PREFETCH_MINIMUM_STRIDE): Define. * tree-ssa-loop-prefetch.c (should_issue_prefetch_p): Return false if stride is constant and is below the minimum stride threshold. From-SVN: r260617 --- gcc/params.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'gcc/params.h') diff --git a/gcc/params.h b/gcc/params.h index 98249d2..96012db 100644 --- a/gcc/params.h +++ b/gcc/params.h @@ -196,6 +196,8 @@ extern void init_param_values (int *params); PARAM_VALUE (PARAM_L1_CACHE_LINE_SIZE) #define L2_CACHE_SIZE \ PARAM_VALUE (PARAM_L2_CACHE_SIZE) +#define PREFETCH_MINIMUM_STRIDE \ + PARAM_VALUE (PARAM_PREFETCH_MINIMUM_STRIDE) #define USE_CANONICAL_TYPES \ PARAM_VALUE (PARAM_USE_CANONICAL_TYPES) #define IRA_MAX_LOOPS_NUM \ -- cgit v1.1