aboutsummaryrefslogtreecommitdiff
path: root/gcc/gimple-array-bounds.cc
diff options
context:
space:
mode:
authorKyrylo Tkachov <kyrylo.tkachov@arm.com>2021-03-18 08:57:01 +0000
committerKyrylo Tkachov <kyrylo.tkachov@arm.com>2021-03-18 09:56:47 +0000
commit8f0c9d53ef3a9b8ba2579b53596cc2b7f5d8bf69 (patch)
tree03b01018553f987a82275f275aaadf6cb2f01792 /gcc/gimple-array-bounds.cc
parent3bcf19215d88e6ec33d283352c52005f02dbc784 (diff)
downloadgcc-8f0c9d53ef3a9b8ba2579b53596cc2b7f5d8bf69.zip
gcc-8f0c9d53ef3a9b8ba2579b53596cc2b7f5d8bf69.tar.gz
gcc-8f0c9d53ef3a9b8ba2579b53596cc2b7f5d8bf69.tar.bz2
aarch64: Improve generic SVE tuning defaults
This patch adds the recently-added tweak to split some SVE VL-based scalar operations [1] to the generic tuning used for SVE, as enabled by adding +sve to the -march flag, for example -march=armv8.2-a+sve. The recommendation for best performance on a particular CPU remains unchanged: use the -mcpu option for that CPU, where possible. -mcpu=native makes this straightforward for native compilation. The tweak to split out SVE VL-based scalar operations is a consistent win for the Neoverse V1 CPU and should be neutral for the Fujitsu A64FX. A run of SPEC2017 on A64FX with this tweak on didn't show any non-noise differences. It is also expected to be neutral on SVE2 implementations. Therefore, the patch enables the tweak for generic +sve tuning e.g. -march=armv8.2-a+sve. No SVE2 CPUs are expected to benefit from it, therefore the tweak is disabled for generic tuning when +sve2 is in -march e.g. -march=armv8.2-a+sve2. The implementation of this approach requires a bit of custom logic in aarch64_override_options_internal to handle these kinds of architecture-dependent decisions, but we do believe the user-facing principle here is important to implement. In general, for the generic target we're using a decision framework that looks like: * If all cores that are known to benefit from an optimization are of architecture X, and all other cores that implement X or above are not impacted, or have a very slight impact, we will consider it for generic tuning for architecture X. * We will not enable that optimisation for generic tuning for architecture X+1 if no known cores of architecture X+1 or above will benefit. This framework allows us to improve generic tuning for CPUs of generation X while avoiding accumulating tweaks for future CPUs of generation X+1, X+2... that do not need them, and thus avoid even the slight negative effects of these optimisations if the user is willing to tell us the desired architecture accurately. X above can mean either annual architecture updates (Armv8.2-a, Armv8.3-a etc) or optional architecture extensions (like SVE, SVE2). [1] http://gcc.gnu.org/g:a65b9ad863c5fc0aea12db58557f4d286a1974d7 gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_adjust_generic_arch_tuning): Define. (aarch64_override_options_internal): Use it. (generic_tunings): Add AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS to tune_flags. gcc/testsuite/ChangeLog: * g++.target/aarch64/sve/aarch64-sve.exp: Add -moverride=tune=none to sve_flags. * g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise. * g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise. * gcc.target/aarch64/sve/aarch64-sve.exp: Likewise. * gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise. * gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
Diffstat (limited to 'gcc/gimple-array-bounds.cc')
0 files changed, 0 insertions, 0 deletions