riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Richard Sandiford <richard.sandiford@arm.com>	2023-08-24 10:18:05 +0100
committer	Richard Sandiford <richard.sandiford@arm.com>	2023-08-24 10:18:05 +0100
commit	aa81e80a5ae663f169496c580ba30ae281c83940 (patch)
tree	9b640bf0d7ca569ccc8f6b0ae491fedbc0577b69 /gcc/gcc.h
parent	a28d4fce8ec2540259a257149de7081f27fb027e (diff)
download	gcc-aa81e80a5ae663f169496c580ba30ae281c83940.zip gcc-aa81e80a5ae663f169496c580ba30ae281c83940.tar.gz gcc-aa81e80a5ae663f169496c580ba30ae281c83940.tar.bz2

aarch64: Account for different Advanced SIMD fusing options

The scalar FNMADD/FNMSUB and SVE FNMLA/FNMLS instructions mean that either side of a subtraction can start an accumulator chain. However, Advanced SIMD doesn't have an equivalent instruction. This means that, for Advanced SIMD, a subtraction can only be fused if the second operand is a multiplication. Also, if both sides of a subtraction are multiplications, and if the second operand is used multiple times, such as: c * d - a * b e * f - a * b then the first rather than second multiplication operand will tend to be fused. On Advanced SIMD, this leads to: tmp1 = a * b tmp2 = -tmp1 ... = tmp2 + c * d // FMLA ... = tmp2 + e * f // FMLA where one of the FMLAs also requires a MOV. This patch tries to account for this in the vector cost model. It improves roms performance by 2-3% on Neoverse V1. It's also needed to avoid a regression in fotonik for Neoverse N2 and Neoverse V2 with the patch for PR110625. gcc/ * config/aarch64/aarch64.cc: Include ssa.h. (aarch64_multiply_add_p): Require the second operand of an Advanced SIMD subtraction to be a multiplication. Assume that such an operation won't be fused if the second operand is used multiple times and if the first operand is also a multiplication. gcc/testsuite/ * gcc.target/aarch64/neoverse_v1_2.c: New test. * gcc.target/aarch64/neoverse_v1_3.c: Likewise.

Diffstat (limited to 'gcc/gcc.h')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: