aboutsummaryrefslogtreecommitdiff
path: root/gcc/debug.cc
diff options
context:
space:
mode:
authorKyrylo Tkachov <kyrylo.tkachov@arm.com>2023-05-15 12:05:35 +0100
committerKyrylo Tkachov <kyrylo.tkachov@arm.com>2023-05-15 12:05:35 +0100
commitc4733ea2b46278974f8d78a8afb379447cc38201 (patch)
tree3724fdd0cd98c704a6dbb6a3aa124590a2cd334f /gcc/debug.cc
parent6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba (diff)
downloadgcc-c4733ea2b46278974f8d78a8afb379447cc38201.zip
gcc-c4733ea2b46278974f8d78a8afb379447cc38201.tar.gz
gcc-c4733ea2b46278974f8d78a8afb379447cc38201.tar.bz2
aarch64: Cost vector comparisons more accurately
We are missing cases for combining of FACGE/FACGT instructions. In the testcase of the patch we generate: foo: fabs v3.4s, v0.4s fabs v0.4s, v1.4s fabs v1.4s, v2.4s fcmgt v0.4s, v3.4s, v0.4s fcmgt v1.4s, v3.4s, v1.4s b g This is because combine is rejecting the pattern due to costs: Successfully matched this instruction: (set (reg:V4SI 106) (neg:V4SI (lt:V4SI (abs:V4SF (reg:V4SF 113)) (abs:V4SF (reg:V4SF 111))))) rejecting combination of insns 8, 9 and 10 original costs 8 + 8 + 12 = 28 replacement costs 8 + 28 = 36 It is obviously recursing in the various arms of the RTX and such. This patch teaches the aarch64 rtx costs routine that our vector comparisons are represented as a NEG of compare operators, with the FACGE/FAGT operations in particular having ABS on each arm. With this patch we get the much more reasonable dump: original costs 8 + 8 + 8 = 24 replacement costs 8 + 8 = 16 and generate the optimal assembly: foo: mov v31.16b, v0.16b facgt v0.4s, v0.4s, v1.4s facgt v1.4s, v31.4s, v2.4s b g Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_rtx_costs, NEG case): Add costing logic for vector modes. gcc/testsuite/ChangeLog: * gcc.target/aarch64/facg_1.c: New test.
Diffstat (limited to 'gcc/debug.cc')
0 files changed, 0 insertions, 0 deletions