diff options
author | Richard Sandiford <richard.sandiford@arm.com> | 2023-05-09 07:40:41 +0100 |
---|---|---|
committer | Richard Sandiford <richard.sandiford@arm.com> | 2023-05-09 07:40:41 +0100 |
commit | ba72a8d85180d0f4dbcea6eb3458ce175ce190b4 (patch) | |
tree | 0de119dfa4b6cb34387267374f4975740dc34200 /gcc | |
parent | 73f7109ffb159302e9d8f70948a5b43b046b38bc (diff) | |
download | gcc-ba72a8d85180d0f4dbcea6eb3458ce175ce190b4.zip gcc-ba72a8d85180d0f4dbcea6eb3458ce175ce190b4.tar.gz gcc-ba72a8d85180d0f4dbcea6eb3458ce175ce190b4.tar.bz2 |
ira: Don't create copies for earlyclobbered pairs
This patch follows on from g:9f635bd13fe9e85872e441b6f3618947f989909a
("the previous patch"). To start by quoting that:
If an insn requires two operands to be tied, and the input operand dies
in the insn, IRA acts as though there were a copy from the input to the
output with the same execution frequency as the insn. Allocating the
same register to the input and the output then saves the cost of a move.
If there is no such tie, but an input operand nevertheless dies
in the insn, IRA creates a similar move, but with an eighth of the
frequency. This helps to ensure that chains of instructions reuse
registers in a natural way, rather than using arbitrarily different
registers for no reason.
This heuristic seems to work well in the vast majority of cases.
However, the problem fixed in the previous patch was that we
could create a copy for an operand pair even if, for all relevant
alternatives, the output and input register classes did not have
any registers in common. It is then impossible for the output
operand to reuse the dying input register.
This left unfixed a further case where copies don't make sense:
there is no point trying to reuse the dying input register if,
for all relevant alternatives, the output is earlyclobbered and
the input doesn't match the output. (Matched earlyclobbers are fine.)
Handling that case fixes several existing XFAILs and helps with
a follow-on aarch64 patch.
Tested on aarch64-linux-gnu and x86_64-linux-gnu. A SPEC2017 run
on aarch64 showed no differences outside the noise. Also, I tried
compiling gcc.c-torture, gcc.dg, and g++.dg for at least one target
per cpu directory, using the options -Os -fno-schedule-insns{,2}.
The results below summarise the tests that showed a difference in LOC:
Target Tests Good Bad Delta Best Worst Median
====== ===== ==== === ===== ==== ===== ======
amdgcn-amdhsa 14 7 7 3 -18 10 -1
arm-linux-gnueabihf 16 15 1 -22 -4 2 -1
csky-elf 6 6 0 -21 -6 -2 -4
hppa64-hp-hpux11.23 5 5 0 -7 -2 -1 -1
ia64-linux-gnu 16 16 0 -70 -15 -1 -3
m32r-elf 53 1 52 64 -2 8 1
mcore-elf 2 2 0 -8 -6 -2 -6
microblaze-elf 285 283 2 -909 -68 4 -1
mmix 7 7 0 -2101 -2091 -1 -1
msp430-elf 1 1 0 -4 -4 -4 -4
pru-elf 8 6 2 -12 -6 2 -2
rx-elf 22 18 4 -40 -5 6 -2
sparc-linux-gnu 15 14 1 -40 -8 1 -2
sparc-wrs-vxworks 15 14 1 -40 -8 1 -2
visium-elf 2 1 1 0 -2 2 -2
xstormy16-elf 1 1 0 -2 -2 -2 -2
with other targets showing no sensitivity to the patch. The only
target that seems to be negatively affected is m32r-elf; otherwise
the patch seems like an extremely minor but still clear improvement.
gcc/
* ira-conflicts.cc (can_use_same_reg_p): Skip over non-matching
earlyclobbers.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c: Remove XFAILs.
* gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/bic_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/scale_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/scale_f64.c: Likewise.
Diffstat (limited to 'gcc')
19 files changed, 21 insertions, 18 deletions
diff --git a/gcc/ira-conflicts.cc b/gcc/ira-conflicts.cc index 5aa080a..a4d93c8 100644 --- a/gcc/ira-conflicts.cc +++ b/gcc/ira-conflicts.cc @@ -398,6 +398,9 @@ can_use_same_reg_p (rtx_insn *insn, int output, int input) if (op_alt[input].matches == output) return true; + if (op_alt[output].earlyclobber) + continue; + if (ira_reg_class_intersect[op_alt[input].cl][op_alt[output].cl] != NO_REGS) return true; diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c index b74ae33..e40865f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (asr_wide_x0_s16_z_tied1, svint16_t, uint64_t, z0 = svasr_wide_z (p0, z0, x0)) /* -** asr_wide_x0_s16_z_untied: { xfail *-*-* } +** asr_wide_x0_s16_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.h, p0/z, z1\.h ** asr z0\.h, p0/m, z0\.h, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c index 8698aef..06e4ca2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (asr_wide_x0_s32_z_tied1, svint32_t, uint64_t, z0 = svasr_wide_z (p0, z0, x0)) /* -** asr_wide_x0_s32_z_untied: { xfail *-*-* } +** asr_wide_x0_s32_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.s, p0/z, z1\.s ** asr z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c index 77b1669..1f840ca 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (asr_wide_x0_s8_z_tied1, svint8_t, uint64_t, z0 = svasr_wide_z (p0, z0, x0)) /* -** asr_wide_x0_s8_z_untied: { xfail *-*-* } +** asr_wide_x0_s8_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.b, p0/z, z1\.b ** asr z0\.b, p0/m, z0\.b, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s32.c index 9e388e4..e02c669 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s32.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (bic_w0_s32_z_tied1, svint32_t, int32_t, z0 = svbic_z (p0, z0, x0)) /* -** bic_w0_s32_z_untied: { xfail *-*-* } +** bic_w0_s32_z_untied: ** mov (z[0-9]+\.s), w0 ** movprfx z0\.s, p0/z, z1\.s ** bic z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s64.c index bf95368..57c1e53 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s64.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (bic_x0_s64_z_tied1, svint64_t, int64_t, z0 = svbic_z (p0, z0, x0)) /* -** bic_x0_s64_z_untied: { xfail *-*-* } +** bic_x0_s64_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.d, p0/z, z1\.d ** bic z0\.d, p0/m, z0\.d, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u32.c index b308b59..9f08ab4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u32.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (bic_w0_u32_z_tied1, svuint32_t, uint32_t, z0 = svbic_z (p0, z0, x0)) /* -** bic_w0_u32_z_untied: { xfail *-*-* } +** bic_w0_u32_z_untied: ** mov (z[0-9]+\.s), w0 ** movprfx z0\.s, p0/z, z1\.s ** bic z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u64.c index e82db1e..de84f3a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u64.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (bic_x0_u64_z_tied1, svuint64_t, uint64_t, z0 = svbic_z (p0, z0, x0)) /* -** bic_x0_u64_z_untied: { xfail *-*-* } +** bic_x0_u64_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.d, p0/z, z1\.d ** bic z0\.d, p0/m, z0\.d, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c index 8d63d39..a020772 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_s16_z_tied1, svint16_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_s16_z_untied: { xfail *-*-* } +** lsl_wide_x0_s16_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.h, p0/z, z1\.h ** lsl z0\.h, p0/m, z0\.h, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c index acd813d..bd67b70 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_s32_z_tied1, svint32_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_s32_z_untied: { xfail *-*-* } +** lsl_wide_x0_s32_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.s, p0/z, z1\.s ** lsl z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c index 17e8e86..7eb8627 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_s8_z_tied1, svint8_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_s8_z_untied: { xfail *-*-* } +** lsl_wide_x0_s8_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.b, p0/z, z1\.b ** lsl z0\.b, p0/m, z0\.b, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c index cff24a8..482f8d0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_u16_z_tied1, svuint16_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_u16_z_untied: { xfail *-*-* } +** lsl_wide_x0_u16_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.h, p0/z, z1\.h ** lsl z0\.h, p0/m, z0\.h, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c index 7b1afab..612897d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_u32_z_tied1, svuint32_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_u32_z_untied: { xfail *-*-* } +** lsl_wide_x0_u32_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.s, p0/z, z1\.s ** lsl z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c index df8b1ec..6ca2f9e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_u8_z_tied1, svuint8_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_u8_z_untied: { xfail *-*-* } +** lsl_wide_x0_u8_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.b, p0/z, z1\.b ** lsl z0\.b, p0/m, z0\.b, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c index 863b51a..9110c5a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (lsr_wide_x0_u16_z_tied1, svuint16_t, uint64_t, z0 = svlsr_wide_z (p0, z0, x0)) /* -** lsr_wide_x0_u16_z_untied: { xfail *-*-* } +** lsr_wide_x0_u16_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.h, p0/z, z1\.h ** lsr z0\.h, p0/m, z0\.h, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c index 73c2cf8..93af4fa 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (lsr_wide_x0_u32_z_tied1, svuint32_t, uint64_t, z0 = svlsr_wide_z (p0, z0, x0)) /* -** lsr_wide_x0_u32_z_untied: { xfail *-*-* } +** lsr_wide_x0_u32_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.s, p0/z, z1\.s ** lsr z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c index fe44eab..2f38139 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (lsr_wide_x0_u8_z_tied1, svuint8_t, uint64_t, z0 = svlsr_wide_z (p0, z0, x0)) /* -** lsr_wide_x0_u8_z_untied: { xfail *-*-* } +** lsr_wide_x0_u8_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.b, p0/z, z1\.b ** lsr z0\.b, p0/m, z0\.b, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f32.c index 747f8a6..12a1b1d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f32.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (scale_w0_f32_z_tied1, svfloat32_t, int32_t, z0 = svscale_z (p0, z0, x0)) /* -** scale_w0_f32_z_untied: { xfail *-*-* } +** scale_w0_f32_z_untied: ** mov (z[0-9]+\.s), w0 ** movprfx z0\.s, p0/z, z1\.s ** fscale z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f64.c index 004cbfa..f6b1171 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f64.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (scale_x0_f64_z_tied1, svfloat64_t, int64_t, z0 = svscale_z (p0, z0, x0)) /* -** scale_x0_f64_z_untied: { xfail *-*-* } +** scale_x0_f64_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.d, p0/z, z1\.d ** fscale z0\.d, p0/m, z0\.d, \1 |