diff options
author | Roger Sayle <roger@nextmovesoftware.com> | 2022-05-18 16:23:01 +0100 |
---|---|---|
committer | Roger Sayle <roger@nextmovesoftware.com> | 2022-05-18 16:23:01 +0100 |
commit | 4a9be8d51182076222d707d9d68f6eda78e8ee2c (patch) | |
tree | 4a1879256b3bc0cc67f4bdf36e6f40617dd0ccc3 | |
parent | 30405ccc143bb4b63476a329800244826a88faf3 (diff) | |
download | gcc-4a9be8d51182076222d707d9d68f6eda78e8ee2c.zip gcc-4a9be8d51182076222d707d9d68f6eda78e8ee2c.tar.gz gcc-4a9be8d51182076222d707d9d68f6eda78e8ee2c.tar.bz2 |
Correct ix86_rtx_cost for multi-word multiplication.
This is the i386 backend specific piece of my revised patch for
PR middle-end/98865, where Richard Biener has suggested that I perform
the desired transformation during RTL expansion where the backend can
control whether it is profitable to convert a multiplication into a
bit-wise AND and a negation. This works well for x86_64, but alas
exposes a latent bug with -m32, where a DImode multiplication incorrectly
appears to be cheaper than negdi2+anddi3(!?). The fix to ix86_rtx_costs
is to report that a DImode (multi-word) multiplication actually requires
three SImode multiplications and two SImode additions. This also corrects
the cost of TImode multiplication on TARGET_64BIT.
2022-05-18 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.cc (ix86_rtx_costs) [MULT]: When mode size
is wider than word_mode, a multiplication costs three word_mode
multiplications and two word_mode additions.
-rw-r--r-- | gcc/config/i386/i386.cc | 12 |
1 files changed, 11 insertions, 1 deletions
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 86752a6..30a9cd0 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -20634,7 +20634,17 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno, op0 = XEXP (op0, 0), mode = GET_MODE (op0); } - *total = (cost->mult_init[MODE_INDEX (mode)] + int mult_init; + // Double word multiplication requires 3 mults and 2 adds. + if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) + { + mult_init = 3 * cost->mult_init[MODE_INDEX (word_mode)] + + 2 * cost->add; + nbits *= 3; + } + else mult_init = cost->mult_init[MODE_INDEX (mode)]; + + *total = (mult_init + nbits * cost->mult_bit + rtx_cost (op0, mode, outer_code, opno, speed) + rtx_cost (op1, mode, outer_code, opno, speed)); |