diff options
author | Jeff Law <jlaw@ventanamicro.com> | 2024-07-25 12:32:28 -0600 |
---|---|---|
committer | Jeff Law <jlaw@ventanamicro.com> | 2024-07-25 12:36:08 -0600 |
commit | 34fb0feca71f763b2fbe832548749666d34a4a76 (patch) | |
tree | aef1c6cba34565678b8dc776f28f366680fa59df /gcc/ext-dce.cc | |
parent | 3aeb697a2158d359d9f951f6c6c6bcd9c53c8c2f (diff) | |
download | gcc-34fb0feca71f763b2fbe832548749666d34a4a76.zip gcc-34fb0feca71f763b2fbe832548749666d34a4a76.tar.gz gcc-34fb0feca71f763b2fbe832548749666d34a4a76.tar.bz2 |
[PR rtl-optimization/116039] Fix life computation for promoted subregs
So this turned out to be a neat little test and while the fuzzer found it on
RISC-V, I wouldn't be surprised if the underlying issue is also the root cause
of the loongarch issue with ext-dce.
The key issue is that if we have something like
(set (dest) (any_extend (subreg (source))))
If the subreg object is marked with SUBREG_PROMOTED and the sign/unsigned state
matches the any_extend opcode, then combine (and I guess anything using
simplify-rtx) may simplify that to
(set (dest) (source))
That implies that bits outside the mode of the subreg are actually live and
valid. This needs to be accounted for during liveness computation.
We have to be careful here though. If we're too conservative about setting
additional bits live, then we'll inhibit the desired optimization in the
coremark examples. To do a good job we need to know the extension opcode.
I'm extremely unhappy with how the use handling works in ext-dce. It mixes
different conceptual steps and has horribly complex control flow. It only
handles a subset of the unary/binary opcodes, etc etc. It's just damn mess.
It's going to need some more noodling around.
In the mean time this is a bit hacky in that it depends on non-obvious behavior
to know it can get the extension opcode, but I don't want to leave the trunk in
a broken state while I figure out the refactoring problem.
Bootstrapped and regression tested on x86 and tested on the crosses. Pushing to the trunk.
PR rtl-optimization/116039
gcc/
* ext-dce.cc (ext_dce_process_uses): Add some comments about concerns
with current code. Mark additional bit groups as live when we have
an extension of a suitably promoted subreg.
gcc/testsuite
* gcc.dg/torture/pr116039.c: New test.
Diffstat (limited to 'gcc/ext-dce.cc')
-rw-r--r-- | gcc/ext-dce.cc | 43 |
1 files changed, 37 insertions, 6 deletions
diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc index c94d1fc..14f163a 100644 --- a/gcc/ext-dce.cc +++ b/gcc/ext-dce.cc @@ -667,6 +667,12 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, if (modify && !skipped_dest && (dst_mask & ~src_mask) == 0) ext_dce_try_optimize_insn (insn, x); + /* Stripping the extension here just seems wrong on multiple + levels. It's source side handling, so it seems like it + belongs in the loop below. Stripping here also makes it + harder than necessary to properly handle live bit groups + for (ANY_EXTEND (SUBREG)) where the SUBREG has + SUBREG_PROMOTED state. */ dst_mask &= src_mask; src = XEXP (src, 0); code = GET_CODE (src); @@ -674,8 +680,8 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, /* Optimization is done at this point. We just want to make sure everything that should get marked as live is marked - from here onward. */ - + from here onward. Shouldn't the backpropagate step happen + before optimization? */ dst_mask = carry_backpropagate (dst_mask, code, src); /* We will handle the other operand of a binary operator @@ -688,7 +694,11 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, /* We're inside a SET and want to process the source operands making things live. Breaking from this loop will cause the iterator to work on sub-rtxs, so it is safe to break - if we see something we don't know how to handle. */ + if we see something we don't know how to handle. + + This code is just hokey as it really just handles trivial + unary and binary cases. Otherwise the loop exits and we + continue iterating on sub-rtxs, but outside the set context. */ unsigned HOST_WIDE_INT save_mask = dst_mask; for (;;) { @@ -704,10 +714,26 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, y = XEXP (y, 0); else if (SUBREG_P (y) && SUBREG_BYTE (y).is_constant ()) { - /* For anything but (subreg (reg)), break the inner loop - and process normally (conservatively). */ - if (!REG_P (SUBREG_REG (y))) + /* We really want to know the outer code here, ie do we + have (ANY_EXTEND (SUBREG ...)) as we need to know if + the extension matches the SUBREG_PROMOTED state. In + that case optimizers can turn the extension into a + simple copy. Which means that bits outside the + SUBREG's mode are actually live. + + We don't want to mark those bits live unnecessarily + as that inhibits extension elimination in important + cases such as those in Coremark. So we need that + outer code. */ + if (!REG_P (SUBREG_REG (y)) + || (SUBREG_PROMOTED_VAR_P (y) + && ((GET_CODE (SET_SRC (x)) == SIGN_EXTEND + && SUBREG_PROMOTED_SIGNED_P (y)) + || (GET_CODE (SET_SRC (x)) == ZERO_EXTEND + && SUBREG_PROMOTED_UNSIGNED_P (y))))) break; + + /* The SUBREG's mode determine the live width. */ bit = subreg_lsb (y).to_constant (); if (dst_mask) { @@ -785,6 +811,11 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, HOST_WIDE_INT size = GET_MODE_BITSIZE (GET_MODE (x)).to_constant (); HOST_WIDE_INT rn = 4 * REGNO (SUBREG_REG (x)); + /* If this is a promoted subreg, then more of it may be live than + is otherwise obvious. */ + if (SUBREG_PROMOTED_VAR_P (x)) + size = GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (x))).to_constant (); + bitmap_set_bit (livenow, rn); if (size > 8) bitmap_set_bit (livenow, rn + 1); |