diff options
author | Jeff Law <jlaw@ventanamicro.com> | 2024-05-01 11:28:41 -0600 |
---|---|---|
committer | Jeff Law <jlaw@ventanamicro.com> | 2024-05-01 11:28:41 -0600 |
commit | fad93e7617ce1aafb006983a71b6edc9ae1eb2d1 (patch) | |
tree | 5089cf7e74c1b10fcfefae091295ba86ca490305 | |
parent | 1fbe1a50d86df11f434351cf62461a32747f9710 (diff) | |
download | gcc-fad93e7617ce1aafb006983a71b6edc9ae1eb2d1.zip gcc-fad93e7617ce1aafb006983a71b6edc9ae1eb2d1.tar.gz gcc-fad93e7617ce1aafb006983a71b6edc9ae1eb2d1.tar.bz2 |
[committed] [RISC-V] Fix detection of store pair fusion cases
We've got the ability to count the number of store pair fusions happening in
the front-end of the pipeline. When comparing some code from last year vs the
current trunk we saw a fairly dramatic drop.
The problem is the store pair fusion detection code was actively harmful due to
a minor bug in checking offsets. So instead of pairing up 8 byte stores such
as sp+0 with sp+8, it tried to pair up sp+8 and sp+16.
Given uarch sensitivity I didn't try to pull together a testcase. But we could
certainly see the undesirable behavior in benchmarks as simplistic as dhrystone
up through spec2017.
Anyway, bootstrapped a while back. Also verified through our performance
counters that store pair fusion rates are back up. Regression tested with
crosses a few minutes ago.
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Break out
tests for easier debugging in store pair fusion case. Fix offset
check in same.
-rw-r--r-- | gcc/config/riscv/riscv.cc | 57 |
1 files changed, 37 insertions, 20 deletions
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 0f62b29..24d1ead 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -8874,26 +8874,43 @@ riscv_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr) extract_base_offset_in_addr (SET_DEST (prev_set), &base_prev, &offset_prev); extract_base_offset_in_addr (SET_DEST (curr_set), &base_curr, &offset_curr); - /* The two stores must be contained within opposite halves of the same - 16 byte aligned block of memory. We know that the stack pointer and - the frame pointer have suitable alignment. So we just need to check - the offsets of the two stores for suitable alignment. - - Originally the thought was to check MEM_ALIGN, but that was reporting - incorrect alignments, even for SP/FP accesses, so we gave up on that - approach. */ - if (base_prev != NULL_RTX - && base_curr != NULL_RTX - && REG_P (base_prev) - && REG_P (base_curr) - && REGNO (base_prev) == REGNO (base_curr) - && (REGNO (base_prev) == STACK_POINTER_REGNUM - || REGNO (base_prev) == HARD_FRAME_POINTER_REGNUM) - && ((INTVAL (offset_prev) == INTVAL (offset_curr) + 8 - && (INTVAL (offset_prev) % 16) == 0) - || ((INTVAL (offset_curr) == INTVAL (offset_prev) + 8) - && (INTVAL (offset_curr) % 16) == 0))) - return true; + /* Fail if we did not find both bases. */ + if (base_prev == NULL_RTX || base_curr == NULL_RTX) + return false; + + /* Fail if either base is not a register. */ + if (!REG_P (base_prev) || !REG_P (base_curr)) + return false; + + /* Fail if the bases are not the same register. */ + if (REGNO (base_prev) != REGNO (base_curr)) + return false; + + /* Originally the thought was to check MEM_ALIGN, but that was + reporting incorrect alignments, even for SP/FP accesses, so we + gave up on that approach. Instead just check for stack/hfp + which we know are aligned. */ + if (REGNO (base_prev) != STACK_POINTER_REGNUM + && REGNO (base_prev) != HARD_FRAME_POINTER_REGNUM) + return false; + + /* The two stores must be contained within opposite halves of the + same 16 byte aligned block of memory. We know that the stack + pointer and the frame pointer have suitable alignment. So we + just need to check the offsets of the two stores for suitable + alignment. */ + /* Get the smaller offset into OFFSET_PREV. */ + if (INTVAL (offset_prev) > INTVAL (offset_curr)) + std::swap (offset_prev, offset_curr); + + /* If the smaller offset (OFFSET_PREV) is not 16 byte aligned, + then fail. */ + if ((INTVAL (offset_prev) % 16) != 0) + return false; + + /* The higher offset must be 8 bytes more than the lower + offset. */ + return (INTVAL (offset_prev) + 8 == INTVAL (offset_curr)); } } |