diff options
author | Vang Thao <Vang.Thao@amd.com> | 2021-08-12 21:39:32 -0700 |
---|---|---|
committer | Vang Thao <Vang.Thao@amd.com> | 2021-08-24 21:22:36 -0700 |
commit | 549f6a819a9a20c9f355ad214590ef68c2212842 (patch) | |
tree | 1be3da2de0ae0f8ed82454ec47896db4986ae835 /llvm/lib/CodeGen/MachineCopyPropagation.cpp | |
parent | 2a35d59b2f70c377c1aff206ad5a7105e1d387e8 (diff) | |
download | llvm-549f6a819a9a20c9f355ad214590ef68c2212842.zip llvm-549f6a819a9a20c9f355ad214590ef68c2212842.tar.gz llvm-549f6a819a9a20c9f355ad214590ef68c2212842.tar.bz2 |
[MachineCopyPropagation] Check CrossCopyRegClass for cross-class copys
On some AMDGPU subtargets, copying to and from AGPR registers using another
AGPR register is not possible. A intermediate VGPR register is needed for AGPR
to AGPR copy. This is an issue when machine copy propagation forwards a
COPY $agpr, replacing a COPY $vgpr which results in $agpr = COPY $agpr. It is
removing a cross class copy that may have been optimized by previous passes and
potentially creating an unoptimized cross class copy later on.
To avoid this issue, check CrossCopyRegClass if a different register class will
be needed for the copy. If so then avoid forwarding the copy when the
destination does not match the desired register class and if the original copy
already matches the desired register class.
Issue seen while attempting to optimize another AGPR to AGPR issue:
Live-ins: $agpr0
$vgpr0 = COPY $agpr0
$agpr1 = V_ACCVGPR_WRITE_B32 $vgpr0
$agpr2 = COPY $vgpr0
$agpr3 = COPY $vgpr0
$agpr4 = COPY $vgpr0
After machine-cp:
$vgpr0 = COPY $agpr0
$agpr1 = V_ACCVGPR_WRITE_B32 $vgpr0
$agpr2 = COPY $agpr0
$agpr3 = COPY $agpr0
$agpr4 = COPY $agpr0
Machine-cp propagated COPY $agpr0 to replace $vgpr0 creating 3 AGPR to AGPR
copys. Later this creates a cross-register copy from AGPR->VGPR->AGPR for each
copy when the prior VGPR->AGPR copy was already optimal.
Reviewed By: lkail, rampitec
Differential Revision: https://reviews.llvm.org/D108011
Diffstat (limited to 'llvm/lib/CodeGen/MachineCopyPropagation.cpp')
-rw-r--r-- | llvm/lib/CodeGen/MachineCopyPropagation.cpp | 28 |
1 files changed, 25 insertions, 3 deletions
diff --git a/llvm/lib/CodeGen/MachineCopyPropagation.cpp b/llvm/lib/CodeGen/MachineCopyPropagation.cpp index 10b74f5..2d9ada5 100644 --- a/llvm/lib/CodeGen/MachineCopyPropagation.cpp +++ b/llvm/lib/CodeGen/MachineCopyPropagation.cpp @@ -414,6 +414,31 @@ bool MachineCopyPropagation::isForwardableRegClassCopy(const MachineInstr &Copy, if (!UseI.isCopy()) return false; + const TargetRegisterClass *CopySrcRC = + TRI->getMinimalPhysRegClass(CopySrcReg); + const TargetRegisterClass *UseDstRC = + TRI->getMinimalPhysRegClass(UseI.getOperand(0).getReg()); + const TargetRegisterClass *CrossCopyRC = TRI->getCrossCopyRegClass(CopySrcRC); + + // If cross copy register class is not the same as copy source register class + // then it is not possible to copy the register directly and requires a cross + // register class copy. Fowarding this copy without checking register class of + // UseDst may create additional cross register copies when expanding the copy + // instruction in later passes. + if (CopySrcRC != CrossCopyRC) { + const TargetRegisterClass *CopyDstRC = + TRI->getMinimalPhysRegClass(Copy.getOperand(0).getReg()); + + // Check if UseDstRC matches the necessary register class to copy from + // CopySrc's register class. If so then forwarding the copy will not + // introduce any cross-class copys. Else if CopyDstRC matches then keep the + // copy and do not forward. If neither UseDstRC or CopyDstRC matches then + // we may need a cross register copy later but we do not worry about it + // here. + if (UseDstRC != CrossCopyRC && CopyDstRC == CrossCopyRC) + return false; + } + /// COPYs don't have register class constraints, so if the user instruction /// is a COPY, we just try to avoid introducing additional cross-class /// COPYs. For example: @@ -430,9 +455,6 @@ bool MachineCopyPropagation::isForwardableRegClassCopy(const MachineInstr &Copy, /// /// so we have reduced the number of cross-class COPYs and potentially /// introduced a nop COPY that can be removed. - const TargetRegisterClass *UseDstRC = - TRI->getMinimalPhysRegClass(UseI.getOperand(0).getReg()); - const TargetRegisterClass *SuperRC = UseDstRC; for (TargetRegisterClass::sc_iterator SuperRCI = UseDstRC->getSuperClasses(); SuperRC; SuperRC = *SuperRCI++) |