diff options
author | Petar Avramovic <Petar.Avramovic@amd.com> | 2025-01-24 12:12:45 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-01-24 12:12:45 +0100 |
commit | 0ee037b861f94604907d95d0ff0ff87805b52428 (patch) | |
tree | a5a5042dcb95bbda6bcb788b0674ce04f8aff971 /llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp | |
parent | 965ff7fa309d4408b4ccf5df7e59fec264c905c5 (diff) | |
download | llvm-0ee037b861f94604907d95d0ff0ff87805b52428.zip llvm-0ee037b861f94604907d95d0ff0ff87805b52428.tar.gz llvm-0ee037b861f94604907d95d0ff0ff87805b52428.tar.bz2 |
AMDGPU/GlobalISel: AMDGPURegBankLegalize (#112864)
Lower G_ instructions that can't be inst-selected with register bank
assignment from AMDGPURegBankSelect based on uniformity analysis.
- Lower instruction to perform it on assigned register bank
- Put uniform value in vgpr because SALU instruction is not available
- Execute divergent instruction in SALU - "waterfall loop"
Given LLTs on all operands after legalizer, some register bank
assignments require lowering while other do not.
Note: cases where all register bank assignments would require lowering
are lowered in legalizer.
AMDGPURegBankLegalize goals:
- Define Rules: when and how to perform lowering
- Goal of defining Rules it to provide high level table-like brief
overview of how to lower generic instructions based on available
target features and uniformity info (uniform vs divergent).
- Fast search of Rules, depends on how complicated Rule.Predicate is
- For some opcodes there would be too many Rules that are essentially
all the same just for different combinations of types and banks.
Write custom function that handles all cases.
- Rules are made from enum IDs that correspond to each operand.
Names of IDs are meant to give brief description what lowering does
for each operand or the whole instruction.
- AMDGPURegBankLegalizeHelper implements lowering algorithms
Since this is the first patch that actually enables -new-reg-bank-select
here is the summary of regression tests that were added earlier:
- if instruction is uniform always select SALU instruction if available
- eliminate back to back vgpr to sgpr to vgpr copies of uniform values
- fast rules: small differences for standard and vector instruction
- enabling Rule based on target feature - salu_float
- how to specify lowering algorithm - vgpr S64 AND to S32
- on G_TRUNC in reg, it is up to user to deal with truncated bits
G_TRUNC in reg is treated as no-op.
- dealing with truncated high bits - ABS S16 to S32
- sgpr S1 phi lowering
- new opcodes for vcc-to-scc and scc-to-vcc copies
- lowering for vgprS1-to-vcc copy (formally this is vgpr-to-vcc G_TRUNC)
- S1 zext and sext lowering to select
- uniform and divergent S1 AND(OR and XOR) lowering - inst-selected into
SALU instruction
- divergent phi with uniform inputs
- divergent instruction with temporal divergent use, source instruction
is defined as uniform(AMDGPURegBankSelect) - missing temporal
divergence lowering
- uniform phi, because of undef incoming, is assigned to vgpr. Will be
fixed in AMDGPURegBankSelect via another fix in machine uniformity
analysis.
Diffstat (limited to 'llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp')
-rw-r--r-- | llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp b/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp index be34700..db59ca1 100644 --- a/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp +++ b/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp @@ -698,6 +698,15 @@ MachineInstrBuilder MachineIRBuilder::buildUnmerge(LLT Res, return buildInstr(TargetOpcode::G_UNMERGE_VALUES, TmpVec, Op); } +MachineInstrBuilder +MachineIRBuilder::buildUnmerge(MachineRegisterInfo::VRegAttrs Attrs, + const SrcOp &Op) { + LLT OpTy = Op.getLLTTy(*getMRI()); + unsigned NumRegs = OpTy.getSizeInBits() / Attrs.Ty.getSizeInBits(); + SmallVector<DstOp, 8> TmpVec(NumRegs, Attrs); + return buildInstr(TargetOpcode::G_UNMERGE_VALUES, TmpVec, Op); +} + MachineInstrBuilder MachineIRBuilder::buildUnmerge(ArrayRef<Register> Res, const SrcOp &Op) { // Unfortunately to convert from ArrayRef<Register> to ArrayRef<DstOp>, |