diff options
author | Petar Avramovic <Petar.Avramovic@amd.com> | 2025-01-24 12:12:45 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-01-24 12:12:45 +0100 |
commit | 0ee037b861f94604907d95d0ff0ff87805b52428 (patch) | |
tree | a5a5042dcb95bbda6bcb788b0674ce04f8aff971 /llvm/lib/CodeGen/MachineFunction.cpp | |
parent | 965ff7fa309d4408b4ccf5df7e59fec264c905c5 (diff) | |
download | llvm-0ee037b861f94604907d95d0ff0ff87805b52428.zip llvm-0ee037b861f94604907d95d0ff0ff87805b52428.tar.gz llvm-0ee037b861f94604907d95d0ff0ff87805b52428.tar.bz2 |
AMDGPU/GlobalISel: AMDGPURegBankLegalize (#112864)
Lower G_ instructions that can't be inst-selected with register bank
assignment from AMDGPURegBankSelect based on uniformity analysis.
- Lower instruction to perform it on assigned register bank
- Put uniform value in vgpr because SALU instruction is not available
- Execute divergent instruction in SALU - "waterfall loop"
Given LLTs on all operands after legalizer, some register bank
assignments require lowering while other do not.
Note: cases where all register bank assignments would require lowering
are lowered in legalizer.
AMDGPURegBankLegalize goals:
- Define Rules: when and how to perform lowering
- Goal of defining Rules it to provide high level table-like brief
overview of how to lower generic instructions based on available
target features and uniformity info (uniform vs divergent).
- Fast search of Rules, depends on how complicated Rule.Predicate is
- For some opcodes there would be too many Rules that are essentially
all the same just for different combinations of types and banks.
Write custom function that handles all cases.
- Rules are made from enum IDs that correspond to each operand.
Names of IDs are meant to give brief description what lowering does
for each operand or the whole instruction.
- AMDGPURegBankLegalizeHelper implements lowering algorithms
Since this is the first patch that actually enables -new-reg-bank-select
here is the summary of regression tests that were added earlier:
- if instruction is uniform always select SALU instruction if available
- eliminate back to back vgpr to sgpr to vgpr copies of uniform values
- fast rules: small differences for standard and vector instruction
- enabling Rule based on target feature - salu_float
- how to specify lowering algorithm - vgpr S64 AND to S32
- on G_TRUNC in reg, it is up to user to deal with truncated bits
G_TRUNC in reg is treated as no-op.
- dealing with truncated high bits - ABS S16 to S32
- sgpr S1 phi lowering
- new opcodes for vcc-to-scc and scc-to-vcc copies
- lowering for vgprS1-to-vcc copy (formally this is vgpr-to-vcc G_TRUNC)
- S1 zext and sext lowering to select
- uniform and divergent S1 AND(OR and XOR) lowering - inst-selected into
SALU instruction
- divergent phi with uniform inputs
- divergent instruction with temporal divergent use, source instruction
is defined as uniform(AMDGPURegBankSelect) - missing temporal
divergence lowering
- uniform phi, because of undef incoming, is assigned to vgpr. Will be
fixed in AMDGPURegBankSelect via another fix in machine uniformity
analysis.
Diffstat (limited to 'llvm/lib/CodeGen/MachineFunction.cpp')
0 files changed, 0 insertions, 0 deletions