riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Petar Avramovic <Petar.Avramovic@amd.com>	2025-01-24 12:12:45 +0100
committer	GitHub <noreply@github.com>	2025-01-24 12:12:45 +0100
commit	0ee037b861f94604907d95d0ff0ff87805b52428 (patch)
tree	a5a5042dcb95bbda6bcb788b0674ce04f8aff971 /llvm/lib/CodeGen/MachineFunction.cpp
parent	965ff7fa309d4408b4ccf5df7e59fec264c905c5 (diff)
download	llvm-0ee037b861f94604907d95d0ff0ff87805b52428.zip llvm-0ee037b861f94604907d95d0ff0ff87805b52428.tar.gz llvm-0ee037b861f94604907d95d0ff0ff87805b52428.tar.bz2

AMDGPU/GlobalISel: AMDGPURegBankLegalize (#112864)

Lower G_ instructions that can't be inst-selected with register bank assignment from AMDGPURegBankSelect based on uniformity analysis. - Lower instruction to perform it on assigned register bank - Put uniform value in vgpr because SALU instruction is not available - Execute divergent instruction in SALU - "waterfall loop" Given LLTs on all operands after legalizer, some register bank assignments require lowering while other do not. Note: cases where all register bank assignments would require lowering are lowered in legalizer. AMDGPURegBankLegalize goals: - Define Rules: when and how to perform lowering - Goal of defining Rules it to provide high level table-like brief overview of how to lower generic instructions based on available target features and uniformity info (uniform vs divergent). - Fast search of Rules, depends on how complicated Rule.Predicate is - For some opcodes there would be too many Rules that are essentially all the same just for different combinations of types and banks. Write custom function that handles all cases. - Rules are made from enum IDs that correspond to each operand. Names of IDs are meant to give brief description what lowering does for each operand or the whole instruction. - AMDGPURegBankLegalizeHelper implements lowering algorithms Since this is the first patch that actually enables -new-reg-bank-select here is the summary of regression tests that were added earlier: - if instruction is uniform always select SALU instruction if available - eliminate back to back vgpr to sgpr to vgpr copies of uniform values - fast rules: small differences for standard and vector instruction - enabling Rule based on target feature - salu_float - how to specify lowering algorithm - vgpr S64 AND to S32 - on G_TRUNC in reg, it is up to user to deal with truncated bits G_TRUNC in reg is treated as no-op. - dealing with truncated high bits - ABS S16 to S32 - sgpr S1 phi lowering - new opcodes for vcc-to-scc and scc-to-vcc copies - lowering for vgprS1-to-vcc copy (formally this is vgpr-to-vcc G_TRUNC) - S1 zext and sext lowering to select - uniform and divergent S1 AND(OR and XOR) lowering - inst-selected into SALU instruction - divergent phi with uniform inputs - divergent instruction with temporal divergent use, source instruction is defined as uniform(AMDGPURegBankSelect) - missing temporal divergence lowering - uniform phi, because of undef incoming, is assigned to vgpr. Will be fixed in AMDGPURegBankSelect via another fix in machine uniformity analysis.

Diffstat (limited to 'llvm/lib/CodeGen/MachineFunction.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: