aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-05-23[NFC][CodeGen] Adopt MachineFunctionProperties convenience accessors (#141101)Rahul Joshi1-2/+1
2025-05-14[GlobalISel] Add a GISelValueTracker printing pass (#139687)David Green1-4/+5
This adds a GISelValueTrackingPrinterPass that can print the known bits and sign bit of each def in a function. It is built on the new pass manager and so adds a NPM GISelValueTrackingAnalysis, renaming the older class to GISelValueTrackingAnalysisLegacy. The first 2 functions from the AArch64GISelMITest are ported over to an mir test to show it working. It also runs successfully on all files in llvm/test/CodeGen/AArch64/GlobalISel/*.mir that are not invalid. It can hopefully be used to test GlobalISel known bits analysis more directly in common cases, without jumping through the hoops that the C++ tests requires.
2025-04-07[NFC][LLVM][AMDGPU] Cleanup pass initialization for AMDGPU (#134410)Rahul Joshi1-5/+1
- Remove calls to pass initialization from pass constructors. - https://github.com/llvm/llvm-project/issues/111767
2025-03-29[GlobalISel][NFC] Rename GISelKnownBits to GISelValueTracking (#133466)Tim Gymnich1-15/+15
- rename `GISelKnownBits` to `GISelValueTracking` to analyze more than just `KnownBits` in the future
2025-01-06[AMDGPU] [GlobalIsel] Combine Fmul with Select into ldexp instruction. (#120104)Vikash Gupta1-1/+1
This combine pattern perform the below transformation. fmul x, select(y, A, B) -> fldexp (x, select i32 (y, a, b)) fmul x, select(y, -A, -B) -> fldexp ((fneg x), select i32 (y, a, b)) where, A=2^a & B=2^b ; a and b are integers. It is a follow-up PR to implement the above combine for globalIsel, as the corresponding DAG combine has been done for SelectionDAG Isel (#111109)
2024-08-22[AMDGPU][GlobalISel] Disable fixed-point iteration in all Combiners (#105517)Jay Foad1-1/+5
Disable fixed-point iteration in all AMDGPU Combiners after #102163. This saves around 2% compile time in ad hoc testing on some large graphics shaders. I did not notice any regressions in the generated code, just a bunch of harmless differences in instruction selection and register allocation.
2024-06-21[NFC] Fix laod -> load typos. NFCDavid Green1-1/+1
2024-06-11[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis ↵paperchalice1-3/+4
result (#94571) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.
2024-05-06[AMDGPU] Fix typo in function nameJay Foad1-3/+3
2024-05-05[AMDGPU] Improve MIR pattern for FMinFMaxLegacy combine. NFC. (#90968)Jay Foad1-10/+8
2024-05-03[AMDGPU] Use replaceOpcodeWith instead of applyCombine_s_mul_u64. NFC.Jay Foad1-7/+1
2024-05-03[AMDGPU] Remove unneeded calls to setInstrAndDebugLoc in matchers. NFC.Jay Foad1-4/+0
2024-05-03[AMDGPU] Simplify applySelectFCmpToFMinToFMaxLegacy. NFC.Jay Foad1-58/+21
2024-02-22[AMDGPU][GlobalISel] Add fdiv / sqrt to rsq combine (#78673)Nick Anderson1-0/+23
Fixes #64743
2024-01-17[AMDGPU] CodeGen for GFX12 8/16-bit SMEM loads (#77633)Jay Foad1-0/+6
2024-01-10[AMDGPU] Fix broken sign-extended subword buffer load combine (#77470)Jay Foad1-22/+24
2024-01-08[AMDGPU] Add CodeGen support for GFX12 s_mul_u64 (#75825)Jay Foad1-0/+34
2023-09-14[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes ↵Arthur Eubanks1-1/+1
(#66295) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::
2023-09-05[GlobalISel] Refactor Combiner APIpvanhout1-65/+44
Remove CodeGen leftovers from the old combiner backend and adapt the API to fit the new backend better. It's now quite a bit closer to how InstructionSelector works. - `CombinerInfo` is now a simple "options" struct. - `Combiner` is now the base class of all TableGen'd combiner implementation. - Many fields have been moved from derived classes into that class. - It has been refactored to create & own the Observer and Builder. - `tryCombineAll` TableGen'd method can now be renamed, which allows targets to implement the actual `tryCombineAll` call manually and do whatever they want to do before/after it. Note: `CombinerHelper` needs to be mutable because none of its methods are const. This can be revisited later. Depends on D158710 Reviewed By: aemerson, dsanders Differential Revision: https://reviews.llvm.org/D158713
2023-08-23AMDGPU: Fix more unsafe rsq formationMatt Arsenault1-6/+8
Introducing rsq contract flags is wrong, and also requires some level of approximate functions. AMDGPUCodeGenPrepare already should handle the f32 cases with appropriate flags, and I don't see how new situations to handle would arise during legalization (other than cases involving the rcp intrinsic, which instcombine tries to handle). AMDGPUCodeGenPrepare does need to learn better handling of rcp/rsq for f64 though, which we never bothered to handle well. Removes another obstacle to correctly lowering sqrt. https://reviews.llvm.org/D158099
2023-07-31[GlobalISel] convergent intrinsicsSameer Sahasrabuddhe1-2/+2
Introduced the convergent equivalent of the existing G_INTRINSIC opcodes: - G_INTRINSIC_CONVERGENT - G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS Out of the targets that currently have some support for GlobalISel, the patch assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154766
2023-07-27Restore "[GlobalISel] GIntrinsic subclass to represent intrinsics in Generic ↵Sameer Sahasrabuddhe1-4/+5
Machine IR" Some opcodes in generic MIR represent calls to intrinsics, where the intrinsic ID is the first non-def operand to the instruction. These are now represented as a subclass of GenericMachineInstr, and the method MachineInstr::getIntrinsicID() is now moved to this subclass GIntrinsic. Some target-defined instructions behave like GMIR intrinsics, and have an Intrinsic::ID operand. But they should not be recognized as generic intrinsics, and should not use GIntrinsic::getIntrinsicID(). Separated these out by introducing a new AMDGPU::getIntrinsicID(). Reviewed By: arsenm, Pierre-vh Differential Revision: https://reviews.llvm.org/D155556 This restores commit baa3386edb11a2f9bcadda8cf58d56f3707c39fa. Originally reverted in d0f7850b01cf17e50a4f4b00e3b84dded94df6b8.
2023-07-27Revert "[GlobalISel] GIntrinsic subclass to represent intrinsics in Generic ↵Sameer Sahasrabuddhe1-5/+4
Machine IR" This reverts commit baa3386edb11a2f9bcadda8cf58d56f3707c39fa. The changes did not cover all occurrences of the deteleted function MachineInstr::getIntrinsicID().
2023-07-27[GlobalISel] GIntrinsic subclass to represent intrinsics in Generic Machine IRSameer Sahasrabuddhe1-4/+5
Some opcodes in generic MIR represent calls to intrinsics, where the intrinsic ID is the first non-def operand to the instruction. These are now represented as a subclass of GenericMachineInstr, and the method MachineInstr::getIntrinsicID() is now moved to this subclass GIntrinsic. Some target-defined instructions behave like GMIR intrinsics, and have an Intrinsic::ID operand. But they should not be recognized as generic intrinsics, and should not use GIntrinsic::getIntrinsicID(). Separated these out by introducing a new AMDGPU::getIntrinsicID(). Reviewed By: arsenm, Pierre-vh Differential Revision: https://reviews.llvm.org/D155556
2023-07-11[AMDGPU] Use GlobalISel MatchTable Combiner Backendpvanhout1-89/+105
Use the new matchtable-based combiner backend for all AMDGPU combiners. This drop-in from the user's perspective; there are no test changes, the new combiner behaves exactly like the old one. Depends on D153757 NOTE: This would land iff D153757 (RFC) lands too. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D153758
2023-02-22[AMDGPU] Improve the lowering of raw_buffer_load_{i8,i16} and ↵Konstantina Mitropoulou1-1/+49
struct_buffer_load_{i8,i16} intrinsics Currently, raw_buffer_load_{i8,i16} and struct_buffer_load_{i8,i16} intrinsics are lowered as buffer_load_{u8,u16}. This patch combines buffer_load_{u8,u16} and sign extension instructions in order to generate buffer_load_{i8,i16} instructions. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D144313
2023-02-02AMDGPU: Try to unfold fneg source when matching legacy fmin/fmaxMatt Arsenault1-0/+2
This is NFC as it stands, since other combines will effectively prevent this from being reachable. This will avoid regressions in a future change which tries to make better use of select source modifiers. Didn't bother with the GlobalISel part for now, since the baseline combine doesn't seem to work on the existing test.
2022-10-26[GlobalISel] Add Predicates to GICombineRulePierre van Houtryve1-12/+15
Small QoL change to allow Predicates to be used in GICombineRule. Currently only one combine in the AMDGPU backend makes use of it. The implementation is pretty simple to get started but of course we can expand this later on and optimize predicate checking better if needed. Reviewed By: dsanders Differential Revision: https://reviews.llvm.org/D136681
2022-10-07[GlobalISel] Mark mi_match as nodiscardJessica Paquette1-3/+5
Typically when you match something, you want to check the result. Fix a couple warnings in the AMDGPUPostLegalizerCombiner which appear as a result of this. Differential Revision: https://reviews.llvm.org/D135491
2022-10-03[GlobalISel] Allow prelegalizer combiners to have access to LegalizerInfo.Amara Emerson1-1/+2
Before, the isPreLegalize() query in CombinerHelper only checked for the presence of a LegalizerInfo object. This is problematic when we want to have a combine actually check for legality in a pre-legalizer combine pass, since if we pass a LegalizerInfo object to the constructor it causes the combines to think that we're running *post* legalizer, which isn't true. This change fixes it to instead check an explicit bool that passes to signal whether the pass will be run before or after legalization. Doing so exposed a bug in the extending loads combine, which tried to check for legality of candidate extending loads if LegalizerInfo was present. Since we only ran it pre-legalizer and therefore with a null LegalizerInfo, it never actually ran. Also fixes the legality checks to keep the tests passing. Differential Revision: https://reviews.llvm.org/D135044
2021-11-30Code quality: Combine V_RSQMateja Marjanovic1-0/+46
Combine V_RCP and V_SQRT into V_RSQ on AMDGPU for GlobalISel. Change-Id: I93c5dcb412483156a6e8b68c4085cbce83ac9703
2021-11-17[AMDGPU][GlobalISel] Fold G_FNEG above when users cannot fold modsMirko Brkusanin1-5/+7
If possible fold fneg into instruction above if users cannot fold mods and we know it will decrease instruction count. Follows same logic as SDAG combiner in choosing opportunities to combine. Differential Revision: https://reviews.llvm.org/D112827
2021-04-27AMDGPU/GlobalISel: Remove redundant G_FCANONICALIZEPetar Avramovic1-0/+10
Add basic version of isCanonicalized for global-isel. Copied from sdag. Add post legalizer combine that deletes G_FCANONICALIZE when its input is already Canonicalized. Differential Revision: https://reviews.llvm.org/D96605
2021-02-02Fixed includes.Thomas Symalla1-20/+0
Differential Revision: https://reviews.llvm.org/D93708
2021-02-02Reverted whitespace changes.Thomas Symalla1-1/+0
Differential Revision: https://reviews.llvm.org/D90968
2021-02-02Formatting changesThomas Symalla1-1/+1
2021-02-02Formatting changes.Thomas Symalla1-1/+1
2021-02-02Updating formatting changes.Thomas Symalla1-4/+15
2021-02-02Resolve formatting changes.Thomas Symalla1-5/+4
2021-02-02Move Combiner to PreLegalize stepThomas Symalla1-128/+0
2021-02-02Reverted unintended git-format change.Thomas Symalla1-2/+1
2021-02-02Fixed the lit tests and a bug in the implementation.Thomas Symalla1-1/+1
2021-02-02Refactored the pattern matching.Thomas Symalla1-4/+16
2021-02-02Added early exit.Thomas Symalla1-4/+16
2021-02-02Added comments.Thomas Symalla1-14/+20
2021-02-02clang-formatThomas Symalla1-21/+33
2021-02-02Added clamp i64 to i16 global isel pattern.Thomas Symalla1-0/+98
2021-01-20[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargetsdfukalov1-1/+2
... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036
2021-01-07[NFC][AMDGPU] Reduce include files dependency.dfukalov1-4/+3
Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D93813
2020-11-03AMDGPU/GlobalISel: Use same builder/observer in post-legalizer-combinerPetar Avramovic1-37/+71
Change match/apply functions into methods of new target specific combiner helper class. Use reference to MachineIRBuilder from helper instead of constructing new MachineIRBuilder each time new instruction needs to made. Allows correct tracking of newly created instructions. Differential Revision: https://reviews.llvm.org/D90623