aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Target/AMDGPU/AMDGPULowerKernelArguments.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-05-11[AMDGPU] Move kernarg preload logic to separate pass (#130434)Austin Kerbow1-254/+2
Moves kernarg preload logic to its own module pass. Cloned function declarations are removed when preloading hidden arguments. The inreg attribute is now added in this pass instead of AMDGPUAttributor. The rest of the logic is copied from AMDGPULowerKernelArguments which now only check whether an arguments is marked inreg to avoid replacing direct uses of preloaded arguments. This change requires test updates to remove inreg from lit tests with kernels that don't actually want preloading.
2025-03-31[IRBuilder] Add new overload for CreateIntrinsic (#131942)Rahul Joshi1-1/+1
Add a new `CreateIntrinsic` overload with no `Types`, useful for creating calls to non-overloaded intrinsics that don't need additional mangling.
2025-02-17[AMDGPU] Remove dead function metadata after amdgpu-lower-kernel-arguments ↵Scott Linder1-0/+1
(#126147) The verifier ensures function !dbg metadata is unique across the module, so ensure the old nameless function we leave behind doesn't violate this invariant. Removing the function via e.g. eraseFromParent seems like a better option, but doesn't seem to be legal from a FunctionPass.
2024-12-08[AMDGPU] Fix hidden kernarg preload count inconsistency (#116759)Austin Kerbow1-7/+8
It is possible that the number of hidden arguments that are selected to be preloaded in AMDGPULowerKernel arguments and isel can differ. This isn't an issue with explicit arguments since isel can lower the argument correctly either way, but with hidden arguments we may have alignment issues if we try to load these hidden arguments that were added to the kernel signature. The reason for the mismatch is that isel reserves an extra synthetic user SGPR for module LDS. Instead of teaching lowerFormalArguments how to handle these properly it makes more sense and is less expensive to fix the mismatch and assert if we ever run into this issue again. We should never be trying to lower these in the normal way. In a future change we probably want to revise how we track "synthetic" user SGPRs and unify the handling in GCNUserSGPRUsageInfo. Sometimes synthetic SGPRSs are considered user SGPRs and sometimes they are not. Until then this patch resolves the inconsistency, fixes the bug, and is otherwise a NFC.
2024-12-04[AMDGPU] Preserve `noundef` and `range` during kernel argument loads (#118395)Krzysztof Drewniak1-0/+11
This commit ensures than noundef (which is frequently a prerequisite for other annotations) and range() annotations on kernel arguments are copied onto their corresponding load from the kernel argument structure.
2024-11-13[AMDGPU] Remove unused includes (NFC) (#116154)Kazu Hirata1-1/+0
Identified with misc-include-cleaner.
2024-10-16[LLVM] Add `Intrinsic::getDeclarationIfExists` (#112428)Rahul Joshi1-2/+2
Add `Intrinsic::getDeclarationIfExists` to lookup an existing declaration of an intrinsic in a `Module`.
2024-10-06[AMDGPU] Support preloading hidden kernel arguments (#98861)Austin Kerbow1-2/+200
Adds hidden kernel arguments to the function signature and marks them inreg if they should be preloaded into user SGPRs. The normal kernarg preloading logic then takes over with some additional checks for the correct implicitarg_ptr alignment. Special care is needed so that metadata for the hidden arguments is not added twice when generating the code object.
2024-06-28[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)Nikita Popov1-1/+1
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.
2024-06-24Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"Stephen Tozer1-1/+1
Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24[IR][NFC] Update IRBuilder to use InsertPosition (#96497)Stephen Tozer1-1/+1
Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.
2024-06-03[AMDGPU] Strengthen preload intrinsics to noundef and nonnull (#92801)Krzysztof Drewniak1-1/+0
The various preloaded registers (workitem IDs, workgroup IDs, and various implicit pointers) always have a finite, invariant, well-defined value throughout a well-defined program. In cases where the compiler infers or the user declares that some implicit input will not be used (ex. via amdgcn-no-workitem-id-y), the behavior of the entire program is undefined, since that misdeclaration can cause arbitrary other preloaded-register intrinsics to access the wrong register. This case is not expected to arise in practice, but could occur when the no implicit argument attributes were not cleared correctly in the presence of external functions, indrect calls, or other means of executing un-analyzable code. Failure to detect that case would be a bug in the attributor. This commit updates the documentation to reflect this long-standing reality. Then, on the basis that all implicit arguments are defined in all correct programs, the intrinsics that return those values are annototated with `noundef``. Some implicit pointer arguments gain a `nonnull`, but the kernel argument segment pointer or implicit argument pointers don't necessarily have this property. This will prevent spurious calls to `freeze` in front-end optimizations that destroy user-provided ranges on built-in IDs. (While I'm here, this commit adds a test for `noundef` on kernel arguments which is currently unimplemented)
2024-02-12[AMDGPU] Enable kernel arg preloading with gfx90a (#81180)Austin Kerbow1-1/+0
Add a trap instruction to the beginning of the kernel prologue to handle cases where preloading is attempted on HW loaded with incompatible firmware.
2024-01-22[DebugInfo][RemoveDIs] Adjust AMDGPU passes to work with DPValues (#78736)Jeremy Morse1-1/+1
This patch tweaks two AMDGPU passes to use iterators rather than instruction pointers for expressing an insertion point. This is needed to accurately support DPValues, the non-instruction storage object for debug-info. Two tests were sensitive to this change (variable assignments were being put in the wrong place), and I've added extra run-lines with the "try new debug-info..." flag. These get tested on our public buildbot to ensure they continue to work accurately.
2024-01-17[AMDGPU] CodeGen for GFX12 8/16-bit SMEM loads (#77633)Jay Foad1-0/+1
2023-09-25[AMDGPU] Add IR lowering changes for preloaded kernargsAustin Kerbow1-1/+57
Preloaded kernel arguments should not be lowered in the IR pass AMDGPULowerKernelArguments. Therefore it's necessary to calculate the total number of user SGPRs that are available for preloading and how many SGPRs would be required to preload each argument to determine whether we should skip lowering i.e. the argument will be preloaded instead. Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D156853
2023-08-09AMDGPU: Port AMDGPULowerKernelArguments to new pass managerMatt Arsenault1-5/+21
https://reviews.llvm.org/D157498
2023-06-22Revert "AMDGPU: Use generic helper for skipping over allocas"Matt Arsenault1-1/+16
This reverts commit aa7e09ebd38c5f23f6d7d6d8394a2aea04715ba9.
2023-06-22AMDGPU: Use generic helper for skipping over allocasMatt Arsenault1-16/+1
2023-06-07AMDGPU: Add MF independent version of getImplicitParameterOffsetMatt Arsenault1-1/+1
2023-04-29AMDGPU: Don't need pointer bitcast in AMDGPULowerKernelArgumentsMatt Arsenault1-2/+2
2023-04-29AMDGPU: Don't try to create pointer bitcasts in kernarg loweringMatt Arsenault1-3/+0
2023-01-13[NFC] Remove Function::getParamAlignmentGuillaume Chatelet1-6/+4
Differential Revision: https://reviews.llvm.org/D141696
2022-12-02[Target] Use std::nullopt instead of None (NFC)Kazu Hirata1-1/+1
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-06-20[NFC] Simplify codeGuillaume Chatelet1-3/+2
2022-02-18[AMDGPU][NFC] Fix typosSebastian Neubauer1-1/+1
Fix some typos in the amdgpu backend. Differential Revision: https://reviews.llvm.org/D119235
2021-08-17[NFC] More get/removeAttribute() cleanupArthur Eubanks1-5/+4
2021-06-06[CodeGen] Add missing includes (NFC)Nikita Popov1-0/+1
These currently rely on the IRBuilder.h include in TargetLowering.h. Make them explicit.
2021-01-20[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargetsdfukalov1-1/+1
... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036
2021-01-07[NFC][AMDGPU] Reduce include files dependency.dfukalov1-21/+2
Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D93813
2020-12-30clang-format, address warningsJuneyoung Lee1-2/+1
2020-12-30Use unary CreateShuffleVector if possibleJuneyoung Lee1-1/+1
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them. Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793) The order is swapped, but in terms of correctness it is still fine. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93923
2020-07-21AMDGPU: Start interpreting byref on kernel argumentsMatt Arsenault1-4/+21
These are treated identically to value aggregates placed in the kernel argument list. A %struct.foo or %struct.foo addrspace(4)* byref(sizeof(%struct.foo)) align(alignof(%struct.foo)) argument should produce the same offsets and argument metadata. This handles all 3 kernel ABI implementations, and the two HSA metadata emission paths.
2020-07-01[Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignmentGuillaume Chatelet1-1/+1
This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82956
2020-05-29[SVE] Eliminate calls to default-false VectorType::get() from AMDGPUChristopher Tetreault1-1/+1
Reviewers: efriedma, david-arm, fpetrogalli, arsenm Reviewed By: david-arm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, tschuett, hiraditya, rkruppe, psnobl, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80328
2020-05-13[SVE] Remove usages of VectorType::getNumElements() from AMDGPUChristopher Tetreault1-1/+1
Reviewers: efriedma, arsenm, david-arm, fpetrogalli Reviewed By: efriedma Subscribers: dmgreen, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, tschuett, hiraditya, rkruppe, psnobl, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79807
2020-05-06AMDGPU: Insert kernarg code after allocasMatt Arsenault1-1/+16
This produces more normal looking IR by keeping all the allocas clustered at the start of the block.
2020-04-09Clean up usages of asserting vector getters in TypeChristopher Tetreault1-1/+1
Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: arsenm, efriedma, sdesmalen Reviewed By: arsenm Subscribers: wdng, arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77268
2020-03-31Remove "mask" operand from shufflevector.Eli Friedman1-1/+1
Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467
2020-01-23[Alignement][NFC] Deprecate untyped CreateAlignedLoadGuillaume Chatelet1-1/+1
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73260
2019-10-15[Alignment] Migrate Attribute::getWith(Stack)AlignmentGuillaume Chatelet1-10/+10
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jdoerfert Reviewed By: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68792 llvm-svn: 374884
2019-06-19AMDGPU: Consolidate some getGeneration checksMatt Arsenault1-1/+1
This is incomplete, and ideally these would all be removed, but it's better to localize them to the subtarget first with comments about what they're for. llvm-svn: 363902
2019-02-01[opaque pointer types] Pass value type to GetElementPtr creation.James Y Knight1-7/+4
This cleans up all GetElementPtr creation in LLVM to explicitly pass a value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57173 llvm-svn: 352913
2019-02-01[opaque pointer types] Pass value type to LoadInst creation.James Y Knight1-7/+8
This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911
2019-01-28AMDGPU: Add DS append/consume intrinsicsMatt Arsenault1-1/+2
Since these pass the pointer in m0 unlike other DS instructions, these need to worry about whether the address is uniform or not. This assumes the address is dynamically uniform, and just uses readfirstlane to get a copy into an SGPR. I don't know if these have the same 16-bit add for the addressing mode offset problem on SI or not, but I've just assumed they do. Also includes some misc. changes to avoid test differences between the LDS and GDS versions. llvm-svn: 352422
2019-01-19Update the file headers across all of the LLVM projects in the monorepoChandler Carruth1-4/+3
to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
2018-12-07AMDGPU: Fix offsets for < 4-byte aggregate kernel argumentsMatt Arsenault1-4/+7
We were still using the rounded down offset and alignment even though they aren't handled because you can't trivially bitcast the loaded value. llvm-svn: 348658
2018-10-08[IRBuilder] Fixup CreateIntrinsic to allow specifying Types to Mangle.Neil Henning1-2/+2
The IRBuilder CreateIntrinsic method wouldn't allow you to specify the types that you wanted the intrinsic to be mangled with. To fix this I've: - Added an ArrayRef<Type *> member to both CreateIntrinsic overloads. - Used that array to pass into the Intrinsic::getDeclaration call. - Added a CreateUnaryIntrinsic to replace the most common use of CreateIntrinsic where the type was auto-deduced from operand 0. - Added a bunch more unit tests to test Create*Intrinsic calls that weren't being tested (including the FMF flag that wasn't checked). This was suggested as part of the AMDGPU specific atomic optimizer review (https://reviews.llvm.org/D51969). Differential Revision: https://reviews.llvm.org/D52087 llvm-svn: 343962
2018-08-30[NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysisNicolai Haehnle1-1/+0
Summary: This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433). The purpose of this patch is to free up the name DivergenceAnalysis for the new generic implementation. The generic implementation class will be shared by specialized divergence analysis classes. Patch by: Simon Moll Reviewed By: nhaehnle Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50434 Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb llvm-svn: 341071
2018-07-28AMDGPU: Stop trying to extend arguments for cloverMatt Arsenault1-26/+0
This was trying to replace i8/i16 arguments with i32, which was broken and no longer necessary. llvm-svn: 338193