aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Basic/Cuda.cpp
AgeCommit message (Collapse)AuthorFilesLines
3 days[CUDA] add support for targeting sm_103/sm_121 with CUDA-12.9 (#151587)Artem Belevich1-0/+6
2025-05-31[Basic] Remove unused includes (NFC) (#142295)Kazu Hirata1-1/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-04-25[Clang][NFC] Move OffloadArch enum to a generic location (#137070)Justin Cai1-117/+0
Currently, the OffloadArch enum is defined Cuda.h. This PR moves the definition to a more generic location in OffloadArch.h/cpp.
2025-02-19[CUDA] Add support for sm101 and sm120 target architectures (#127187)Sebastian Jodłowski1-0/+8
Add support for sm101 and sm120 target architectures. It requires CUDA 12.8. --------- Co-authored-by: Sebastian Jodlowski <sjodlowski@nuro.ai>
2025-02-19[AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (#126762)Fabian Ritter1-2/+0
gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all occurrences of gfx940/gfx941 from clang that can be removed without changes in the llvm directory. The target-invalid-cpu-note/amdgcn.c test is not included here since it tests a list of targets that is defined in llvm/lib/TargetParser/TargetParser.cpp. For SWDEV-512631
2025-01-22Remove incorrect CUDA defines (#123898)Sergey Kozub1-3/+1
Remove CUDA_127 and CUDA_129 defines incorrectly added in https://github.com/llvm/llvm-project/pull/123398
2025-01-21[NVPTX] Add support for PTX 8.6 and CUDA 12.6 (12.8) (#123398)Sergey Kozub1-2/+6
Add CUDA versions 12.7, 12.8, 12.9 which support PTX8.6+ (enables using Blackwell-specific instructions).
2024-11-18AMDGPU: Add gfx950 subtarget definitions (#116307)Matt Arsenault1-0/+1
Mostly a stub, but adds some baseline tests and tests for removed instructions.
2024-11-12[AMDGPU] Introduce a new generic target `gfx9-4-generic` (#115190)Shilei Tian1-0/+1
This patch introduces a new generic target, `gfx9-4-generic`. Since it doesn’t support FP8 and XF32-related instructions, the patch includes several code reorganizations to accommodate these changes.
2024-10-23[AMDGPU] Add a new target for gfx1153 (#113138)Carl Ritson1-0/+1
2024-10-14[CUDA] Add support for CUDA-12.6 and sm_100 (#112028)Artem Belevich1-0/+5
This is a copy of #97402(with minor updates), which is now ready to land. --------- Co-authored-by: Sergey Kozub <skozub@nvidia.com>
2024-06-30[CUDA][NFC] CudaArch to OffloadArch rename (#97028)Jakub Chlanda1-56/+54
Rename `CudaArch` to `OffloadArch` to better reflect its content and the use. Apply a similar rename to helpers handling the enum.
2024-06-25[clang][Driver] Add HIPAMD Driver support for AMDGCN flavoured SPIR-V (#95061)Alex Voicu1-0/+1
This patch augments the HIPAMD driver to allow it to target AMDGCN flavoured SPIR-V compilation. It's mostly straightforward, as we re-use some of the existing SPIRV infra, however there are a few notable additions: - we introduce an `amdgcnspirv` offload arch, rather than relying on using `generic` (this is already fairly overloaded) or simply using `spirv` or `spirv64` (we'll want to use these to denote unflavoured SPIRV, once we bring up that capability) - initially it is won't be possible to mix-in SPIR-V and concrete AMDGPU targets, as it would require some relatively intrusive surgery in the HIPAMD Toolchain and the Driver to deal with two triples (`spirv64-amd-amdhsa` and `amdgcn-amd-amdhsa`, respectively) - in order to retain user provided compiler flags and have them available at JIT time, we rely on embedding the command line via `-fembed-bitcode=marker`, which the bitcode writer had previously not implemented for SPIRV; we only allow it conditionally for AMDGCN flavoured SPIRV, and it is handled correctly by the Translator (it ends up as a string literal) Once the SPIRV BE is no longer experimental we'll switch to using that rather than the translator. There's some additional work that'll come via a separate PR around correctly piping through AMDGCN's implementation of `printf`, for now we merely handle its flags correctly.
2024-06-06[AMDGPU] Add a new target gfx1152 (#94534)Shilei Tian1-0/+1
2024-06-05[CUDA] Mark CUDA-12.5 as supported and introduce ptx 8.5. (#94113)Andrey Portnoy1-0/+1
This PR is based on https://github.com/llvm/llvm-project/pull/91516.
2024-06-05AMDGPU: Add missing gfx* generic targets handling in clang (NVPTX, OpenMP ↵Konstantin Zhuravlyov1-0/+5
runtime) (#94483)
2024-05-08[CUDA] Mark CUDA-12.4 as supported and introduce ptx 8.4. (#91516)Artem Belevich1-2/+3
2024-04-16[CUDA] Rename SM_32 to SM_32_ to work around AIX headers (#88779)Joseph Huber1-3/+3
Summary: AIX headers define this, so we need to work around it. In the future this will be removed but for now we should just rename it to avoid these issues.
2023-12-11[CUDA] Add support for CUDA-12.3 and sm_90a (#74895)Artem Belevich1-0/+5
2023-11-23[AMDGPU] Define new targets gfx1200 and gfx1201 (#73133)Jay Foad1-0/+2
Define target names and ELF numbers for new GFX12 targets gfx1200 and gfx1201. For now they behave identically to GFX11.
2023-07-17[AMDGPU] Add targets gfx1150 and gfx1151Jay Foad1-0/+2
This is the target definition only. Currently they are treated the same as GFX 11.0.x. Differential Revision: https://reviews.llvm.org/D155429
2023-06-02[CUDA] Update Kepler(sm_3*) support info.Artem Belevich1-1/+5
sm_30 and sm_32 were removed in cuda-11.0 sm_35 and sm_37 were removed in cuda-12.0 Differential Revision: https://reviews.llvm.org/D152027
2023-05-25[CUDA] bump supported CUDA version to 12.1/11.8Artem Belevich1-0/+2
Differential Revision: https://reviews.llvm.org/D151361
2023-05-10AMDGPU: Add basic gfx942 targetKonstantin Zhuravlyov1-0/+1
Differential Revision: https://reviews.llvm.org/D149983
2023-05-10AMDGPU: Add basic gfx941 targetKonstantin Zhuravlyov1-0/+1
Differential Revision: https://reviews.llvm.org/D149982
2022-12-14Don't include StringSwitch (NFC)Kazu Hirata1-1/+0
These files do not use llvm::StringSwitch.
2022-10-07Add support for CUDA-11.8 and sm_{87,89,90} GPUs.Artem Belevich1-0/+11
Differential Revision: https://reviews.llvm.org/D135306
2022-10-07Refactored CUDA version housekeeping to use less boilerplate.Artem Belevich1-92/+49
Differential Revision: https://reviews.llvm.org/D135328
2022-06-18[clang] Use value_or instead of getValueOr (NFC)Kazu Hirata1-2/+1
2022-04-29[AMDGPU][clang] Definition of gfx11 subtargetJoe Nash1-0/+4
Contributors: Jay Foad <jay.foad@amd.com> Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> Patch 2/N for upstreaming of AMDGPU gfx11 architecture Depends on D124536 Reviewed By: foad, kzhuravl, #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D124537
2022-03-02[AMDGPU] Add gfx1036 targetAakanksha1-0/+1
Differential Revision: https://reviews.llvm.org/D120846
2022-03-02[AMDGPU] Add gfx940 targetStanislav Mekhanoshin1-0/+1
This is target definition only. Differential Revision: https://reviews.llvm.org/D120688
2021-12-20[HIPSPV][3/4] Enable SPIR-V emission for HIPYaxun (Sam) Liu1-0/+1
This patch enables SPIR-V binary emission for HIP device code via the HIPSPV tool chain. ‘--offload’ option, which is envisioned in [1], is added for specifying offload targets. This option is used to override default device target (amdgcn-amd-amdhsa) for HIP compilation for emitting device code as SPIR-V binary. The option is handled in getHIPOffloadTargetTriple(). getOffloadingDeviceToolChain() function (based on the design in the SYCL repository) is added to select HIPSPVToolChain when HIP offload target is ‘spirv64’. The HIPActionBuilder is modified to produce LLVM IR at the backend phase. HIPSPV tool chain expects to receive HIP device code as LLVM IR so it can run external LLVM passes over them. HIPSPV TC is also responsible for emitting the SPIR-V binary. A Cuda GPU architecture ‘generic’ is added. The name is picked from the LLVM SPIR-V Backend. In the HIPSPV code path the architecture name is inserted to the bundle entry ID as target ID. Target ID is expected to be always present so a component in the target triple is not mistaken as target ID. Tests are added for checking the HIPSPV tool chain. [1]: https://lists.llvm.org/pipermail/cfe-dev/2020-December/067362.html Patch by: Henry Linjamäki Reviewed by: Yaxun Liu, Artem Belevich, Alexey Bader Differential Revision: https://reviews.llvm.org/D110622
2021-11-09[CUDA] Bump supported CUDA version to 11.5Carlos Galvez1-0/+5
Differential Revision: https://reviews.llvm.org/D113249
2021-08-23[CUDA] Improve CUDA version detection and diagnostics.Artem Belevich1-2/+4
Always use cuda.h to detect CUDA version. It's a more universal approach compared to version.txt which is no longer present in recent CUDA versions. Split the 'unknown CUDA version' warning in two: * when detected CUDA version is partially supported by clang. It's expected to work in general, at the feature parity with the latest supported CUDA version. and may be missing support for the new features/instructions/GPU variants. Clang will issue a warning. * when detected version is new. Recent CUDA versions have been working with clang reasonably well, and will likely to work similarly to the partially supported ones above. Or it may not work at all. Clang will issue a warning and proceed as if the latest known CUDA version was detected. Differential Revision: https://reviews.llvm.org/D108247
2021-08-23[CUDA] Add support for CUDA-11.4Artem Belevich1-0/+12
Differential Revision: https://reviews.llvm.org/D108239
2021-06-24[AMDGPU] Add gfx1035 targetAakanksha Patil1-0/+1
Differential Revision: https://reviews.llvm.org/D104804
2021-06-08Reland "[AMDGPU] Add gfx1013 target"Brendon Cahoon1-0/+1
This reverts commit 211e584fa2a4c032e4d573e7cdbffd622aad0a8f. Fixed a use-after-free error that caused the sanitizers to fail.
2021-06-08Revert "[AMDGPU] Add gfx1013 target"Brendon Cahoon1-1/+0
This reverts commit ea10a86984ea73fcec3b12d22404a15f2f59b219. A sanitizer buildbot reports an error.
2021-06-08[AMDGPU] Add gfx1013 targetBrendon Cahoon1-0/+1
Differential Revision: https://reviews.llvm.org/D103663
2021-05-13[AMDGPU] Add gfx1034 targetAakanksha Patil1-0/+1
Differential Revision: https://reviews.llvm.org/D102306
2021-05-01[Cuda] Internalize a struct and a global variableFangrui Song1-1/+3
2021-02-17[AMDGPU] gfx90a supportStanislav Mekhanoshin1-0/+1
Differential Revision: https://reviews.llvm.org/D96906
2021-02-09[CUDA, NVPTX] Allow targeting sm_86 GPUs.Artem Belevich1-1/+13
The patch only plumbs through the option necessary for targeting sm_86 GPUs w/o adding any new functionality. Differential Revision: https://reviews.llvm.org/D95974
2020-12-13[NFC][AMDGPU] Reformat AMD GPU targets in cuda.cppTony1-17/+28
Differential Revision: https://reviews.llvm.org/D93181
2020-11-03[AMDGPU] Add gfx1033 targetTim Renouf1-1/+1
Differential Revision: https://reviews.llvm.org/D90447 Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761
2020-11-03[AMDGPU] Add gfx90c targetTim Renouf1-1/+1
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were previously included in gfx909. Differential Revision: https://reviews.llvm.org/D90419 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
2020-10-10[AMDGPU] Add gfx602, gfx705, gfx805 targetsTim Renouf1-1/+4
At AMD, in an internal audit of our code, we found some corner cases where we were not quite differentiating targets enough for some old hardware. This commit is part of fixing that by adding three new targets: * The "Oland" and "Hainan" variants of gfx601 are now split out into gfx602. LLPC (in the GPUOpen driver) and other front-ends could use that to avoid using the shaderZExport workaround on gfx602. * One variant of gfx703 is now split out into gfx705. LLPC and other front-ends could use that to avoid using the shaderSpiCsRegAllocFragmentation workaround on gfx705. * The "TongaPro" variant of gfx802 is now split out into gfx805. TongaPro has a faster 64-bit shift than its former friends in gfx802, and a subtarget feature could be set up for that to take advantage of it. This commit does not make that change; it just adds the target. V2: Add clang changes. Put TargetParser list in order. V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order, so fix the GPUKind order. Differential Revision: https://reviews.llvm.org/D88916 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
2020-10-02[CUDA][HIP] Fix bound arch for offload action for fat binaryYaxun (Sam) Liu1-0/+1
Currently CUDA/HIP toolchain uses "unknown" as bound arch for offload action for fat binary. This causes -mcpu or -march with "unknown" added in HIPToolChain::TranslateArgs or CUDAToolChain::TranslateArgs. This causes issue for https://reviews.llvm.org/D88377 since HIP toolchain needs to check -mcpu in HIPToolChain::TranslateArgs. The bound arch of offload action for fat binary is not really used, therefore set it to CudaArch::UNUSED. Differential Revision: https://reviews.llvm.org/D88524
2020-09-08[HIP] Add gfx1031 and gfx1030Yaxun (Sam) Liu1-1/+1
Differential Revision: https://reviews.llvm.org/D87324