Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
|
|
Currently, the OffloadArch enum is defined Cuda.h. This PR moves the
definition to a more generic location in OffloadArch.h/cpp.
|
|
Add support for sm101 and sm120 target architectures. It requires CUDA
12.8.
---------
Co-authored-by: Sebastian Jodlowski <sjodlowski@nuro.ai>
|
|
gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.
This PR removes all occurrences of gfx940/gfx941 from clang that can be
removed without changes in the llvm directory. The
target-invalid-cpu-note/amdgcn.c test is not included here since it
tests a list of targets that is defined in
llvm/lib/TargetParser/TargetParser.cpp.
For SWDEV-512631
|
|
Remove CUDA_127 and CUDA_129 defines incorrectly added in
https://github.com/llvm/llvm-project/pull/123398
|
|
Add CUDA versions 12.7, 12.8, 12.9 which support PTX8.6+ (enables using Blackwell-specific instructions).
|
|
Mostly a stub, but adds some baseline tests and
tests for removed instructions.
|
|
This patch introduces a new generic target, `gfx9-4-generic`. Since it doesn’t support FP8 and XF32-related instructions, the patch includes several code reorganizations to accommodate these changes.
|
|
|
|
This is a copy of #97402(with minor updates), which is now ready to land.
---------
Co-authored-by: Sergey Kozub <skozub@nvidia.com>
|
|
Rename `CudaArch` to `OffloadArch` to better reflect its content and the
use.
Apply a similar rename to helpers handling the enum.
|
|
This patch augments the HIPAMD driver to allow it to target AMDGCN
flavoured SPIR-V compilation. It's mostly straightforward, as we re-use
some of the existing SPIRV infra, however there are a few notable
additions:
- we introduce an `amdgcnspirv` offload arch, rather than relying on
using `generic` (this is already fairly overloaded) or simply using
`spirv` or `spirv64` (we'll want to use these to denote unflavoured
SPIRV, once we bring up that capability)
- initially it is won't be possible to mix-in SPIR-V and concrete AMDGPU
targets, as it would require some relatively intrusive surgery in the
HIPAMD Toolchain and the Driver to deal with two triples
(`spirv64-amd-amdhsa` and `amdgcn-amd-amdhsa`, respectively)
- in order to retain user provided compiler flags and have them
available at JIT time, we rely on embedding the command line via
`-fembed-bitcode=marker`, which the bitcode writer had previously not
implemented for SPIRV; we only allow it conditionally for AMDGCN
flavoured SPIRV, and it is handled correctly by the Translator (it ends
up as a string literal)
Once the SPIRV BE is no longer experimental we'll switch to using that
rather than the translator. There's some additional work that'll come
via a separate PR around correctly piping through AMDGCN's
implementation of `printf`, for now we merely handle its flags
correctly.
|
|
|
|
This PR is based on https://github.com/llvm/llvm-project/pull/91516.
|
|
runtime) (#94483)
|
|
|
|
Summary:
AIX headers define this, so we need to work around it. In the future
this will be removed but for now we should just rename it to avoid these
issues.
|
|
|
|
Define target names and ELF numbers for new GFX12 targets gfx1200 and
gfx1201. For now they behave identically to GFX11.
|
|
This is the target definition only. Currently they are treated the same
as GFX 11.0.x.
Differential Revision: https://reviews.llvm.org/D155429
|
|
sm_30 and sm_32 were removed in cuda-11.0
sm_35 and sm_37 were removed in cuda-12.0
Differential Revision: https://reviews.llvm.org/D152027
|
|
Differential Revision: https://reviews.llvm.org/D151361
|
|
Differential Revision: https://reviews.llvm.org/D149983
|
|
Differential Revision: https://reviews.llvm.org/D149982
|
|
These files do not use llvm::StringSwitch.
|
|
Differential Revision: https://reviews.llvm.org/D135306
|
|
Differential Revision: https://reviews.llvm.org/D135328
|
|
|
|
Contributors:
Jay Foad <jay.foad@amd.com>
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>
Patch 2/N for upstreaming of AMDGPU gfx11 architecture
Depends on D124536
Reviewed By: foad, kzhuravl, #amdgpu, arsenm
Differential Revision: https://reviews.llvm.org/D124537
|
|
Differential Revision: https://reviews.llvm.org/D120846
|
|
This is target definition only.
Differential Revision: https://reviews.llvm.org/D120688
|
|
This patch enables SPIR-V binary emission for HIP device code via the
HIPSPV tool chain.
‘--offload’ option, which is envisioned in [1], is added for specifying
offload targets. This option is used to override default device target
(amdgcn-amd-amdhsa) for HIP compilation for emitting device code as
SPIR-V binary. The option is handled in getHIPOffloadTargetTriple().
getOffloadingDeviceToolChain() function (based on the design in the
SYCL repository) is added to select HIPSPVToolChain when HIP offload
target is ‘spirv64’.
The HIPActionBuilder is modified to produce LLVM IR at the backend
phase. HIPSPV tool chain expects to receive HIP device code as LLVM
IR so it can run external LLVM passes over them. HIPSPV TC is also
responsible for emitting the SPIR-V binary.
A Cuda GPU architecture ‘generic’ is added. The name is picked from
the LLVM SPIR-V Backend. In the HIPSPV code path the architecture
name is inserted to the bundle entry ID as target ID. Target ID is
expected to be always present so a component in the target triple
is not mistaken as target ID.
Tests are added for checking the HIPSPV tool chain.
[1]: https://lists.llvm.org/pipermail/cfe-dev/2020-December/067362.html
Patch by: Henry Linjamäki
Reviewed by: Yaxun Liu, Artem Belevich, Alexey Bader
Differential Revision: https://reviews.llvm.org/D110622
|
|
Differential Revision: https://reviews.llvm.org/D113249
|
|
Always use cuda.h to detect CUDA version. It's a more universal approach
compared to version.txt which is no longer present in recent CUDA versions.
Split the 'unknown CUDA version' warning in two:
* when detected CUDA version is partially supported by clang. It's expected to
work in general, at the feature parity with the latest supported CUDA
version. and may be missing support for the new features/instructions/GPU
variants. Clang will issue a warning.
* when detected version is new. Recent CUDA versions have been working with
clang reasonably well, and will likely to work similarly to the partially
supported ones above. Or it may not work at all. Clang will issue a warning and
proceed as if the latest known CUDA version was detected.
Differential Revision: https://reviews.llvm.org/D108247
|
|
Differential Revision: https://reviews.llvm.org/D108239
|
|
Differential Revision: https://reviews.llvm.org/D104804
|
|
This reverts commit 211e584fa2a4c032e4d573e7cdbffd622aad0a8f.
Fixed a use-after-free error that caused the sanitizers to fail.
|
|
This reverts commit ea10a86984ea73fcec3b12d22404a15f2f59b219.
A sanitizer buildbot reports an error.
|
|
Differential Revision: https://reviews.llvm.org/D103663
|
|
Differential Revision: https://reviews.llvm.org/D102306
|
|
|
|
Differential Revision: https://reviews.llvm.org/D96906
|
|
The patch only plumbs through the option necessary for targeting sm_86 GPUs w/o
adding any new functionality.
Differential Revision: https://reviews.llvm.org/D95974
|
|
Differential Revision: https://reviews.llvm.org/D93181
|
|
Differential Revision: https://reviews.llvm.org/D90447
Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761
|
|
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were
previously included in gfx909.
Differential Revision: https://reviews.llvm.org/D90419
Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
|
|
At AMD, in an internal audit of our code, we found some corner cases
where we were not quite differentiating targets enough for some old
hardware. This commit is part of fixing that by adding three new
targets:
* The "Oland" and "Hainan" variants of gfx601 are now split out into
gfx602. LLPC (in the GPUOpen driver) and other front-ends could use
that to avoid using the shaderZExport workaround on gfx602.
* One variant of gfx703 is now split out into gfx705. LLPC and other
front-ends could use that to avoid using the
shaderSpiCsRegAllocFragmentation workaround on gfx705.
* The "TongaPro" variant of gfx802 is now split out into gfx805.
TongaPro has a faster 64-bit shift than its former friends in gfx802,
and a subtarget feature could be set up for that to take advantage of
it. This commit does not make that change; it just adds the target.
V2: Add clang changes. Put TargetParser list in order.
V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order,
so fix the GPUKind order.
Differential Revision: https://reviews.llvm.org/D88916
Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
|
|
Currently CUDA/HIP toolchain uses "unknown" as bound arch
for offload action for fat binary. This causes -mcpu or -march
with "unknown" added in HIPToolChain::TranslateArgs or
CUDAToolChain::TranslateArgs.
This causes issue for https://reviews.llvm.org/D88377 since
HIP toolchain needs to check -mcpu in HIPToolChain::TranslateArgs.
The bound arch of offload action for fat binary is not really
used, therefore set it to CudaArch::UNUSED.
Differential Revision: https://reviews.llvm.org/D88524
|
|
Differential Revision: https://reviews.llvm.org/D87324
|