Age | Commit message (Collapse) | Author | Files | Lines |
|
Closes #95219
|
|
These global flags block furthur improvements for clang, users should
always use fast-math flags
see also
https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast/80797
Remove them incrementally, this is the clang part.
|
|
(#161773)
LLVM models ConstantPointerNull as all-zero, but some GPUs (e.g. AMDGPU
and our downstream GPU target) use a non-zero sentinel for null in
private / local address spaces.
SPIR-V is a supported input for our GPU target. This PR preserves a
canonical zero form in the generic AS while allowing later lowering to
substitute the target’s real sentinel.
|
|
|
|
This replaces the original arch check.
|
|
AMDGCN flavoured SPIR-V allows AMDGCN specific builtins, including those
for scoped fences and some specific RMWs. However, at present we don't
map syncscopes to their SPIR-V equivalents, but rather use the AMDGCN
ones. This ends up pessimising the resulting code as system scope is
used instead of device (agent) or subgroup (wavefront), so we correct
the behaviour, to ensure that we do the right thing during reverse
translation.
|
|
On new targets like `gfx1250`, the buffer resource (V#) now uses this
format:
```
base (57-bit): resource[56:0]
num_records (45-bit): resource[101:57]
reserved (6-bit): resource[107:102]
stride (14-bit): resource[121:108]
```
This PR changes the type of `num_records` from `i32` to `i64` in both
builtin and intrinsic, and also adds the support for lowering the new
format.
Fixes SWDEV-554034.
---------
Co-authored-by: Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>
|
|
|
|
|
|
|
|
|
|
These are defined in the user range until standard versions of them get
adopted into dwarf, which is expected in DWARF6.
Some of these amount to reservations currently as no code to use them is
included. It would be very helpful to get them committed to avoid
conflicts necessitating encoding changes while we are in the process of
upstreaming.
---------
Co-authored-by: Juan Martinez Fernandez <juamarti@amd.com>
Co-authored-by: Emma Pilkington <Emma.Pilkington@amd.com>
|
|
GCC 14 also made this an error by default, so we’re following suit.
Fixes #74605
|
|
Reverts llvm/llvm-project#157463
The PR breaks buildbots with old ROCm versions, so revert it and reapply
when buildbots are updated.
|
|
Remove definitions, test uses, and documentation of the macros, which were
deprecated in November 2024 with PR #112849 / #115507.
Where required, the wavefront size should instead be queried via means provided
by the HIP runtime: the (non-constexpr) `warpSize` variable in device code, or
`hipGetDeviceProperties` in host code.
This change passed AMD-internal testing.
Implements SWDEV-522062.
|
|
Tests exercizing TBAA metadata (both purposefully and not), and
previously generated via UTC, have been regenerated and updated
to version 6.
|
|
Co-authored-by: Ivan Kosarev <ivan.kosarev@amd.com>
|
|
|
|
- Add clang built-ins + sema/codegen
- Add IR Intrinsic + verifier
- Add DAG/GlobalISel codegen for the intrinsics
- Add lowering in SIMemoryLegalizer using a MMO flag.
|
|
|
|
|
|
Add builtins that expose the underlying llvm.amdgcn.inverse.ballot
intrinsic that we've had for a while.
This allows more explicitly writing code that selects or branches in
terms of lane masks, which can lead to better code quality.
|
|
If a wavefrontsize32 or wavefrontsize64 is the only possible value
insert it into feature list by default and use that value as an
indication that another wavefront size is not legal.
|
|
|
|
Fixes a bug on AMDGPU targets where a pointer was stored as address
space 5, but then loaded as address space 0.
Issue found as part of [Kokkos](https://github.com/kokkos/kokkos)
testing, specifically `hip.atomics` (see
[core/unit_test/TestAtomics.hpp](https://github.com/kokkos/kokkos/blob/develop/core/unit_test/TestAtomics.hpp)).
Issue was introduced by commit
[39ec9de7c230](https://github.com/llvm/llvm-project/commit/39ec9de7c230)
- [clang][CodeGen] sret args should always point to the alloca AS, so
use that (https://github.com/llvm/llvm-project/pull/114062).
|
|
|
|
Reopen #128938.
Attempt to shrink the size of vector loads where only some of the
incoming lanes are used for rebroadcasts in shufflevector instructions.
---------
Co-authored-by: Leon Clark <leoclark@amd.com>
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
|
|
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.
This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
|
|
|
|
This patch exposes builtins for atomic `add`, `max`, and `min` operations that
operate over buffer resource pointers.
|
|
|
|
(#151960)
Reverts llvm/llvm-project#128938 while a crash regression is investigated
|
|
Attempt to shrink the size of vector loads where only some of the incoming lanes are used for rebroadcasts in shufflevector instructions.
---------
Co-authored-by: Leon Clark <leoclark@amd.com>
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fix two minor issues:
- Add double quote
- Remove unused prefix
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(#148141)
Fixes: SWDEV-541399
|
|
|
|
|