Age | Commit message (Collapse) | Author | Files | Lines |
|
Fix a crash that would arise when intrinsics like llvm.masked.load.T.p7
were left in the module when AMDGPULowerBufferFatPointers was applied
and so a captures(none) annotation would be applied to a non-pointer
value, triggering a verifier failure.
---------
Co-authored-by: Shilei Tian <i@tianshilei.me>
|
|
Fixes https://github.com/iree-org/iree/issues/22001
The visitor in SplitPtrStructs would re-visit instructions if an
instruction earlier in program order caused a recursive visit() call via
getPtrParts(). This would cause instructions to be processed multiple
times.
As a consequence of this, PHI nodes could be added to the Conditionals
array multiple times, which would to a conditinoal that was already
simplified being processed multiple times. After the code moved to
InstSimplifyFolder, this re-processing, combined with more agressive
simplifications, would lead to an attempt to replace an instruction with
itself, causing an assertion failure and crash.
This commit resolves the issue and adds the reduced form of the crashing
input as a test.
|
|
Saves around 125-210 MB of compilation memory usage per source for
roughly one third of our backend sources, ~60 MB on average.
|
|
null (#154128)
Fixes #154056.
The fat buffer lowering pass was erroniously detecting that it did not
need to run on functions that only load/store to the null constant (or
other such constants). We thought this would be covered by specializing
constants out to instructions, but that doesn't account foc trivial
constants like null. Therefore, we check the operands of instructions
for buffer fat pointers in order to find such constants and ensure the
pass runs.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
|
|
Not quite NFC as it looks like the original intrinsic-handling code
never got updated to use records. This was never caught because that
code wasn't tested. I've adjusted an existing test so the behaviour is
now covered.
|
|
Reviewed By: krzysz00
Pull Request: https://github.com/llvm/llvm-project/pull/139413
|
|
This is one of the final remaining debug-intrinsic specific codepaths
out there, and pieces of cross-LLVM infrastructure to do with debug
intrinsics.
|
|
It appears this wasn't handled in the initial migration a year ago,
seemingly because it didn't lead to any test failures. Find and interpret
debug records in the same way the original code handled intrinsics. Note
that we drop a call to copyMetadata: debug records can't carry additional
metadata like instructions, nothing relies on this in AMDGPU AFAIUI.
|
|
|
|
This flag was used to let us incrementally introduce debug records
into LLVM, however everything is now using records. It serves no
purpose now, so delete it.
|
|
Avoid using report_fatal_error. Many more uses that should be converted
in the pass remain.
|
|
This fixes #139614 on non-clang compilers by moving `__has_warning`
completely inside the `#if defined(__clang__)` block. This prevents a
parse failure from compilers which don't recognize `__has_warning`.
Original description:
Followup to #138741.
This adds the requested macro to silence
`-Wunnecessary-virtual-specifier` when declaring virtual anchor
functions in `final` classes, per [LLVM
policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers).
It also cleans up any remaining instances of the warning, allowing us to
stop disabling it when we build LLVM.
|
|
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
|
|
This reverts commit 0954c9d487e7cb30673df9f0ac125f71320d2936.
It breaks the build when built with gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04).
|
|
Followup to #138741.
This adds the requested macro to silence
`-Wunnecessary-virtual-specifier` when declaring virtual anchor
functions in `final` classes, per [LLVM
policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers).
It also cleans up any remaining instances of the warning, allowing us to
stop disabling it when we build LLVM.
|
|
Some application code operating on generic pointers (that then gete
initialized to buffer fat pointers) may perform tests against nullptr.
After address space inference, this results in comparisons against
`addrspacecast (ptr null to ptr addrspace(7))`, which were crashing.
However, while general casts to ptr addrspace(7) from generic pointers
aren't supposted, it is possible to cast null pointers to the all-zerose
bufer resource and 0 offset, which this patch adds.
It also adds a TODO for casting _out_ of buffer resources, which isn't
implemented here but could be.
|
|
This PR adds a amdgns_load_to_lds intrinsic that abstracts over loads to
LDS from global (address space 1) pointers and buffer fat pointers
(address space 7), since they use the same API and "gather from a
pointer to LDS" is something of an abstract operation.
This commit adds the intrinsic and its lowerings for addrspaces 1 and 7,
and updates the MLIR wrappers to use it (loosening up the restrictions
on loads to LDS along the way to match the ground truth from target
features).
It also plumbs the intrinsic through to clang.
|
|
instructions (#137701)
This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum`
instructions.
These mirror the `llvm.maximum.*` and `llvm.minimum.*` instructions, but
are atomic and use IEEE754 2019 handling for NaNs, which is different to
`fmax` and `fmin`. See:
https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic
for more details.
Future changes will allow this LLVM IR to be lowered to specialised
assembler instructions on suitable targets, such as AArch64.
|
|
Now that IRMover and the rest of LLVM don't allow recursive types, drop
support for them from the clone of the IRMover code used when lowering
buffer fat pointer operations.
|
|
instructions" (#137657)
Reverts llvm/llvm-project#136759 due to bad interaction with c792b25e4
|
|
(#136759)
This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum`
instructions.
These mirror the `llvm.maximum.*` and `llvm.minimum.*` instructions, but
are atomic and use IEEE754 2019 handling for NaNs, which is different to
`fmax` and `fmin`. See:
https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic
for more details.
Future changes will allow this LLVM IR to be lowered to specialised
assembler instructions on suitable targets, such as AArch64.
|
|
|
|
|
|
version (#133015)" (#134871)
This reverts commit d1a05721172272f7aab685b56d99e86814a15bff.
There was further discussion on the PR about whether the intinsics
should exist in this form.
|
|
- Remove calls to pass initialization from pass constructors.
- https://github.com/llvm/llvm-project/issues/111767
|
|
(#133015)
Add a buffer_fat_ptr_load_lds intrinsic, by analogy with
global_load_lds, which enables using `ptr addrspace(7)` to set the rsrc
and offset arguments to raw_ptr_buffer_load_lds.
|
|
This PR updates AMDGPULowerBufferFatPointers to use the
InstSimplifyFolder
when creating IR during buffer fat pointer lowering.
This shouldn't cause any large functional changes and might improve the
quality of the generated code.
|
|
Replace `undef` debug info with `poison`.
|
|
(#126621)" (#129078)
This reverts commit 1559a65efaf327f9c72e14d4bb1834f076e7fc20.
Fixed test (I suspect broken by unrelated change in the merge)
|
|
This reverts commit 469757efafebdd5772d993fca4dc0dfa7cbda17c.
Multiple buildbot failures have been reported:
https://github.com/llvm/llvm-project/pull/126621
|
|
Since LowerBufferFatPointers runs before PreISelIntrinsicLowering, which
normally handles unsupported memcpy()s,, and since you can't have a
`noalias {ptr addrspace(8), i32}` becasue it crashes later passes,
manually expand memcpy()s involving buffer fat pointers to loops.
Additionally, though they're unlikely to be used, this commit adds
support for memset().
This commit doesn't implement writing direct-to-LDS loads as the
intrinsics, but leaves the option in the future.
|
|
Attempting to pass a `ptr addrspace(7)` to functions that take `ptr`
arguments produces undesirable `addrspacecast(addrspacecast(p8 x to p7)
to p0) => addrspacecast(p8 x to p0)` folds. This results in illegal GEP
operations on buffer resources, which can't be GEP'd. (However, note
that, while unimplemneted, addressspacecast from ptr addrspace(7) to ptr
is legal - it's just an effective address computation)
To resolve this problem, and thus prevent illegal
`getelementptr T, ptr addrspace(8) %x, ...` s from being produces, this
commit extends amdgcn.make.buffer.rsrc to also be variadic in its result
type, auto-upgrading old manglings.
The logic for handling a make.buffer.rsrc in instruction selection
remains untouched and expects the output type to be a ptr addrspace(8),
as does the Clang lowering for its builtin (the pointer-to-pointer
version might want a different name in clang). LowerBufferFatPointers
has been updated to lower
amdgcn.make.buffer.rsrc.p7.p* to amdgcn.make.buffer.rsrc.p8.p* .
This'll also make exposing buffer fat pointers in Clang easier, since
you don't have to cast between a `__amdgcn_rsrc_t` and a pointer.
|
|
The lowering for GEP didn't properly support the case where the pointer
argument was being implicitly broadcast by a vector of indices. Fix
that.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
|
|
lowering" (#123660)
(#123657)
This reverts commit 64749fb01538fba2b56d9850497d5f3a626cabc2.
Adds a constructor to VecSlice to address the failure
|
|
(#123657)
Reverts llvm/llvm-project#110572
Seem to have broken a buildbot, not sure why
https://lab.llvm.org/buildbot/#/builders/108/builds/8346
|
|
The current lowering for ptr addrspace(7) assumed that the instruction
selector can handle arbtrary LLVM types, which is not the case. Code
generation can't deal with
- Values that aren't 8, 16, 32, 64, 96, or 128 bits long
- Aggregates (this commit only handles arrays of scalars, more may come)
- Vectors of more than one byte
- 3-word values that aren't a vector of 3 32-bit values (for axample, a
<6 x half>)
This commit adds a buffer contents type legalizer that adds the needed
bitcasts, zero-extensions, and splits into subcompnents needed to
convert a load or store operation into one that can be successfully
lowered through code generation.
In the long run, some of the involved bitcasts (though potentially not
the buffer operation splitting) ought to be handled by the instruction
legalizer, but SelectionDAG makes this difficult.
It also takes advantage of the new `nuw` flag on `getelementptr` when
lowering GEPs to offset additions.
We don't currently plumb through `nsw` on GEPs since that should likely
be a separate change and would require declaring what we mean by "the
address" in the context of the GEP guarantees.
|
|
Use typeIncompatible() to drop attributes incompatible with the new
argument/return type, instead of keeping a custom list.
|
|
amdgpu-lower-buffer-fat-pointers (#120139)
|
|
This commit usis the `nuw` flag on `getelemnetptr` to set the `nuw` flag
on buffer offset additions, and also moves from `inbounds` to the looser
`nusw` for the existing case.
|
|
This patch works around:
llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:1101:13:
error: enumeration values 'USubCond' and 'USubSat' not handled in
switch [-Werror,-Wswitch]
I've notified the author in #105568.
|
|
Upstream the intrinsics `llvm.amdgcn.raw.atomic.buffer.load`
and `llvm.amdgcn.raw.atomic.ptr.buffer.load`.
These additional intrinsics mark atomic buffer loads
as atomic to LLVM by removing the `IntrReadMem`
attribute. Otherwise, it could hoist these
intrinsics out of loops in cases where LLVM marks
them as invariant. That can cause issues such as
infinite loops.
Continuation of https://reviews.llvm.org/D138786
with the additional use in the fat buffer lowering,
more test cases and the additional ptr versions
of these intrinsics.
---------
Co-authored-by: rtayl <>
Co-authored-by: Jay Foad <jay.foad@amd.com>
Co-authored-by: Mariusz Sikora <mariusz.sikora@amd.com>
|
|
|
|
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
|
|
Expand all constant expressions that use fat pointers upfront, so that
the rewriting logic only has to deal with instructions and not the
constant expression variants as well.
My primary motivation is to remove the creation of illegal constant
expressions (mul and shl) from this pass, but this also cuts down quite
a bit on the amount of duplicate logic.
|
|
For ptrtoint that truncates to the offset only, the expansion generated
a shift by the bit width, which is poison. Instead, we should return the
offset directly.
(The same problem exists for the constant expression case, but I plan to
address that separately, and more comprehensively.)
|
|
expressions
We expect all of these ConstantExpr ctors to fold away, don't try
to preserve flags, especially as the flags are not correct.
|
|
OffAccum will never be nullptr now, instead check for a zero
constant.
|
|
Use emitGEPOffset() to emit the GEP offset, which already has all the
necessary logic.
This also fixes the nuw flag incorrectly being set on the offset
calculation, while only nsw is implied by inbounds.
|
|
This implements the `nusw` and `nuw` flags for `getelementptr` as
proposed at
https://discourse.llvm.org/t/rfc-add-nusw-and-nuw-flags-for-getelementptr/78672.
The three possible flags are encapsulated in the new `GEPNoWrapFlags`
class. Currently this class has a ctor from bool, interpreted as the
InBounds flag. This ctor should be removed in the future, as code gets
migrated to handle all flags.
There are a few places annotated with `TODO(gep_nowrap)`, where I've had
to touch code but opted to not infer or precisely preserve the new
flags, so as to keep this as NFC as possible and make sure any changes
of that kind get test coverage when they are made.
|
|
|