Age | Commit message (Collapse) | Author | Files | Lines |
|
This flag was used to let us incrementally introduce debug records
into LLVM, however everything is now using records. It serves no
purpose now, so delete it.
|
|
By chance, two things have prevented the autoupgrade path being
exercised much so far:
* LLParser setting the debug-info mode to "old" on seeing intrinsics,
* The test in AutoUpgrade.cpp wanting to upgrade into a "new" debug-info
block.
In practice, this appears to mean this code path hasn't seen the various
invalid inputs that can come its way. This commit does a number of
things:
* Tolerates the various illegal inputs that can be written with
debug-intrinsics, and that must be tolerated until the Verifier runs,
* Printing illegal/null DbgRecord fields must succeed,
* Verifier errors need to localise the function/block where the error
is,
* Tests that now see debug records will print debug-record errors,
Plus a few new tests for other intrinsic-to-debug-record failures modes
I found. There are also two edge cases:
* Some of the unit tests switch back and forth between intrinsic and
record modes at will; I've deleted coverage and some assertions to
tolerate this as intrinsic support is now Gone (TM),
* In sroa-extract-bits.ll, the order of debug records flips. This is
because the autoupgrader upgrades in the opposite order to the basic
block conversion routines... which doesn't change the record order, but
_does_ change the use list order in Metadata! This should (TM) have no
consequence to the correctness of LLVM, but will change the order of
various records and the order of DWARF record output too.
I tried to reduce this patch to a smaller collection of changes, but
they're all intertwined, sorry.
|
|
Currently, GlobalObject has an "alignment" property... but it's
basically nonsense: alignment doesn't mean the same thing for variables
and functions, and it's completely meaningless for ifuncs.
This "removes" (actually marking protected) the methods from
GlobalObject, adds the relevant methods to Function and GlobalVariable,
and adjusts the code appropriately.
This should make future alignment-related cleanups easier.
|
|
Add a `alloc-variant-zeroed` function attribute which can be used to
inform folding allocation+memset. This addresses
https://github.com/rust-lang/rust/issues/104847, where LLVM does not
know how to perform this transformation for non-C languages.
Co-authored-by: Jamie <jamie@osec.io>
|
|
This patch implements verifier checks for range
attributes on ImmArg. This enables validation
of the range of ImmArg operands in intrinsics,
when the intrinsic definition includes the range
information.
Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
|
|
Thread-local globals live, by default, in the default globals address
space, which may not be 0, so we need to overload @llvm.thread.pointer
to support other address spaces, and use the default globals address
space in Clang.
|
|
Unreachable is transformed to llvm.amdgcn.unreachable() during exit
unification. Make sure the verifier tolerates this.
|
|
This adds basic support for objc_claimAutoreleasedReturnValue, which is
mostly equivalent to objc_retainAutoreleasedReturnValue, with the
difference that it doesn't require the marker nop to be emitted between
it and the call it was attached to.
To achieve that, this also teaches the AArch64 attachedcall bundle
lowering to pick whether the marker should be emitted or not based on
whether the attachedcall target is claimARV or retainARV.
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
|
|
Also fix some typos in comments
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
|
|
Currently, each variant in the variant part of a structure type can only
contain a single member. This was sufficient for Rust, where each
variant is represented as its own type.
However, this isn't really enough for Ada, where a variant can have
multiple members.
This patch adds support for this scenario. This is done by allowing the
use of DW_TAG_variant by DICompositeType, and then changing the DWARF
generator to recognize when a DIDerivedType representing a variant holds
one of these. In this case, the fields from the DW_TAG_variant are
inlined into the variant, like so:
```
<4><7d>: Abbrev Number: 9 (DW_TAG_variant)
<7e> DW_AT_discr_value : 74
<5><7f>: Abbrev Number: 7 (DW_TAG_member)
<80> DW_AT_name : (indirect string, offset: 0x43): field0
<84> DW_AT_type : <0xa7>
<88> DW_AT_alignment : 8
<89> DW_AT_data_member_location: 0
<5><8a>: Abbrev Number: 7 (DW_TAG_member)
<8b> DW_AT_name : (indirect string, offset: 0x4a): field1
<8f> DW_AT_type : <0xa7>
<93> DW_AT_alignment : 8
<94> DW_AT_data_member_location: 8
```
Note that the intermediate DIDerivedType is still needed in this
situation, because that is where the discriminants are stored.
|
|
Migrate their usage to the `AnyMem*Inst` family, and add a isAtomic()
query on the base class for that hierarchy. This matches the idioms we
use for e.g. isAtomic on load, store, etc.. instructions, the existing
isVolatile idioms on mem* routines, and allows us to more easily share
code between atomic and non-atomic variants.
As with #138568, the goal here is to simplify the class hierarchy and
make it easier to reason about. I'm moving from easiest to hardest, and
will stop at some point when I hit "good enough". Longer term, I'd sorta
like to merge or reverse the naming on the plain Mem*Inst and the
AnyMem*Inst, but that's a much larger and more risky change. Not sure
I'm going to actually do that.
|
|
While external globals can be unsized, I don't think an unsized
initializer makes sense.
It seems like the backend currently ends up treating this as a zero-size
global. If people want that behavior, they should declare it as such.
|
|
When checking string attribute values are valid, it's not
necessary to check hasFnAttr prior to querying the value.
|
|
We currently accept label arguments to inline asm calls. This support
predates both blockaddresses and callbr and is only covered by one X86
test. Remove it in favor of callbr (or at least blockaddress, though
that cannot guarantee correct codegen, just like using block labels
directly can't).
I didn't bother implementing bitcode upgrade support for this, but I can
add it if desired.
|
|
This PR updates the `Verifier` to enforce that `alloca` instructions on
AMDGPU must be in AS5. This prevents hitting a misleading backend error
like "unable to select FrameIndex," which makes it look like a backend
bug when it's actually an IR-level issue.
|
|
In #132722 spills of ZT0 were disabled around all SME ABI routines to
avoid a case where ZT0 is spilled before ZA is enabled (resulting in a
crash).
It turns out that the ABI does not promise that routines will preserve
ZT0 (however in practice they do), so generally disabling ZT0 spills for
ABI routines is not correct.
The case where a crash was possible was "aarch64_new_zt0" functions with
ZA disabled on entry and a ZT0 spill around __arm_tpidr2_save. In this
case, ZT0 will be undefined at the call to __arm_tpidr2_save, so this
patch avoids the ZT0 spill by marking the callsite with
"aarch64_zt0_undef". This attribute only applies to callsites and marks
that at the point the call is made ZT0 is not defined, so does not need
preserving.
|
|
Fixes #126417.
Currently, assignment tracking recognizes allocas, stores, and mem
intrinsics as valid instructions to tag with DIAssignID, with allocas
representing the allocation for a variable and the others representing
instructions that may assign to the variable. There are other intrinsics
that can perform these assignments however, and if we transform a store
instruction into one of these intrinsics and correctly transfer the
DIAssignID over, this results in a verifier error. The
AssignmentTrackingAnalysis pass also does not know how to handle these
intrinsics if they are untagged, as it does not know how to extract
assignment information (base address, offset, size) from them.
This patch adds _some_ support for some intrinsics that may perform
assignments: masked store/scatter, and vp store/strided store/scatter.
This patch does not add support for extracting assignment information
from these, as they may store with either non-constant size or to
non-contiguous blocks of memory; instead it adds support for recognizing
untagged stores with "unknown" assignment info, for which we assume that
the memory location of the associated variable should not be used, as we
can't determine which fragments of it should or should not be used.
In principle, it should be possible to handle the more complex cases
mentioned above, but it would require more substantial changes to
AssignmentTrackingAnalysis, and it is mostly only needed as a fallback
if the DIAssignID is not preserved on these alternative stores.
|
|
(#136285)
Closes #136144.
After the patch:
```llvm
; label-crash.ll
define void @invalid_arg_type(i32 %0) {
1:
call void @foo(label %1)
ret void
}
declare void @foo(label)
```
```
lambda@ubuntu22:~/test$ llc -o out.s label-crash.ll
Function argument cannot be of label type!
label %0
ptr @foo
llc: error: 'label-crash.ll': input module cannot be verified
```
|
|
there are llvm passes that would invalidate the decision we make in
clang.
|
|
|
|
verifier (#134910)
|
|
Relaxes the newly added verifier rule to also allow an integer argument
to dbg_declare, which is interpreted as a pointer. Adjust CGP to deal with
it gracefully.
Fixes https://github.com/llvm/llvm-project/issues/134523.
Alternative to https://github.com/llvm/llvm-project/pull/134601.
|
|
This is mostly just a simplification. getCalledFunction is a best-effort
thing so the verifier should not be relying on it in most cases, except
for intrinsic calls where we are guaranteed that the called function is
known, but most of those cases can be handled with
CallBase::getIntrinsicID instead.
---------
Co-authored-by: Tim Gymnich <tim@gymni.ch>
|
|
As far as I understand, the first operand of dbg_declare should be a
pointer (inside a metadata wrapper). However, using a non-pointer is
currently not rejected, and we have some tests that use non-pointer
types. As far as I can tell, these tests either meant to use dbg_value
or are just incorrect hand-crafted tests.
Ran into this while trying to `fix` #134008.
|
|
With -fpatchable-function-entry (or the patchable_function_entry
function attribute), we emit records of patchable entry locations to the
__patchable_function_entries section. Add an additional parameter to the
command line option that allows one to specify a different default
section name for the records, and an identical parameter to the function
attribute that allows one to override the section used.
The main use case for this change is the Linux kernel using prefix NOPs
for ftrace, and thus depending on__patchable_function_entries to locate
traceable functions. Functions that are not traceable currently disable
entry NOPs using the function attribute, but this creates a
compatibility issue with -fsanitize=kcfi, which expects all indirectly
callable functions to have a type hash prefix at the same offset from
the function entry.
Adding a section parameter would allow the kernel to distinguish between
traceable and non-traceable functions by adding entry records to
separate sections while maintaining a stable function prefix layout for
all functions. LKML discussion:
https://lore.kernel.org/lkml/Y1QEzk%2FA41PKLEPe@hirez.programming.kicks-ass.net/
|
|
This adds DWARF generation for fixed-point types. This feature is needed
by Ada.
Note that a pre-existing GNU extension is used in one case. This has
been emitted by GCC for years, and is needed because standard DWARF is
otherwise incapable of representing these types.
|
|
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch uses insert_range with
iterator ranges. For each case, I've verified that foos is defined as
make_range(foo_begin(), foo_end()) or in a similar manner.
|
|
pattern argument (#132026)
I initially thought starting with a more narrow definition and later
expanding would make more sense. But as pointed out in review for PR
#129220, this restriction is generating additional unnecessary work.
This patch alters the intrinsic to accept patterns of any type. Future
patches will update LoopIdiomRecognize and PreISelIntrinsicLowering to
take advantage of this. The verifier will complain if an unsized type is
used. I've additionally taken the opportunity to remove a comment from
the LangRef about some bit widths potentially not being supported by the
target. I don't think this is any more true than it is for arbitrary
width loads/stores which don't carry a similar warning that I can see.
A verifier check ensures that only sized types are used for the pattern.
|
|
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.
For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.
The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
|
|
`llvm.wasm.throw` intrinsic can throw but it was not invokable. Not sure
what the rationale was when it was first written that way, but I think
at least in Emscripten's C++ exception support with the Wasm port of
libunwind, `__builtin_wasm_throw`, which is lowered down to
`llvm.wasm.rethrow`, is used only within `_Unwind_RaiseException`, which
is an one-liner and thus does not need an `invoke`:
https://github.com/emscripten-core/emscripten/blob/720e97f76d6f19e0c6a2d6988988cfe23f0517fb/system/lib/libunwind/src/Unwind-wasm.c#L69
(`_Unwind_RaiseException` is called by `__cxa_throw`, which is generated
by the `throw` C++ keyword)
But this does not address other direct uses of the builtin in C++, whose
use I'm not sure about but is not prohibited. Also other language
frontends may need to use the builtin in different functions, which has
`try`-`catch`es or destructors.
This makes `llvm.wasm.throw` invokable in the backend. To do that, this
adds a custom lowering routine to `SelectionDAGBuilder::visitInvoke`,
like we did for `llvm.wasm.rethrow`.
This does not generate `invoke`s for `__builtin_wasm_throw` yet, which
will be done by a follow-up PR.
Addresses #124710.
|
|
An Ada program can have types that are subranges of other types. This
patch adds a new DIType node, DISubrangeType, to represent this concept.
I considered extending the existing DISubrange to do this, but as
DISubrange does not derive from DIType, that approach seemed more
disruptive.
A DISubrangeType can be used both as an ordinary type, but also as the
type of an array index. This is also important for Ada.
Ada subrange types can also be stored using a bias. Representing this in
the DWARF required the use of an extension. GCC has been emitting this
extension for years, so I've reused it here.
|
|
Check that the input register is an inreg argument to the parent
function. (See the comment in `IntrinsicsAMDGPU.td`.)
This LLVM defect was identified via the AMD Fuzzing project.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
|
|
Make sure that this intrinsic is being followed by unreachable.
This LLVM defect was identified via the AMD Fuzzing project.
Thanks to @rovka for helping me solve this issue!
|
|
As part of the "RemoveDIs" work to eliminate debug intrinsics, we're
replacing methods that use Instruction*'s as positions with iterators. The
call-sites updated in this patch are those where the dyn_cast_or_null cast
utility doesn't compose well with iterator insertion. It can distinguish
between nullptr and a "present" (non-null) Instruction pointer, but not
between a legal and illegal instruction iterator. This can lead to
end-iterator dereferences and thus crashes.
We can improve this in the future (as parent-pointers can now be accessed
from ilist nodes), but for the moment, add explicit tests for end()
iterators at the five call sites affected by this.
|
|
(#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.
This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).
---------
Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
|
|
This remove some erroneous debug info from tests that should address the
test failures that showed up when the this was previously committed.
This reverts commit 6716ce8b641f0e42e2343e1694ee578b027be0c4.
|
|
Asserts on various tests/buildbots, at least one example is
DebugInfo/X86/set.ll
This reverts commit 2dc5682dacab2dbb52a771746fdede0e938fc6e9.
|
|
Came up recently with some nodebug case on codeview, that caused a null
entry in elements and crashed LLVM.
Original clang fix to avoid generating IR like this: 504dd577675e8c85cdc8525990a7c8b517a38a89
|
|
This implements the lowering of calls from agnostic-ZA functions to
non-agnostic-ZA functions, using the ABI routines
`__arm_sme_state_size`, `__arm_sme_save` and `__arm_sme_restore`.
This implements the proposal described in the following PRs:
* https://github.com/ARM-software/acle/pull/336
* https://github.com/ARM-software/abi-aa/pull/264
|
|
This makes sure no optimizations are applied that assume the
bigger alignment or size, which could be incorrect if we link
together with non-instrumented code.
|
|
Verify the format is valid and the type is one of the expected
i32 vectors. Verify the used vector types at least cover the
requirements of the corresponding format operand.
|
|
Add a property to allow marking target extension types that cannot be
used in an alloca instruction or byval argument, similar to CanBeGlobal
for global variables.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
|
|
For target extension types that are not allowed to be used as globals,
also disallow them nested inside array and structure types.
|
|
Improve the information printed when -memprof-report-hinted-sizes is
enabled. Now print the full context hash computed from the original
profile, similar to what we do when reporting matching statistics. This
will make it easier to correlate with the profile.
Note that the full context hash must be computed at profile match time
and saved in the metadata and summary, because we may trim the context
during matching when it isn't needed for distinguishing hotness.
Similarly, due to the context trimming, we may have more than one full
context id and total size pair per MIB in the metadata and summary,
which now get a list of these pairs.
Remove the old aggregate size from the metadata and summary support.
One other change from the prior support is that we no longer write the
size information into the combined index for the LTO backends, which
don't use this information, which reduces unnecessary bloat in
distributed index files.
|
|
Relands 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3 after regenerating the
test case.
Supersedes the draft PR #94992, taking a different approach following
feedback:
* Lower in PreISelIntrinsicLowering
* Don't require that the number of bytes to set is a compile-time
constant
* Define llvm.memset_pattern rather than llvm.memset_pattern.inline
As discussed in the [RFC
thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496),
the intent is that the intrinsic will be lowered to loops, a sequence of
stores, or libcalls depending on the expected cost and availability of
libcalls on the target. Right now, there's just a single lowering path
that aims to handle all cases. My intent would be to follow up with
additional PRs that add additional optimisations when possible (e.g.
when libcalls are available, when arguments are known to be constant
etc).
|
|
This reverts commit 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3.
Recent scheduling changes means tests need to be re-generated. Reverting
to green while I do that.
|
|
Supersedes the draft PR #94992, taking a different approach following
feedback:
* Lower in PreISelIntrinsicLowering
* Don't require that the number of bytes to set is a compile-time
constant
* Define llvm.memset_pattern rather than llvm.memset_pattern.inline
As discussed in the [RFC
thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496),
the intent is that the intrinsic will be lowered to loops, a sequence of
stores, or libcalls depending on the expected cost and availability of
libcalls on the target. Right now, there's just a single lowering path
that aims to handle all cases. My intent would be to follow up with
additional PRs that add additional optimisations when possible (e.g.
when libcalls are available, when arguments are known to be constant
etc).
|
|
This patch introduces an experimental intrinsic for matching the
elements of one vector against the elements of another.
For AArch64 targets that support SVE2, the intrinsic lowers to a MATCH
instruction for supported fixed and scalar vector types.
|
|
Identified with misc-include-cleaner.
|
|
|