| Age | Commit message (Collapse) | Author | Files | Lines |
|
(#164499)
This patch adds LLVM IR intrinsics and basic codegen support for the
XSfvfexp* and XSfvfexpa* extensions.
---------
Co-authored-by: Jesse Huang <jesse.huang@sifive.com>
Co-authored-by: Craig Topper <craig.topper@sifive.com>
|
|
This patch introduces support for the Hexagon V81 architecture. It
includes instruction formats, definitions, encodings, scheduling
classes, and builtins/intrinsics.
|
|
Clean up AnalysisConsumer code from the timer-related branches that are
not used most of the time, and move this logic to Timer.cpp, which is a
more relevant place and allows for a cleaner implementation.
|
|
This patch adds HVX vgather/vscatter genertion for i16, i32, and i8. It
also adds a flag to control generation of scatter/gather instructions
for HVX. Default to "disable".
Co-authored-by: Sergei Larin <slarin@codeaurora.org>
Co-authored-by: Sergei Larin <slarin@quicinc.com>
Co-authored-by: Maxime Schmitt <maxime.schmitt@qti.qualcomm.com>
|
|
Eventually this should be program state, and not part of TargetLowering
so avoid direct references to the libcall functions in it.
The usage of RuntimeLibcallsInfo here is not good though, particularly
the use through TargetTransformInfo. It would be better if the IR
attributes were directly encoded in the libcall definition (or at least made
consistent elsewhere). The parsing of the attributes should not also be
responsible for doing the libcall recognition, which is the only part pulling in
the dependency.
|
|
This has been dead since 97bfb936af4077e8cb6c75664231f27a9989d563
|
|
Add GlobalISel lowering of G_FMINIMUM and G_FMAXIMUM following the same
logic as in SDag's expandFMINIMUM_FMAXIMUM.
Update AMDGPU legalization rules: Pre GFX12 now uses new lowering method
and make G_FMINNUM_IEEE and G_FMAXNUM_IEEE legal to match SDag.
|
|
(#164133)
The predicate system is currently primitive and alternative call
predicates
should be mutually exclusive.
|
|
All 3 implementations are just checking if this has the
windows check function, so merge that as the only implementation.
|
|
InstSimplifyFolder can fold binary intrinsics, so take the opportunity
to unify code with getOpcodeOrIntrinsicID, and handle the case. The
additional handling of WidenGEP is non-functional, as the GEP is
simplified before it is widened, as the included test shows.
|
|
#148410 (#164551)
This PR reapplies the changes previously introduced in #148410.
It introduces a redesigned and rebuilt Cling-based auto-loading
workaround that enables scanning libraries and resolving unresolved
symbols within those libraries.
|
|
This patch improves constant folding through `llvm.vector.insert`. It
does not change anything for fixed-length vectors (which can already be
folded to ConstantVectors for these cases), but folds scalable vectors
that otherwise would not be folded.
These folds preserve the destination vector (which could be undef or
poison), giving targets more freedom in lowering the operations.
|
|
I'm not sure if this is the best way forward or not, but we have a lot
of issues with forgetting that shuffle_vectors can be scalar again and
again. (There is another example from the recent known-bits code added
recently). As a scalar-dst shuffle vector is just an extract, and a
scalar-source shuffle vector is just a build vector, this patch makes
scalar shuffle vector illegal and adjusts the irbuilder to create the
correct node as required.
Most targets do this already through lowering or combines. Making scalar
shuffles illegal simplifies gisel as a whole, it just requires that
transforms that create shuffles of new sizes to account for the scalar
shuffle being illegal (mostly IRBuilder and LessElements).
|
|
We could inadvertently create new entries in the PrevailingModuleForGUID
map during lookup, which was always using operator[]. In most cases we
will have one for external symbols, but not in cases where the
prevailing copy is in a native object. Or if this happened to be looked
up for a local.
Make the map private and create and use accessors.
|
|
This introduces the Armv9.7-A architecture version, including the
relevant command-line option for -march.
More details about the Armv9.7-A architecture version can be found at:
* https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2025
* https://developer.arm.com/documentation/109697/2025_09/2025-Architecture-Extensions
* https://developer.arm.com/documentation/ddi0602/2025-09/
Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>
|
|
Print a note when the manually specified name in an intrinsic matches
the default name it would have been assigned based on the record name,
in which case the manual specification is redundant and can be
eliminated.
Also remove existing redundant manual names.
|
|
Similar to other code in ADT / STLExtras, allow `to_vector` to work with
ranges that require ADL to find the begin/end iterators.
|
|
(#164329)
Add support for Machine IR (MIR) triplet and entity generation in llvm-ir2vec.
This change extends llvm-ir2vec to support Machine IR (MIR) in addition to LLVM IR, enabling the generation of training data for MIR2Vec embeddings. MIR2Vec provides machine-level code embeddings that capture target-specific instruction semantics, complementing the target-independent IR2Vec embeddings.
- Extended llvm-ir2vec to support triplet and entity generation for Machine IR (MIR)
- Added `--mode=mir` option to specify MIR mode (vs LLVM IR mode)
- Implemented MIR triplet generation with Next and Arg relationships
- Added entity mapping generation for MIR vocabulary
- Updated documentation to explain MIR-specific features and usage
(Partially addresses #162200 ; Tracking issue - #141817)
|
|
Adding Matching and Inference Functionality to Propeller. For detailed
information, please refer to the following RFC:
https://discourse.llvm.org/t/rfc-adding-matching-and-inference-functionality-to-propeller/86238.
This is the second PR, which includes the calculation of basic block
hashes and their emission to the ELF file. It is associated with the
previous PR at https://github.com/llvm/llvm-project/pull/160706.
co-authors: lifengxiang1025
[lifengxiang@kuaishou.com](mailto:lifengxiang@kuaishou.com); zcfh
[wuminghui03@kuaishou.com](mailto:wuminghui03@kuaishou.com)
Co-authored-by: lifengxiang1025 <lifengxiang@kuaishou.com>
Co-authored-by: zcfh <wuminghui03@kuaishou.com>
Co-authored-by: Rahman Lavaee <rahmanl@google.com>
|
|
Add DTLTO linker option `--thinlto-remote-compiler-prepend-arg` to
enable support for the multi-call LLVM driver that requires an
additional option to specify the subcommand, e.g. "llvm clang ...".
Fixes https://github.com/llvm/llvm-project/issues/159125.
|
|
(#164046)
We are scanning through every single definition of a vtable across all
translation units which is unnecessary in most cases.
If this is a local, we want to make sure there isn't another local with
the same GUID due to it having the same relative path. However, we were
always scanning through every single summary in all cases.
We can now check the new HasLocal flag added in PR164647 ahead of the
loop,
instead of checking on every iteration.
This cut down a large thin link by around 6%, which was over half the
time it spent in WPD.
Note that we previously took the last conforming vtable summary, and now
we use the first. This caused a test difference in one somewhat
contrived test for vtables in comdats.
|
|
Note that "override" makes "virtual" redundant.
Identified with modernize-use-override.
|
|
We can pass a range directly to hash_combine_range these days.
|
|
This PR bundles partial reductions inside the VPExpressionRecipe class.
Stacked PRs:
1. https://github.com/llvm/llvm-project/pull/147026
2. https://github.com/llvm/llvm-project/pull/147255
3. https://github.com/llvm/llvm-project/pull/156976
4. https://github.com/llvm/llvm-project/pull/160154
5. -> https://github.com/llvm/llvm-project/pull/147302
6. https://github.com/llvm/llvm-project/pull/162503
7. https://github.com/llvm/llvm-project/pull/147513
|
|
This patch introduces SDNodeFlags::InBounds, to show that an ISD::PTRADD SDNode
implements an inbounds getelementptr operation (i.e., the pointer operand is in
bounds wrt. an allocated object it is based on, and the arithmetic does not
change that). The flag is set in the DAG construction when lowering inbounds
GEPs.
Inbounds information is useful in the ISel when selecting memory instructions
that perform address computations whose intermediate steps must be in the same
memory region as the final result. Follow-up patches to propagate the flag in
DAGCombines and to use it when lowering AMDGPU's flat memory instructions,
where the immediate offset must not affect the memory aperture of the address
(similar to this GISel patch: #153001), are planned.
This mirrors #150900, which has introduced a similar flag in GlobalISel.
This patch supersedes #131862, which previously attempted to introduce an
SDNodeFlags::InBounds flag. The difference between this PR and #131862 is that
there is now an ISD::PTRADD opcode (PR #140017) and the InBounds flag is only
defined to apply to ISD::PTRADD DAG nodes. It is therefore unambiguous that
in-bounds-ness refers to a memory object into which the left operand of the
PTRADD node points (in contrast to #131862, where InBounds would have applied
to commutative ISD::ADD nodes, so that the semantics would be more difficult to
reason about).
For SWDEV-516125.
|
|
|
|
This patch simplifies the AddInteger overloads by introducing
AddIntegerImpl, a helper function to handle all cases, both 32-bit and
64-bit cases.
|
|
|
|
Fixes #142146
Do nullptr check when pass accept `const TargetMachine &` in
constructor, but it is still not exhaustive.
|
|
Finds longest (almost) plain substring in the pattern.
Implementation is conservative to avoid false positives.
The result is not used to optimize
`GlobPattern::match()` so it's calculated on
request.
For
* https://github.com/llvm/llvm-project/pull/164545
---------
Co-authored-by: Luke Lau <luke@igalia.com>
|
|
|
|
|
|
Replace two StringRefs with One StringRef + 2 x size_t.
Prepare for:
* https://github.com/llvm/llvm-project/pull/164512
|
|
This patch moves llvm::identity to IndexedMap for two reasons:
- llvm::identity is used only IndexedMap.
- llvm::identity is not suitable for general use as it is not quite
the same as std::identity despite the comments in identity.h.
Also, this patch renames the class to IdentityIndex and places it in
the "detail" namespace.
|
|
Add MIR2Vec support to the llvm-ir2vec tool, enabling embedding generation for Machine IR alongside the existing LLVM IR functionality.
(This is an initial integration; Other entity/triplet gen for vocab generation would follow as separate patches)
|
|
Make call graph section to have a dedicated type instead of the generic
progbits type.
|
|
|
|
We already have a matching constructor from ArrayRef, so add support for
assigning from ArrayRef as well.
|
|
Add a flag to the GlobalValueSummaryInfo indicating whether the
associated SummaryList (all summaries with the same GUID) contains any
summaries with local linkage. This flag is set when building the index,
so it is associated with the original linkage type before
internalization and promotion. Consumers should check the
withInternalizeAndPromote() flag on the index before using it.
In most cases we expect a 1-1 mapping between a GUID and a summary with
local linkage, because for locals the GUID is computed from the hash of
"modulepath;name". However, there can be multiple locals with the same
GUID if translation units are not compiled with enough path. And in rare
but theoretically possible cases, there can be hash collisions on the
underlying MD5 computation. So to be safe when looking for local
summaries, analyses currently look through all summaries in the list.
These lists can be extremely long in the case of large binaries with
template function defs in widely used headers (i.e. linkonce_odr).
A follow on change will use this flag to reduce ThinLTO analysis time in
WPD by 5-6% for a large target (details in PR164046 which will be
reworked to use this flag).
Note that in the past we have tried to keep bits related to the GUID in
the ValueInfo (which has a pointer to the associated
GlobalValueSummaryInfo), via its PointerIntPair. However, we are out of
bits there. This change does add a byte to every GlobalValueSummaryInfo
instance, which I measured as a little under 0.90% overhead in a large
target. However, it enables adding 7 bits of other per-GUID flags in the
future without adding more overhead. Note that it was lower overhead to
add this to the GlobalValueSummaryInfo than the ValueInfo, which tends
to be copied into other maps.
|
|
Handling opcodes in embedding computation.
- Revamped MIR Vocabulary with four sections - `Opcodes`, `Common Operands`, `Physical Registers`, and `Virtual Registers`
- Operands broadly fall into 3 categories -- the generic MO types that are common across architectures, physical and virtual register classes. We handle these categories separately in MIR2Vec. (Though we have same classes for both physical and virtual registers, their embeddings vary).
|
|
|
|
Add a new getGEPExpr variant which is independent of GEPOperator*.
To be used to construct SCEVs for VPlan recipes in
https://github.com/llvm/llvm-project/pull/161276.
PR: https://github.com/llvm/llvm-project/pull/164487
|
|
This PR is part of the LLVM IR LSP server project
([RFC](https://discourse.llvm.org/t/rfc-ir-visualization-with-vs-code-extension-using-an-lsp-server/87773))
To be able to make a LSP server, it's crucial to have location
information about the LLVM objects (Functions, BasicBlocks and
Instructions).
This PR adds:
* Position tracking to the Lexer
* A new AsmParserContext class, to hold the new position info
* Tests to check if the location is correct
The AsmParserContext can be passed as an optional parameter into the
parser. Which populates it and it can be then used by other tools, such
as the LSP server.
The AsmParserContext idea was borrowed from MLIR. As we didn't want to
store data no one else uses inside the objects themselves. But the
implementation is different, this class holds several maps of Functions,
BasicBlocks and Instructions, to map them to their location.
And some utility methods were added to get the positions of the
processed tokens.
|
|
Add an index-wide flag indicating whether index-based internalization
and promotion have completed. This will be used in a follow on change.
|
|
Note that "override" makes "virtual" redundant.
Identified with modernize-use-override.
|
|
We've switched to llvm::identity_cxx20 for SparseMultiSet, so we don't
need llvm::identity in this file.
|
|
This patch replaces std::enable_if_t with std::void_t in two type
traits. Both approaches enable the template specialization if and
only if the LLVM_BITMASK_LARGEST_ENUMERATOR enumerator exists.
|
|
`ISD::VSelect` (#164069)
Fixes #150019
|
|
Refactor the AllocToken pass to accept the mode via pass options rather
than LLVM cl::opt. This is both cleaner, but also required to make the
mode frontend-driven and avoid potential inconsistencies.
|
|
Refactor the stateless (hash-based) token calculation logic out of the
`AllocToken` pass and into `llvm/Support/AllocToken.h`.
This helps with making the token calculation logic available to other
parts of the codebase, which will be necessary for frontend
implementation of `__builtin_infer_alloc_token` to perform constexpr
evaluation.
The `AllocTokenMode` enum and a new `AllocTokenMetadata` struct are
moved into a shared header. The `getAllocTokenHash()` function now
provides the source of truth for calculating token IDs for `TypeHash`
and `TypeHashPointerSplit` modes.
|