Age | Commit message (Collapse) | Author | Files | Lines |
|
objects (#104778)
Whilst dealing with review comments on
https://github.com/llvm/llvm-project/pull/96752
I discovered that SCEV does not know about the dereferenceable attribute
on function arguments so I have updated getRangeRef to make use of it
by calling getPointerDereferenceableBytes.
|
|
This is a reland of (#96287). This patch attempts to reduce the reverted
patch's clang compile time by removing #includes of float128.h and
inlining convertToQuad functions instead.
|
|
Analogous to PR #104491
Issue #89287
|
|
Eventually we'll need to flatten the profile (at the end of all IPO) and lower to "vanilla" `MD_prof`. This is the first part of that.
Issue #89287
|
|
- Move raw_ostream << operators for `ModRef` and `MemoryEffects` to a
new ModRef.cpp file under llvm/Support (instead of AliasAnalysis.cpp)
- This enables calling these operators from `Core` files like
Instructions.cpp (for instance for debugging). Currently, they live in
`LLVMAnalysis` which cannot be linked with `Core`.
|
|
This on its own gives small compile-time improvements in some configs
and enables using loop guards at more places in the future while keeping
compile-time impact low.
https://llvm-compile-time-tracker.com/compare.php?from=c44202574ff9a8c0632aba30c2765b134557435f&to=55ffc3dd920fa9af439fd39f8f9cc13509531420&stat=instructions:u
|
|
This will be needed when maintaining the contextual profile for ICP or inlining - we'll need to first fetch the ID of a callsite, which is in an instrumentation instruction (intrinsic) preceding the callsite.
|
|
Analysis (#104828)
Add Validator Version to information collected by Module Metadata
Analysis pass. An earlier change (#104040) added a default hardcoded
value for validator version to be associated with DXIL module created
during HLSL source compilation.
Add tests to verify validator version info collected
- Updated existing tests
- Added a test with validator version specified in DXIL metadata
|
|
This reverts commit 4aacc60fe7e1f7b3f788bba8382ea1fa5189ef3b.
The original implementation provided a simple method to check whether the forest
of nested cycles is well-formed. This is now augmented with other methods to
check well-formedness of every cycle, either individually, or as the entire
forest. These will be used by future transforms that modify CycleInfo.
|
|
This reverts commit b432afc28406b670a58933c2fe56c73e6f85911e.
Reverted due to linker failures in expensive-checks.
|
|
Use the nuw attribute of GEPs to prove that pointers do not alias, in
cases matching the following:
+ + +
| BaseOffset | +<nuw> Indices |
---------------->|-------------------->|
|-->V2Size | |-------> V1Size
LHS RHS
If the difference between pointers is Offset +<nuw> Indices then we know
that the addition does not wrap the pointer index type (add nuw) and the
constant Offset is a lower bound on the distance between the pointers. We
can then prove NoAlias via Offset u>= V2Size.
|
|
The original implementation provided a simple method to check whether
the forest of nested cycles is well-formed. This is now augmented with
other methods to check well-formedness of all cycles, either
invdividually, or as the entire forest. These will be used by future
transforms that modify CycleInfo.
|
|
`isKnownNonEqual`; NFC
Downstream hit this assert, since it doesn't really make any
difference, just change code to return false.
|
|
X, Y` (#104698)
These patterns are found in harfbuzz/typst.
Alive2: https://alive2.llvm.org/ce/z/cxyjYV
|
|
This transformation doesn't actually use any of the internal state of
LSR and recomputes all information from SCEV. Splitting it out makes
it easier to test.
Note that long term I would like to write a version of this transform
which *is* integrated with LSR's solver, but if that happens, we'll
just delete the extra pass.
Integration wise, I switched from using TTI to using a pass configuration
variable. This seems slightly more idiomatic, and means we don't run
the extra logic on any target other than RISCV.
|
|
Remove unused directly includes and forward declarations in ADT and
Analysis headers.
|
|
buffers" (#104517)
Some build configs allow `llvm_unreachable` in a constexpr context, but
not all, so these functions that map a fully covered enum to a string
can't be constexpr. This version fixes that by dropping constexpr from
those functions.
This reverts commit fcc318ff7960d7de8cbac56eb4f32b44b5261677, reapplying
28d577ecefa1557f5dea5566bf33b885c563d14b.
Original message follows:
This implements the DXILResourceAnalysis pass for `dx.TypedBuffer` and
`dx.RawBuffer` types. This should be sufficient to lower
`dx.handle.fromBinding` for this set of types, but it leaves a number of
TODOs around for other resource types.
This also includes a straightforward `print` method in `ResourceInfo` to
make the analysis testable. This is deliberately different than the
printer in `lib/Target/DirectX/DXILResource.cpp`, which attempts to
print bindings in a format compatible with the comments `dxc` prints. We
will eventually want to make that functionality driven by this analysis
pass, but it isn't sufficient for testing so we need both.
|
|
Use computeConstantDifference() instead of casting getMinusSCEV() to
SCEVConstant. This can be much faster in some cases, because
computeConstantDifference() computes the result without creating new
SCEV expressions.
This improves LTO/ThinLTO compile-time for lencod by more than 10%.
I've verified that computeConstantDifference() does not produce worse
results than the previous code for anything in llvm-test-suite. This
required raising the iteration cutoff to 6. I ended up increasing it to
8 just to be on the safe side (for code outside llvm-test-suite), and
because this doesn't materially affect compile-time anyway (we'll almost
always bail out earlier).
|
|
|
|
buffers" (#104504)
Reverts llvm/llvm-project#100699
This broke a few bots unfortunately.
|
|
`UseCtxProfile` (#104492)
|
|
We were missing the signed flag on the negative value, so the
range was incorrectly interpreted for integers larger than 64-bit.
Split out from https://github.com/llvm/llvm-project/pull/80309.
|
|
|
|
Continuing from #102084, which introduced the analysis, we now populate
it with info about functions contained in the module.
When we will update the profile due to e.g. inlined callsites, we'll
ingest the callee's counters and callsites to the caller. We'll move
those to the caller's respective index space (counter and callers), so
we need to know and maintain where those currently end.
We also don't need to keep profiles not pertinent to this module.
This patch also introduces an arguably much simpler way to track the
GUID of a function from the frontend compilation, through ThinLTO, and
into the post-thinlink compilation step, which doesn't rely on keeping
names around. A separate RFC and patches will discuss extending this to
the current PGO (instrumented and sampled) and other consumers as an
infrastructural component.
|
|
This implements the DXILResourceAnalysis pass for `dx.TypedBuffer` and
`dx.RawBuffer` types. This should be sufficient to lower
`dx.handle.fromBinding` for this set of types, but it leaves a number
of TODOs around for other resource types.
This also includes a straightforward `print` method in `ResourceInfo`
to make the analysis testable. This is deliberately different than the
printer in `lib/Target/DirectX/DXILResource.cpp`, which attempts to
print bindings in a format compatible with the comments `dxc` prints.
We will eventually want to make that functionality driven by this
analysis pass, but it isn't sufficient for testing so we need both.
Pull Request: https://github.com/llvm/llvm-project/pull/100699
|
|
Broke this out into its own commit to make the next one easier to
review.
Pull Request: https://github.com/llvm/llvm-project/pull/100700
|
|
for explicit symbol visibility (#103900)
In multiple source files function definitions never sees there
declaration in a header because its never included causing linker errors
when explicit symbol visibility macros\dllexport are added to the
declarations.
Most of these were originally found by @tstellar in
https://github.com/llvm/llvm-project/pull/67502
TargetRegistry.h is needed in MCExternalSymbolizer.cpp for
createMCSymbolizer
Analysis/Passes.h is needed in LazyValueInfo.cpp and RegionInfo.cpp for
createLazyValueInfoPassin and createRegionInfoPass
Transforms/Scalar.h is needed in SpeculativeExecution.cpp for
createSpeculativeExecutionPass
|
|
|
|
Fixes #103500
|
|
Split out from #98608.
|
|
This reverts commit 3cab7c555ad6451f2b1b4dc918a4b4f4e4a3e45d.
The modified test fails on ppc64le buildbots.
|
|
This is a reland of #96287. This change makes tests in logf128.ll ignore
the sign of NaNs for negative value tests and moves an #include <cmath>
to be blocked behind #ifndef _GLIBCXX_MATH_H.
|
|
Inside computeConstantDifference(), handle the case where both sides are
of the form `C * %x`, in which case we can strip off the common
multiplication (as long as we remember to multiply by it for the
following difference calculation).
There is an obvious alternative implementation here, which would be to
directly decompose multiplies inside the "Multiplicity" accumulation.
This does work, but I've found this to be both significantly slower
(because everything has to work on APInt) and more complex in
implementation (e.g. because we now need to match back the new More/Less
with an arbitrary factor) without providing more power in practice. As
such, I went for the simpler variant here.
This is the last step to make computeConstantDifference() sufficiently
powerful to replace existing uses of
`cast<SCEVConstant>(getMinusSCEV())` with it.
|
|
Previously the return types of __size_returning_new variants were not
validated based on their members. This patch checks the members
manually, also generalizes the size_t checks to be based on the module
instead of being hardcoded.
As requested in followup comment on
https://github.com/llvm/llvm-project/pull/101564.
|
|
NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero indices.
|
|
Split out from https://github.com/llvm/llvm-project/pull/80309.
|
|
A dominance query of a block that is in a different function is
ill-defined, so assert that getNode() is only called for blocks that are
in the same function.
There are three cases, where this behavior did occur. LoopFuse didn't
explicitly do this, but didn't invalidate the SCEV block dispositions,
leaving dangling pointers to free'ed basic blocks behind, causing
use-after-free. We do, however, want to be able to dereference basic
blocks inside the dominator tree, so that we can refer to them by a
number stored inside the basic block.
Reverts #102780
Reland #101198
Fixes #102784
Co-authored-by: Alexis Engelke <engelke@in.tum.de>
|
|
/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:12009:21:
error: loop variable '[S, Mul]' creates a copy from type 'const value_type' (aka 'const llvm::detail::DenseMapPair<const llvm::SCEV *, int>') [-Werror,-Wrange-loop-construct]
for (const auto [S, Mul] : Multiplicity) {
^
/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:12009:10:
note: use reference type 'const value_type &' (aka 'const llvm::detail::DenseMapPair<const llvm::SCEV *, int> &') to prevent copying
for (const auto [S, Mul] : Multiplicity) {
^~~~~~~~~~~~~~~~~~~~~
&
|
|
computeConstantDifference() can currently look through addrecs with
identical steps, and then through adds with identical operands (apart
from constants).
However, it fails to handle minor variations, such as two nested add
recs, or an outer add with an inner addrec (rather than the other way
around).
This patch supports these cases by adding a loop over the
simplifications, limited to a small number of iterations. The motivation
is the same as in #101339, to make
computeConstantDifference() powerful enough to replace existing uses of
`dyn_cast<SCEVConstant>(getMinusSCEV())` with it. Though as the IR test
diff shows, other callers may also benefit.
|
|
The constructor initializes `*this` with `M->getDataLayout()`, which
is effectively the same as calling the copy constructor.
There does not seem to be a case where a copy would be necessary.
Pull Request: https://github.com/llvm/llvm-project/pull/102841
|
|
(#101404)
The canAssumeNoSelfWrap routine in howManyLessThans was doing two subtly
inter-related things. First, it was proving no-self-wrap. This exactly
duplicates the existing logic in the caller. Second, it was establishing
the precondition for the nw->nsw/nuw inference. Specifically, we need to
know that *this* exit must be taken for the inference to be sound.
Otherwise, another (possible abnormal) exit could be taken in the
iteration where this IV would become poison.
This change moves all of that logic into the caller, and caches the
resulting nuw/nsw flags in the AddRec. This centralizes the logic in one
place, and makes it clear that it all depends on controlling the sole
exit.
We do loose a couple cases with SCEV predication. Specifically, if SCEV
predication was able to convert e.g. zext(addrec) into an addrec(zext)
using predication, but didn't record the nuw fact on the new addrec,
then the consuming code can no longer fix this up. I don't think this
case particularly matters.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
|
|
DXIL Metadata Analysis passes (one for legacy PM and one for new PM)
that collect following DXIL module metadata information in a structure
are added.
1. Shader Model version
2. DXIL version
3. Shader Stage
Information collected using the legacy pass is verified by adding
additional test commands to existing metadata test sources.
|
|
Split out from https://github.com/llvm/llvm-project/pull/80309.
|
|
The Mul factor was zero-extended here, resulting in incorrect
results for integers larger than 64-bit.
As we currently only multiply by 1 or -1, just split this into
two cases -- there's no need for a full multiplication here.
Fixes https://github.com/llvm/llvm-project/issues/102597.
|
|
|
|
Reverts llvm/llvm-project#101198
Breaks multiple bots:
https://lab.llvm.org/buildbot/#/builders/72/builds/2103
https://lab.llvm.org/buildbot/#/builders/164/builds/1909
https://lab.llvm.org/buildbot/#/builders/66/builds/2706
|
|
|
|
A dominance query of a block that is in a different function is
ill-defined, so assert that getNode() is only called for blocks that are
in the same function.
There are two cases, where this behavior did occur. LoopFuse didn't
explicitly do this, but didn't invalidate the SCEV block dispositions,
leaving dangling pointers to free'ed basic blocks behind, causing
use-after-free. We do, however, want to be able to dereference basic
blocks inside the dominator tree, so that we can refer to them by a
number stored inside the basic block.
|
|
Without this patch, the constructor arguments come from
SmallVectorImpl, not ArrayRef. This patch switches them to ArrayRef
so that we can construct SmallVector with a single argument.
Note that LLVM Programmer’s Manual prefers ArrayRef to SmallVectorImpl
for flexibility.
|
|
If nobuiltin is set, directly return nullptr instead of using a
separate out parameter and having all callers check this.
|