Age | Commit message (Collapse) | Author | Files | Lines |
|
Local changes were getting pretty large and complex so this is an NFC
refactor PR to simplify the upcoming changes
|
|
The "preserve input debug-info format" flag allowed some tooling to opt
into not seeing the new debug records yet, and to not autoupgrade. This
was good at the time, but un-necessary now that we'll be ditching
intrinsics shortly.
It also hides errors now: verify-uselistorder was hardcoding this flag
to on, and as a result it hasn't seen debug records before. Thus, we
missed a uselistorder variation: constant-expressions such as GEPs can
be contained within debug records and completely isolated from the value
hierachy, see the metadata-use-uselistorder.ll test. These Values didn't
get ordered, but were legitimate uses of constants like "i64 0", and we
now run into difficulty handling that. The patch to AsmWriter seeks
Values to order even through debug-info now.
Finally there are a few intrinsics-tests relying on this flag that we
can just delete, such as one in llvm-reduce and another few in the
LocalTest unit tests. For the fast-isel test, it was added in
https://reviews.llvm.org/D67703 explicitly for checking the size of
blocks without debug-info and in 1525abb9c94 the codepath it tests moved
towards being sunsetted. It'll be totally redundant once RemoveDIs is on
permanently.
Note that there's now no explicit test for the textual-IR autoupgrade
path. I submit that we can rely on the thousands of .ll files where
we've only been bothered to update the outputs, not the inputs, to debug
records.
|
|
Non-functional change as first step in
https://github.com/llvm/wg-hlsl/pull/207
Removes `Binding` from "Resource Instance" types
|
|
This was added in https://reviews.llvm.org/D26389 to help with extremely
deep SCEV expressions.
However, this is wrong since we may cache sub-SCEVs to be equivalent
that CompareValueComplexity returned 0 due to hitting the max comparison
depth.
This also improves compile time in some compiles:
https://llvm-compile-time-tracker.com/compare.php?from=34fa037c4fd7f38faada5beedc63ad234e904247&to=e241ecf999f4dd42d4b951d4a5d4f8eabeafcff0&stat=instructions:u
Similar to #100721.
Fixes #130688.
|
|
During the transition from debug intrinsics to debug records, we used
several different command line options to customise handling: the
printing of debug records to bitcode and textual could be independent of
how the debug-info was represented inside a module, whether the
autoupgrader ran could be customised. This was all valuable during
development, but now that totally removing debug intrinsics is coming
up, this patch removes those options in favour of a single flag
(experimental-debuginfo-iterators), which enables autoupgrade, in-memory
debug records, and debug record printing to bitcode and textual IR.
We need to do this ahead of removing the
experimental-debuginfo-iterators flag, to reduce the amount of
test-juggling that happens at that time.
There are quite a number of weird test behaviours related to this --
some of which I simply delete in this commit. Things like
print-non-instruction-debug-info.ll , the test suite now checks for
debug records in all tests, and we don't want to check we can print as
intrinsics. Or the update_test_checks tests -- these are duplicated with
write-experimental-debuginfo=false to ensure file writing for intrinsics
is correct, but that's something we're imminently going to delete.
A short survey of curious test changes:
* free-intrinsics.ll: we don't need to test that debug-info is a zero
cost intrinsic, because we won't be using intrinsics in the future.
* undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory
mode while we sorted something out; it works now either way.
* salvage-cast-debug-info.ll: was testing intrinsics-in-memory get
salvaged, isn't necessary now
* localize-constexpr-debuginfo.ll: was producing "dead metadata"
intrinsics for optimised-out variable values, dbg-records takes the
(correct) representation of poison/undef as an operand. Looks like we
didn't update this in the past to avoid spurious test differences.
* Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing
that debug-info affected codegen, and we deferred updating the tests
until now. This is just one of those silent gnochange issues that get
fixed by RemoveDIs.
Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc,
that checks we can autoupgrade debug intrinsics that are in bitcode into
the new debug records.
|
|
- extract KnownFPClass for future use inside of GISelKnownBits
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
|
|
EphemeralValuesCache class (#132454)"
This reverts commit 4adefcfb856aa304b7b0a9de1eec1814f3820e83.
|
|
(#132454)
This is a follow-up to https://github.com/llvm/llvm-project/pull/130210.
The EphemeralValuesAnalysis pass used to return an EphemeralValuesCache
object which used to hold the ephemeral values and used to provide a
lazy collection of the ephemeral values, and an invalidation using the
`clear()` function.
This patch removes the EphemeralValuesCache class completely and instead
returns the SmallVector containing the ephemeral values.
|
|
the CallAnalyzer (#130210)
This patch does two things:
1. It implements an ephemeral values cache analysis pass that collects the ephemeral values of a function and caches them for fast lookups. The collection of the ephemeral values is done lazily when the user calls `EphemeralValuesCache::ephValues()`.
2. It adds caching of ephemeral values using the `EphemeralValuesCache` to speed up `CallAnalyzer::analyze()`. Without caching this can take a long time to run in cases where the function contains a large number of `@llvm.assume()` calls and a large number of callsites. The time is spent in `collectEphemeralvalues()`.
|
|
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.
For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.
The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
|
|
(#125880) (#128020)
Relative to the previous attempt this includes two fixes:
* Adjust callCapturesBefore() to not skip captures(ret: address,
provenance) arguments, as these will not count as a capture
at the call-site.
* When visiting uses during stack slot optimization, don't skip
the ModRef check for passthru captures. Calls can both modref
and be passthru for captures.
------
This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.
The key API changes here are:
* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.
For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.
This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
|
|
The implementation doesn't use it, and is unlikely to use it in
the future.
The places that do set StoreCaptures=false, do so incorrectly and
would be broken if the parameter actually did anything.
|
|
(#125880)"
This reverts commit 0fab404ee874bc5b0c442d1841c7d2005c3f8729.
Seems to break LTO builds of clang on Windows, see comments on
https://github.com/llvm/llvm-project/pull/125880
|
|
spurious ref edges between new functions." (#127285)
Reverts llvm/llvm-project#116285
Breaks expensive checks build, e.g.
https://lab.llvm.org/buildbot/#/builders/16/builds/13821
|
|
ref edges between new functions. (#116285)
The addSplitRefRecursiveFunctions LazyCallGraph helper should not
require a reference between every new function. Spurious ref edges
between the new functions are allowed and the new function are
considered to be a RefSCC. This change clarifies that this is the case
in the method's description and its DEBUG mode verifier.
|
|
Relative to the previous attempt, this adjusts isEscapeSource()
to not treat calls with captures(ret: address, provenance) or similar
arguments as escape sources. This addresses the miscompile reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577
The implementation uses a helper function on CallBase to make this
check a bit more efficient (e.g. by skipping the byval checks) as
checking attributes on all arguments if fairly expensive.
------
This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.
The key API changes here are:
* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.
For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.
This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
|
|
This reverts commit ee655ca27aad466bcc54f6eba03f7e564940ad5a.
A miscompilation has been reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577
|
|
This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.
The key API changes here are:
* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.
For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.
This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
|
|
For GEPs, we have three bit widths involved: The pointer bit width, the
index bit width, and the bit width of the GEP operands.
The correct behavior here is:
* We need to sextOrTrunc the GEP operand to the index width *before*
multiplying by the scale.
* If the index width and pointer width differ, GEP only ever modifies
the low bits. Adds should not overflow into the high bits.
I'm testing this via unit tests because it's a bit tricky to test in IR
with InstCombine canonicalization getting in the way.
|
|
These demonstrate miscompiles in the existing code.
|
|
We can take advantage of the fact that we subsequently only clone cold
allocation contexts, since not cold behavior is the default, and
significantly reduce the amount of metadata (and later ThinLTO summary
and MemProfContextDisambiguation graph nodes) by pruning unnecessary not
cold contexts when building metadata from the trie.
Specifically, we only need to keep notcold contexts that overlap the
longest with cold allocations, to know how deeply to clone those
contexts to expose the cold allocation behavior.
For a large target this reduced ThinLTO bitcode object sizes by about
35%. It reduced the ThinLTO indexing time by about half and the peak
ThinLTO indexing memory by about 20%.
|
|
While we convert hot contexts to notcold contexts during the cloning
step, their existence was greatly limiting the context trimming
performed when we add the MemProf profile to the IR. To address this,
any hot contexts are converted to notcold contexts immediately after
first checking for unambiguous allocation types, and before checking it
again and before adding metadata while performing context trimming.
Note that hot hints are now disabled by default, however, this avoids
adding unnecessary overhead if they are re-enabled.
|
|
By default we were marking some contexts as hot, and adding hot hints to
unambiguously hot allocations. However, there is not yet support for
cloning to expose hot allocation contexts, and none is planned for the
forseeable future.
While we convert hot contexts to notcold contexts during the cloning
step, their existence was greatly limiting the context trimming
performed when we add the MemProf profile to the IR. This change simply
disables the generation of hot contexts / hints by default, as few
allocations were unambiguously hot.
A subsequent change will address the issue when hot hints are optionally
enabled. See PR124219 for details.
This change resulted in significant overhead reductions for a large
target:
~48% reduction in the per-module ThinLTO bitcode summary sizes
~72% reduction in the distributed ThinLTO bitcode combined summary sizes
~68% reduction in thin link time
~34% reduction in thin link peak memory
|
|
(#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.
This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).
---------
Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
|
|
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
|
|
Return `poison` for zero-sized types in `isBitwiseValue`.
|
|
We need to create symbols with "the original shape of resource and
element type" to put in the resource metadata in order to generate valid
DXIL.
Note that DXC generally doesn't emit an actual symbol outside of library
shaders (it emits an undef of a pointer to the type), but since we have
to deal with opaque pointers we would need a way to smuggle the type
through to match that. Instead, we simply emit symbols for now.
Fixed #116849
|
|
This splits the DXILResourceAnalysis pass into TypeAnalysis and
BindingAnalysis passes. The type analysis pass is made immutable and
populated lazily so that it can be used earlier in the pipeline without
needing to carefully maintain the invariants of the binding analysis.
Fixes #118400
|
|
|
|
This patch introduces the LLVM components of a type sanitizer: a
sanitizer for type-based aliasing violations.
It is based on Hal Finkel's https://reviews.llvm.org/D32198.
C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
these given TBAA metadata added by Clang. Roughly, a pointer of given
type cannot be used to access an object of a different type (with, of
course, certain exceptions). Unfortunately, there's a lot of code in the
wild that violates these rules (e.g. for type punning), and such code
often must be built with -fno-strict-aliasing. Performance is often
sacrificed as a result. Part of the problem is the difficulty of finding
TBAA violations. Hopefully, this sanitizer will help.
For each TBAA type-access descriptor, encoded in LLVM's IR using
metadata, the corresponding instrumentation pass generates descriptor
tables. Thus, for each type (and access descriptor), we have a unique
pointer representation. Excepting anonymous-namespace types, these
tables are comdat, so the pointer values should be unique across the
program. The descriptors refer to other descriptors to form a type
aliasing tree (just like LLVM's TBAA metadata does). The instrumentation
handles the "fast path" (where the types match exactly and no
partial-overlaps are detected), and defers to the runtime to handle all
of the more-complicated cases. The runtime, of course, is also
responsible for reporting errors when those are detected.
The runtime uses essentially the same shadow memory region as tsan, and
we use 8 bytes of shadow memory, the size of the pointer to the type
descriptor, for every byte of accessed data in the program. The value 0
is used to represent an unknown type. The value -1 is used to represent
an interior byte (a byte that is part of a type, but not the first
byte). The instrumentation first checks for an exact match between the
type of the current access and the type for that address recorded in the
shadow memory. If it matches, it then checks the shadow for the
remainder of the bytes in the type to make sure that they're all -1. If
not, we call the runtime. If the exact match fails, we next check if the
value is 0 (i.e. unknown). If it is, then we check the shadow for the
remainder of the byes in the type (to make sure they're all 0). If
they're not, we call the runtime. We then set the shadow for the access
address and set the shadow for the remaining bytes in the type to -1
(i.e. marking them as interior bytes). If the type indicated by the
shadow memory for the access address is neither an exact match nor 0, we
call the runtime.
The instrumentation pass inserts calls to the memset intrinsic to set
the memory updated by memset, memcpy, and memmove, as well as
allocas/byval (and for lifetime.start/end) to reset the shadow memory to
reflect that the type is now unknown. The runtime intercepts memset,
memcpy, etc. to perform the same function for the library calls.
The runtime essentially repeats these checks, but uses the full TBAA
algorithm, just as the compiler does, to determine when two types are
permitted to alias. In a situation where access overlap has occurred and
aliasing is not permitted, an error is generated.
Clang's TBAA representation currently has a problem representing unions,
as demonstrated by the one XFAIL'd test in the runtime patch. We'll
update the TBAA representation to fix this, and at the same time, update
the sanitizer.
When the sanitizer is active, we disable actually using the TBAA
metadata for AA. This way we're less likely to use TBAA to remove memory
accesses that we'd like to verify.
As a note, this implementation does not use the compressed shadow-memory
scheme discussed previously
(http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That
scheme would not handle the struct-path (i.e. structure offset)
information that our TBAA represents. I expect we'll want to further
work on compressing the shadow-memory representation, but I think it
makes sense to do that as follow-up work.
It goes together with the corresponding clang changes
(https://github.com/llvm/llvm-project/pull/76260) and compiler-rt
changes (https://github.com/llvm/llvm-project/pull/76261)
PR: https://github.com/llvm/llvm-project/pull/76259
|
|
Instead of storing an auxilliary structure with the information from the
DXIL resource target extension types duplicated, access the information
that we can via the type itself.
This also means we need to handle some of the target extension types we
haven't fully defined yet, like Texture and CBuffer. For now we make an
educated guess to what those should look like based on llvm/wg-hlsl#76,
and we can update them fairly easily when we've defined them more
thoroughly.
First part of #118400
|
|
(#119547)
This relands commit #115111.
Use traditional way to update post dominator tree, i.e. break critical
edge splitting into insert, insert, delete sequence.
When splitting critical edges, the post dominator tree may change its
root node, and `setNewRoot` only works in normal dominator tree...
See
https://github.com/llvm/llvm-project/blob/6c7e5827eda26990e872eb7c3f0d7866ee3c3171/llvm/include/llvm/Support/GenericDomTree.h#L684-L687
|
|
Reverts llvm/llvm-project#115111 Causes #119511
|
|
Support critical edge splitting in dominator tree updater. Continue the
work in #100856.
Compile time check:
https://llvm-compile-time-tracker.com/compare.php?from=87c35d782795b54911b3e3a91a5b738d4d870e55&to=42b3e5623a9ab4c3648564dc0926b36f3b438a3a&stat=instructions%3Au
|
|
(NFC). (#116715)
This commit updates the documentation for `PluginInlineAdvisorAnalysis`
based on the feedback in PR#114615 suggesting that
`registerAnalysisRegistrationCallback` should be the preferred method to
register the plugin inline advisor analysis.
|
|
(#114615)
The plugin analysis for `InlineAdvisor` and `InlineOrder` currently
relies on shared global state to keep track if the analysis is
available.
This causes issues when pipelines using plugins and pipelines not using
plugins are run in the same process.
The shared global state can be easily replaced by checking in the given
instance of `ModuleAnalysisManager` if the plugin analysis has been
registered.
|
|
[NFC] (#116453)
This PR replaces all `UndefValue` act as placeholders with `PoisonValue`
in `llvm/unittests`.
|
|
reallocarray is available in glibc since 2.29 under _DEFAULT_SOURCE and
under _GNU_SOURCE before, let's model it appropriately.
|
|
[NFC] (#115985)
Since these `UndefValue::get` are acted as placeholders, I think it's
safe to replace them with poison values.
There are a lot of `UndefValue::get` in LLVM, I'll start fixing the ones
in `unittests` while fixing the regression tests.
|
|
See the following case:
```
@GlobIntONE = global i32 0, align 4
define ptr @src() {
entry:
br label %for.body.peel.begin
for.body.peel.begin: ; preds = %entry
br label %for.body.peel
for.body.peel: ; preds = %for.body.peel.begin
br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel
cleanup.loopexit.peel: ; preds = %for.body.peel
br label %cleanup.peel
cleanup.peel: ; preds = %cleanup.loopexit.peel, %for.body.peel
%retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
br i1 true, label %for.body.peel.next, label %cleanup7
for.body.peel.next: ; preds = %cleanup.peel
br label %for.body.peel.next1
for.body.peel.next1: ; preds = %for.body.peel.next
br label %entry.peel.newph
entry.peel.newph: ; preds = %for.body.peel.next1
br label %for.body
for.body: ; preds = %cleanup, %entry.peel.newph
%retval.0 = phi ptr [ %retval.2.peel, %entry.peel.newph ], [ %retval.2, %cleanup ]
br i1 false, label %cleanup, label %cleanup.loopexit
cleanup.loopexit: ; preds = %for.body
br label %cleanup
cleanup: ; preds = %cleanup.loopexit, %for.body
%retval.2 = phi ptr [ %retval.0, %for.body ], [ @GlobIntONE, %cleanup.loopexit ]
br i1 false, label %for.body, label %cleanup7.loopexit
cleanup7.loopexit: ; preds = %cleanup
%retval.2.lcssa.ph = phi ptr [ %retval.2, %cleanup ]
br label %cleanup7
cleanup7: ; preds = %cleanup7.loopexit, %cleanup.peel
%retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
ret ptr %retval.2.lcssa
}
define ptr @tgt() {
entry:
br label %for.body.peel.begin
for.body.peel.begin: ; preds = %entry
br label %for.body.peel
for.body.peel: ; preds = %for.body.peel.begin
br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel
cleanup.loopexit.peel: ; preds = %for.body.peel
br label %cleanup.peel
cleanup.peel: ; preds = %cleanup.loopexit.peel, %for.body.peel
%retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
br i1 true, label %for.body.peel.next, label %cleanup7
for.body.peel.next: ; preds = %cleanup.peel
br label %for.body.peel.next1
for.body.peel.next1: ; preds = %for.body.peel.next
br label %entry.peel.newph
entry.peel.newph: ; preds = %for.body.peel.next1
br label %for.body
for.body: ; preds = %cleanup, %entry.peel.newph
br i1 false, label %cleanup, label %cleanup.loopexit
cleanup.loopexit: ; preds = %for.body
br label %cleanup
cleanup: ; preds = %cleanup.loopexit, %for.body
br i1 false, label %for.body, label %cleanup7.loopexit
cleanup7.loopexit: ; preds = %cleanup
%retval.2.lcssa.ph = phi ptr [ %retval.2.peel, %cleanup ]
br label %cleanup7
cleanup7: ; preds = %cleanup7.loopexit, %cleanup.peel
%retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
ret ptr %retval.2.lcssa
}
```
1. `simplifyInstruction(%retval.2.peel)` returns `@GlobIntONE`. Thus,
`ScalarEvolution::createNodeForPHI` returns SCEV expr `@GlobIntONE` for
`%retval.2.peel`.
2. `SimplifyIndvar::replaceIVUserWithLoopInvariant` tries to replace the
use of `%retval.2.peel` in `%retval.2.lcssa.ph` with `@GlobIntONE`.
3. `simplifyLoopAfterUnroll -> simplifyLoopIVs -> SCEVExpander::expand`
reuses `%retval.2.peel = phi ptr [ undef, %for.body.peel ], [
@GlobIntONE, %cleanup.loopexit.peel ]` to generate code for
`@GlobIntONE`. It is incorrect.
This patch disallows simplifying `phi(undef, X)` to `X` by setting
`CanUseUndef` to false.
Closes https://github.com/llvm/llvm-project/issues/114879.
|
|
This patch fixes:
third-party/unittest/googletest/include/gtest/gtest.h:1379:11:
error: comparison of integers of different signs: 'const unsigned
int' and 'const int' [-Werror,-Wsign-compare]
|
|
This patch adds a new analysis pass to track a set of passes and their
parameters to see if we can avoid running transform passes that have
just been run. The current implementation only skips redundant
InstCombine runs. I will add support for other passes in follow-up
patches.
RFC link:
https://discourse.llvm.org/t/rfc-pipeline-avoid-running-transform-passes-that-have-just-been-run/82467
Compile time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=76007138f4ffd4e0f510d12b5e8cad529c21f24d&to=64134cf07ea7eb39c60320087c0c5afdc16c3a2b&stat=instructions%3Au
|
|
StructType::setBody is the only mechanism that can potentially create
recursion in the type system. Add a runtime check that it is not
actually used to create recursion.
If the check fails, report an error from LLParser, BitcodeReader and
IRLinker. In all other cases assert that the check succeeds.
In future StructType::setBody will be removed in favor of specifying the
body when the type is created, so any performance hit from this runtime
check will be temporary.
|
|
This patch adds basic support for `hypot`. Constant folding support will
be submitted in a subsequent patch.
Related issue: https://github.com/llvm/llvm-project/issues/113711
Note: It's my first time contributing to the LLVM with encouragement
from one of my friends, @fawdlstty. I learned a lot from
https://github.com/llvm/llvm-project/pull/99611, and thanks for that.
Note: I had created the same PR and merged
(https://github.com/llvm/llvm-project/pull/113724), but reverted caused
by the merging issue. (The CI issue happened in 3 A.M. at my timezone.
So, I need to fall asleep again after I replied about why issue
happened.) So, I rebased to the latest main branch and recreate the PR
and hope I won't have the third time to create the same PR.
I hope @arsenm can help me review the code again. I’m sorry for that.
Kenji Mouri
|
|
Reverts llvm/llvm-project#113724
|
|
This patch adds basic support for `hypot`. Constant folding support will
be submitted in a subsequent patch.
Related issue: https://github.com/llvm/llvm-project/issues/113711
Note: It's my first time contributing to the LLVM with encouragement
from one of my friends, @fawdlstty. I learned a lot from
https://github.com/llvm/llvm-project/pull/99611, and thanks for that.
Kenji Mouri
|
|
This patch adds the `tgamma` libcall.
|
|
Enable building InlineAdvisorPlugin and InlineOrderPlugin on windows for
shared library builds.
This is part of the work to enable LLVM_BUILD_LLVM_DYLIB and LLVM
plugins on window.
|
|
This patch adds basic support for `scalbln, scalblnf, scalblnl, scalbn,
scalbnf, scalbnl`. Constant folding support will be submitted in a
subsequent patch.
Related issue: <#112631>
|
|
This patch adds the `ilogb` libcall. Constant folding will be handled in
subsequent patches.
|