Age | Commit message (Collapse) | Author | Files | Lines |
|
Make HashRecognize a non-PassManager analysis that can be called to get
the result on-demand, creating a new getResult() entry-point. The issue
was discovered when attempting to use the analysis to perform a
transform in LoopIdiomRecognize.
|
|
This pass figures out whether inlining has exposed a constant address to
a lowered type test, and remove the test if so and the address is known
to pass the test. Unfortunately this pass ends up needing to reverse
engineer what LowerTypeTests did; this is currently inherent to the design
of ThinLTO importing where LowerTypeTests needs to run at the start.
Reviewers: teresajohnson
Reviewed By: teresajohnson
Pull Request: https://github.com/llvm/llvm-project/pull/141327
|
|
Most of the recent development on the MemProfiler has been on the Use part. The instrumentation has been quite stable for a while. As the complexity of the use grows (with undrifting, diagnostics etc) I figured it would be good to separate these two implementations.
|
|
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
|
|
Currently if there's any memory access that AccessAnalysis couldn't
analyze then all of the runtime pointer check results are discarded.
This patch makes this able to be controlled with the AllowPartial
option, which makes it so we generate the runtime check information
for those pointers that we could analyze, as transformations may still
be able to make use of the partial information.
Of the transformations that use LoopAccessAnalysis, only
LoopVersioningLICM changes behaviour as a result of this change. This is
because the others either:
* Check canVectorizeMemory, which will return false when we have partial
pointer information as analyzeLoop() will return false.
* Examine the dependencies returned by getDepChecker(), which will be
empty as we exit analyzeLoop if we have partial pointer information
before calling areDepsSafe(), which is what fills in the dependency
information.
|
|
Introduce a fresh analysis for recognizing polynomial hashes, with the
rationale that several targets have specific instructions to optimize
things like CRC and GHASH (eg. X86 and RISC-V crypto extension). We
limit the scope to polynomial hashes computed in a Galois field of
characteristic 2, since this class of operations can also be optimized
in the absence of target-specific instructions to use a lookup table.
At the moment, we only recognize the CRC algorithm.
RFC:
https://discourse.llvm.org/t/rfc-new-analysis-for-polynomial-hash-recognition/86268
|
|
- No need to prefix `PointerType` with `llvm::`.
- Avoid namespace block to define `PrintPipelinePasses`.
|
|
|
|
|
|
This PR introduces IR2Vec as an analysis pass. The changes include:
- Logic for generating Symbolic encodings.
- 75D learned vocabulary.
- lit tests.
Here is the link to the RFC -
https://discourse.llvm.org/t/rfc-enhancing-mlgo-inlining-with-ir2vec-embeddings
Acknowledgements: contributors -
https://github.com/IITH-Compilers/IR2Vec/graphs/contributors
---------
Co-authored-by: svkeerthy <venkatakeerthy@google.com>
Co-authored-by: Mircea Trofin <mtrofin@google.com>
|
|
(#131005)
When we enable EVL-based loop vectorization w/ predicated tail-folding,
each vectorized loop has effectively two induction variables: one
calculates the step using (VF x vscale) and the other one increases the
IV by values returned from experiment.get.vector.length. The former,
also known as canonical IV, is more favorable for analyses as it's
"countable" in the sense of SCEV; the latter (EVL-based IV), however, is
more favorable to codegen, at least for those that support scalable
vectors like AArch64 SVE and RISC-V.
The idea is that we use canonical IV all the way until the end of all
vectorizers, where we replace it with EVL-based IV using EVLIVSimplify
introduced here. Such that we can have the best from both worlds.
This Pass is enabled by default in RISC-V. However, since we haven't
really vectorize loops with predicate tail-folding by default, this Pass
is no-op at this moment.
|
|
This adds a GISelValueTrackingPrinterPass that can print the known bits
and sign bit of each def in a function. It is built on the new pass
manager and so adds a NPM GISelValueTrackingAnalysis, renaming the older
class to GISelValueTrackingAnalysisLegacy.
The first 2 functions from the AArch64GISelMITest are ported over to an
mir test to show it working. It also runs successfully on all files in
llvm/test/CodeGen/AArch64/GlobalISel/*.mir that are not invalid. It can
hopefully be used to test GlobalISel known bits analysis more directly
in common cases, without jumping through the hoops that the C++ tests
requires.
|
|
`DXILResourceBindingAnalysis` analyses explicit resource bindings in the
module and puts together lists of used virtual register spaces and
available virtual register slot ranges for each binding type. It also
stores additional information found during the analysis such as whether
the module uses implicit bindings or if any of the bindings overlap.
This information will be used in `DXILResourceImplicitBindings` pass
(coming soon) to assign register slots to resources with implicit
bindings, and in a post-optimization validation pass that will raise
diagnostic about overlapping bindings.
Part 1/2 of #136786
|
|
In this change, NVPTX AA is moved before Basic AA to potentially improve
compile time. Additionally, it introduces a flag in the
`ExternalAAWrapper` that allows other backends to run their
target-specific AA passes before Basic AA, if desired.
The change works for both New Pass Manager and Legacy Pass Manager.
Original implementation by Princeton Ferro <pferro@nvidia.com>
|
|
/llvm-project/llvm/lib/Passes/PassBuilder.cpp:1508:2:
error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
1508 | };
| ^
1 error generated.
|
|
|
|
|
|
- Add new pass manager version of `MachineUniformityAnalysis `.
- Query `TargetTransformInfo` in new pass manager version.
- Use `printAsOperand` when printing machine function name
|
|
|
|
|
|
|
|
Didn't find a test for this (but there are tests for the `Function`
version of this pass)
|
|
|
|
This completes the PreEmitPasses.
|
|
If a hot callsite function is not inlined in the 1st build, inlining the
hot callsite in pre-link stage of SPGO 2nd build may lead to Function
Sample not found in profile file in link stage. It will miss some
profile info.
ThinLTO has already considered and dealed with it by setting
HotCallSiteThreshold to 0 to stop the inline. This patch just adds the
same processing for FullLTO.
|
|
|
|
When `llvm-extract`-ing a function, and the `CG Profile` flag is present in the original module, we end up with lots of `!{null, null, i64 1234}` entries for call edges that have disappeared as result of the removed functions.
This patch fixes that by adding a pass to `llvm-extract` that finds `CG Profile` edges with one or both operands `null` and removes them. This results in a cleaner output.
|
|
|
|
Flatten the profile pre-thinlink so that ThinLTO has something to work with for the parts of the binary that aren't covered by contextual profiles. Post-thinlink, the flattener is re-run and will actually change profile info, but just for the modules containing contextual trees ("specialized modules"). For the rest, the flattener just yanks out the instrumentation.
|
|
When coroutines are used w/ both -ffat-lto-objects and -flto=thin,
the coroutine passes are not added to the optimization pipelines.
Ensure they are added before ModuleOptimization to generate a
working ELF object.
Fixes #134409.
|
|
Non-functional change as first step in
https://github.com/llvm/wg-hlsl/pull/207
Removes `Binding` from "Resource Instance" types
|
|
Not sure why the "fold-all" option naming didn't match the
variable "FoldPreOutputs", but I've preserved the difference.
More annoyingly, the pass name "normalize" does not match the pass
name IRNormalizer and should probably be fixed one way or the other.
Also the existing test coverage for the flags is lacking. I've added
a test that shows they parse, but we should have tests that they
do something.
|
|
|
|
|
|
|
|
the CallAnalyzer (#130210)
This patch does two things:
1. It implements an ephemeral values cache analysis pass that collects the ephemeral values of a function and caches them for fast lookups. The collection of the ephemeral values is done lazily when the user calls `EphemeralValuesCache::ephValues()`.
2. It adds caching of ephemeral values using the `EphemeralValuesCache` to speed up `CallAnalyzer::analyze()`. Without caching this can take a long time to run in cases where the function contains a large number of `@llvm.assume()` calls and a large number of callsites. The time is spent in `collectEphemeralvalues()`.
|
|
For the NewPM, the merge-const option was assigned to an unused
option field. Assign it to the correct one. The merge-const-aggressive
option was not supported -- and invalid options were silently ignored.
Accept it and error on invalid options.
For the LegacyPM, the corresponding cl::opt options were ignored when
called via opt rather than llc.
|
|
|
|
-print-before-pass-number (#130983)
|
|
This is meant as a preparation for PR #130988 "[AMDGPU] Implement IR
expansion for frem instruction" which implements the expansion of
another instruction in this pass. The more general name seems more
appropriate given this change and quite reasonable even without it.
|
|
|
|
|
|
EnableTailMerge is false by default and is handled by the pass builder.
Passes are independent of target pipeline options.
This completes the generic `MachineLateOptimization` passes for the NPM
pipeline.
|
|
|
|
|
|
|
|
Reverts llvm/llvm-project#128654
Breaks FatLTO
https://github.com/llvm/llvm-project/pull/128654#issuecomment-2700053700
|
|
|
|
|
|
|