Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
the CallAnalyzer (#130210)
This patch does two things:
1. It implements an ephemeral values cache analysis pass that collects the ephemeral values of a function and caches them for fast lookups. The collection of the ephemeral values is done lazily when the user calls `EphemeralValuesCache::ephValues()`.
2. It adds caching of ephemeral values using the `EphemeralValuesCache` to speed up `CallAnalyzer::analyze()`. Without caching this can take a long time to run in cases where the function contains a large number of `@llvm.assume()` calls and a large number of callsites. The time is spent in `collectEphemeralvalues()`.
|
|
For the NewPM, the merge-const option was assigned to an unused
option field. Assign it to the correct one. The merge-const-aggressive
option was not supported -- and invalid options were silently ignored.
Accept it and error on invalid options.
For the LegacyPM, the corresponding cl::opt options were ignored when
called via opt rather than llc.
|
|
|
|
This is meant as a preparation for PR #130988 "[AMDGPU] Implement IR
expansion for frem instruction" which implements the expansion of
another instruction in this pass. The more general name seems more
appropriate given this change and quite reasonable even without it.
|
|
|
|
|
|
EnableTailMerge is false by default and is handled by the pass builder.
Passes are independent of target pipeline options.
This completes the generic `MachineLateOptimization` passes for the NPM
pipeline.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Targets can set the EnableSinkAndFold option in CGPassBuilderOptions for
the NPM pipeline in buildCodeGenPipeline(... &Opts, ...)
|
|
|
|
Use `-passes="regallocgreedy<[all|sgpr|wwm|vgpr]>` to insert the greedy
RA with a filter and `-regalloc-npm=<type>` to control which RA to use
in existing pipeline.
|
|
Leaving out NPM command line support for the next patch.
|
|
There are no standalone tests for this pass for backends implementing
the NPM yet.
|
|
Making all reg alloc classes have an `::Option` class makes things nicer
to construct them.
|
|
Similar to #117309.
The advisor and logger are accessed through the provider, which is
served by the new PM. Legacy PM forwards calls to the provider.
New PM is a machine function analysis that lazily initializes the
provider.
|
|
Legacy pass used to provide the advisor, so this extracts that logic
into a provider class used by both analysis passes.
All three (Default, Release, Development) legacy passes
`*AdvisorAnalysis` are basically renamed to `*AdvisorProvider`, so the
actual legacy wrapper passes are `*AdvisorAnalysisLegacy`.
There is only one NPM analysis `RegAllocEvictionAnalysis` that switches
between the three providers in the `::run` method, to be cached by the
NPM.
Also adds `RequireAnalysis<RegAllocEvictionAnalysis>` to the optimized
target reg alloc codegen builder.
|
|
When using FatLTO, it is common to want to enable certain types of whole
program optimizations (WPD) or security transforms (CFI), so that they
can be made available when performing LTO. However, these transforms
should not be used when compiling the non-LTO object code. Since the
frontend must emit different IR, we cannot simply clone the module and
optimize the LTO section and non-LTO section differently to work around
this. Instead, we need to remove any problematic instruction sequences.
This patch adds a new pass whose responsibility is to clean up the IR
in the FatLTO pipeline after creating the bitcode section, which is
after running the pre-link pipeline but before running module
optimization. This allows us to safely drop any conflicting instructions
or IR constructs that are inappropriate for non-LTO compilation.
|
|
`RegisterClassInfo` was supposed to be kept alive between pass runs,
which wasn't being done leading to recomputations increasing the compile
time.
Now the Impl class is a member of the legacy and new passes so that it
is not reconstructed on every pass run.
---------
Co-authored-by: Christudasan Devadasan <christudasan.devadasan@amd.com>
|
|
This reverts commit 5aa4979c47255770cac7b557f3e4a980d0131d69 while I
investigate what's causing the compile-time regression.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This patch implements an LLVM IR pass, named kernel-info, that reports
various statistics for codes compiled for GPUs. The ultimate goal of
these statistics to help identify bad code patterns and ways to mitigate
them. The pass operates at the LLVM IR level so that it can, in theory,
support any LLVM-based compiler for programming languages supporting
GPUs. It has been tested so far with LLVM IR generated by Clang for
OpenMP offload codes targeting NVIDIA GPUs and AMD GPUs.
By default, the pass runs at the end of LTO, and options like
``-Rpass=kernel-info`` enable its remarks. Example `opt` and `clang`
command lines appear in `llvm/docs/KernelInfo.rst`. Remarks include
summary statistics (e.g., total size of static allocas) and individual
occurrences (e.g., source location of each alloca). Examples of its
output appear in tests in `llvm/test/Analysis/KernelInfo`.
|
|
LowerAllowCheckPass (#124211)
This adds and utilizes a cutoffs parameter for LowerAllowCheckPass, via the Options parameter (introduced in https://github.com/llvm/llvm-project/pull/122994).
Future work will connect -fsanitize-skip-hot-cutoff (introduced patch in https://github.com/llvm/llvm-project/pull/121619) in the clang frontend to the cutoffs parameter used here.
|
|
(#122833) (#122994)
This reverts commit 1515caf7a59dc20cb932b724b2ef5c1d1a593427
(https://github.com/llvm/llvm-project/pull/122833) i.e., relands
7d8b4eb0ead277f41ff69525ed807f9f6e227f37
(https://github.com/llvm/llvm-project/pull/122765), with LowerAllowCheckPass::Options moved inside the callback to fix a stack use-after-scope error.
---------
Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
|
|
|
|
Original commit: eb63cd62a4a1907dbd58f12660efd8244e7d81e9
Previously reverted due to non-negligible compile-time impact in
stage1-ReleaseLTO-g scenario. The issue has been addressed by
always reusing previously computed MemorySSA results, and request
new ones only when `isMemorySSAEnabled` is set.
Co-authored-by: Momchil Velikov <momchil.velikov@arm.com>
|
|
(#122833)
Reverts llvm/llvm-project#122765
Reason: buildbot breakage
(https://lab.llvm.org/buildbot/#/builders/46/builds/10393)
```
z:\b\llvm-clang-x86_64-sie-win\build\bin\clang.exe -cc1 -internal-isystem Z:\b\llvm-clang-x86_64-sie-win\build\lib\clang\20\include -nostdsysteminc -triple x86_64-pc-linux-gnu -emit-llvm -o - Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\CodeGen\allow-ubsan-check-inline.c -fsanitize=signed-integer-overflow -mllvm -ubsan-guard-checks -O3 -mllvm -lower-allow-check-random-rate=1 -Rpass=lower-allow-check -Rpass-missed=lower-allow-check -fno-inline 2>&1 | z:\b\llvm-clang-x86_64-sie-win\build\bin\filecheck.exe Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\CodeGen\allow-ubsan-check-inline.c --check-prefixes=NOINL --implicit-check-not="remark:"
# executed command: 'z:\b\llvm-clang-x86_64-sie-win\build\bin\clang.exe' -cc1 -internal-isystem 'Z:\b\llvm-clang-x86_64-sie-win\build\lib\clang\20\include' -nostdsysteminc -triple x86_64-pc-linux-gnu -emit-llvm -o - 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\CodeGen\allow-ubsan-check-inline.c' -fsanitize=signed-integer-overflow -mllvm -ubsan-guard-checks -O3 -mllvm -lower-allow-check-random-rate=1 -Rpass=lower-allow-check -Rpass-missed=lower-allow-check -fno-inline
# note: command had no output on stdout or stderr
# error: command failed with exit status: 0xc0000409
# executed command: 'z:\b\llvm-clang-x86_64-sie-win\build\bin\filecheck.exe' 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\CodeGen\allow-ubsan-check-inline.c' --check-prefixes=NOINL --implicit-check-not=remark:
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line: z:\b\llvm-clang-x86_64-sie-win\build\bin\filecheck.exe Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\CodeGen\allow-ubsan-check-inline.c --check-prefixes=NOINL --implicit-check-not=remark:
# `-----------------------------
# error: command failed with exit status: 2
```
|
|
This is glue code to convert LowerAllowCheckPass from a FUNCTION_PASS to
FUNCTION_PASS_WITH_PARAMS. The parameters are currently unused.
Future work will plumb `-fsanitize-skip-hot-cutoff` (introduced in
https://github.com/llvm/llvm-project/pull/121619) to
LowerAllowCheckOptions.
|
|
And use that as an argument for
allow_ubsan_check when needed.
Other ubsan checks use SanitizerKind,
but those are known to the clang only.
So make it a parameter in LLVM.
|
|
This reverts commit eb63cd62a4a1907dbd58f12660efd8244e7d81e9.
This changes the preservation behavior for MSSA when the new flag
is not enabled.
|
|
Preparatory work to migrate from MemoryDependenceAnalysis
towards MemorySSA in GVN.
Co-authored-by: Antonio Frighetto <me@antoniofrighetto.com>
|
|
|
|
Remove ReportingMode and ReportingOpts.
|
|
Required for `{start|stop}-{after-before}` cli
|
|
-fno-sanitize-merge=local-bounds) (#120682)
#120613 removed -ubsan-unique-traps and replaced it with
-fno-sanitize-merge (introduced in #120511), which allows fine-grained
control of which UBSan checks to prevent merging. This analogous patch
removes -bound-checking-unique-traps, and allows it to be controlled via
-fno-sanitize-merge=local-bounds.
Most of this patch is simply plumbing through the compiler flags into
the bounds checking pass.
Note: this patch subtly changes -fsanitize-merge (the default) to also
include -fsanitize-merge=local-bounds. This is different from the
previous behavior, where -fsanitize-merge (or the old
-ubsan-unique-traps) did not affect local-bounds (requiring the separate
-bounds-checking-unique-traps). However, we argue that the new behavior
is more intuitive.
Removing -bounds-checking-unique-traps and merging its functionality
into -fsanitize-merge breaks backwards compatibility; we hope that this
is acceptable since '-mllvm -bounds-checking-unique-traps' was an
experimental flag.
|
|
This was introduced ~5yrs ago (by me), and has never really gotten
any adoption. By now, it's significantly out of sync with new/changed
poison propoagation rules. The idea is still reasonable, but the
imagined use case is largely covered by alive2 these days anyways.
|
|
This check is a part of UBSAN, but does not support
verbose output like other UBSAN checks.
This is a step to fix that.
|
|
This patch introduces the LLVM components of a type sanitizer: a
sanitizer for type-based aliasing violations.
It is based on Hal Finkel's https://reviews.llvm.org/D32198.
C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
these given TBAA metadata added by Clang. Roughly, a pointer of given
type cannot be used to access an object of a different type (with, of
course, certain exceptions). Unfortunately, there's a lot of code in the
wild that violates these rules (e.g. for type punning), and such code
often must be built with -fno-strict-aliasing. Performance is often
sacrificed as a result. Part of the problem is the difficulty of finding
TBAA violations. Hopefully, this sanitizer will help.
For each TBAA type-access descriptor, encoded in LLVM's IR using
metadata, the corresponding instrumentation pass generates descriptor
tables. Thus, for each type (and access descriptor), we have a unique
pointer representation. Excepting anonymous-namespace types, these
tables are comdat, so the pointer values should be unique across the
program. The descriptors refer to other descriptors to form a type
aliasing tree (just like LLVM's TBAA metadata does). The instrumentation
handles the "fast path" (where the types match exactly and no
partial-overlaps are detected), and defers to the runtime to handle all
of the more-complicated cases. The runtime, of course, is also
responsible for reporting errors when those are detected.
The runtime uses essentially the same shadow memory region as tsan, and
we use 8 bytes of shadow memory, the size of the pointer to the type
descriptor, for every byte of accessed data in the program. The value 0
is used to represent an unknown type. The value -1 is used to represent
an interior byte (a byte that is part of a type, but not the first
byte). The instrumentation first checks for an exact match between the
type of the current access and the type for that address recorded in the
shadow memory. If it matches, it then checks the shadow for the
remainder of the bytes in the type to make sure that they're all -1. If
not, we call the runtime. If the exact match fails, we next check if the
value is 0 (i.e. unknown). If it is, then we check the shadow for the
remainder of the byes in the type (to make sure they're all 0). If
they're not, we call the runtime. We then set the shadow for the access
address and set the shadow for the remaining bytes in the type to -1
(i.e. marking them as interior bytes). If the type indicated by the
shadow memory for the access address is neither an exact match nor 0, we
call the runtime.
The instrumentation pass inserts calls to the memset intrinsic to set
the memory updated by memset, memcpy, and memmove, as well as
allocas/byval (and for lifetime.start/end) to reset the shadow memory to
reflect that the type is now unknown. The runtime intercepts memset,
memcpy, etc. to perform the same function for the library calls.
The runtime essentially repeats these checks, but uses the full TBAA
algorithm, just as the compiler does, to determine when two types are
permitted to alias. In a situation where access overlap has occurred and
aliasing is not permitted, an error is generated.
Clang's TBAA representation currently has a problem representing unions,
as demonstrated by the one XFAIL'd test in the runtime patch. We'll
update the TBAA representation to fix this, and at the same time, update
the sanitizer.
When the sanitizer is active, we disable actually using the TBAA
metadata for AA. This way we're less likely to use TBAA to remove memory
accesses that we'd like to verify.
As a note, this implementation does not use the compressed shadow-memory
scheme discussed previously
(http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That
scheme would not handle the struct-path (i.e. structure offset)
information that our TBAA represents. I expect we'll want to further
work on compressing the shadow-memory representation, but I think it
makes sense to do that as follow-up work.
It goes together with the corresponding clang changes
(https://github.com/llvm/llvm-project/pull/76260) and compiler-rt
changes (https://github.com/llvm/llvm-project/pull/76261)
PR: https://github.com/llvm/llvm-project/pull/76259
|
|
Most of the other sanitizers are now only module level passes. This
moves all functionality into the module pass, and removes the function
pass.
|