Age | Commit message (Collapse) | Author | Files | Lines |
|
Apply loop guards when checking if the recurrence is non-negative in
cases where runtime checks are hoisted out of an inner loop.
|
|
The icmp and fcmp constant expressions were removed in deab451e7a7f
"[IR] Remove support for icmp and fcmp constant expressions (#93038)".
Update the DXILBitcodeWriter to stop referencing them.
|
|
|
|
Reviewers: aeubanks
Reviewed By: aeubanks
Pull Request: https://github.com/llvm/llvm-project/pull/88456
|
|
Matches the cmake build.
Reviewers: aeubanks
Reviewed By: aeubanks
Pull Request: https://github.com/llvm/llvm-project/pull/88458
|
|
Matches CMake LLVM_UBSAN_FLAGS.
Reviewers: aeubanks
Reviewed By: aeubanks
Pull Request: https://github.com/llvm/llvm-project/pull/93911
|
|
Reviewers: aeubanks
Reviewed By: aeubanks
Pull Request: https://github.com/llvm/llvm-project/pull/88457
|
|
All of these instructions can be generated using regular LL intrinsics.
Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md
|
|
## Consistent PDB GUID in `llvm-readobj`
Currently, the PDB GUID is shown as a byte array:
`PDBGUID: (D8 4C 88 D9 26 15 1F 11 4C 4C 44 20 50 44 42 2E)`
This is inconsistent with `llvm-pdbutil` (e.g. `llvm-pdbutil dump
--summary`) which shows it as a hexadecimal string.
Additionally, `yaml2obj` uses the same hexadecimal string format.
In general, the hexadecimal string is the common representation for PDB
GUIDs on Windows.
This PR changes it to be consistent as shown below:
`PDBGUID: {D9884CD8-1526-111F-4C4C-44205044422E}`
|
|
|
|
Simplify by setting PseudoInstr to the tablegen name of the Pseudo in
the first place.
|
|
See the following example:
```
define i1 @src(i64 %x, i1 %y) {
%1526 = icmp ne i64 %x, 0
%1527 = icmp eq i64 %x, 0
%sel = select i1 %y, i1 %1526, i1 %1527
ret i1 %sel
}
define i1 @tgt(i64 %x, i1 %y) {
%1527 = icmp eq i64 %x, 0
%sel = xor i1 %y, %1527
ret i1 %sel
}
```
I find that this pattern is common in C/C++/Rust code base.
This patch folds `select Cond, Y, X` into `Cond ^ X` iff:
1. X has the same type as Cond
2. X is poison -> Y is poison
3. X == !Y
Alive2: https://alive2.llvm.org/ce/z/hSmkHS
|
|
Any of the `zext` bits in a `zext nneg` can be converted to `sext` but
when checking if casts are compatible `BasicAA` fails to take into
account `nneg`. This change adds tracking of `nneg` to the `CastedValue`
struct and ensures that `sext` and `zext` bits are treated as
interchangeable when either `CastedValue` has a `nneg`. When
distributing casted values in `GetLinearExpression` we conservatively
discard the `nneg` from the `CastedValue`, except in the case of `shl
nsw`, where we know the sign has not changed to negative.
|
|
convention. (#94353)
Allows to simplify the definition itself.
Part of <https://github.com/llvm/llvm-project/issues/62629>.
|
|
lowering. (#92960)
Implement emitPrologue/emitEpilogue methods, determine/spill/restore
callee saved registers functionality with test. Also implement lowering
of the DYNAMIC_STACKALLOC/STACKSAVE/STACKRESTORE stack operations with
tests.
|
|
goma is deprecated and not maintained anymore.
|
|
|
|
This adds compare and branch instructions fusion for Neoverse V2.
According to the Software Optimization Guide:
Specific Aarch64 instruction pairs that can be fused are as follows:
CMP/CMN (immediate) + B.cond
CMP/CMN (register) + B.cond
Performance for SPEC2017 is neutral, but another benchmark improves
significantly.
Results for SPEC2017 on a Neoverse V2:
500.perlbench 0%
502.gcc_r 0%
505.mcf_r -0.15%
523.xalancbmk_r -0.43%
525.x264_r 0%
531.deepsjeng_r 0%
541.leela_r -0.16%
557.xz_r -0.47%
|
|
The range may no longer be valid after the select has been
optimized away.
This fixes the kernel miscompiles reported at
https://github.com/ClangBuiltLinux/linux/issues/2031.
|
|
|
|
When doing a runtimes build with LTO using ld.bfd (or ld.gold), the
build starts failing with ninja 1.12, which added a new critical path
scheduler. The reason is that LLVMgold.so is not available yet at the
point where runtimes start being build, leading to configuration
failures in the nested cmake invocation.
Fix this by adding an explicit dependency on LLVMgold.so if it is
available. (It may not always be necessary, e.g. if the used linker is
lld, but it would be hard to detect when exactly it may or may not be
needed, so always adding the dependency is safer.)
|
|
Add overloads of GetElementPtrInst::Create() that accept
GEPNoWrapFlags, and switch the bool parameters in IRBuilder to
accept it instead as well.
As a sample use, switch GEP i8 canonicalization in InstCombine to
preserve the original flags.
|
|
Store getSE result in variable to re-use and use structured bindings
when looping over bounds.
|
|
|
|
This preserves the flags if a constexpr GEP is created (at least
as long as they don't get dropped later -- the test cases uses a
constexpr index to avoid that).
|
|
Add a test case with a missed simplification when hoisting runtime
checks due to not applying loop guards.
|
|
This preserves the flags during that transform, but currently they
will still end up getting dropped at a later stage.
|
|
Flags are already fully preserved for the instruction case,
but lost on constant expressions.
|
|
|
|
|
|
VPTypeAnalysis::inferScalarTypeForRecipe is missing the case for
VPInstruction::LogicalAnd, due to which the test
vplan-incomplete-cases.ll crashes. Add this missing case, and move the
test in vplan-infer-not-or-type.ll to vplan-incomplete-cases.ll, showing
correct codegen for trip-counts 2 and 3.
|
|
Since the switch to opaque pointers, zero-index GEPs will be
optimized away anyway, so there is no need to explicitly handle
them here.
|
|
|
|
|
|
|
|
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
|
|
|
|
With the change in 2fa059195bb54f422cc996db96ac549888268eae we can now
use a range for loop.
|
|
A cycle profile showed that we were spending a lot of time invoking
MapVector::erase. According to
https://llvm.org/docs/ProgrammersManual.html#llvm-adt-mapvector-h,
erasing elements one at a time is very inefficient for MapVector and it
is better to use remove_if.
This change resulted in around 7% time reduction on a large thin link.
While here remove an unused function that also invokes erase on
MapVectors.
|
|
The old use of must-be-executed-context (MBEC) did propagate
through calls even if that was not allowed. We now only propagate from
call site arguments. If there are calls/intrinsics that allows
propagation, we need to add them explicitly.
Fixes: https://github.com/llvm/llvm-project/issues/78507
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
|
|
vector type. (#93406)
FunctionStackPoisoner does not serve for `AllocaInst` with scalable
vector type, but it does not filter out struct type with scalable vector
introduced by c8eb535aed0368c20b25fe05bca563ab38dd91e9.
|
|
Sink vscale calls as well when indvars is not widen
(-indvars-widen-indvars=false).
|
|
(#94285)
A cycle profile of a thin link showed a lot of time spent in sort called
from the BitcodeWriter, which was being used to compute the unique
references to stack ids in the summaries emitted for each backend in a
distributed thinlto build. We were also frequently invoking lower_bound
to locate stack id indices in the resulting vector when writing out the
referencing memprof records.
Change this to use a map to uniquify the references, and to hold the
index of the corresponding stack id in the StackIds vector, which is
now populated at the same time.
This reduced the time of a large thin link by about 10%.
|
|
The MI is generated in `PPCDAGToDAGISel::Select` so the match pattern isn't used and can be removed.
|
|
It should preserve more analysis results, but it happens immediately
after instruction selection.
|
|
Previously this assumed that `LLVM_ENABLE_ABI_BREAKING_CHECKS` would
always be enabled in this case, if it's not `TTI` does not exist.
Introduced in 7652a59407018c057cdc1163c9f64b5b6f0954eb
|
|
To support the third parameter of the alignment directive, R_LARCH_ALIGN
relocations need a non-zero symbol index.
In many cases we don't need the third parameter and can set the symbol
index to 0.
This patch will remove a lot of .Lla-relax-align* symbols and mitigate
the size regression due to
https://github.com/llvm/llvm-project/pull/72962.
Co-authored-by: Jinyang He <hejinyang@loongson.cn>
Co-authored-by: Weining Lu <luweining@loongson.cn>
|
|
block address taken. (#94296)
These blocks usually show up in the form of branches within inline
assembly. Since it's hard to rewire them, we fully omit paths with such
blocks from path cloning.
|
|
libcxx/libcxxabi/libunwind.
-fvisibility-global-new-delete-hidden is deprecated and clang was warning
about it on every build command. These libraries are always built using
a stage2 compiler, so we can use the new build flag unconditionally.
Reviewers: aeubanks
Reviewed By: aeubanks
Pull Request: https://github.com/llvm/llvm-project/pull/88459
|
|
- Fix build with `EXPENSIVE_CHECKS`
- Remove unused `PassName::ID` to resolve warning
- Mark `~SelectionDAGISel` virtual so AArch64 backend can work properly
|