Age | Commit message (Collapse) | Author | Files | Lines |
|
This reverts commit 0dbd804a690720688d8234d8bdaee8f8f4fdcddc.
|
|
DW_AT_comp_dir/DW_AT_dwo_name (#91237)
We need to update DW_AT_comp_dir/DW_AT_dwo_name TU in the
.debug_info.dwo section so that the path is correct. Refactored helper
functions to make it easier for next step.
|
|
This uses non-volatile registers for the first four (six on Windows)
registers used for `preserve_none` argument passing. This allows these
registers to stay "pinned", even if the body of the `preserve_none`
function contains calls to other "normal" functions.
Example:
```c
void boring(void);
__attribute__((preserve_none)) void (continuation)(void *, void *, void *, void *);
__attribute__((preserve_none)) void entry(void *a, void *b, void *c, void *d)
{
boring();
__attribute__((musttail)) return continuation(a, b, c, d);
}
```
Before:
```asm
pushq %rax
movq %rcx, %rbx
movq %rdx, %r14
movq %rsi, %r15
movq %rdi, %r12
callq boring@PLT
movq %r12, %rdi
movq %r15, %rsi
movq %r14, %rdx
movq %rbx, %rcx
popq %rax
jmp continuation@PLT
```
After:
```asm
pushq %rax
callq boring@PLT
popq %rax
jmp continuation@PLT
```
|
|
`emitc.subscript` (#91087)
This change updates the logic that determines whether an `emitc.expression`
result is translated into a dedicated variable assignment. Due to how
the translation of `emitc.subscript` currently works, a previously
inline-able `emitc.expression` would produce incorrect C++ if its single user
was a `emitc.subscript` operation.
|
|
This removes the GISel versions of isREVMask, isTRNMask, isUZPMask and
isZipMask. They are combined with the existing versions from SDAG into
AArch64PerfectShuffle.h.
|
|
|
|
This appears to have been missed because later cpus don't inherit from Nehalem tuning much.
Noticed while cleaning up for #90985
|
|
Need to use the last address of the vectorized stores for the strided
stores, not the first one, to correctly store the data.
|
|
This change refactors RenderFloatingPointOptions() to eliminate some
excessively complicated logic and a redundant switch statement. The
logic being simplified is an artifact of the original -ffp-model
implementation, and over time it has become unnecessary.
The handling of diagnostics related to the -ffp-contract option is
still a bit convoluted after this change. I will address that in a
subsequent patch because I think it will make sense to make some minor
changes to the driver behavior when that is cleaned up. The current
patch should not make any change to observable behavior of the driver.
|
|
I was cleaning up this portion of the code and realized these are
completely unused.
|
|
In some cases we see strings from categories being part of "data"
sections (Ex:`__objc_const`), not part of of sections marked as
`cstring_literals`. Since lld treats these sections differently we need
to explicitly implement support for reading strings from the
non-`cstring_literals` sections.
Adding a test that previously would result in an assert.
|
|
When generating categories, clang sometimes will generate references in
the `.addrsig` section to the various category data items. Since we may
erase such items after merging them, we also need to remove them from
the `.addrsig` section - otherwise this will cause runtime asserts with
the `.addrsig` section trying to access invalid data.
Implementation wise, we use a hashset to keep track of all erased
`InputSection`'s and then go through all `.addrsig` sections and remove
references to any erased `InputSection`.
|
|
Currently RISCVDeadRegisterDefinitions runs after vsetvli insertion, but
in #70549 vsetvli insertion runs after vector regalloc and as a result
we no longer convert some vsetvli a0, a0s to vsetvli x0, a0. This patch
moves it to after vector regalloc, but before scalar regalloc so we
still get the benefits of reducing register pressure.
|
|
|
|
These two are very similar to the other 'var-list' variants, except they
require that the type of the variable be a pointer. This patch
implements that restriction.
|
|
|
|
Besides duplicating code, privatizing variables in every section
causes problems when synchronization barriers are used. This
happens because each section is executed by a given thread, which
will cause the program to hang if not all running threads execute
the barrier operation.
Fixes https://github.com/llvm/llvm-project/issues/72824
|
|
Because LiveVariables has been run, we no longer need to lookup the
users in MachineRegisterInfo anymore and can instead just check for the
dead flag.
|
|
bugprone-return-const-ref-from-parameter (#91160)
|
|
Need to check that the signed operand has an extra sign bit to be sure
that we do not skip signedness, when trying to minimize bitwidth for
smin/smax intrinsics.
|
|
|
|
Fix the issue that `char` constants are converted to `uint64_t` in the
wrong way when doing the inlining.
|
|
This reverts commit 11066449d49e20f18f46757df07455c6abcedcf1.
As noted in the original patch, this was designed to reverted once
https://reviews.llvm.org/D142479 and https://reviews.llvm.org/D142660
landed, which has long since happened.
|
|
(#87649)
`SBProcess::GetMemoryRegionInfo` uses `qMemoryRegionInfo` packet to get
memory region info, but this is not supported in gdb-server and causing
downstream lldb test failures. This change ignores the the error from
`SBProcess::GetMemoryRegionInfo` .
Reported by @tedwoodward @jerinphilip.
|
|
ThinLTO summaries" (#90610)" (#91194)
Reverts llvm/llvm-project#90692
Breaking PPC buildbots. The bots are not meant to test LLD, but are
running a test that is using an old version of LLD without the change
(so is incompatible). Revert until a fix is found.
|
|
(#88816)
Adds more FP test macros for the upcoming test adds for #61092 and the
issues opened from it: #88768, #88769, #88770, #88771, #88772.
Fix bug in `{EXPECT,ASSERT}_FP_EXCEPTION`. `EXPECT_FP_EXCEPTION(0)`
seems to be used to test that an exception did not happen, but it always
does `EXPECT_GE(... & 0, 0)` which never fails.
Update and refactor tests that break after the above bug fix. An
interesting way things broke after the above change is that
`ForceRoundingMode` and `quick_get_round()` were raising the inexact
exception, breaking a lot of the `atan*` tests.
The changes for all files other than `FPMatcher.h` and
`libc/test/src/math/smoke/RoundToIntegerTest.h` should have the same
semantics as before. For `RoundToIntegerTest.h`, lines 56-58 before the
changes do not always hold since this test is used for functions with
different exception and errno behavior like `lrint` and `lround`. I've
deleted those lines for now, but tests for those cases should be added
for the different nearest int functions to account for this.
Adding @nickdesaulniers for review.
|
|
Fixes #90285.
|
|
(reland #90438) (#91172)
This relands #90348 with a fix for a [buildbot
failure](https://lab.llvm.org/buildbot/#/builders/216/builds/38446)
caused by the test being run with `-fno-rtti`.
|
|
Pre-commit tests for an upcoming patch.
|
|
An `emitc.expression` can only yield a single result, but some
operations which have the `CExpression` trait can have multiple results,
which can result in a crash when applying the `fold-expressions` pass.
This change adds a check for the single-result condition and a simple
test.
|
|
|
|
|
|
Plugins are not loaded without the -cc1 phase. Do not report them when
running on an assembly file or when linking. Many build tools add these
options to all driver invocations, including LLVM's build system.
Fixes #88173
|
|
Fixes https://github.com/llvm/llvm-project/issues/78936
|
|
Handle implicit firstprivate DSAs on task generating constructs.
Fixes https://github.com/llvm/llvm-project/issues/64480
|
|
llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp:3582:13:
error: unused function 'isBlendOrUndef' [-Werror,-Wunused-function]
static bool isBlendOrUndef(ArrayRef<int> Mask) {
^
1 error generated.
|
|
llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp:40081:21:
error: comparison of integers of different signs: 'int' and 'unsigned int' [-Werror,-Wsign-compare]
for (int I = 0; I != NumElts; ++I) {
~ ^ ~~~~~~~
1 error generated.
|
|
or 2 args (#89624)
The destructors generated by the legacy IBM `xlclang++` compiler can
take 1 or 2 arguments and the differences were handled by type `cast`
where it is needed. Clang now treats the `cast` here as an error after
https://github.com/llvm/llvm-project/commit/999d4f840777bf8de26d45947192aa0728edc0fb
landed with `-Xextra -Werror`. The issue had been worked around by using
`#pragma GCC diagnostic push/pop`. This patch defines 2 separate
destructor types for 1 argument and 2 arguments respectively so `cast`
is not needed.
|
|
Followup to #89727
|
|
If we don't demand the same element from both single source shuffles (permutes), then attempt to blend the sources together first and then perform a merged permute.
For vXi16 blends we have to be careful as these are much more likely to involve byte/word vector shuffles that will result in the creation of additional shuffle instructions.
This fold might be worth it for VSELECT with constant masks on AVX512 targets, but I haven't investigated this yet, but I've tried to write combineBlendOfPermutes so to be prepared for this.
The PR34592 -O0 regression is an unfortunate failure to cleanup with a later pass that calls SimplifyDemandedElts like the -O3 does - I'm not sure how worried we should be tbh.
|
|
Change definition of expandBitCastI128ToF128 and expandBitCastF128ToI128
to allow for simplified use in atomic load/store.
Update logic to split 128-bit loads and stores in DAGCombine to also
handle the f128 case where appropriate. This fixes the regressions
introduced by recent atomic load/store patches.
|
|
Noticed while investigating GFNI per-element vector shifts (we can form SHL but not SRL/SRA)
Alive2: https://alive2.llvm.org/ce/z/fSH-rf
|
|
|
|
|
|
|
|
Referring to RISC-V, adding an MI level pass to optimize *W instructions
for LoongArch.
First it removes unneeded sext(addi.w rd, rs, 0) instructions. Either
because the sign extended bits aren't consumed or because the input was
already sign extended by an earlier instruction.
Then:
1. Unless explicit disabled or the target prefers instructions with W
suffix, it removes the -w suffix from opw instructions whenever all
users are dependent only on the lower word of the result of the
instruction. The cases handled are:
* addi.w because it helps reduce test differences between LA32 and LA64
w/o being a pessimization.
2. Or if explicit enabled or the target prefers instructions with W
suffix, it adds the W suffix to the instruction whenever all users are
dependent only on the lower word of the result of the instruction. The
cases handled are:
* add.d/addi.d/sub.d/mul.d.
* slli.d with imm < 32.
* ld.d/ld.wu.
|
|
We need to use InitField here, not SetField.
|
|
This is really a workaround to allow control flow lowering in the
presence of convergence control tokens. Control-flow intrinsics in LLVM
IR are convergent because they indirectly represent the wave CFG, i.e.,
sets of threads that are "converged" or "execute in lock-step". But they
exist during a small window in the lowering process, inserted after the
structurizer and then translated to equivalent MIR pseudos. So rather
than create convergence tokens for these builtins, we simply mark them
as not convergent.
The corresponding MIR pseudos are marked as having side effects, which
is sufficient to prevent optimizations without having to mark them as
convergent.
|
|
Proof: https://alive2.llvm.org/ce/z/iRnJ4i
Fixes https://github.com/llvm/llvm-project/issues/91127.
|
|
lldb already mostly(*) tracks this information. This just makes it
available to the SB users.
(*) It does not do that for typedefs right now see llvm.org/pr90958
|