Age | Commit message (Collapse) | Author | Files | Lines |
|
This is detected by asan after #83774
Allocation size will be divided by `__endian_factor` before storing. If
it's not aligned,
we will not be able to recover allocation size to pass into
`__alloc_traits::deallocate`.
we have code like this
```
auto __allocation = std::__allocate_at_least(__alloc(), __recommend(__sz) + 1);
__p = __allocation.ptr;
__set_long_cap(__allocation.count);
void __set_long_cap(size_type __s) _NOEXCEPT {
__r_.first().__l.__cap_ = __s / __endian_factor;
__r_.first().__l.__is_long_ = true;
}
size_type __get_long_cap() const _NOEXCEPT {
return __r_.first().__l.__cap_ * __endian_factor;
}
inline ~basic_string() {
__annotate_delete();
if (__is_long())
__alloc_traits::deallocate(__alloc(), __get_long_pointer(), __get_long_cap());
}
```
1. __recommend() -> even size
2. `std::__allocate_at_least(__alloc(), __recommend(__sz) + 1)` - > not
even size
3. ` __set_long_cap() `- > lose one bit of size for __endian_factor == 2
(see `/ __endian_factor`)
4. `__alloc_traits::deallocate(__alloc(), __get_long_pointer(),
__get_long_cap())` -> uses even size (see `__get_long_cap`)
(cherry picked from commit d129ea8d2fa347e63deec0791faf389b84f20ce1)
|
|
Previously, there was a ternary conditional with a less-than comparison
appearing inside a template argument, which was really confusing because
of the <...> of the function template. This patch rewrites the same
statement on two lines for clarity.
(cherry picked from commit 382f70a877f00ab71f3cb5ba461b52e1b59cd292)
|
|
|
|
Fixes #92300.
(cherry picked from commit d89f20058b45e3836527e816af7ed7372e1d554d)
|
|
(#94091)
…ing literals (#92214)
|
|
In #88846 I changed this code to use RAUW to perform the replacement
instead of manual updates -- but kept the outer loop, which means we try
to perform RAUW once per user. However, some of the users might be freed
by the RAUW operation, resulting in use-after-free.
The case where this happens is constant users where the replacement
might result in the destruction of the original constant.
Fixes https://github.com/llvm/llvm-project/issues/92991.
(cherry picked from commit 9f85bc834b07ebfec9e5e02deb9255a0f6ec5cc7)
|
|
|
|
- No indirect syscalls on OpenBSD. Instead there is a `futex` function
which issues a direct syscall.
- Monotonic clock is available despite the full POSIX suite of timers
not being available in its entirety.
See https://lists.boost.org/boost-bugs/2015/07/41690.php and
https://github.com/boostorg/log/commit/c98b1f459add14d5ce3e9e63e2469064601d7f71
for a description of an analogous problem and fix for Boost.
(cherry picked from commit af7467ce9f447d6fe977b73db1f03a18d6bbd511)
|
|
If the `/usr/lib/...` path where compiler-rt is conventionally installed
on OpenBSD does not exist, fall back to the regular logic to find it.
This is a minimal change to allow OpenBSD cross compilation from a
toolchain that doesn't adopt all of OpenBSD's monorepo's conventions.
(cherry picked from commit be10746f3a4381456eb5082a968766201c17ab5d)
|
|
Fixes #91312.
Don't perform the transform if the alias may be replaced at link time.
(cherry picked from commit c79690040acf5bb3d857558b0878db47f7f23dc3)
|
|
I accidentally left out the code to transfer sret attributes to entry
thunks, so values weren't being passed in the right registers, and the
sret pointer wasn't returned in the correct register.
Fixes #90229
|
|
In some cases, MSVC's mangling for arm64ec thunks includes the alignment
of a struct. I added some code to try to match... but it never really
worked right. The issues:
- Alignment is only mangled if it's 16 or more (I guess the default is
supposed to be 8).
- Alignment isn't mangled on return values (since the memory is
allocated by the caller).
The current patch leaves hooks to make alignment mangling work... but
doesn't actually ever mangle alignment: clang never actually encodes a
relevant alignment into the IR. Once we get clang to emit the real
size/alignment of structs, we can start emitting it.
|
|
(cherry picked from commit d06270ee00e37b247eb99268fb2f106dbeee08ff)
|
|
This is ORed with the fast-unaligned-access feature which applies
to scalar and vector together.:
|
|
Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
|
|
See the following case:
```
define i32 @src1(i32 %x) {
%dec = sub nuw i32 -2, %x
%ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)
%sub = sub nsw i32 32, %ctlz
%shl = shl i32 1, %sub
%ugt = icmp ult i32 %x, -2
%sel = select i1 %ugt, i32 %shl, i32 1
ret i32 %sel
}
define i32 @tgt1(i32 %x) {
%dec = sub nuw i32 -2, %x
%ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false)
%sub = sub nsw i32 32, %ctlz
%and = and i32 %sub, 31
%shl = shl nuw i32 1, %and
ret i32 %shl
}
```
`nuw` in `%dec` should be dropped after the select instruction is
eliminated.
Alive2: https://alive2.llvm.org/ce/z/7S9529
Fixes https://github.com/llvm/llvm-project/issues/91691.
(cherry picked from commit b5f4210e9f51f938ae517f219f04f9ab431a2684)
|
|
analysis."
After reconsidering the words of @nikic, I have decided to revisit the patches I suggested be backported. Upon further analysis, I think there is a high likelihood that this change added to release 18.x was referencing a crash that was caused by a PR that isn't added.
I will, however, keep the test that was added just in case.
This reverts commit 6e071cf30599e821be56b75e6041cfedb7872216.
|
|
Fixes https://github.com/llvm/llvm-project/issues/92062
(cherry picked from commit d422e90fcbdddd68749918ddd86c94188807efce)
|
|
When expanding an L128 (which is used to reload i128) it is
possible that the quadword destination register clobbers an
address register. This patch adds an assertion against the case
where both of the expanded parts clobber the address, and in the
case where one of the expanded parts do so puts it last.
Fixes #91437
(cherry picked from commit d6ee7e8481fbaee30f37d82778ef12e135db5e67)
|
|
Fixes https://github.com/llvm/llvm-project/issues/91551
|
|
See the LangRef:
> All uses of a value returned by the same ‘freeze’ instruction are
guaranteed to always observe the same value, while different ‘freeze’
instructions may yield different values.
It is incorrect to replace freezes with the simplified value.
Proof:
https://alive2.llvm.org/ce/z/3Dn9Cd
https://alive2.llvm.org/ce/z/Qyh5h6
Fixes https://github.com/llvm/llvm-project/issues/91178
(cherry picked from commit d085b42cbbefe79a41113abcd2b1e1f2a203acef)
Revert "[InstSimplify] Do not simplify freeze in `simplifyWithOpReplaced` (#91215)"
This reverts commit 1c2eb18d52976fef89972e89c52d2ec5ed7e4868.
[InstSimplify] Do not simplify freeze in `simplifyWithOpReplaced` (#91215)
See the LangRef:
> All uses of a value returned by the same ‘freeze’ instruction are
guaranteed to always observe the same value, while different ‘freeze’
instructions may yield different values.
It is incorrect to replace freezes with the simplified value.
Proof:
https://alive2.llvm.org/ce/z/3Dn9Cd
https://alive2.llvm.org/ce/z/Qyh5h6
Fixes https://github.com/llvm/llvm-project/issues/91178
(cherry picked from commit d085b42cbbefe79a41113abcd2b1e1f2a203acef)
|
|
doesn't support AVX512 (#91694)
(cherry picked from commit 87f3407856e61a73798af4e41b28bc33b5bf4ce6)
|
|
(#90911)
In DAGCombiner, the `performCONDCombine` function attempts to remove AND
instructions in front of SUBS (cmp) instructions for which the AND is
transparent. The rules for that are correct, but it fails to take into
account the case where the SUBS instruction has multiple users with
different condition codes for comparison and simply removes the AND for
all of them. This causes a miscompilation in the attached test case.
(cherry picked from commit 72eaa0ed9934bfaa2449091bbc6e45648d1396d6)
|
|
Walk all the ISA strings and set the subtarget bits for any extension we
find in any string.
This allows LTO output to have a ELF attributes from the union of all of
the files used to compile it.
|
|
(#83344)
Instead of caching STI in the RISCVELFTargetStreamer, store the two
flags we need from it.
My goal is to allow RISCVAsmPrinter to override these flags using IR
module metadata for LTO. So they need to be separated from the STI used
to construct the TargetStreamer.
This patch should be NFC as long as no one is changing the contents of
the STI that was used to construct the TargetStreamer between the
constructor and the use of the flags.
|
|
In an LTO build, we don't set the ELF attributes to indicate what
extensions were compiled with. The target CPU/Attrs in
RISCVTargetMachine do not get set for an LTO build. Each function gets a
target-cpu/feature attribute, but this isn't usable to set ELF attributs
since we wouldn't know what function to use. We can't just once since it
might have been compiler with an attribute likes target_verson.
This patch adds the ISA as Module metadata so we can retrieve it in the
backend. Individual translation units can still be compiled with
different strings so we need to collect the unique set when Modules are
merged.
The backend will need to combine the unique ISA strings to produce a
single value for the ELF attributes. This will be done in a separate
patch.
|
|
STT_NOTYPE
When adding fixups for RISCV_TLSDESC_ADD_LO and RISCV_TLSDESC_LOAD_LO,
the local label added for RISCV TLSDESC relocations have STT_TLS set,
which is incorrect. Instead, these labels should have `STT_NOTYPE`.
This patch stops adding such fixups and avoid setting the STT_TLS on
these symbols. Failing to do so can cause LLD to emit an error `has an
STT_TLS symbol but doesn't have an SHF_TLS section`. We additionally,
adjust how LLD services these relocations to avoid errors with
incompatible relocation and symbol types.
Reviewers: topperc, MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/85817
(cherry picked from commit dfe4ca9b7f4a422500d78280dc5eefd1979939e6)
|
|
String pool merging currently, for a reason that's not entirely clear to
me, tries to create GEP instructions instead of GEP constant expressions
when replacing constant references. It only uses constant expressions in
cases where this is required. However, it does not catch all cases where
such a requirement exists. For example, the landingpad catch clause has
to be a constant.
Fix this by always using the constant expression variant, which also
makes the implementation simpler.
Additionally, there are some edge cases where even replacement with a
constant GEP is not legal. The one I am aware of is the
llvm.eh.typeid.for intrinsic, so add a special case to forbid
replacements for it.
Fixes https://github.com/llvm/llvm-project/issues/88844.
(cherry picked from commit 3a3aeb8eba40e981d3a9ff92175f949c2f3d4434)
|
|
Fixes #86109.
(cherry picked from commit cceedc939a43c7c732a5888364251775bffc2dba)
|
|
templates (#91628)
This fixes a regression introduced by bee78b88f.
When we form a deduction guide for a constructor, basically, we do the
following work:
- Collect template parameters from the constructor's surrounding class
template, if present.
- Collect template parameters from the constructor.
- Splice these template parameters together into a new template
parameter list.
- Turn all the references (e.g. the function parameter list) to the
invented parameter list by applying a `TreeTransform` to the function
type.
In the previous fix, we handled cases of nested class templates by
substituting the "outer" template parameters (i.e. those not declared at
the surrounding class template or the constructor) with the
instantiating template arguments. The approach per se makes sense, but
there was a flaw in the following case:
```cpp
template <typename U, typename... Us> struct X {
template <typename V> struct Y {
template <typename T> Y(T) {}
};
template <typename T> Y(T) -> Y<T>;
};
X<int>::Y y(42);
```
While we're transforming the parameters for `Y(T)`, we first attempt to
transform all references to `V` and `T`; then, we handle the references
to outer parameters `U` and `Us` using the template arguments from
`X<int>` by transforming the same `ParamDecl`. However, the first step
results in the reference `T` being `<template-param-0-1>` because the
invented `T` is the last of the parameter list of the deduction guide,
and what we're substituting with is a corresponding parameter pack
(which is `Us`, though empty). Hence we're messing up the substitution.
I think we can resolve it by reversing the substitution order, which
means handling outer template parameters first and then the inner
parameters.
There's no release note because this is a regression in 18, and I hope
we can catch up with the last release.
Fixes https://github.com/llvm/llvm-project/issues/88142
(cherry picked from commit 8c852ab57932a5cd954cb0d050c3d2ab486428df)
|
|
(cherry picked from commit 4b4763ffebaed9f1fee94b8ad5a1a450a9726683)
|
|
memory access. (#89031)"
Original commit message: "
Clang's CodeGen is designed to work with a single llvm::Module. In many cases
for convenience various CodeGen parts have a reference to the llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module which does
not map well to CodeGen's design. To work this around we clone the module and
pass it down.
With some effort it is possible to teach CodeGen to ask the CodeGenModule for
its current module and that would have an overall positive impact on CodeGen
improving the encapsulation of various parts but that's not resilient to future
regression.
This patch takes a more conservative approach and keeps the first llvm::Module
empty intentionally and does not pass it to the Jit. That's also not bullet
proof because we have to guarantee that CodeGen does not write on the
blueprint. However, we have inserted some assertions to catch accidental
additions to that canary module.
This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock progress
on https://github.com/llvm/llvm-project/pull/84758.
"
This patch reverts adc4f6233df734fbe3793118ecc89d3584e0c90f and removes
the check of `named_metadata_empty` of the first llvm::Module because on darwin
clang inserts some harmless metadata which we can ignore.
(cherry picked from commit a3f07d36cbc9e3a0d004609d140474c1d8a25bb6)
|
|
(#91826)
We have been collecting release notes from the PRs for most of the
18.1.x releases and this just helps automate the process.
(cherry picked from commit c99d1156c28dfed67a8479dd97608d1f0d6cd593)
|
|
When a child process is forked with OpenMP already initialized, the
child process resets its affinity mask and sets proc-bind-var to false
so that the entire original affinity mask is used. This patch corrects
an issue with the affinity initialization code setting affinity to
compact instead of none for this special case of forked children.
The test trying to catch this only testing explicit setting of
KMP_AFFINITY=none. Add test run for no KMP_AFFINITY setting.
Fixes: #91098
(cherry picked from commit 73bb8d9d92f689863c94d48517e89d35dae0ebcf)
|
|
Currently, we mistakenly mark the local labels used in RISC-V TLSDESC as
TLS symbols, when they should not be. This patch adds tests with the
current incorrect behavior, and subsequent patches will address the
issue.
Reviewers: MaskRay, topperc
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/85816
(cherry picked from commit f6f474c4ef9694a4ca8f08d59fd112c250fb9c73)
|
|
We should moreElements <3 x s1> to <4 x s1> before we try to widen the element,
otherwise we end up with a <3 x s21> nonsense type.
(cherry picked from commit a01e9ce86f4c1bc9af819902db9f287b6d23f54f)
Test has been changed from original commit due to a fallback in a G_BITCAST.
Added abort=2 so we can see partial legalization and check no crash.
|
|
Vectors are always tightly packed, and elements of non-byte-sized
usually do not have a well-defined (byte) offset.
Fixes https://github.com/llvm/llvm-project/issues/90695.
(cherry picked from commit d484c4d3501a7ff3d00a6e0cfad026a3b01d320c)
|
|
(#86972) (#91580)
Fixes #86917
`FCMP_TRUE` and `FCMP_FALSE` were previously not considered and we ended up in an llvm_unreachable assertion.
|
|
For inbounds GEPs, if the source pointer is non-null, the result must
also be non-null. However, this does not hold for non-inbounds GEPs.
Fixes https://github.com/llvm/llvm-project/issues/91177.
(cherry picked from commit f34d30cdae0f59698f660d5cc8fb993fb3441064)
|
|
Fixes #90966.
(cherry picked from commit db0ed5533368414b1c4e1c884eef651c66359da2)
|
|
As well as flipping the sense of the bit, GFX12 moved it from bit 0 to
bit 1 in the encoded simm16 operand.
(cherry picked from commit e0a763c490d8ef58dca867e0ef834978ccf8e17d)
|
|
This is a fix for miscompiles reported in
https://github.com/llvm/llvm-project/issues/89060
After argument copy elison the IR value for the eliminated alloca
is aliasing with the fixed stack object. This patch is making sure
that we mark the fixed stack object as being aliased with IR values
to avoid that for example schedulers are reordering accesses to
the fixed stack object. This could otherwise happen when there is a
mix of MemOperands refering the shared fixed stack slow via both
the IR value for the elided alloca, and via a fixed stack pseudo
source value (as would be the case when lowering the arguments).
(cherry picked from commit d8b253be56b3e9073b3e59123cf2da0bcde20c63)
|
|
AVX doesn't provide 16-bit BROADCAST instruction.
Fixes #91005
|
|
With KNL/KNC being deprecated, we don't need to care about such no VLX
cases anymore. We may remove such patterns in the future.
Fixes #90844
(cherry picked from commit 7963d9a2b3c20561278a85b19e156e013231342c)
|
|
Code to determine if a waitcnt is required before a barrier instruction
only
considered S_BARRIER.
gfx12 adds barrier_signal/wait so need to enhance the existing code to
look for
a barrier start (which is just an S_BARRIER for earlier architectures).
|
|
This updates the release-binaries workflow so that the different build
stages are split across multiple jobs. This saves money by reducing the
time spent on the larger github runners and also makes it easier to
debug, because now it's possible to build a smaller release package
(with clang and lld) using only the free GitHub runners.
The workflow no longer uses the test-release.sh script but instead uses
the Release.cmake cache. This gives the workflow more flexibility and
ensures that the binary package will always be created even if the tests
fail.
This idea to split the stages comes from the "LLVM Precommit CI through
Github Actions" RFC:
https://discourse.llvm.org/t/rfc-llvm-precommit-ci-through-github-actions/76456
(cherry picked from commit abac98479b81cc0cc717bb6cdbae6f774e3b0232)
|
|
In aa02002491333c42060373bc84f1ff5d2c76b4ce the input name was changed
from tag to release-version, but the code was never updated.
(cherry picked from commit 8d220d109d28dac352c563ab062fb72132b7eca1)
|
|
Since aa02002491333c42060373bc84f1ff5d2c76b4ce we weren't installing the
correct dependencies, and since 2836d8edbfbcd461b25101ed58f93c862d65903a
we must pass a custom token to github-upload-release.py for verifying
permissions.
(cherry picked from commit 51207756b0692f325cf75560185cf0336239b3e0)
|
|
This patch adds repository checks to the release-binaries workflow jobs.
People were observing that the job was running on a schedule in their
forks. This only happens on old forks, but those probably exist in great
number given how prolific LLVM is. This is also good practice anyways,
on top of solving the direct problem of these jobs running with the cron
schedule on people's forks.
(cherry picked from commit 9f5be5f0092a636274953389cd5771c45ac0a568)
|
|
Set this in the cache file directly instead of via the test-release.sh
script so that the release builds can be reproduced with just the cache
file.
(cherry picked from commit 53ff002c6f7ec64a75ab0990b1314cc6b4bb67cf)
|