Age | Commit message (Collapse) | Author | Files | Lines |
|
This will result in larger atomic operations getting expanded to
`__atomic_*` libcalls via AtomicExpandPass, which matches what Clang
already does in the frontend.
While AMDGPU currently disables the use of all libcalls, I've changed it
to instead disable all of them _except_ the atomic ones. Those are
already be emitted by the Clang frontend, and enabling them in the
backend allows the same behavior there.
|
|
A displacement is an 8-, 16-, or 32-bit value.
LLVM integrated assembler silently encodes an out-of-range displacement.
GNU assembler checks the displacement and may report a warning or error
(error is for 64-bit addressing, done as part of
https://sourceware.org/PR10636).
```
movq 0x80000000(%rip), %rax
Error: 0x80000000 out of range of signed 32bit displacement
movq -0x080000001(%rax), %rax
Error: 0xffffffff7fffffff out of range of signed 32bit displacement
movl 0x100000001(%eax), %eax
Warning: 0x100000001 shortened to 0x1
```
For 32-bit addressing, GNU assembler gives no diagnostic when the
displacement is within `[-2**32,2**32)`. 16-bit addressing is similar.
```
movl 0xffffffff(%eax), %eax # no diagnostic
movl -0xffffffff(%eax), %eax # no diagnostic
```
Supporting a larger range is probably because wraparound using a large
constant is more reasonable. E.g. Linux kernel arch/x86/kernel/head_32.S
has `leal -__PAGE_OFFSET(%ecx),%esp` where `__PAGE_OFFSET` is
0xc0000000.
This patch implements a similar behavior.
|
|
Fix #57086: when ASAN_SHADOW_OFFSET_CONST >= 0x80000000 (FreeBSD,
NetBSD, etc), `movsbl ASAN_SHADOW_OFFSET_CONST(%r10),%r10d` has an
invalid displacement (not representable as a signed 32-bit integer),
which will be diagnosed by GNU assembler.
```
% cat a.s
movsbl 0x80000000(%r10),%r10d
% as a.s
a.s: Assembler messages:
a.s:1: Error: 0x80000000 out of range of signed 32bit displacement
% clang -c a.s
```
The integrated assembler after #75747 will diagnose the invalid
displacement as well.
```
% clang -c a.s
a.s:1:19: error: displacement 2147483648 is not within [-2147483648, 2147483647]
movsbl 0x80000000(%r10),%r10d
^
```
If ASAN_SHADOW_OFFSET_CONST cannot be encoded as a displacement, switch
to `movabsq+movsbl`.
|
|
The function `pthread_mutex_clocklock` is not supported by TSAN yet,
which is mentioned by[
llvm/llvm-project/issues/62623](https://github.com/llvm/llvm-project/issues/62623#issue-1701600538).
This patch is to handle this function.
|
|
|
|
Since FatLTO now uses the UnifiedLTO pipeline, we should not set the
ThinLTO module flag to true, since it may cause an assertion failure.
See https://github.com/llvm/llvm-project/issues/70703 for context.
|
|
This is follow up for https://github.com/llvm/llvm-project/pull/71902.
The
default option --continue-on-cu-index-overflow returned an error
--continue-on-cu-index-overflow: missing argument. Changed it so that it
is the
same behavior as other flags like -gsplit-dwarf. Where
--continue-on-cu-index-overflow will default to continue, and user can
set mode
with --continue-on-cu-index-overflow=\<value>.
|
|
Currently, when libc++'s views::take specially handles an iota_view, the
addition is done after dereferencing the beginning iterator. However, in
[range.take.overview]/2.3, the addition is done before the dereferencing,
which means that the standard requires the returned iota_view to have
the same W and Bound type in such cases.
This patch fixes that, and also fixes a test that was testing the
incorrect behavior.
Fixes #75611
|
|
This reverts commit 58a2c4e2f24ffce3966c3988d1a4ca7b04c52244 to fix the
issue detected by https://lab.llvm.org/buildbot/#/builders/233/builds/5424.
|
|
This is a followup to d01be3c63109986627c1c029d6d0130f76a63a2f.
|
|
This patch lifts aux vector related definitions to app.h. Because
startup's refactoring is in progress, this patch still contains
duplicated changes. This problem will be addressed very soon in an
incoming patch.
|
|
The function was using the default version of ValueObject::Dump, which
has a default of using the synthetic-ness of the top-level value for
determining whether to print _all_ values as synthetic. This resulted in
some unusual behavior, where e.g. a std::vector is stringified as
synthetic if its dumped as the top level object, but in its raw form if
it is a member of a struct without a pretty printer.
The SBValue class already has properties which determine whether one
should be looking at the synthetic view of the object (and also whether
to use dynamic types), so it seems more natural to use that.
|
|
|
|
The existing code incorrectly assumes that `Path` can be empty. It
can't, it always contains at least `<` or `"`. On Unix, this patch fixes
an incorrect diagnostics that instead of `"/Users/blah"` suggested
`"Userss/blah"`. In assert builds, this would outright crash.
This patch also fixes a bug on Windows that would prevent the diagnostic
being triggered due to separator mismatch.
rdar://91172342
|
|
- G_MERGE_VALUES and G_UNMERGE_VALUES need type pairs instead of type.
|
|
`V` was unused, and all the other deletions follow from that
observation.
|
|
|
|
This patch runs clang-format on all of libcxx/include and libcxx/src, in
accordance with the RFC discussed at [1]. Follow-up patches will format
the benchmarks, the test suite and remaining parts of the code. I'm
splitting this one into its own patch so the diff is a bit easier to
review.
This patch was generated with:
find libcxx/include libcxx/src -type f \
| grep -v 'module.modulemap.in' \
| grep -v 'CMakeLists.txt' \
| grep -v 'README.txt' \
| grep -v 'libcxx.imp' \
| grep -v '__config_site.in' \
| xargs clang-format -i
A Git merge driver is available in libcxx/utils/clang-format-merge-driver.sh
to help resolve merge and rebase issues across these formatting changes.
[1]: https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all
|
|
1) It was marked as volatile. This is not needed and the only reason
it was done is because it is both load and store and handled
together with atomics. Global load to LDS was marked as volatile
just because buffer load was done that way.
2) Preserve at least LDS (store) pointer which we always have with
the intrinsics.
3) Use PoisonValue instead of nullptr for load memop as a Value.
|
|
SIInstrInfo::areMemAccessesTriviallyDisjoint does a DS offset checks,
but does not account for LDS DMA instructions. Added these checks.
Without it code falls through and returns true which is wrong. As a
result mayAlias would always return false for LDS DMA and a regular LDS
instruction or 2 LDS DMA instructions.
At the moment this is NFCI because we do not use this AA in a context
which may touch LDS DMA instructions. This is also unreacheable now
because of the ordered memory ref checks just above in the function and
LDS DMA is marked as volatile. This volatile marking is removed in PR
#75247, therefore I'd submit this check before #75247.
|
|
Rather than shepherding a type name all the way to the backend as a
string and attempting to parse it, get the element type out of the AST
and store that in the resource annotation metadata directly.
Pull Request: https://github.com/llvm/llvm-project/pull/75674
|
|
(#75665)
Add a sub diagnostic group under `-Wunsafe-buffer-usage` controlled by
`-Wunsafe-buffer-usage-in-container`. The subgroup will include warnings
on misuses of `std::span`, `std::vector`, and `std::array`.
|
|
|
|
tests. (#75552)
Profiling cmake shows that a significant time configuring `libc` folder
is spent on running `get_object_files_for_test` in the `test` folder (13
sec in `libc/test` folder / 16 sec in `libc` folder). By caching all
needed objects for each target instead of resolving every time, the time
cmake spends on configuring `libc/test` folder is reduced to ~1s.
|
|
|
|
are not dumped (#75724)
When a section contains two functions x1 and x2, we incorrectly display
x1's relocations when dumping x2 for `--disassemble-symbols=x2 -r`.
Fix #75539 by ignoring these relocations.
|
|
(#75726)
Non-LTO compiles set the buffer name to "<inline asm>"
(`AsmPrinter::addInlineAsmDiagBuffer`) and pass diagnostics to
`ClangDiagnosticHandler` (through the `MCContext` handler in
`MachineModuleInfoWrapperPass::doInitialization`) to ensure that
the exit code is 1 in the presence of errors. In contrast, LTO compiles
spuriously succeed even if error messages are printed.
```
% cat a.c
void _start() {}
asm("unknown instruction");
% clang -c a.c
<inline asm>:1:1: error: invalid instruction mnemonic 'unknown'
1 | unknown instruction
| ^
1 error generated.
% clang -c -flto a.c; echo $? # -flto=thin is the same
error: invalid instruction mnemonic 'unknown'
unknown instruction
^~~~~~~
error: invalid instruction mnemonic 'unknown'
unknown instruction
^~~~~~~
0
```
`CollectAsmSymbols` parses inline assembly and is transitively called by
both `ModuleSummaryIndexAnalysis::run` and `WriteBitcodeToFile`, leading
to duplicate diagnostics.
This patch updates `CollectAsmSymbols` to be similar to non-LTO
compiles.
```
% clang -c -flto=thin a.c; echo $?
<inline asm>:1:1: error: invalid instruction mnemonic 'unknown'
1 | unknown instruction
| ^
1 errors generated.
1
```
The `HasErrors` check does not prevent duplicate warnings but assembler
warnings are very uncommon.
|
|
While debugging https://github.com/llvm/llvm-project/issues/71326, the
`LoadOp::verify` code and error were very confusing. This PR improves
that.
This code was a part from the reverted PR
https://github.com/llvm/llvm-project/pull/75519. Fixing the
`-convert-vector-to-scf` issue is going to take a bit longer and this
code was out of scope anyway.
Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>
|
|
semicolon as delimiter for local-linkage varibles." (#75835)
Reverts llvm/llvm-project#74008
The compiler-rt test failed due to `llvm-dis` not found
(https://lab.llvm.org/buildbot/#/builders/127/builds/59884)
Will revert and investigate how to require the proper dependency.
|
|
(#70612)
In kernel language mode, use user's grid and blocks size directly. No
validity
check, which means if user's values are too large, the launch will fail,
similar
to what CUDA and HIP are doing right now.
|
|
Summary:
The CPU target currently inherits all the libraries from the normal link
job to ensure that it has access to the same envrionment that the host
does. However, this previously was not respecting argument libraries
that are passed by name rather than `-l` as well as the whole archive
flags. This patch fixes this to allow the CPU linker to correctly pick
up the libraries associated with things like address sanitizers.
Fixes: https://github.com/llvm/llvm-project/issues/75651
|
|
as delimiter for local-linkage varibles. (#74008)
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.
To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/compiler-rt/include/profile/InstrProfData.inc#L72)
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/llvm/lib/ProfileData/InstrProf.cpp#L876-L885)
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp#L1707)
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.
*`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
|
|
This reverts commit 318d5bff0b65aa7d52fc7004d49587416f0fb564.
Has incomplete test updates.
|
|
Summary:
This patch fixes the remaining global constructor in the plguins after
addressing the ones in the JIT interface. This struct was mistakenly
using global constructors as not all the members were being initialized
properly. This was almost certainly being optimized out because it's
trivial, but would still be present in debug builds and prevented us
from compiling with `-Werror=global-constructors`. We will want to do
that once offloading is moved to a runtimes only build.
|
|
This patch enables the following builtins for SME2
svbfmlslb_f32
svbfmlslb_lane_f32
svbfmlslt_f32
svbfmlslt_lane_f32
Patch by: Kerry McLaughlin <kerry.mclaughlin@arm.com>
---------
Co-authored-by: Matthew Devereau <matthew.devereau@arm.com>
|
|
predicate-as-counter (#75200)
The `_s64`/`_u64` part can be omitted now and the name variants do not
include unsigned comparison mnemonics. Both are inferred from
the argument types.
|
|
This is a preparation to start using CMake 3.28 in the CI.
|
|
Adds the following SME2 builtins:
- svzip (x2 & x4)
- svzipq (x2 & x4)
- svuzp (x2 & x4)
- svuzpq (x2 & x4)
See https://github.com/ARM-software/acle/pull/217/files
Patch by David Sherwood <david.sherwood@arm.com>
|
|
A miscompilation issue has been addressed with refined checking.
|
|
We were only folding cases which remained extloads, but DAG.getExtLoad can also handle the cases which don't need to extend at all (we just can't do truncloads).
reduceLoadWidth can handle this for scalar loads, but not for vectors.
Noticed while triaging D152928
|
|
This patch adds a warning that's emitted when a builtin call uses ZA
state but the calling function doesn't provide any.
Patch by David Sherwood <david.sherwood@arm.com>.
|
|
-SAME and -LITERAL do not compose in CHECK commands.
|
|
Add missing < %s to RUN line.
|
|
Some negative tests turn into positive tests, as the differences
between undef and poison propagation allow additional transforms.
|
|
|
|
|
|
This patch fixes the following build error on z/OS `error: unknown type name 'Dl_info'` by adding a guard to check if we have dladdr.
|
|
This is usually handled by demanded elements simplification. However,
as that is not supported for scalable vectors, also handle it
explicitly here.
|
|
(#68373)
This patch makes `num_teams` and `thread_limit` mandatory for bare
kernels,
similar to a reguar kernel language that when launching a kernel, the
grid size
has to be set explicitly.
|
|
Summary:
Libomptarget supports JIT by treating an LLVM-IR file as a regular input
image. The handling here used a global map to keep track of triples once
it was parsed. This was done to same time, however this created a global
constructor as well as an extra mutex to handle it. This patch removes
the use of this map.
Instead, we simply use the file magic to perform a quick check if the
input image is valid bitcode. If not, we then create a lazy module. This
should roughly equivalent to the old handling that create an IR symbol
table. Here we can prevent the module from materializing everything but
the single triple metadata we read in later.
|