Age | Commit message (Collapse) | Author | Files | Lines |
|
This has been flaky for a while, for example
https://lab.llvm.org/buildbot/#/builders/96/builds/50350
```
Command Output (stdout):
--
lldb version 18.0.0git (https://github.com/llvm/llvm-project.git revision 3974d89bde66a2ec61261b969b51993da81205c7)
clang revision 3974d89bde66a2ec61261b969b51993da81205c7
llvm revision 3974d89bde66a2ec61261b969b51993da81205c7
"can't evaluate expressions when the process is running."
```
```
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
#0 0x0000ffffa46191a0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x529a1a0)
#1 0x0000ffffa4617144 llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x5298144)
#2 0x0000ffffa46198d0 SignalHandler(int) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x529a8d0)
#3 0x0000ffffab25b7dc (linux-vdso.so.1+0x7dc)
#4 0x0000ffffab13d050 /build/glibc-Q8DG8B/glibc-2.31/string/../sysdeps/aarch64/multiarch/memcpy_advsimd.S:92:0
#5 0x0000ffffa446f420 lldb_private::process_gdb_remote::GDBRemoteRegisterContext::PrivateSetRegisterValue(unsigned int, llvm::ArrayRef<unsigned char>) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x50f0420)
#6 0x0000ffffa446f7b8 lldb_private::process_gdb_remote::GDBRemoteRegisterContext::GetPrimordialRegister(lldb_private::RegisterInfo const*, lldb_private::process_gdb_remote::GDBRemoteCommunicationClient&) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x50f07b8)
#7 0x0000ffffa446f308 lldb_private::process_gdb_remote::GDBRemoteRegisterContext::ReadRegisterBytes(lldb_private::RegisterInfo const*) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x50f0308)
#8 0x0000ffffa446ec1c lldb_private::process_gdb_remote::GDBRemoteRegisterContext::ReadRegister(lldb_private::RegisterInfo const*, lldb_private::RegisterValue&) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x50efc1c)
#9 0x0000ffffa412eaa4 lldb_private::RegisterContext::ReadRegisterAsUnsigned(lldb_private::RegisterInfo const*, unsigned long) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x4dafaa4)
#10 0x0000ffffa420861c ReadLinuxProcessAddressMask(std::shared_ptr<lldb_private::Process>, llvm::StringRef) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x4e8961c)
#11 0x0000ffffa4208430 ABISysV_arm64::FixCodeAddress(unsigned long) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x4e89430)
```
Judging by the backtrace something is trying to read the pointer authentication address/code mask
registers. This explains why I've not seen this issue locally, as the buildbot runs on Graviton
3 with has the pointer authentication extension.
I will try to reproduce, fix and re-enable the test.
|
|
Generalize checkAndSecondOpImpliedByFirst to also check if the second
operand implies the first.
|
|
std::midpoint is specified by having a pointer overload in
[numeric.ops.midpoint].
With the way the pointer overload is specified, users can expect that
calling
std::midpoint as `std::midpoint<T>(a, b)` should work, but it didn't in
libc++
due to the way the pointer overload was specified.
Fixes #67046
|
|
Tests for vector.outerproduct for scalable vectors from
"vector-scalable-outerproduct.mlir" are moved to:
* ops.mlir and invalid.mlir.
These files are effectively used to document what Ops are supported and
That's basically what the original file was testing (but specifically
for scalable vectors).
|
|
Needed for switch to lookup table optimization.
|
|
|
|
Return ConstantPoolSDNode instead of const ConstantPoolSDNode - doesn't affect the accessors at all and makes it easier to use result in calls expecting a SDNode.
|
|
According to the latest update of the ISA
https://developer.arm.com/documentation/ddi0602/2023-09/?lang=en all of
the affected instruction encodings now require
(FEAT_SVE2 or FEAT_SME2) and FEAT_SVE_B16B16
|
|
For instructions that don't map to a mnemonic string, the implementation
of MCInstPrinter::getMnemonic would return an invalid pointer due to the
result of the calculation of the instruction's position in the `AsmStrs`
table. This patch fixes the issue by ensuring those cases return a
`nullptr` value instead.
Fixes #74177.
|
|
Return poison instead of undef for non-demanded lanes in the AMDGPU
demanded element simplification hook.
Also bail out of dmask is 0, as this case has special semantics:
> If DMASK==0, the TA overrides DMASK=1 and puts zeros in VGPR followed by
> LWE status if exists. TFE status is not generated since the fetch is dropped.
|
|
|
|
Header gurads are not detected until we hit EOF. Make sure we postpone
any such detection until then.
|
|
So we can see it failing and get the extra logged information.
|
|
And remove the workaround I was trying, as this logging may prove what
the actual issue is.
Which I think is that the thread plan map in Process is cleared before
the threads are destroyed. So Thread::ShouldStop could be getting
the current plan, then the plan map is cleared, then Thread::ShouldStop
is deciding based on that plan to pop a plan from the now empty stack.
|
|
As currently defined `MAX_EXPONENT` actually corresponds to the biased
exponent (i.e. an unsigned value).
|
|
Be consistent with complex and character rewrite so that the pass can be
run selectively.
|
|
specializations. (#75795)
This fixes tooling (clangd, include-cleaner) bugs where we miss
functionalities on concept AST nodes.
|
|
Cleanup test sleef-calls-aarch64.ll:
- make the util update script's regex more clear
- eliminate scalar epilogues in tests
|
|
This patch fixes build attributes w/r to Zcmt extension dependency on
Zicsr.
|
|
If this works it'll give me a clue for the underlying issue.
|
|
MCTargetExpr (#75693)
This fixes #73109.
In instruction `addl %eax %rax`, because there is a missing comma in the
middle of two registers, the asm parser will treat it as a binary
expression.
```
%rax % rax --> register mod identifier
```
However, In `MCExpr::evaluateAsRelocatableImpl`, it only checks the left
side of the expression. This patch ensures the right side will also be
checked.
|
|
|
|
|
|
See more details: https://lab.llvm.org/buildbot/#/builders/119/builds/16346
Behalf of @vgvassilev
|
|
|
|
|
|
performFP_TO_INT_SATCombine. (#76014)
performFP_TO_INT_SATCombine could also serve pattern (fp_to_int_sat
(frint X)).
|
|
PromoteAllocaToVector (#68744)
Attached test will cause crash without this change.
We should not remove isAssumeLikeIntrinsic instruction if it is used by
other instruction.
|
|
|
|
toolchain (#73765) (#75890)
Extend the multi-lib re-use selection mechanism for RISC-V.
This funciton will try to re-use multi-lib if they are compatible.
Definition of compatible:
- ABI must be the same.
- multi-lib is a subset of current arch, e.g. multi-lib=march=rv32im
is a subset of march=rv32imc.
- march that contains atomic extension can't reuse multi-lib that
doesn't has atomic, vice versa. e.g. multi-lib=march=rv32im and
march=rv32ima are not compatible, because software and hardware
atomic operation can't work together correctly.
|
|
function template (#75913)
Fixes #75732
|
|
Infer the side effects of `gpu.launch` from its body.
|
|
Notable things in this commit:
* refactors `__indirect_binary_left_foldable`, making it slightly
different (but equivalent) to _`indirect-binary-left-foldable`_, which
improves readability (a [patch to the Working Paper][patch] was made)
* omits `__cpo` namespace, since it is not required for implementing
niebloids (a cleanup should happen in 2024)
* puts tests ensuring invocable robustness and dangling correctness
inside the correctness testing to ensure that the algorithms' results
are still correct
[patch]: https://github.com/cplusplus/draft/pull/6734
|
|
R16-R31 was added into GPRs in
https://github.com/llvm/llvm-project/pull/70958,
This patch supports the encoding/decoding for promoted SHA instruction
in EVEX space.
RFC:
https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4
|
|
This change makes block/region walkers consistent with operation
walkers. An operation walk enumerates the current operation. Similarly,
block/region walks should enumerate the current block/region.
Example:
```
// Current behavior:
op1->walk([](Operation *op2) { /* op1 is enumerated */ });
block1->walk([](Block *block2) { /* block1 is NOT enumerated */ });
region1->walk([](Block *block) { /* blocks of region1 are NOT enumerated */ });
region1->walk([](Region *region2) { /* region1 is NOT enumerated });
// New behavior:
op1->walk([](Operation *op2) { /* op1 is enumerated */ });
block1->walk([](Block *block2) { /* block1 IS enumerated */ });
region1->walk([](Block *block) { /* blocks of region1 ARE enumerated */ });
region1->walk([](Region *region2) { /* region1 IS enumerated });
```
|
|
In LLVMContext::diagnose, set `HasErrors` for `DS_Error` so that all
derived `DiagnosticHandler` have correct `HasErrors` information.
An alternative is to set `HasErrors` in
`DiagnosticHandler::handleDiagnostics`, but all derived
`handleDiagnostics` would have to call the base function.
|
|
libunwind uses C-style and low-level C++, so the language standard
doesn't matter that much, but bumping to C++17 aligns with the rest of
LLVM and enables some features that would cause pedantic warnings in
C++11 (e.g. -Wc++17-attribute-extensions for [[fallthrough]]/
[[nodiscard]]/[[maybe_unused]]). (Contributors might use these features
unaware of the pedantic warnings).
Suggested-by: Christopher Di Bella <cjdb@google.com>
|
|
This commit adds extra assertions to `OperationFolder` and `OpBuilder`
to ensure that the types of the folded SSA values match with the result
types of the op. There used to be checks that discard the folded results
if the types do not match. This commit makes these checks stricter and
turns them into assertions.
Discarding folded results with the wrong type (without failing
explicitly) can hide bugs in op folders. Two such bugs became apparent
in MLIR (and some more in downstream projects) and are fixed with this
change.
Note: The existing type checks were introduced in
https://reviews.llvm.org/D95991.
Migration guide: If you see failing assertions (`folder produced value
of incorrect type`; make sure to run with assertions enabled!), run with
`-debug` or dump the operation right before the failing assertion. This
will point you to the op that has the broken folder. A common mistake is
a mismatch between static/dynamic dimensions (e.g., input has a static
dimension but folded result has a dynamic dimension).
|
|
This is to avoid confusion when dealing with reduction/combining kinds.
For example, see a recent PR comment:
https://github.com/llvm/llvm-project/pull/75846#discussion_r1430722175.
Previously, they were picked to mostly mirror the names of the llvm
vector reduction intrinsics:
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic. In
isolation, it was not clear if `<maxf>` has `arith.maxnumf` or
`arith.maximumf` semantics. The new reduction kind names map 1:1 to
arith ops, which makes it easier to tell/look up their semantics.
Because both the vector and the gpu dialect depend on the arith dialect,
it's more natural to align names with those in arith than with the
lowering to llvm intrinsics.
Issue: https://github.com/llvm/llvm-project/issues/72354
|
|
The test failure is about failed to build instrumented binary on ppc
(https://lab.llvm.org/buildbot/#/builders/18/builds/13228). Not sure how
to fix this for now. Mark the test unsupported on ppc.
```
RUN: at line 46: /home/buildbots/ppc64be-sanitizer/sanitizer-ppc64be/build/build_gcc/./bin/clang --driver-mode=g++ -m64 -ldl -fprofile-generate -fuse-ld=lld -O2 lib.cpp main.cpp -o main
+ /home/buildbots/ppc64be-sanitizer/sanitizer-ppc64be/build/build_gcc/./bin/clang --driver-mode=g++ -m64 -ldl -fprofile-generate -fuse-ld=lld -O2 lib.cpp main.cpp -o main
ld.lld: error: /lib/../lib64/Scrt1.o: ABI version 1 is not supported
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```
|
|
is generated on 64-bit systems (#76005)
|
|
compiler-rt test to three tested platforms. (#76001)
- The IR test failed to import indirect callees on big-endian systems.
The raw profiles are generated on little-endian systems. Going to
require little-endian.
- Limit the compiler-rt test to three tested platforms.
|
|
Refer to RISCV [1], LoongArch also need delayed decision for ADD/SUB
relocations. In handleAddSubRelocations, just return directly if SecA !=
SecB, handleFixup usually will finish the the rest of creating PCRel
relocations works. Otherwise we emit relocs depends on whether
relaxation is enabled. If not, we return true and avoid record ADD/SUB
relocations.
Now the two symbols separated by alignment directive will return without
folding symbol offset in AttemptToFoldSymbolOffsetDifference, which has
the same effect when relaxation is enabled.
[1] https://reviews.llvm.org/D155357
|
|
- After #73887, spirv.Image cannot has a zeroinitializer, even though
it's only used in metadata to pass down the type info to the backend.
Instead of creating zeros, create undef ones of that target-specific
types.
|
|
(#75900)
Adding a test case that this warns even for completely covered switches.
|
|
This commit makes reductions part of the terminator. Instead of
`scf.yield`, `scf.reduce` now terminates the body of `scf.parallel` ops.
`scf.reduce` may contain an arbitrary number of reductions, with one
region per reduction.
Example:
```mlir
%init = arith.constant 0.0 : f32
%r:2 = scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init, %init)
-> f32, f32 {
%elem_to_reduce1 = load %buffer1[%iv] : memref<100xf32>
%elem_to_reduce2 = load %buffer2[%iv] : memref<100xf32>
scf.reduce(%elem_to_reduce1, %elem_to_reduce2 : f32, f32) {
^bb0(%lhs : f32, %rhs: f32):
%res = arith.addf %lhs, %rhs : f32
scf.reduce.return %res : f32
}, {
^bb0(%lhs : f32, %rhs: f32):
%res = arith.mulf %lhs, %rhs : f32
scf.reduce.return %res : f32
}
}
```
`scf.reduce` operations can no longer be interleaved with other ops in
the body of `scf.parallel`. This simplifies the op and makes it possible
to assign the `RecursiveMemoryEffects` trait to `scf.reduce`. (This was
not possible before because the op was not a terminator, causing the op
to be DCE'd.)
|
|
Summary:
This patch reorganizes a lot of the code used to check for compatibility
with the current environment. The main bulk of this patch involves
moving from using a separate `__tgt_image_info` struct (which just
contains a string for the architecture) to instead simply checking this
information from the ELF directly. Checking information in the ELF is
very inexpensive as creating an ELF file is simply writing a base
pointer.
The main desire to do this was to reorganize everything into the ELF
image. We can then do the majority of these checks without first
initializing the plugin. A future patch will move the first ELF checks
to happen without initializing the plugin so we no longer need to
initialize and plugins that don't have needed images.
This patch also adds a lot more sanity checks for whether or not the ELF
is actually compatible. Such as if the images have a valid ABI, 64-bit
width, executable, etc.
|
|
Summary:
Recently we added support for detecting the CUDA processor with the ELF
flags. This allows us to get a string representation of it in other
code. This will be used by the offloading runtime.
|
|
platform (#75905)
`cmp; bc; isync` is more performant than `lwsync` theoretically.
64-bit platform already features it, now implement it for 32-bit
platform.
|
|
|