Age | Commit message (Collapse) | Author | Files | Lines |
|
Created using spr 1.3.4
[skip ci]
|
|
|
|
llvm-project/mlir/lib/Dialect/Linalg/Transforms/MeshShardingInterfaceImpl.cpp:96:8:
error: unused variable 'resultElementType' [-Werror,-Wunused-variable]
Type resultElementType =
^
llvm-project/mlir/lib/Dialect/Linalg/Transforms/MeshShardingInterfaceImpl.cpp:122:1:
error: non-void function does not return a value in all control paths [-Werror,-Wreturn-type]
}
^
2 errors generated.
|
|
This commit adds listener support to the dialect conversion. Similarly
to the greedy pattern rewrite driver, an optional listener can be
specified in the configuration object.
Listeners are notified only if the dialect conversion succeeds. In case
of a failure, where some IR changes are first performed and then rolled
back, no notifications are sent.
Due to the fact that some kinds of rewrite are reflected in the IR
immediately and some in a delayed fashion, there are certain limitations
when attaching a listener; these are documented in `ConversionConfig`.
To summarize, users are always notified about all rewrites that
happened, but the notifications are sent all at once at the very end,
and not interleaved with the actual IR changes.
This change is in preparation improvements to
`transform.apply_conversion_patterns`, which currently invalidates all
handles. In the future, it can use a listener to update handles
accordingly, similar to `transform.apply_patterns`.
|
|
|
|
During block signature conversion, a new block is inserted and ops are
moved from the old block to the new block. This commit changes the
implementation such that ops are moved in bulk (`splice`) instead of
one-by-one; that's what `splitBlock` is doing.
This also makes it possible to pass the new block argument types
directly to `createBlock` instead of using `addArgument` (which bypasses
the rewriter). This doesn't change anything from a technical point of
view (there is no rewriter API for adding arguments at the moment), but
the implementation reads a bit nicer.
|
|
Allows linalg structured operations to be handled during spmdization and
sharding propagation.
There is only support for projected permutation indexing maps.
|
|
|
|
Link: https://discourse.llvm.org/t/enabling-opaque-pointers-by-default/61322
|
|
Link: https://discourse.llvm.org/t/enabling-opaque-pointers-by-default/61322
|
|
When dumping FDEs, `readelf` prints new location values after
`DW_CFA_advance_loc(*)` instructions, which looks quite convenient:
```
> readelf -wf test.o
...
... FDE ... pc=0000000000000030..0000000000000064
DW_CFA_advance_loc: 4 to 0000000000000034
...
DW_CFA_advance_loc: 4 to 0000000000000038
...
```
This patch makes `llvm-dwarfdump` and `llvm-readobj` do the same.
|
|
The saved state of the AARCH64_DWARF_PAUTH_RA_STATE register was not
updated, so `llvm-dwarfdump` continued to dump it as `reg34=1` even if
the correct value is `0`:
```
> llvm-dwarfdump -v test.o
...
0000002c 00000024 00000030 FDE cie=00000000 pc=00000030...00000064
Format: DWARF32
DW_CFA_advance_loc: 4
DW_CFA_AARCH64_negate_ra_state:
DW_CFA_advance_loc: 4
DW_CFA_def_cfa_offset: +16
DW_CFA_offset: W30 -16
DW_CFA_remember_state:
DW_CFA_advance_loc: 16
DW_CFA_def_cfa_offset: +0
DW_CFA_advance_loc: 4
DW_CFA_AARCH64_negate_ra_state:
DW_CFA_restore: W30
DW_CFA_advance_loc: 4
DW_CFA_restore_state:
DW_CFA_advance_loc: 12
DW_CFA_def_cfa_offset: +0
DW_CFA_advance_loc: 4
DW_CFA_AARCH64_negate_ra_state:
DW_CFA_restore: W30
DW_CFA_nop:
0x30: CFA=WSP
0x34: CFA=WSP: reg34=1
0x38: CFA=WSP+16: W30=[CFA-16], reg34=1
0x48: CFA=WSP: W30=[CFA-16], reg34=1
0x4c: CFA=WSP: reg34=1 <--- should be '=0'
0x50: CFA=WSP+16: W30=[CFA-16], reg34=1
0x5c: CFA=WSP: W30=[CFA-16], reg34=1
0x60: CFA=WSP: reg34=1 <--- should be '=0'
```
|
|
loops (#83939)
This PR adds OMP runtime support for more efficient partitioning of
certain types of collapsed loops that can be used by compilers that
support loop collapsing (i.e. MSVC) to achieve more optimal thread load
balancing.
In particular, this PR addresses double nested upper and lower isosceles
triangular loops of the following types
1. lower triangular 'less_than'
for (int i=0; i<N; i++)
for (int j=0; j<i; j++)
2. lower triangular 'less_than_equal'
for (int i=0; i<N; j++)
for (int j=0; j<=i; j++)
3. upper triangular
for (int i=0; i<N; i++)
for (int j=i; j<N; j++)
Includes tests for the three supported loop types.
---------
Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
|
|
909ab0e0d1903ad2329ca9fdf248d21330f9437f. NFC
|
|
|
|
|
|
|
|
in tryFoldSelectOfConstants()
|
|
Following-up on GH-76761.
|
|
llvm-project/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp:650:13:
error: private field 'OwnerFn' is not used [-Werror,-Wunused-private-field]
Function *OwnerFn = nullptr;
^
1 error generated.
|
|
|
|
We should moreElements <3 x s1> to <4 x s1> before we try to widen the element,
otherwise we end up with a <3 x s21> nonsense type.
|
|
We can not use OtherPredicates or SubtargetPredicate because they
should be copied from pseudo to real, and we should not override them.
|
|
libc/src/__support/CPP/bit.h and cpp:: is meant to mirror std::. Fix the
TODO.
|
|
- [libc] finish documenting c23 additions
- sort according to appearance in Annex B and section 7
|
|
* Include whether functions are inlinable as they impact whether to add
them into the tbd file and for future verification.
* Fix how clang arguments got passed along, previously spacing was
passed along to CC1 causing search path inputs to look non-existent.
|
|
Handle out-of-bounds reading errors correctly in LinuxKernelRewriter.
|
|
This PR follows our previous [RFC
](https://discourse.llvm.org/t/rfc-add-xegpu-dialect-for-intel-gpus/75723)
to add XeGPU dialect definition for Intel GPUs. It contains dialect,
type, attributes and operators definitions, as well as testcases for
semantic checks. The lowering and optimization passes will be issued
with separated passes.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
|
|
I don't know why the premerge setup didn't fail on this, but many
builbots are broken right now.
|
|
* Create a libcall for s64 type for 32 bit targets.
* Fix a bug in REM selection: SUBREG_TO_REG is not intended to produce a
value from super registers.
* Replace selector tests by end-to-end tests. Other passes
check the selected MIR better.
|
|
|
|
When Apple released its new linker, it had a subtle bug that caused
LLDB's TLS tests to fail. Unfortunately this means that TLS tests are
not going to work on machines that have affected versions of the linker,
so we should annotate the tests so that they only work when we are
confident the linker has the required fix.
I'm not completely satisfied with this implementation. That being said,
I believe that adding suport for linker versions in general is a
non-trivial change that would require far more thought. There are a few
challenges involved:
- LLDB's testing infra takes an argument to change the compiler, but
there's no way to switch out the linker.
- There's no standard way to ask a compiler what linker it will use.
- There's no standard way to ask a linker what its version is. Many
platforms have the same name for their linker (ld).
- Some platforms automatically switch out the linker underneath you. We
do this for Windows tests (where we use LLD no matter what).
Given that this is affecting the tests on our CI, I think this is an
acceptable solution in the interim.
|
|
Select blocks poison, but AND/OR do not. We need to insert a freeze
to block poison propagation.
This creates suboptimal codegen which I will try to fix with other
patches. I'm prioritizing the correctness fix since we have 2 bug reports.
Fixes #84200 and #84350
|
|
|
|
Summary:
The old driver embed PTX in rdc-mode and so does the `nvcc` compiler.
The new drivers currently does not do this, so we should keep it
consistent in this case. This simply requires adding the assembler
output as an input to the offloading action that gets fed to fatbin.
|
|
I am trying to bring up a MacOS buildbot targeting x86 and noticed that
two Dexter tests were failing,
cross-project-tests/debuginfo-tests/llgdb-tests/static-member.cpp and
cross-project-tests/debuginfo-tests/llgdb-tests/static-member-2.cpp.
Looking in the history for these tests, they were XFAILed for Apple
Silicon in 9c46606 and are failing similar on x86 for me, so we should extend
the XFAIL to all MacOS architectures.
|
|
If notifyEmitted encounters a failure (either because some plugin returned one,
or because the ResourceTracker was defunct) then we need to deallocate the
FinalizedAlloc manually.
No testcase yet: This requires a concurrent setup -- we'll need to build some
infrastructure to coordinate links and deliberately injected failures in order
to reliably test this.
|
|
We're going from a single use to two independent uses, we need these two
to see consistent values for undef. As an example, consider x = 0x2 when
y = 0b00u1. If the sub use picks 0b0001 and the cmp use picks 0b0011,
that would be incorrect.
|
|
|
|
(#84174)
With out-of-process execution the target triple can be different from
the one on the host. We need an interface to configure it.
Relanding this with cleanup-fixes in the unittest.
|
|
This is useful to attach generators to JITDylibs or inject initial
symbol definitions.
|
|
Allows a processor to define different latencies for the two operations.
|
|
Add test showing that tbaa.struct is generated when using TSan with
relaxed-aliasing.
|
|
|
|
Admittedly a bit awkward, `visionos` is the correct and accepted
spelling for annotating availability for xrOS target triples. This patch
detects errors and handles cases when `xros` is mistakenly passed.
In addition, add APIs for introduced/deprecated/obsoleted versioning in
DarwinSDKInfo mappings.
|
|
Read .altinstructions and annotate instructions that have alternative
sequences with "AltInst" annotation. Note that some instructions may
have more than one alternatives, in which case they will have multiple
annotations in the form "AltInst", "AltInst2", "AltInst3", etc.
|
|
|
|
Equivalent to the changes made in https://github.com/llvm/llvm-project/pull/83941,
except to support shell tests.
|
|
This reverts commit 4ce52e2d576937fe930294cae883a0daa17eeced to fix
issues detected by https://lab.llvm.org/buildbot/#/builders/74/builds/26470/steps/12/logs/stdio.
|
|
- Factor our common setup code.
- Split the ProgressManager test into separate tests as they test
separate things.
- Fix usage of EXPECT (which continues on failure) and ASSERT (which
halts on failure). We must use the latter when calling GetEvent as
otherwise we'll try to dereference a null EventSP.
|