aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-08[𝘀𝗽𝗿] changes introduced through rebaseusers/fmayer/spr/main.nfc-hwasan-also-be-more-consistent-when-getting-pointer-typesFlorian Mayer63-225/+2007
Created using spr 1.3.4 [skip ci]
2024-03-07[NFC] [hwasan] use for_each and move commentFlorian Mayer1-6/+5
2024-03-08[mlir] Fix build failure in MeshShardingInterfaceImpl.cpp (NFC)Jie Fu1-1/+2
llvm-project/mlir/lib/Dialect/Linalg/Transforms/MeshShardingInterfaceImpl.cpp:96:8: error: unused variable 'resultElementType' [-Werror,-Wunused-variable] Type resultElementType = ^ llvm-project/mlir/lib/Dialect/Linalg/Transforms/MeshShardingInterfaceImpl.cpp:122:1: error: non-void function does not return a value in all control paths [-Werror,-Wreturn-type] } ^ 2 errors generated.
2024-03-08[mlir][Transforms] Add listener support to dialect conversion (#83425)Matthias Springer4-55/+302
This commit adds listener support to the dialect conversion. Similarly to the greedy pattern rewrite driver, an optional listener can be specified in the configuration object. Listeners are notified only if the dialect conversion succeeds. In case of a failure, where some IR changes are first performed and then rolled back, no notifications are sent. Due to the fact that some kinds of rewrite are reflected in the IR immediately and some in a delayed fashion, there are certain limitations when attaching a listener; these are documented in `ConversionConfig`. To summarize, users are always notified about all rewrites that happened, but the notifications are sent all at once at the very end, and not interleaved with the actual IR changes. This change is in preparation improvements to `transform.apply_conversion_patterns`, which currently invalidates all handles. In the future, it can use a listener to update handles accordingly, similar to `transform.apply_patterns`.
2024-03-07[PPC] precommit cases for issue 74915Chen Zheng1-0/+54
2024-03-08[mlir][Transforms][NFC] Make signature conversion more efficient (#83922)Matthias Springer1-12/+15
During block signature conversion, a new block is inserted and ops are moved from the old block to the new block. This commit changes the implementation such that ops are moved in bulk (`splice`) instead of one-by-one; that's what `splitBlock` is doing. This also makes it possible to pass the new block argument types directly to `createBlock` instead of using `addArgument` (which bypasses the rewriter). This doesn't change anything from a technical point of view (there is no rewriter API for adding arguments at the moment), but the implementation reads a bit nicer.
2024-03-07[mlir] Implement Mesh's ShardingInterface for Linalg ops (#82284)Boian Petkantchin19-19/+754
Allows linalg structured operations to be handled during spmdization and sharding propagation. There is only support for projected permutation indexing maps.
2024-03-07[NFC] [hwasan] consistent naming for cl::optFlorian Mayer1-10/+10
2024-03-07[Instrumentation] Convert tests to opaque pointers (NFC)Fangrui Song5-17/+17
Link: https://discourse.llvm.org/t/enabling-opaque-pointers-by-default/61322
2024-03-07[Object] Convert tests to opaque pointers (NFC)Fangrui Song8-25/+25
Link: https://discourse.llvm.org/t/enabling-opaque-pointers-by-default/61322
2024-03-08[DWARF] Dump an updated location for DW_CFA_advance_loc* (#84274)Igor Kudrin5-31/+41
When dumping FDEs, `readelf` prints new location values after `DW_CFA_advance_loc(*)` instructions, which looks quite convenient: ``` > readelf -wf test.o ... ... FDE ... pc=0000000000000030..0000000000000064 DW_CFA_advance_loc: 4 to 0000000000000034 ... DW_CFA_advance_loc: 4 to 0000000000000038 ... ``` This patch makes `llvm-dwarfdump` and `llvm-readobj` do the same.
2024-03-08[llvm-dwarfdump] Fix parsing DW_CFA_AARCH64_negate_ra_state (#84128)Igor Kudrin2-0/+21
The saved state of the AARCH64_DWARF_PAUTH_RA_STATE register was not updated, so `llvm-dwarfdump` continued to dump it as `reg34=1` even if the correct value is `0`: ``` > llvm-dwarfdump -v test.o ... 0000002c 00000024 00000030 FDE cie=00000000 pc=00000030...00000064 Format: DWARF32 DW_CFA_advance_loc: 4 DW_CFA_AARCH64_negate_ra_state: DW_CFA_advance_loc: 4 DW_CFA_def_cfa_offset: +16 DW_CFA_offset: W30 -16 DW_CFA_remember_state: DW_CFA_advance_loc: 16 DW_CFA_def_cfa_offset: +0 DW_CFA_advance_loc: 4 DW_CFA_AARCH64_negate_ra_state: DW_CFA_restore: W30 DW_CFA_advance_loc: 4 DW_CFA_restore_state: DW_CFA_advance_loc: 12 DW_CFA_def_cfa_offset: +0 DW_CFA_advance_loc: 4 DW_CFA_AARCH64_negate_ra_state: DW_CFA_restore: W30 DW_CFA_nop: 0x30: CFA=WSP 0x34: CFA=WSP: reg34=1 0x38: CFA=WSP+16: W30=[CFA-16], reg34=1 0x48: CFA=WSP: W30=[CFA-16], reg34=1 0x4c: CFA=WSP: reg34=1 <--- should be '=0' 0x50: CFA=WSP+16: W30=[CFA-16], reg34=1 0x5c: CFA=WSP: W30=[CFA-16], reg34=1 0x60: CFA=WSP: reg34=1 <--- should be '=0' ```
2024-03-07[OpenMP] runtime support for efficient partitioning of collapsed triangular ↵vadikp-intel5-2/+692
loops (#83939) This PR adds OMP runtime support for more efficient partitioning of certain types of collapsed loops that can be used by compilers that support loop collapsing (i.e. MSVC) to achieve more optimal thread load balancing. In particular, this PR addresses double nested upper and lower isosceles triangular loops of the following types 1. lower triangular 'less_than' for (int i=0; i<N; i++) for (int j=0; j<i; j++) 2. lower triangular 'less_than_equal' for (int i=0; i<N; j++) for (int j=0; j<=i; j++) 3. upper triangular for (int i=0; i<N; i++) for (int j=i; j<N; j++) Includes tests for the three supported loop types. --------- Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
2024-03-07[RISCV] Update some tests I missed in ↵Craig Topper2-29/+33
909ab0e0d1903ad2329ca9fdf248d21330f9437f. NFC
2024-03-08[compiler-rt][Fuzzer] fix windows typo (#84407)David CARLIER1-1/+1
2024-03-08[Clang] Implement constexpr support for `__builtin_popcountg` (#84318)OverMighty5-1/+15
2024-03-08[compiler-rt] adding fchmodat2 syscall introduced in Linux 6.6. (#82275)David CARLIER1-0/+9
2024-03-07[GlobalISel] Fix yet another pointer type invalid combining issue, this time ↵Amara Emerson2-0/+40
in tryFoldSelectOfConstants()
2024-03-07[compiler-rt][fuzzer] Reland "SetThreadName windows implementation" (#83562)David CARLIER1-4/+21
Following-up on GH-76761.
2024-03-08[Asan] Fix -Wunused-private-field in non-assertion builds (NFC)Jie Fu1-0/+1
llvm-project/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp:650:13: error: private field 'OwnerFn' is not used [-Werror,-Wunused-private-field] Function *OwnerFn = nullptr; ^ 1 error generated.
2024-03-07[NFC] [hwasan] be consistent about how to get integer types (#84396)Florian Mayer1-4/+2
2024-03-07[AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELTAmara Emerson2-0/+71
We should moreElements <3 x s1> to <4 x s1> before we try to widen the element, otherwise we end up with a <3 x s21> nonsense type.
2024-03-07AMDGPU: Use True16Predicate for UseRealTrue16Insts in VOP2 Reals (#84394)Changpeng Fang1-3/+3
We can not use OtherPredicates or SubtargetPredicate because they should be copied from pseudo to real, and we should not override them.
2024-03-07[libc] rename cpp::count_ones to cpp::popcount to better mirror std:: (#84388)Nick Desaulniers8-14/+12
libc/src/__support/CPP/bit.h and cpp:: is meant to mirror std::. Fix the TODO.
2024-03-07[libc] finish documenting c23 additions (#84383)Nick Desaulniers1-33/+79
- [libc] finish documenting c23 additions - sort according to appearance in Annex B and section 7
2024-03-07[InstallAPI] Collect global functions (#83952)Cyndy Ishida10-11/+182
* Include whether functions are inlinable as they impact whether to add them into the tbd file and for future verification. * Fix how clang arguments got passed along, previously spacing was passed along to CC1 causing search path inputs to look non-existent.
2024-03-07[BOLT] Properly propagate Cursor errors (#84378)Maksim Panchenko2-10/+25
Handle out-of-bounds reading errors correctly in LinuxKernelRewriter.
2024-03-07[MLIR] XeGPU dialect for Intel GPU - core definitions and base classes (#78483)Chao Chen15-1/+250
This PR follows our previous [RFC ](https://discourse.llvm.org/t/rfc-add-xegpu-dialect-for-intel-gpus/75723) to add XeGPU dialect definition for Intel GPUs. It contains dialect, type, attributes and operators definitions, as well as testcases for semantic checks. The lowering and optimization passes will be issued with separated passes. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2024-03-07Fix build: llvm::Error needs to be moved for implicit conversion to Expected.Mehdi Amini1-1/+1
I don't know why the premerge setup didn't fail on this, but many builbots are broken right now.
2024-03-08[X86][GlobalISel] Enable G_SDIV/G_UDIV/G_SREM/G_UREM (#81615)Evgenii Kudriashov26-3388/+1044
* Create a libcall for s64 type for 32 bit targets. * Fix a bug in REM selection: SUBREG_TO_REG is not intended to produce a value from super registers. * Replace selector tests by end-to-end tests. Other passes check the selected MIR better.
2024-03-07[libc][docs] add page linking to talks (#84393)Nick Desaulniers2-0/+30
2024-03-07[lldb] Add ability to detect darwin host linker version to xfail tests (#83941)Alex Langford2-0/+28
When Apple released its new linker, it had a subtle bug that caused LLDB's TLS tests to fail. Unfortunately this means that TLS tests are not going to work on machines that have affected versions of the linker, so we should annotate the tests so that they only work when we are confident the linker has the required fix. I'm not completely satisfied with this implementation. That being said, I believe that adding suport for linker versions in general is a non-trivial change that would require far more thought. There are a few challenges involved: - LLDB's testing infra takes an argument to change the compiler, but there's no way to switch out the linker. - There's no standard way to ask a compiler what linker it will use. - There's no standard way to ask a linker what its version is. Many platforms have the same name for their linker (ld). - Some platforms automatically switch out the linker underneath you. We do this for Windows tests (where we use LLD no matter what). Given that this is affecting the tests on our CI, I think this is an acceptable solution in the interim.
2024-03-07[RISCV] Insert a freeze before converting select to AND/OR. (#84232)Craig Topper21-2356/+2587
Select blocks poison, but AND/OR do not. We need to insert a freeze to block poison propagation. This creates suboptimal codegen which I will try to fix with other patches. I'm prioritizing the correctness fix since we have 2 bug reports. Fixes #84200 and #84350
2024-03-07[libc][stdfix] Add exp function for short _Accum and _Accum types. (#84391)lntue15-13/+423
2024-03-07[CUDA] Include PTX in non-RDC mode using the new driver (#84367)Joseph Huber4-22/+36
Summary: The old driver embed PTX in rdc-mode and so does the `nvcc` compiler. The new drivers currently does not do this, so we should keep it consistent in this case. This simply requires adding the assembler output as an input to the offloading action that gets fed to fatbin.
2024-03-07[Dexter] Extend XFAIL of Dexter tests to all MacOS architectures. (#83936)dyung2-2/+2
I am trying to bring up a MacOS buildbot targeting x86 and noticed that two Dexter tests were failing, cross-project-tests/debuginfo-tests/llgdb-tests/static-member.cpp and cross-project-tests/debuginfo-tests/llgdb-tests/static-member-2.cpp. Looking in the history for these tests, they were XFAILed for Apple Silicon in 9c46606 and are failing similar on x86 for me, so we should extend the XFAIL to all MacOS architectures.
2024-03-07[ORC] Deallocate FinalizedAllocs on error paths in notifyEmitted.Lang Hames1-2/+10
If notifyEmitted encounters a failure (either because some plugin returned one, or because the ResourceTracker was defunct) then we need to deallocate the FinalizedAlloc manually. No testcase yet: This requires a concurrent setup -- we'll need to build some infrastructure to coordinate links and deliberately injected failures in order to reliably test this.
2024-03-07[CVP] Freeze Y when expanding urem x, y with X < 2Y (#84390)Philip Reames2-11/+18
We're going from a single use to two independent uses, we need these two to see consistent values for undef. As an example, consider x = 0x2 when y = 0b00u1. If the sub use picks 0b0001 and the cmp use picks 0b0011, that would be incorrect.
2024-03-07[gn build] Port 2a4a852a67eaLLVM GN Syncbot1-0/+1
2024-03-07Reland [clang-repl] Expose setter for triple in IncrementalCompilerBuilder ↵Stefan Gränitz4-6/+59
(#84174) With out-of-process execution the target triple can be different from the one on the host. We need an interface to configure it. Relanding this with cleanup-fixes in the unittest.
2024-03-07[Orc] Add NotifyCreated callback for LLJITBuilder (#84175)Stefan Gränitz1-0/+18
This is useful to attach generators to JITDylibs or inject initial symbol definitions.
2024-03-07[RISCV] Split div vs rem scheduling information [nfc] (#84385)Philip Reames7-8/+57
Allows a processor to define different latencies for the two operations.
2024-03-07[TBAA] Add test showing tbaa.struct being generated with relaxed-alias.Florian Hahn1-0/+26
Add test showing that tbaa.struct is generated when using TSan with relaxed-aliasing.
2024-03-07[mlir][sparse] Move n:m printing into toMLIRString (#84264)Yinying Li2-14/+6
2024-03-07[clang] Upstream visionOS Availability & DarwinSDKInfo APIs (#84279)Cyndy Ishida7-2/+97
Admittedly a bit awkward, `visionos` is the correct and accepted spelling for annotating availability for xrOS target triples. This patch detects errors and handles cases when `xros` is mistakenly passed. In addition, add APIs for introduced/deprecated/obsoleted versioning in DarwinSDKInfo mappings.
2024-03-07[BOLT] Add reading support for Linux kernel .altinstructions section (#84283)Maksim Panchenko2-0/+233
Read .altinstructions and annotate instructions that have alternative sequences with "AltInst" annotation. Note that some instructions may have more than one alternatives, in which case they will have multiple annotations in the form "AltInst", "AltInst2", "AltInst3", etc.
2024-03-07[GlobalISel] Fix crash in tryFoldAndOrOrICmpsUsingRanges() with pointer types.Amara Emerson2-0/+66
2024-03-07[lldb] Disable shell tests affected by ld_new bug (#84246)Dave Lee3-2/+18
Equivalent to the changes made in https://github.com/llvm/llvm-project/pull/83941, except to support shell tests.
2024-03-07Revert "[SLP]Improve minbitwidth analysis."Alexey Bataev15-553/+305
This reverts commit 4ce52e2d576937fe930294cae883a0daa17eeced to fix issues detected by https://lab.llvm.org/buildbot/#/builders/74/builds/26470/steps/12/logs/stdio.
2024-03-07[lldb] Do some gardening in ProgressReportTest (NFC) (#84278)Jonas Devlieghere1-103/+96
- Factor our common setup code. - Split the ProgressManager test into separate tests as they test separate things. - Fix usage of EXPECT (which continues on failure) and ASSERT (which halts on failure). We must use the latter when calling GetEvent as otherwise we'll try to dereference a null EventSP.