aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-04[𝘀𝗽𝗿] changes to main this commit is based onusers/vitalybuka/spr/main.nfccodegen-precommit-test-for-84858Vitaly Buka8-38/+195
Created using spr 1.3.4 [skip ci]
2024-04-04[NFC][UBSAN] Regenerate a testVitaly Buka1-8/+30
2024-04-04Revert "Debuginfod Testing & fixes: 3rd times the charm? (#87676)"Shubham Rastogi10-515/+17
This reverts commit d6713ad80d6907210c629f22babaf12177fa329c. This changed was reverted because of greendragon failures such as Unresolved Tests (2): lldb-api :: debuginfod/Normal/TestDebuginfod.py lldb-api :: debuginfod/SplitDWARF/TestDebuginfodDWP.py
2024-04-04[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new ↵Alexey Bataev27-69/+2124
tail-folding mode using EVL. (#76172) This patch introduces generating VP intrinsics in the Loop Vectorizer. Currently the Loop Vectorizer supports vector predication in a very limited capacity via tail-folding and masked load/store/gather/scatter intrinsics. However, this does not let architectures with active vector length predication support take advantage of their capabilities. Architectures with general masked predication support also can only take advantage of predication on memory operations. By having a way for the Loop Vectorizer to generate Vector Predication intrinsics, which (will) provide a target-independent way to model predicated vector instructions. These architectures can make better use of their predication capabilities. Our first approach (implemented in this patch) builds on top of the existing tail-folding mechanism in the LV (just adds a new tail-folding mode using EVL), but instead of generating masked intrinsics for memory operations it generates VP intrinsics for loads/stores instructions. The patch adds a new VPlanTransforms to replace the wide header predicate compare with EVL and updates codegen for load/stores to use VP store/load with EVL. Other important part of this approach is how the Explicit Vector Length is computed. (VP intrinsics define this vector length parameter as Explicit Vector Length (EVL)). We use an experimental intrinsic `get_vector_length`, that can be lowered to architecture specific instruction(s) to compute EVL. Also, added a new recipe to emit instructions for computing EVL. Using VPlan in this way will eventually help build and compare VPlans corresponding to different strategies and alternatives. Differential Revision: https://reviews.llvm.org/D99750
2024-04-04[RISCV][NFC] Add isTargetAndroid API in RISCVSubtarget (#87671)Paul Kirth1-0/+1
This is required to set target specific code generation options for Android, like using the TLS slot for the stack protector.
2024-04-04[bazel] Add support for building lldb (#87589)Keith Smiley5-3/+3289
This adds build configuration for building LLDB on macOS and Linux. It uses a default subset of features that should work out of the box with macOS + Ubuntu. It is notably missing python support right now, although some of the scaffolding is there, because of the complexity of linking a python dylib, especially if you plan to distribute the resulting liblldb.so. Most of this build file is pretty simple, one of the unfortunate patterns I had to use was to split the header and sources cc_library targets to break circular dependencies.
2024-04-05[flang] Add --gcc-toolchain and --gcc-install-dir options to flang. (#87360)Michael Kruse31-1/+45
The `--gcc-toolchain` and `--gcc-install-dir` option were previously only visible to the Clang driver, but not Flang. These determine which assembler, linker, and libraries to use, e.g. for cross-compiling, and therefore are relevant for Flang as well. Tests are implemented using a mock GCC installation in `basic_cross_linux_tree` copied over from Clang's tests. The Clang driver already contains tests with `--driver-mode=flang` but `flang-new` is an entirely different executable (containing the `-fc1` stage) that should be tested as well. While not all files in `basic_cross_linux_tree` are strictly needed for testing those two driver flags, they will be necessarily needed for future added flags such as `--rtlib`.   Also remove the entry `*.o` in flang's `.gitignore` since `crt*.o` files are needed in the GCC mock installation. Fixes #86729
2024-04-04[UBSAN] Remove invalid assert added with #87709Vitaly Buka1-1/+0
2024-04-05[SPARC] Implement L and H inline asm argument modifiers (#87259)Koakuma4-0/+64
This adds support for using the L and H argument modifiers for twinword operands in inline asm code, such as in: ``` %1 = tail call i64 asm sideeffect "rd %pc, ${0:L} ; srlx ${0:L}, 32, ${0:H}", "={o4}"() ``` This is needed by the Linux kernel.
2024-04-04[UBSAN][HWASAN] Remove redundant flags (#87709)Vitaly Buka7-23/+19
Presense of `cutoff-hot` or `random-skip-rate` should be enough to trigger optimization.
2024-04-04[NFC][HWASAN][UBSAN] Remove cl:init from few opts (#87692)Vitaly Buka2-2/+2
They are supposed to be used with `getNumOccurrences`.
2024-04-04[bazel] Add missing dependency for mlir:SCFUtils (#87711)Chenguang Wang1-0/+1
https://github.com/llvm/llvm-project/commit/5aeb604c7ce417eea110f9803a6c5cb1cdbc5372 https://buildkite.com/llvm-project/upstream-bazel/builds/93859
2024-04-04[libc] Temporary math macros fix (#87681)Michael Jones3-2/+19
Downstream's having some issues due to math-macros.h issues. These will be fixed properly soon. See https://github.com/llvm/llvm-project/issues/87683 for tracking this tech debt.
2024-04-04[HWASAN][UBSAN] Don't use default `profile-summary-cutoff-hot` (#87691)Vitaly Buka4-22/+9
Default cutoff is not usefull here. Decision to enable or not sanitizer causes more significant performance impact, than a typical optimizations which rely on `profile-summary-cutoff-hot`.
2024-04-04[flang] Added windows-include.h wrapper to resolve name conflicts. (#87650)Slava Zakharin5-11/+29
The header file includes windows.h in a mean-and-lean way to avoid bringing in names that may conflict with Flang code.
2024-04-04[libc++][NFC] Make __desugars_to a variable template and rename the header ↵Nikolas Klauser11-31/+26
to desugars_to.h (#87337) This improves compile times and memory usage slightly and removes some boilerplate.
2024-04-04[mlir][SCF] Modernize `coalesceLoops` method to handle `scf.for` loops with ↵MaheshRavishankar13-240/+587
iter_args (#87019) As part of this extension this change also does some general cleanup 1) Make all the methods take `RewriterBase` as arguments instead of creating their own builders that tend to crash when used within pattern rewrites 2) Split `coalesePerfectlyNestedLoops` into two separate methods, one for `scf.for` and other for `affine.for`. The templatization didnt seem to be buying much there. Also general clean up of tests.
2024-04-04[memprof] Introduce writeMemProf (NFC) (#87698)Kazu Hirata1-76/+142
This patch refactors the serialization of MemProf data to a switch statement style: switch (Version) { case Version0: return ...; case Version1: return ...; } just like IndexedMemProfRecord::serialize. A reasonable amount of code is shared and factored out to helper functions between writeMemProfV0 and writeMemProfV1 to the extent that doens't hamper readability.
2024-04-04Revert "[ARM][Thumb2] Mark BTI-clearing instructions as scheduling region ↵Victor Campos3-189/+0
boundaries" (#87699) Reverts llvm/llvm-project#79173 The testcase fails in non-asserts builds.
2024-04-04[Headers] Don't declare unreachable() from stddef.h in C++ (#86748)Ian Anderson1-0/+4
Even if __need_unreachable is set, stddef.h should not declare unreachable() in C++ because it conflicts with the declaration in \<utility>.
2024-04-04[NFC] [HWASan] clarify FIXME comment (#87689)Florian Mayer1-0/+3
2024-04-04[builtin][NFC] Remove ClangBuiltin<"__builtin_allow_ubsan_check"> (#87581)Vitaly Buka1-2/+1
We don't need clang builtin for this one. It was copy pasted from `__builtin_allow_runtime_check` RFC: https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641
2024-04-04[flang][cuda] Add restriction on assumed size device variable (#87664)Valentin Clement (バレンタイン クレメン)2-1/+7
According to https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/#cfpg-var-qual-attr-device > A device array may be an explicit-shape array, an allocatable array, or an assumed-shape dummy array. Assumed size array are not supported. This patch adds an error for that case.
2024-04-04[mlir][ods] Fix attribute setter gen when properties are on (#87688)Jeff Niu2-13/+60
ODS was still generating the old `Operation::setAttr` hooks for ODS methods for setting attributes, when the backing implementation of the attributes was changed to properties. No idea how this wasn't noticed until now.
2024-04-04[NFC][UBSAN] Similar to #87687 for UBSANVitaly Buka1-64/+64
2024-04-04[MLIR][CF] Fix cf.switch parsing with result numbers (#87658)Keyi Zhang2-2/+15
This PR should fix the parsing bug reported in https://github.com/llvm/llvm-project/issues/87430. It allows using result number as the `cf.switch` operand.
2024-04-04[HWASan] Allow stack_history_size of 4096 (#86362)Florian Mayer1-2/+2
There is no reason to limit the minimum to two pages.
2024-04-04[NFC][HWASAN] Cleanup opt opt test (#87687)Vitaly Buka1-12/+12
Main change is replacing DEFAULT with HOT99. I'll remove DEFAULT related functionality in the followup patches.
2024-04-04[NFC][HWASAN] Simplify `selectiveInstrumentationShouldSkip` (#87670)Vitaly Buka1-20/+16
2024-04-04[SLP]Fix PR87630: wrong result for externally used vector value.Alexey Bataev2-8/+14
Need to check that the externally used value can be represented with the BitWidth before applying it, otherwise need to keep wider type.
2024-04-04[OpenMP] Unsupport absolute KMP_HW_SUBSET test for s390x (#87555)Jonathan Peyton1-0/+6
2024-04-04[libc++][CI] Updates to Clang 19. (#85301)Mark de Wever6-19/+19
Since we have released Clang 16 is no longer actively supported. However the FreeBSD runner is still using this, so some tests still guard against Clang 16.
2024-04-04Debuginfod Testing & fixes: 3rd times the charm? (#87676)Kevin Frei10-17/+515
I believe I've got the tests properly configured to only run on Linux x86(_64), as I don't have a Linux AArch64/Arm device to diagnose what's going wrong with the tests (I suspect there's some issue with generating `.note.gnu.build-id` sections...) The actual code fixes have now been reviewed 3 times: https://github.com/llvm/llvm-project/pull/79181 (moved shell tests to API tests), https://github.com/llvm/llvm-project/pull/85693 (Changed some of the testing infra), and https://github.com/llvm/llvm-project/pull/86812 (didn't get the tests configured quite right). The Debuginfod integration for symbol acquisition in LLDB now works with the `executable` and `debuginfo` Debuginfod network requests working properly for normal, `objcopy --only-keep-debug` stripped, split-dwarf, and `objcopy --only-keep-debug` stripped *plus* split-dwarf symbols/binaries. The reasons for the multiple attempts have been tests on platforms I don't have access to (Linux AArch64/Arm + MacOS x86_64). I believe I've got the tests properly disabled for everything except for Linux x86(_64) now. I've built & tested on MacOS AArch64 and Linux x86_64. --------- Co-authored-by: Kevin Frei <freik@meta.com>
2024-04-04[libc] Move thread sync when closing port earlierJoseph Huber1-4/+4
Summary: This synchronization should be done before we handle the logic relating to closing the port. This isn't majorly important now but it would break if we ever decided to run a server on the GPU.
2024-04-04[SLP]Add a test with the incorrect casting for external user, NFC.Alexey Bataev1-0/+64
2024-04-04[AArch64] Fix heuristics for folding "lsl" into load/store ops. (#86894)Eli Friedman14-177/+119
The existing heuristics were assuming that every core behaves like an Apple A7, where any extend/shift costs an extra micro-op... but in reality, nothing else behaves like that. On some older Cortex designs, shifts by 1 or 4 cost extra, but all other shifts/extensions are free. On all other cores, as far as I can tell, all shifts/extensions for integer loads are free (i.e. the same cost as an unshifted load). To reflect this, this patch: - Enables aggressive folding of shifts into loads by default. - Removes the old AddrLSLFast feature, since it applies to everything except A7 (and even if you are explicitly targeting A7, we want to assume extensions are free because the code will almost always run on a newer core). - Adds a new feature AddrLSLSlow14 that applies specifically to the Cortex cores where shifts by 1 or 4 cost extra. I didn't add support for AddrLSLSlow14 on the GlobalISel side because it would require a bunch of refactoring to work correctly. Someone can pick this up as a followup.
2024-04-04[CostModel][X86] Add costkinds test coverage for masked ↵Simon Pilgrim5-16/+7255
load/store/gather/scatter Noticed while starting triage for #87640
2024-04-04[AArch64][PAC][MC][ELF] Support PAuth ABI compatibility tag (#85236)Daniil Kovalev6-12/+127
Depends on #87545 Emit `GNU_PROPERTY_AARCH64_FEATURE_PAUTH` property in `.note.gnu.property` section depending on `aarch64-elf-pauthabi-platform` and `aarch64-elf-pauthabi-version` llvm module flags.
2024-04-04[TextAPI] Reorder addRPath parameters (#87601)Cyndy Ishida4-9/+9
It matches up with other _attribute_ adding member functions and helps simplify InterfaceFile assignment for InstallAPI.
2024-04-04[ValueTracking] Add more conditions in to `isTruePredicate`Noah Goldstein3-64/+77
There is one notable "regression". This patch replaces the bespoke `or disjoint` logic we a direct match. This means we fail some simplification during `instsimplify`. All the cases we fail in `instsimplify` we do handle in `instcombine` as we add `disjoint` flags. Other than that, just some basic cases. See proofs: https://alive2.llvm.org/ce/z/_-g7C8 Closes #86083
2024-04-04[ValueTracking] Add tests for deducing more conditions in `isTruePredicate`; NFCNoah Goldstein2-0/+466
2024-04-04[ValueTracking] Infer known bits fromfrom `(icmp eq (and/or x,y), C)`Noah Goldstein3-20/+25
In `(icmp eq (and x,y), C)` all 1s in `C` must also be set in both `x`/`y`. In `(icmp eq (or x,y), C)` all 0s in `C` must also be set in both `x`/`y`. Closes #87143
2024-04-04[ValueTracking] Add tests for computing known bits from `(icmp eq (and/or ↵Noah Goldstein1-5/+105
x,y), C)`; NFC
2024-04-04[mlir] Add `requiresReplacedValues` and `visitReplacedValues` to ↵Fabian Mora5-10/+82
`PromotableOpInterface` (#86792) Add `requiresReplacedValues` and `visitReplacedValues` methods to `PromotableOpInterface`. These methods allow `PromotableOpInterface` ops to transforms definitions mutated by a `store`. This change is necessary to correctly handle the promotion of `LLVM_DbgDeclareOp`. --------- Co-authored-by: Théo Degioanni <30992420+Moxinilian@users.noreply.github.com>
2024-04-04[CMake] Install LLVMgold.so for LLVM_INSTALL_TOOLCHAIN_ONLY=on (#87567)Fangrui Song1-1/+1
LLVMgold.so can be used with GNU ar, gold, ld, and nm to process LLVM bitcode files. Install it in LLVM_INSTALL_TOOLCHAIN_ONLY=on builds like we install libLTO.so. Suggested by @emelife Fix #84271
2024-04-04[memprof] Make RecordWriterTrait a non-template class (#87604)Kazu Hirata2-7/+10
commit d89914f30bc7c180fe349a5aa0f03438ae6c20a4 Author: Kazu Hirata <kazu@google.com> Date: Wed Apr 3 21:48:38 2024 -0700 changed RecordWriterTrait to a template class with IndexedVersion as a template parameter. This patch changes the class back to a non-template one while retaining the ability to serialize multiple versions. The reason I changed RecordWriterTrait to a template class was because, even if RecordWriterTrait had IndexedVersion as a member variable, RecordWriterTrait::EmitKeyDataLength, being a static function, would not have access to the variable. Since OnDiskChainedHashTableGenerator calls EmitKeyDataLength as: const std::pair<offset_type, offset_type> &Len = InfoObj.EmitKeyDataLength(Out, I->Key, I->Data); we can make EmitKeyDataLength a member function, but we have one problem. InstrProfWriter::writeImpl calls: void insert(typename Info::key_type_ref Key, typename Info::data_type_ref Data) { Info InfoObj; insert(Key, Data, InfoObj); } which default-constructs RecordWriterTrait without a specific version number. This patch fixes the problem by adjusting InstrProfWriter::writeImpl to call the other form of insert instead: void insert(typename Info::key_type_ref Key, typename Info::data_type_ref Data, Info &InfoObj) To prevent an accidental invocation of the default constructor of RecordWriterTrait, this patch deletes the default constructor.
2024-04-04[gn build] Port fd38366e4525Arthur Eubanks1-1/+0
2024-04-04[gn build] Port 8bb9443333e0Arthur Eubanks1-0/+1
2024-04-04[gn build] Port 3365d6217901Arthur Eubanks1-0/+1
2024-04-04[gn build] Manually port 6f2d8cc0Arthur Eubanks2-0/+2