aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-06-22[𝘀𝗽𝗿] changes introduced through rebaseusers/koachan/spr/main.sparcias-rework-asi-tag-matching-in-prep-for-parseforallfeaturesKoakuma3638-71330/+318329
Created using spr 1.3.5 [skip ci]
2024-06-19[NFC][SPARC] Fix typos and style mismatchesKoakuma3-7/+7
Fix style errors accidentally introduced in PRs #87259 and #94245. Reviewers: rorth, jrtc27, brad0, s-barannikov Reviewed By: s-barannikov Pull Request: https://github.com/llvm/llvm-project/pull/96019
2024-06-19[LV] Add more masked store cost tests with different masks.Florian Hahn1-0/+151
Add additional masked store tests which caused crashes with earlier versions of https://github.com/llvm/llvm-project/pull/92555.
2024-06-19[GISel][RISCV]Implement indirect parameter passing (#95429)Gábor Spaits4-32/+898
Some targets like RISC-V pass scalars wider than 2×XLEN bits by reference, so those arguments are replaced in the argument list with an address (See RISC-V ABIs Specification 1.0 section 2.1). This commit implements this indirect parameter passing in GlobalISel. --------- Co-authored-by: Gabor Spaits <Gabor.Spaits@hightec-rt.com>
2024-06-19[mlir][Conversion] Generalize and fix crash in `reconcile-unrealized-casts` ↵Matthias Springer7-159/+180
(#95700) This commit fixes a crash in `-reconcile-unrealized-casts` when cast ops have multiple operands: ``` DialectConversion.cpp:1583: virtual void mlir::ConversionPatternRewriter::replaceOp(mlir::Operation *, mlir::ValueRange): Assertion `op->getNumResults() == newValues.size() && "incorrect # of replacement values"' failed. ``` This commit also generalizes the pass such that more ops are folded. In particular (letters indicate types): ``` A / \ B C | A ``` Previously, such IR was not folded at all. The `A -> B -> A` type cast cycle is now folded away. (The `A -> C` cast stays in place.) This commit also turns the pass from a dialect conversion into a simple IR walk. The pattern and its `populate` function are removed. The pattern was a (non-conversion) rewrite pattern, but used in a dialect conversion, which is generally not safe. In particular, the rewrite pattern may traverse IR that was already scheduled for erasure by the dialect conversion. Note: Some test cases changed slightly (NFC) because the new pass implementation no longer attempts to fold ops. Note for LLVM integration: If your pipeline uses the removed `populate` function, try to simply remove that function call. Chances are you may not need it at all. If it is in fact needed, run the `-reconcile-unrealized-casts` pass right after the pass that used to populate the pattern. --------- Co-authored-by: Maksim Levental <maksim.levental@gmail.com> Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
2024-06-19[mlir][side effect] refactor(*): Include more precise side effects (#94213)donald chen40-397/+687
This patch adds more precise side effects to the current ops with memory effects, allowing us to determine which OpOperand/OpResult/BlockArgument the operation reads or writes, rather than just recording the reading and writing of values. This allows for convenient use of precise side effects to achieve analysis and optimization. Related discussions: https://discourse.llvm.org/t/rfc-add-operandindex-to-sideeffect-instance/79243
2024-06-19[AMDGPU] Add IsSingle to a few Interp instructions (#95984)Joe Nash1-1/+1
A _e64 suffix should not be printed since these instructions only have one legal encoding length. The absence of the IsSingle flag is hidden by how the string is printed. We could fix it for GFX10 as well, but we shouldn't change the asm output to omit _e64 at this point. NFC.
2024-06-19[AMDGPU] Add IsSingle to V_DIV_FMAS* for consistency. (#95983)Joe Nash1-0/+2
A _e64 suffix should not be printed since these instructions only have one legal encoding length. The absence of the IsSingle flag is hidden by how the string is printed, but fix it for consistency. NFC
2024-06-19[RISCV][test] Pre-commit test case where ConstantHoisting fails to triggerAlex Bradbury1-0/+12
Our getIntImmCostInst is falling back to returning TCC_Free in this case even though both immediates take two instructions to materialise.
2024-06-19[mlir][vector] Specify bounds of dynamic indices in vector.extract/insert ↵Benjamin Maxwell1-0/+8
(#95933)
2024-06-19[include-cleaner] don't consider the associated header unused (#67228)Sam McCall2-6/+89
Loosely, the "associated header" of `foo.cpp` is `foo.h`. It should be included, many styles include it first. So far we haven't special cased it in any way, and require this include to be justified. e.g. if foo.cpp defines a function declared in foo.h, then the #include is allowed to check these declarations match. However this doesn't really align with what users want: - people reasonably want to include the associated header for the side-effect of validating that it compiles. In the degenerate case, `lib.cpp`is just `#include "lib.h"` (see bug) - That `void foo(){}` IWYU-uses `void foo();` is a bit artificial, and most users won't internalize this. Instead they'll stick with the simpler model "include the header that defines your API". In the rare cases where these give different answers[1], our current behavior is a puzzling special case from the user POV. It is more user-friendly to accept both models. - even where this diagnostic is a true positive (defs don't match header decls) the diagnostic does not communicate this usefully. Fixes https://github.com/llvm/llvm-project/issues/67140 [1] Example of an associated header that's not IWYU-used: ``` // http.h inline URL buildHttpURL(string host, int port, string path) { return "http://" + host + ":" + port + "/" + path; } // http.cpp class HTTPURLHandler : URLHandler { ... }; REGISTER_URL_HANDLER("http", HTTPURLHandler); ```
2024-06-19[AArch64] Avoid using NEON BSL for streaming[-compatible] functions (#95803)Sander de Smalen2-36/+71
2024-06-19[AArch64] Let patterns for NEON instructions check runtime mode. (#95560)Sander de Smalen8-883/+313
This helps identify any failures where the compiler might otherwise silently emit instructions that are not valid for the given runtime mode.
2024-06-19[llvm-mca] Use llvm::erase_if (NFC) (#96029)Kazu Hirata1-5/+3
2024-06-19[InstCombine] Preserve all gep flags in gep of exact div foldNikita Popov2-4/+13
2024-06-19[mlir][ArmSVE] Add `arm_sve.psel` operation (#95764)Benjamin Maxwell6-3/+166
This adds a new operation for the SME/SVE2.1 psel instruction. This allows selecting a predicate based on a bit within another predicate, essentially allowing for 2-D predication. Informally, the semantics are: ```mlir %pd = arm_sve.psel %p1, %p2[%index] : vector<[4]xi1>, vector<[8]xi1> ``` => ``` if p2[index % num_elements(p2)] == 1: pd = p1 : type(p1) else: pd = all-false : type(p1) ```
2024-06-19Avoid object libraries in the VS IDE (#93519)Michael Kruse7-28/+70
As discussed in #89743, when using the Visual Studio solution generators, object library projects are displayed as a collection of non-editable *.obj files. To look for the corresponding source files, one has to browse (or search) to the library's obj.libname project. This patch tries to avoid this as much as possible. For Clang, there is already an exception for XCode. We handle MSVC_IDE the same way. For MLIR, this is more complicated. There are explicit references to the obj.libname target that only work when there is an object library. This patch cleans up the reasons for why an object library is needed: 1. The obj.libname is modified in the calling CMakeLists.txt. Note that with use-only references, `add_library(<name> ALIAS <target>)` could have been used. 2. An `libMLIR.so` (mlir-shlib) is also created. This works by adding linking the object libraries' object files into the libMLIR.so (in addition to the library's own .so/.a). XCode is handled using the `-force_load` linker option instead. Windows is not supported. This mechanism is different from LLVM's llvm-shlib that is created by linking static libraries with `-Wl,--whole-archive` (and `-Wl,-all_load` on MacOS). 3. The library might be added to an aggregate library. In-tree, the seems to be only `libMLIR-C.so` and the standalone example. In XCode, it uses the object library and `-force_load` mechanism as above. Again, this is different from `libLLVM-C.so`. 4. Build an object library whenever it was before this patch, except when generating a Visual Studio solution. This condition could be removed, but I am trying to avoid build breakages of whatever configurations others use. This seems to never have worked with XCode because of the explicit references to obj.libname (reason 1.). I don't have access to XCode, but I tried to preserve the current working. IMHO there should be a common mechanism to build aggregate libraries for all LLVM projects instead of the 4 that we have now. As far as I can see, this means for LLVM there are the following changes on whether object libraries are created: 1. An object library is created even in XCode if FORCE_OBJECT_LIBRARY is set. I do not know how XCode handles it, but I also know CMake will abort otherwise. 2. An object library is created even for explicitly SHARED libraries for building `libMLIR.so`. Again, mlir-shlib does not work otherwise. `libMLIR.so` itself is created using SHARED so this patch is marking it as EXCLUDE_FROM_LIBMLIR. 3. For the second condition, it is now sensitive to whether the mlir-shlib is built at all (LLVM_BUILD_LLVM_DYLIB). However, an object library is still built using the fourth condition unless using the MSVC solution generator. That is, except with MSVC_IDE, when an object library was built before, it will also be an object library now.
2024-06-19[RISCV] Move RISCVInsertVSETVLI::coalesceVSETVLIs back to before ↵Luke Lau1-5/+5
insertReadVL (#96056)
2024-06-19[mlir][ArmSME] Fold MoveTileSliceToVector + TransferWrite to StoreTileSlice ↵Benjamin Maxwell3-6/+117
(#95907)
2024-06-19[AArch64] NFC: Precommit some tests for SMESander de Smalen3-0/+727
This shows that when compiling for +sme only, the code-generator doesn't consider streaming mode to determine whether to use (compatible) SVE instructions. A follow-up patch will fix these issues.
2024-06-19[InstCombine] Preserve all gep flags when emitting offsetNikita Popov2-1/+19
2024-06-19[InstCombine] Preserve all gep flags in gep of select foldNikita Popov2-3/+23
2024-06-19[InstCombine] Preserve all gep flags in dependent IV foldNikita Popov2-3/+34
2024-06-19[X86] computeKnownBitsForPMADDWD/PMADDUBSW - tidyup line overflow by moving ↵Simon Pilgrim1-21/+13
extensions to the multiply stage. NFC.
2024-06-19[InstCombine] Preserve all gep flags in another select of gep foldNikita Popov2-3/+28
2024-06-19[InstCombine] Preserve all flags in phi of gep foldNikita Popov3-4/+58
Preserve the intersection of all flags. Add GEPNoWrapFlags::all() to serve as the initialization value for the intersection.
2024-06-19[include-cleaner] Use filename as requested, not resolved pathKadir Cetinkaya1-2/+2
This was an unintended change in d5297b72aa32ad3a69563a1fcc61294282f0b379. We don't want to resolve symlinks in filenames, as these might lead to unexpected spellings, compared to requested filenames.
2024-06-19[InstCombine] Preserve all flags in select of gep foldNikita Popov2-3/+26
Preserve the flag intersection.
2024-06-19[RISCV][NFC] Add UnsupportedSched<F|D|A> multiclasses (#95948)Anton Sidorenko2-96/+108
These multiclasses will be used by new processors (e.g. https://github.com/llvm/llvm-project/pull/95427)
2024-06-19DenseMap: support enum class keys (#95972)Ramkumar Ramachandra2-2/+32
Implemented using std::underlying_type.
2024-06-19[NFC][AArch64] Organise extensions by archtecture version (#95898)Lucas Duarte Prates1-355/+419
This updates the way the AArch64 architecture extensions are organised in AArch64Features.td to improve readability and maintainability of the file. Extensions are now grouped by the corresponding architecture version in which they were introduced.
2024-06-19[X86] combineConstantPoolLoads - early-out if the load is not from a ↵Simon Pilgrim1-2/+5
constant pool. NFC. Don't embed inside the for-loop later on
2024-06-19[X86] Replace (void) with [[maybe_unused]] for some variables unused (or ↵Simon Pilgrim1-6/+3
only used in asserts). NFC.
2024-06-19[SPIR-V] Add __spirv_ wrapper to the OpAtomicExchange instruction (#95961)Vyacheslav Levytskyy2-0/+97
This PR adds __spirv_ wrapper to the OpAtomicExchange instruction. A new test case is added for the change introduced.
2024-06-19[SPIR-V] Improve implementation of the duplicates tracker's storage (#95958)Vyacheslav Levytskyy3-160/+113
This PR continues https://github.com/llvm/llvm-project/pull/94952, managing FunctionType in the same way as a pointee types in https://github.com/llvm/llvm-project/pull/94952 (that is working with TypedPointers pointee types rather than with original llvm's untyped pointers). This PR also fully reworks the base type for the duplicates tracker's storage to conform with and reuse DenseMapInfo. Previous implementation didn't store enough info to differ between key values (see isEqual() implemented as equality of derived from arguments hash values). This, in turn, led to random crashes in very rare occasions when hash value of an actual key matched hash values of empty and tombstone instances. In this PR we use std::tuple instead of a tailor-made class hierarchy, both reusing DenseMapInfo templates and getting rid of the crash condition.
2024-06-19[LICM] Fix dropped metadata (#95221)Tim Gymnich2-0/+39
LICM drops metadata for call instructions when cloning instructions. This patch just adds the missing `copyMetadata`. Fixes #91919.
2024-06-19[flang] allow assumed-rank box in fir.store (#95980)jeanPerier4-13/+58
Codegen is done with a memcpy using the rank from the "value" descriptor like for the fir.load case. Rational described in https://github.com/llvm/llvm-project/blob/main/flang/docs/AssumedRank.md.
2024-06-19[X86][CodeGen] Share code between CompressEVEX pass and ND2NonND transform, NFCIShengchen Kan5-86/+115
2024-06-19[mlir] Fix loop-like interface (#95817)Ivan Kulagin1-4/+4
Using the `this` pointer inside interface methods is illegal because it breaks concept-based interfaces. It is necessary to use `$_op` instead. Co-authored-by: ikulagin <i.kulagin@ispras.ru>
2024-06-19[mlir][vector] Add `vector.from_elements` op (#95938)Matthias Springer7-4/+305
This commit adds a new operation to the vector dialect: `vector.from_elements` The op constructs a new vector from a given list of scalar values. It is similar to `tensor.from_elements`. ```mlir %0 = vector.from_elements %a, %b, %c, %a, %a, %a : vector<2x3xf32> ``` Constructing a new vector from elements was tedious before this op existed: a typical way was to define an `arith.constant ... : vector<...>`, followed by a chain of `vector.insert`. Folders/canonicalizations are added that can fold `vector.extract` ops and convert the `vector.from_elements` op into a `vector.splat` op. The LLVM lowering generates an `llvm.mlir.undef`, followed by a sequence of scalar insertions in the form of `llvm.insertelement`. Only 0-D and 1-D vectors are currently supported in the LLVM lowering.
2024-06-19[flang] allow assumed-rank box in fir.alloca (#95947)jeanPerier2-2/+14
The alloca can be maximized with the maximum number or ranks, which is reasonable (15 currently as per the standard). Introducing a rank based dynamic allocation would complexify alloca hoisting and stack size analysis (this can be revisited if the standard changes to allow more ranks). No change is needed since this is already reflected in how the fir.box type is translated to LLVM.
2024-06-19[LangRef] Relax semantics of writeonly / memory(write) (#95238)Nikita Popov1-2/+19
Instead of making writes immediate undefined behavior, consider these attributes in terms of their externally observable effects. We don't care if a location is read within the function, as long as it has no impact on observed behavior. In particular, allow: * Reading a location after writing it. * Reading a location before writing it (within the function) returns a poison value. The latter could be further relaxed to also allow things like "reading the value and then writing it back", but I'm not sure how one would specify that operationally (so that proof checkers can verify it). While here, also explicitly mention the fact that reads and writes to allocas and read from constant globals are `memory(none)`. Fixes https://github.com/llvm/llvm-project/issues/95152.
2024-06-19Reland "[scudo] Apply filling when realloc shrinks and re-grows a block ↵Fabio D'Urso2-5/+31
in-place" (#95838) Reland of #93212, which had been reverted in commit bddd8eae17df6511aee789744ccdc158de817081.
2024-06-19[mlir][emitc] Refactor ArithToEmitC: perform sign adaptation, type ↵Corentin Ferry2-52/+33
conversions / cast insertion in a single place (#95789) Factor EmitC type signedness adaptation and cast operations in ArithToEmitC using adaptValueType and adaptIntegralTypeSignedness.
2024-06-19[NFC] [Serialization] Unify how LocalDeclID can be createdChuanqi Xu13-137/+193
Now we can create a LocalDeclID directly with an integer without verifying. It may be hard to refactor if we want to change the way we serialize DeclIDs (See https://github.com/llvm/llvm-project/pull/95897). Also it is hard for us to debug if someday someone construct a LocalDeclID with an incorrect value. So in this patch, I tried to unify the way we can construct a LocalDeclID in ASTReader, where we will construct the LocalDeclID from the serialized data. Also, now we can verify the constructed LocalDeclID sooner in the new interface.
2024-06-19[mlir][vector] Add tests for xfer-permute-lowering (1/n)(nfc) (#95529)Andrzej Warzyński1-6/+83
Adds more tests to "vector-transfer-permutation-lowering.mlir", specifically for the `TransferWritePermutationLowering` pattern - such tests seem to be missing ATM. The following edge cases are covered: * plain fixed-width (supported) * scalable vectors with mask (supported) * plain fixed-width, masked (not supported) This is a part of a larger effort to make sure that all key cases for patterns under `populateVectorTransferPermutationMapLoweringPatterns` (*) are tested. I also want to make sure that tests use consistent function and variable names. (*) `transform.apply_patterns.vector.transfer_permutation_patterns` in TD parlance)
2024-06-19[llvm][CodeGen] Fix failure in window scheduler caused by phi (#95900)Hua Tian2-4/+48
In certain cases, the register passed with the kernel MBB in phi are not defined within the kernel MBB. This patch adds the corresponding handling.
2024-06-19[IR] Mark shl constant expression as undesirable (#95940)Nikita Popov3-8/+12
Mark shl constant expressions undesirable, so that they are no longer automatically created by IRBuilder, constant folding, etc. This is in preparation for removing them entirely.
2024-06-19[NFC][CodeGen] Remove dead ParallelCG.h/.cpp API (#95770)Pierre van Houtryve5-142/+0
LTOBackend inlined it a while ago and now uses a static copy. This API was unused. We can always restore it at some point if it's needed, but right now it's just bloat.
2024-06-19[lldb][ObjC] Don't query objective-c runtime for decls in C++ contexts (#95963)Michael Buch3-1/+29