rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-06-22	[𝘀𝗽𝗿] changes introduced through rebaseusers/koachan/spr/main.sparcias-rework-asi-tag-matching-in-prep-for-parseforallfeatures	Koakuma	3638	-71330/+318329
	Created using spr 1.3.5 [skip ci]
2024-06-19	[NFC][SPARC] Fix typos and style mismatches	Koakuma	3	-7/+7
	Fix style errors accidentally introduced in PRs #87259 and #94245. Reviewers: rorth, jrtc27, brad0, s-barannikov Reviewed By: s-barannikov Pull Request: https://github.com/llvm/llvm-project/pull/96019
2024-06-19	[LV] Add more masked store cost tests with different masks.	Florian Hahn	1	-0/+151
	Add additional masked store tests which caused crashes with earlier versions of https://github.com/llvm/llvm-project/pull/92555.
2024-06-19	[GISel][RISCV]Implement indirect parameter passing (#95429)	Gábor Spaits	4	-32/+898
	Some targets like RISC-V pass scalars wider than 2×XLEN bits by reference, so those arguments are replaced in the argument list with an address (See RISC-V ABIs Specification 1.0 section 2.1). This commit implements this indirect parameter passing in GlobalISel. --------- Co-authored-by: Gabor Spaits <Gabor.Spaits@hightec-rt.com>
2024-06-19	[mlir][Conversion] Generalize and fix crash in `reconcile-unrealized-casts` ↵	Matthias Springer	7	-159/+180
	(#95700) This commit fixes a crash in `-reconcile-unrealized-casts` when cast ops have multiple operands: ``` DialectConversion.cpp:1583: virtual void mlir::ConversionPatternRewriter::replaceOp(mlir::Operation *, mlir::ValueRange): Assertion `op->getNumResults() == newValues.size() && "incorrect # of replacement values"' failed. ``` This commit also generalizes the pass such that more ops are folded. In particular (letters indicate types): ``` A / \ B C \| A ``` Previously, such IR was not folded at all. The `A -> B -> A` type cast cycle is now folded away. (The `A -> C` cast stays in place.) This commit also turns the pass from a dialect conversion into a simple IR walk. The pattern and its `populate` function are removed. The pattern was a (non-conversion) rewrite pattern, but used in a dialect conversion, which is generally not safe. In particular, the rewrite pattern may traverse IR that was already scheduled for erasure by the dialect conversion. Note: Some test cases changed slightly (NFC) because the new pass implementation no longer attempts to fold ops. Note for LLVM integration: If your pipeline uses the removed `populate` function, try to simply remove that function call. Chances are you may not need it at all. If it is in fact needed, run the `-reconcile-unrealized-casts` pass right after the pass that used to populate the pattern. --------- Co-authored-by: Maksim Levental <maksim.levental@gmail.com> Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
2024-06-19	[mlir][side effect] refactor(*): Include more precise side effects (#94213)	donald chen	40	-397/+687
	This patch adds more precise side effects to the current ops with memory effects, allowing us to determine which OpOperand/OpResult/BlockArgument the operation reads or writes, rather than just recording the reading and writing of values. This allows for convenient use of precise side effects to achieve analysis and optimization. Related discussions: https://discourse.llvm.org/t/rfc-add-operandindex-to-sideeffect-instance/79243
2024-06-19	[AMDGPU] Add IsSingle to a few Interp instructions (#95984)	Joe Nash	1	-1/+1
	A _e64 suffix should not be printed since these instructions only have one legal encoding length. The absence of the IsSingle flag is hidden by how the string is printed. We could fix it for GFX10 as well, but we shouldn't change the asm output to omit _e64 at this point. NFC.
2024-06-19	[AMDGPU] Add IsSingle to V_DIV_FMAS* for consistency. (#95983)	Joe Nash	1	-0/+2
	A _e64 suffix should not be printed since these instructions only have one legal encoding length. The absence of the IsSingle flag is hidden by how the string is printed, but fix it for consistency. NFC
2024-06-19	[RISCV][test] Pre-commit test case where ConstantHoisting fails to trigger	Alex Bradbury	1	-0/+12
	Our getIntImmCostInst is falling back to returning TCC_Free in this case even though both immediates take two instructions to materialise.
2024-06-19	[mlir][vector] Specify bounds of dynamic indices in vector.extract/insert ↵	Benjamin Maxwell	1	-0/+8
	(#95933)
2024-06-19	[include-cleaner] don't consider the associated header unused (#67228)	Sam McCall	2	-6/+89
	Loosely, the "associated header" of `foo.cpp` is `foo.h`. It should be included, many styles include it first. So far we haven't special cased it in any way, and require this include to be justified. e.g. if foo.cpp defines a function declared in foo.h, then the #include is allowed to check these declarations match. However this doesn't really align with what users want: - people reasonably want to include the associated header for the side-effect of validating that it compiles. In the degenerate case, `lib.cpp`is just `#include "lib.h"` (see bug) - That `void foo(){}` IWYU-uses `void foo();` is a bit artificial, and most users won't internalize this. Instead they'll stick with the simpler model "include the header that defines your API". In the rare cases where these give different answers[1], our current behavior is a puzzling special case from the user POV. It is more user-friendly to accept both models. - even where this diagnostic is a true positive (defs don't match header decls) the diagnostic does not communicate this usefully. Fixes https://github.com/llvm/llvm-project/issues/67140 [1] Example of an associated header that's not IWYU-used: ``` // http.h inline URL buildHttpURL(string host, int port, string path) { return "http://" + host + ":" + port + "/" + path; } // http.cpp class HTTPURLHandler : URLHandler { ... }; REGISTER_URL_HANDLER("http", HTTPURLHandler); ```
2024-06-19	[AArch64] Avoid using NEON BSL for streaming[-compatible] functions (#95803)	Sander de Smalen	2	-36/+71

2024-06-19	[AArch64] Let patterns for NEON instructions check runtime mode. (#95560)	Sander de Smalen	8	-883/+313
	This helps identify any failures where the compiler might otherwise silently emit instructions that are not valid for the given runtime mode.
2024-06-19	[llvm-mca] Use llvm::erase_if (NFC) (#96029)	Kazu Hirata	1	-5/+3

2024-06-19	[InstCombine] Preserve all gep flags in gep of exact div fold	Nikita Popov	2	-4/+13

2024-06-19	[mlir][ArmSVE] Add `arm_sve.psel` operation (#95764)	Benjamin Maxwell	6	-3/+166
	This adds a new operation for the SME/SVE2.1 psel instruction. This allows selecting a predicate based on a bit within another predicate, essentially allowing for 2-D predication. Informally, the semantics are: ```mlir %pd = arm_sve.psel %p1, %p2[%index] : vector<[4]xi1>, vector<[8]xi1> ``` => ``` if p2[index % num_elements(p2)] == 1: pd = p1 : type(p1) else: pd = all-false : type(p1) ```
2024-06-19	Avoid object libraries in the VS IDE (#93519)	Michael Kruse	7	-28/+70
	As discussed in #89743, when using the Visual Studio solution generators, object library projects are displayed as a collection of non-editable *.obj files. To look for the corresponding source files, one has to browse (or search) to the library's obj.libname project. This patch tries to avoid this as much as possible. For Clang, there is already an exception for XCode. We handle MSVC_IDE the same way. For MLIR, this is more complicated. There are explicit references to the obj.libname target that only work when there is an object library. This patch cleans up the reasons for why an object library is needed: 1. The obj.libname is modified in the calling CMakeLists.txt. Note that with use-only references, `add_library(<name> ALIAS <target>)` could have been used. 2. An `libMLIR.so` (mlir-shlib) is also created. This works by adding linking the object libraries' object files into the libMLIR.so (in addition to the library's own .so/.a). XCode is handled using the `-force_load` linker option instead. Windows is not supported. This mechanism is different from LLVM's llvm-shlib that is created by linking static libraries with `-Wl,--whole-archive` (and `-Wl,-all_load` on MacOS). 3. The library might be added to an aggregate library. In-tree, the seems to be only `libMLIR-C.so` and the standalone example. In XCode, it uses the object library and `-force_load` mechanism as above. Again, this is different from `libLLVM-C.so`. 4. Build an object library whenever it was before this patch, except when generating a Visual Studio solution. This condition could be removed, but I am trying to avoid build breakages of whatever configurations others use. This seems to never have worked with XCode because of the explicit references to obj.libname (reason 1.). I don't have access to XCode, but I tried to preserve the current working. IMHO there should be a common mechanism to build aggregate libraries for all LLVM projects instead of the 4 that we have now. As far as I can see, this means for LLVM there are the following changes on whether object libraries are created: 1. An object library is created even in XCode if FORCE_OBJECT_LIBRARY is set. I do not know how XCode handles it, but I also know CMake will abort otherwise. 2. An object library is created even for explicitly SHARED libraries for building `libMLIR.so`. Again, mlir-shlib does not work otherwise. `libMLIR.so` itself is created using SHARED so this patch is marking it as EXCLUDE_FROM_LIBMLIR. 3. For the second condition, it is now sensitive to whether the mlir-shlib is built at all (LLVM_BUILD_LLVM_DYLIB). However, an object library is still built using the fourth condition unless using the MSVC solution generator. That is, except with MSVC_IDE, when an object library was built before, it will also be an object library now.
2024-06-19	[RISCV] Move RISCVInsertVSETVLI::coalesceVSETVLIs back to before ↵	Luke Lau	1	-5/+5
	insertReadVL (#96056)
2024-06-19	[mlir][ArmSME] Fold MoveTileSliceToVector + TransferWrite to StoreTileSlice ↵	Benjamin Maxwell	3	-6/+117
	(#95907)
2024-06-19	[AArch64] NFC: Precommit some tests for SME	Sander de Smalen	3	-0/+727
	This shows that when compiling for +sme only, the code-generator doesn't consider streaming mode to determine whether to use (compatible) SVE instructions. A follow-up patch will fix these issues.
2024-06-19	[InstCombine] Preserve all gep flags when emitting offset	Nikita Popov	2	-1/+19

2024-06-19	[InstCombine] Preserve all gep flags in gep of select fold	Nikita Popov	2	-3/+23

2024-06-19	[InstCombine] Preserve all gep flags in dependent IV fold	Nikita Popov	2	-3/+34

2024-06-19	[X86] computeKnownBitsForPMADDWD/PMADDUBSW - tidyup line overflow by moving ↵	Simon Pilgrim	1	-21/+13
	extensions to the multiply stage. NFC.
2024-06-19	[InstCombine] Preserve all gep flags in another select of gep fold	Nikita Popov	2	-3/+28

2024-06-19	[InstCombine] Preserve all flags in phi of gep fold	Nikita Popov	3	-4/+58
	Preserve the intersection of all flags. Add GEPNoWrapFlags::all() to serve as the initialization value for the intersection.
2024-06-19	[include-cleaner] Use filename as requested, not resolved path	Kadir Cetinkaya	1	-2/+2
	This was an unintended change in d5297b72aa32ad3a69563a1fcc61294282f0b379. We don't want to resolve symlinks in filenames, as these might lead to unexpected spellings, compared to requested filenames.
2024-06-19	[InstCombine] Preserve all flags in select of gep fold	Nikita Popov	2	-3/+26
	Preserve the flag intersection.
2024-06-19	[RISCV][NFC] Add UnsupportedSched<F\|D\|A> multiclasses (#95948)	Anton Sidorenko	2	-96/+108
	These multiclasses will be used by new processors (e.g. https://github.com/llvm/llvm-project/pull/95427)
2024-06-19	DenseMap: support enum class keys (#95972)	Ramkumar Ramachandra	2	-2/+32
	Implemented using std::underlying_type.
2024-06-19	[NFC][AArch64] Organise extensions by archtecture version (#95898)	Lucas Duarte Prates	1	-355/+419
	This updates the way the AArch64 architecture extensions are organised in AArch64Features.td to improve readability and maintainability of the file. Extensions are now grouped by the corresponding architecture version in which they were introduced.
2024-06-19	[X86] combineConstantPoolLoads - early-out if the load is not from a ↵	Simon Pilgrim	1	-2/+5
	constant pool. NFC. Don't embed inside the for-loop later on
2024-06-19	[X86] Replace (void) with [[maybe_unused]] for some variables unused (or ↵	Simon Pilgrim	1	-6/+3
	only used in asserts). NFC.
2024-06-19	[SPIR-V] Add __spirv_ wrapper to the OpAtomicExchange instruction (#95961)	Vyacheslav Levytskyy	2	-0/+97
	This PR adds __spirv_ wrapper to the OpAtomicExchange instruction. A new test case is added for the change introduced.
2024-06-19	[SPIR-V] Improve implementation of the duplicates tracker's storage (#95958)	Vyacheslav Levytskyy	3	-160/+113
	This PR continues https://github.com/llvm/llvm-project/pull/94952, managing FunctionType in the same way as a pointee types in https://github.com/llvm/llvm-project/pull/94952 (that is working with TypedPointers pointee types rather than with original llvm's untyped pointers). This PR also fully reworks the base type for the duplicates tracker's storage to conform with and reuse DenseMapInfo. Previous implementation didn't store enough info to differ between key values (see isEqual() implemented as equality of derived from arguments hash values). This, in turn, led to random crashes in very rare occasions when hash value of an actual key matched hash values of empty and tombstone instances. In this PR we use std::tuple instead of a tailor-made class hierarchy, both reusing DenseMapInfo templates and getting rid of the crash condition.
2024-06-19	[LICM] Fix dropped metadata (#95221)	Tim Gymnich	2	-0/+39
	LICM drops metadata for call instructions when cloning instructions. This patch just adds the missing `copyMetadata`. Fixes #91919.
2024-06-19	[flang] allow assumed-rank box in fir.store (#95980)	jeanPerier	4	-13/+58
	Codegen is done with a memcpy using the rank from the "value" descriptor like for the fir.load case. Rational described in https://github.com/llvm/llvm-project/blob/main/flang/docs/AssumedRank.md.
2024-06-19	[X86][CodeGen] Share code between CompressEVEX pass and ND2NonND transform, NFCI	Shengchen Kan	5	-86/+115

2024-06-19	[mlir] Fix loop-like interface (#95817)	Ivan Kulagin	1	-4/+4
	Using the `this` pointer inside interface methods is illegal because it breaks concept-based interfaces. It is necessary to use `$_op` instead. Co-authored-by: ikulagin <i.kulagin@ispras.ru>
2024-06-19	[mlir][vector] Add `vector.from_elements` op (#95938)	Matthias Springer	7	-4/+305
	This commit adds a new operation to the vector dialect: `vector.from_elements` The op constructs a new vector from a given list of scalar values. It is similar to `tensor.from_elements`. ```mlir %0 = vector.from_elements %a, %b, %c, %a, %a, %a : vector<2x3xf32> ``` Constructing a new vector from elements was tedious before this op existed: a typical way was to define an `arith.constant ... : vector<...>`, followed by a chain of `vector.insert`. Folders/canonicalizations are added that can fold `vector.extract` ops and convert the `vector.from_elements` op into a `vector.splat` op. The LLVM lowering generates an `llvm.mlir.undef`, followed by a sequence of scalar insertions in the form of `llvm.insertelement`. Only 0-D and 1-D vectors are currently supported in the LLVM lowering.
2024-06-19	[flang] allow assumed-rank box in fir.alloca (#95947)	jeanPerier	2	-2/+14
	The alloca can be maximized with the maximum number or ranks, which is reasonable (15 currently as per the standard). Introducing a rank based dynamic allocation would complexify alloca hoisting and stack size analysis (this can be revisited if the standard changes to allow more ranks). No change is needed since this is already reflected in how the fir.box type is translated to LLVM.
2024-06-19	[LangRef] Relax semantics of writeonly / memory(write) (#95238)	Nikita Popov	1	-2/+19
	Instead of making writes immediate undefined behavior, consider these attributes in terms of their externally observable effects. We don't care if a location is read within the function, as long as it has no impact on observed behavior. In particular, allow: * Reading a location after writing it. * Reading a location before writing it (within the function) returns a poison value. The latter could be further relaxed to also allow things like "reading the value and then writing it back", but I'm not sure how one would specify that operationally (so that proof checkers can verify it). While here, also explicitly mention the fact that reads and writes to allocas and read from constant globals are `memory(none)`. Fixes https://github.com/llvm/llvm-project/issues/95152.
2024-06-19	Reland "[scudo] Apply filling when realloc shrinks and re-grows a block ↵	Fabio D'Urso	2	-5/+31
	in-place" (#95838) Reland of #93212, which had been reverted in commit bddd8eae17df6511aee789744ccdc158de817081.
2024-06-19	[mlir][emitc] Refactor ArithToEmitC: perform sign adaptation, type ↵	Corentin Ferry	2	-52/+33
	conversions / cast insertion in a single place (#95789) Factor EmitC type signedness adaptation and cast operations in ArithToEmitC using adaptValueType and adaptIntegralTypeSignedness.
2024-06-19	[NFC] [Serialization] Unify how LocalDeclID can be created	Chuanqi Xu	13	-137/+193
	Now we can create a LocalDeclID directly with an integer without verifying. It may be hard to refactor if we want to change the way we serialize DeclIDs (See https://github.com/llvm/llvm-project/pull/95897). Also it is hard for us to debug if someday someone construct a LocalDeclID with an incorrect value. So in this patch, I tried to unify the way we can construct a LocalDeclID in ASTReader, where we will construct the LocalDeclID from the serialized data. Also, now we can verify the constructed LocalDeclID sooner in the new interface.
2024-06-19	[mlir][vector] Add tests for xfer-permute-lowering (1/n)(nfc) (#95529)	Andrzej Warzyński	1	-6/+83
	Adds more tests to "vector-transfer-permutation-lowering.mlir", specifically for the `TransferWritePermutationLowering` pattern - such tests seem to be missing ATM. The following edge cases are covered: * plain fixed-width (supported) * scalable vectors with mask (supported) * plain fixed-width, masked (not supported) This is a part of a larger effort to make sure that all key cases for patterns under `populateVectorTransferPermutationMapLoweringPatterns` () are tested. I also want to make sure that tests use consistent function and variable names. () `transform.apply_patterns.vector.transfer_permutation_patterns` in TD parlance)
2024-06-19	[llvm][CodeGen] Fix failure in window scheduler caused by phi (#95900)	Hua Tian	2	-4/+48
	In certain cases, the register passed with the kernel MBB in phi are not defined within the kernel MBB. This patch adds the corresponding handling.
2024-06-19	[IR] Mark shl constant expression as undesirable (#95940)	Nikita Popov	3	-8/+12
	Mark shl constant expressions undesirable, so that they are no longer automatically created by IRBuilder, constant folding, etc. This is in preparation for removing them entirely.
2024-06-19	[NFC][CodeGen] Remove dead ParallelCG.h/.cpp API (#95770)	Pierre van Houtryve	5	-142/+0
	LTOBackend inlined it a while ago and now uses a static copy. This API was unused. We can always restore it at some point if it's needed, but right now it's just bloat.
2024-06-19	[lldb][ObjC] Don't query objective-c runtime for decls in C++ contexts (#95963)	Michael Buch	3	-1/+29