aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
4 daysConsider dynamic shapes in verify functionsusers/hsiangkai/winograd-opsHsiangkai Wang3-50/+105
8 daysAddress more commentsHsiangkai Wang1-18/+13
9 daysAdd more tests in Linalg/roundtrip.mlir and Linalg/invalid.mlirHsiangkai Wang3-13/+137
12 daysAddress Max191's commentsHsiangkai Wang4-131/+163
14 daysAddress ftynse's commentsHsiangkai Wang4-241/+199
2024-06-20[mlir][linalg] Implement Conv2D using Winograd Conv2D algorithmHsiangkai Wang7-0/+779
Define high level winograd operators and convert conv_2d_nhwc_fhwc into winograd operators. According to Winograd Conv2D algorithm, we need three transform operators for input, filter, and output transformation. The formula of Winograd Conv2D algorithm is Y = A^T x [(G x g x G^T) @ (B^T x d x B)] x A filter transform: G x g x G^T input transform: B^T x d x B output transform: A^T x y x A The implementation is based on the paper, Fast Algorithm for Convolutional Neural Networks. (https://arxiv.org/abs/1509.09308)
2024-06-20[AMDGPU] Preserve chain when selecting llvm.amdgcn.pops.exiting.wave.id (#96167)Jay Foad2-1/+41
Without this SelectionDAG could fail assertions when using the intrinsic in a non-entry BB.
2024-06-20[MC] Fix compilationAlexis Engelke1-1/+1
2024-06-20[CodeGen] Use temp symbol for MBBs (#95031)Alexis Engelke6-20/+32
Internal label names never occur in the symbol table, so when using an object streamer, there's no point in constructing these names and then adding them to hash tables -- they are never visible in the output. It's not possible to reuse createTempSymbol, because on BPF has a different prefix for globals and basic blocks right now.
2024-06-20[NewPM] Move PassManager::run() into Impl.h (NFC)Nikita Popov2-45/+49
We use explicit template instantiation for these classes, so there is no need to have the definition in the header. The places that instantiate the method will include the PassManagerImpl.h file.
2024-06-20Fix bazel build past e2296d8295516e9991cd6ca99ba193fbd232b6da (#96166)Danial Klimkin1-0/+1
2024-06-20[ValueTracking] Support gep nuw in isKnownNonZero()Nikita Popov2-2/+60
gep nuw can be null if and only if both the base pointer and offset are null. Unlike the inbounds case this does not depend on whether the null pointer is valid. Proofs: https://alive2.llvm.org/ce/z/PLoqK5
2024-06-20[AArch64] Remove -debug flag from mlicm-csr-mask.mirpvanhout1-1/+1
2024-06-20[LV] Remove loads from null from pr73894.ll test.Florian Hahn1-9/+7
Load from null is UB, load from pointer arg instead.
2024-06-20[MLIR][Vector] Generalize DropUnitDimFromElementwiseOps to non leading / ↵Hugo Trachino2-26/+65
trailing dimensions. (#92934) Generalizes `DropUnitDimFromElementwiseOps` to support inner unit dimensions. This change stems from improving lowering of contractionOps for Arm SME. Where we end up with inner unit dimensions on MulOp, BroadcastOp and TransposeOp, preventing the generation of outerproducts. discussed [here](https://discourse.llvm.org/t/on-improving-arm-sme-lowering-resilience-in-mlir/78543/17?u=nujaa). --------- Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>
2024-06-20[flang][OpenMP] support more reduction types for procedure designators (#96057)Tom Eccles4-68/+444
This re-uses reduction declarations from intrinsic operators to add support for reductions of allocatables, pointers, and arrays with procedure designators (e.g. min/max). I have split this into two commits to make it easier to review. The first one makes the functional change. The second cleans things up now that we can share much more code between intrinsic operators and procedure designators.
2024-06-20[MC] Eliminate two symbol-related hash maps (#95464)aengelke15-101/+147
Previously, a symbol insertion requires (at least) three hash table operations: - Lookup/create entry in Symbols (main symbol table) - Lookup NextUniqueID to deduplicate identical temporary labels - Add entry to UsedNames, which is also used to serve as storage for the symbol name in the MCSymbol. All three lookups are done with the same name, so combining these into a single table reduces the number of lookups to one. Thus, a pointer to a symbol table entry can be passed to createSymbol to avoid a duplicate lookup of the same name. The new symbol table entry value is placed in a separate header to avoid including MCContext in MCSymbol or vice versa.
2024-06-20[X86] Fix indention in X86InstrArithmetic.td, NFCIShengchen Kan1-199/+187
2024-06-20[LLVM] Add InsertPosition union-type to remove overloads of ↵Stephen Tozer12-3138/+396
Instruction-creation (#94226) This patch simplifies instruction creation by replacing all overloads of instruction constructors/Create methods that are identical other than the Instruction *InsertBefore/BasicBlock *InsertAtEnd/BasicBlock::iterator InsertBefore argument with a single version that takes an InsertPosition argument. The InsertPosition class can be implicitly constructed from any of the above, internally converting them to the appropriate BasicBlock::iterator value which can then be used to insert the instruction (or to not insert it if an invalid iterator is passed). The upshot of this is that code will be deduplicated, and all callsites will switch to calling the new unified version without any changes needed to make the compiler happy. There is at least one exception to this; the construction of InsertPosition is a user-defined conversion, so any caller that was already relying on a different user-defined conversion won't work. In all of LLVM and Clang this happens exactly once: at clang/lib/CodeGen/CGExpr.cpp:123 we try to construct an alloca with an AssertingVH<Instruction> argument, which must now be cast to an Instruction* by using `&*`. If this is more common elsewhere, it could be fixed by adding an appropriate constructor to InsertPosition.
2024-06-20[mlir][ArmSME] Lower extract from 2D scalable create_mask to psel (#96066)Benjamin Maxwell6-10/+168
Example: ```mlir %mask = vector.create_mask %a, %b : vector<[4]x[8]xi1> %slice = vector.extract %mask[%index] : vector<[8]xi1> from vector<[4]x[8]xi1> ``` Becomes: ```mlir %mask_rows = vector.create_mask %a : vector<[4]xi1> %mask_cols = vector.create_mask %b : vector<[8]xi1> %slice = arm_sve.psel %mask_cols, %mask_rows[%index] : vector<[8]xi1>, vector<[4]xi1> ``` Note: While psel is under ArmSVE it requires SME (or SVE 2.1), so this is currently the most logical place for this lowering.
2024-06-20[AMDGPU] Add ALL prefix to all RUN lines for better diagnosticsJay Foad1-6/+6
2024-06-20[ARM] CMSE security mitigation on function arguments and returned values ↵Lucas Duarte Prates4-15/+953
(#89944) The ABI mandates two things related to function calls: - Function arguments must be sign- or zero-extended to the register size by the caller. - Return values must be sign- or zero-extended to the register size by the callee. As consequence, callees can assume that function arguments have been extended and so can callers with regards to return values. Here lies the problem: Nonsecure code might deliberately ignore this mandate with the intent of attempting an exploit. It might try to pass values that lie outside the expected type's value range in order to trigger undefined behaviour, e.g. out of bounds access. With the mitigation implemented, Secure code always performs extension of values passed by Nonsecure code. This addresses the vulnerability described in CVE-2024-0151. Patches by Victor Campos. --------- Co-authored-by: Victor Campos <victor.campos@arm.com>
2024-06-20[AMDGPU] Add a RUN line to test the OSABI-PAL-ERR prefixJay Foad1-0/+1
2024-06-20[AMDGPU] Fix GFX90A/GFX940 check prefix typosJay Foad2-2/+2
2024-06-20[AMDGPU] Tweak comment to fix warning from filecheck_lint.pyJay Foad1-2/+2
2024-06-20[AMDGPU] Fix typo "GXF" in check prefixJay Foad1-17/+17
2024-06-20[MachineLICM] Work-around Incomplete RegUnits (#95926)Pierre van Houtryve3-10/+85
Reverts the behavior introduced by 770393b while keeping the refactored code. Fixes a miscompile on AArch64, at the cost of a small regression on AMDGPU. #96146 opened to investigate the issue
2024-06-20[MC] Remove SectionKind from MCSection (#96067)aengelke27-328/+224
There are only three actual uses of the section kind in MCSection: isText(), XCOFF, and WebAssembly. Store isText() in the MCSection, and store other info in the actual section variants where required. ELF and COFF flags also encode all relevant information, so for these two section variants, remove the SectionKind parameter entirely. This allows to remove the string switch (which is unnecessary and inaccurate) from createELFSectionImpl. This was introduced in [D133456](https://reviews.llvm.org/D133456), but apparently, it was never hit for non-writable sections anyway and the resulting kind was never used.
2024-06-20[LLD] [MinGW] Interpret an empty -entry option as no entry point (#96055)Martin Storsjö2-1/+5
This fixes https://github.com/llvm/llvm-project/issues/93309, and seems to match how GNU ld handles this case.
2024-06-20[clang] Fix `static_cast` to array of unknown bound (#96041)Mariya Podchishchaeva3-0/+37
Per P1975R0 an expression like static_cast<U[]>(...) defines the type of the expression as U[1]. Fixes https://github.com/llvm/llvm-project/issues/62863
2024-06-20mmapForContinuousMode: Align Linux's impl to __APPLE__'s more. NFC. (#95702)NAKAMURA Takumi1-4/+26
2024-06-20[nsan] Fix style issueFangrui Song12-869/+841
The initial check-in of compiler-rt/lib/nsan #94322 has a lot of style issues. Fix them before the history becomes more useful. Pull Request: https://github.com/llvm/llvm-project/pull/96142
2024-06-20Update ExternalPreprocessorSource.h (#96144)Danial Klimkin1-0/+3
Add missing includes.
2024-06-20Fix bazel build past abd95342f0b94e140b36ac954b8f8c29b1393861 (#96143)Danial Klimkin1-0/+27
2024-06-20[mlir][vector] Disable Gather1DToConditionalLoads for scalable vectors (#96049)Cullen Rhodes2-0/+13
Pattern scalarizes vector.gather operations and is incorrect for scalable vectors.
2024-06-20[flang] lower assumed-rank TARGET to intent(in) POINTER (#96082)jeanPerier3-4/+26
The only special thing to do is to use fir.rebox_assumed_rank when reboxing the target to properly set the POINTER attribute inside the descriptor.
2024-06-20[flang] enable copy-in/out of assumed-rank arrays (#96080)jeanPerier2-2/+37
Just remove the TODO and add a test. There is nothing special to do to deal with assumed-rank copy-in/out after the previous copy-in/out API change in https://github.com/llvm/llvm-project/pull/95822.
2024-06-20[IR] Remove support for shl constant expressions (#96037)Nikita Popov18-141/+25
Remove support for shl constant expressions, as part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
2024-06-20[LLVM] Extend setModuleFlag interface. (#86031)Daniel Kiss3-0/+27
Add same interfaces variants to the `Module::setModuleFlag` as the `Module::addModuleFlag` has.
2024-06-20[mlir] Apply ClangTidy finding.Adrian Kuegel1-1/+1
2024-06-20[lldb/DWARF] Fix type definition search with simple template names (#95905)Pavel Labath5-46/+48
With simple template names the template arguments aren't embedded in the DW_AT_name attribute of the type. The code in FindDefinitionTypeForDWARFDeclContext was comparing the synthesized template arguments on the leaf (most deeply nested) DIE, but was not sufficient, as the difference get be at any level above that (Foo<T>::Bar vs. Foo<U>::Bar). This patch makes sure we compare the entire context. As a drive-by I also remove the completely unnecessary ConstStringification of the GetDIEClassTemplateParams result.
2024-06-20Reland "[CVP] Check whether the default case is reachable (#79993)" (#96089)Yingwei Zheng3-5/+339
This patch reverts https://github.com/llvm/llvm-project/pull/81585 as https://github.com/llvm/llvm-project/pull/78582 has been landed. Now clang works well with reproducer https://github.com/llvm/llvm-project/pull/79993#issuecomment-1936822679.
2024-06-19-fsanitize=vptr: Change hash function and simplify bit mixerFangrui Song2-32/+18
llvm::hash_value is not guaranteed to be deterministic. Use the deterministic xxh3_64bits. A strong bit mixer isn't necessary. Use a simpler one that works well with pointers.
2024-06-20[NFC] Fix header level in LangRefQiu Chaofan1-1/+1
2024-06-20[Serialization] No transitive identifier change (#92085)Chuanqi Xu9-82/+210
Following of https://github.com/llvm/llvm-project/pull/92083 The motivation is still cutting of the unnecessary change in the dependency chain. See the above link (recursively) for details. After this patch, (and the above patch), we can already do something pretty interesting. For example, #### Motivation example ``` //--- m-partA.cppm export module m:partA; export inline int getA() { return 43; } export class A { public: int getMem(); }; export template <typename T> class ATempl { public: T getT(); }; //--- m-partA.v1.cppm export module m:partA; export inline int getA() { return 43; } // Now we add a new declaration without introducing a new type. // The consuming module which didn't use m:partA completely is expected to be // not changed. export inline int getA2() { return 88; } export class A { public: int getMem(); // Now we add a new declaration without introducing a new type. // The consuming module which didn't use m:partA completely is expected to be // not changed. int getMem2(); }; export template <typename T> class ATempl { public: T getT(); // Add a new declaration without introducing a new type. T getT2(); }; //--- m-partB.cppm export module m:partB; export inline int getB() { return 430; } //--- m.cppm export module m; export import :partA; export import :partB; //--- useBOnly.cppm export module useBOnly; import m; export inline int get() { return getB(); } ``` In this example, module `m` exports two partitions `:partA` and `:partB`. And a consumer `useBOnly` only consumes the entities from `:partB`. So we don't hope the BMI of `useBOnly` changes if only `:partA` changes. After this patch, we can make it if the change of `:partA` doesn't introduce new types. (And we can get rid of this if we make no-transitive-type-change). As the example shows, when we change the implementation of `:partA` from `m-partA.cppm` to `m-partA.v1.cppm`, we add new function declaration `getA2()` at the global namespace, add a new member function `getMem2()` to class `A` and add a new member function to `getT2()` to class template `ATempl`. And since `:partA` is not used by `useBOnly` completely, the BMI of `useBOnly` won't change after we made above changes. #### Design details Method used in this patch is similar with https://github.com/llvm/llvm-project/pull/92083 and https://github.com/llvm/llvm-project/pull/86912. It extends the 32 bit IdentifierID to 64 bits and use the higher 32 bits to store the module file index. So that the encoding of the identifier won't get affected by other modules. #### Overhead Similar with https://github.com/llvm/llvm-project/pull/92083 and https://github.com/llvm/llvm-project/pull/86912. The change is only expected to increase the size of the on-disk .pcm files and not affect the compile-time performances. And from my experiment, the size of the on-disk change only increase 1%+ and observe no compile-time impacts. #### Future Plans I'll try to do the same thing for type ids. IIRC, it won't change the dependency graph if we add a new type in an unused units. I do think this is a significant win. And this will be a pretty good answer to "why modules are better than headers."
2024-06-20[RISCV] Lower llvm.clear_cache to __riscv_flush_icache for glibc targets ↵Roger Ferrer Ibáñez4-0/+80
(#93481) This change is a preliminary step to support trampolines on RISC-V. Trampolines are used by flang to implement obtaining the address of an internal program (i.e., a nested function in Fortran parlance). In this change we lower `llvm.clear_cache` intrinsic on glibc targets to `__riscv_flush_icache` which is what GCC is currently doing for Linux targets.
2024-06-20[HeaderSearch] Introduce LazyIdentifierInfoPtr for Controlling Macro in ↵Chuanqi Xu5-30/+76
HeaderFileInfo This patch is helpful to reduce 32 bits for HeaderFileInfo by combining a uint32_t and pointer into a tagged pointer. This is reviewed as part of https://github.com/llvm/llvm-project/pull/92085 and required to be split as a separate commit
2024-06-20[PowerPC] Make verifier happy after peephole on MMA COPYs (#94321)Kai Luo2-7/+39
2024-06-20[mlir][linalg] Fix numerical issue with softmax (#96090)Prashant Kumar2-3/+3
For more info: https://github.com/iree-org/iree/issues/17670#issuecomment-2167591878
2024-06-20[ADT] Update hash function of uint64_t for DenseMap (#95734)Chuanqi Xu1-2/+5
(Background: See the comment: https://github.com/llvm/llvm-project/pull/92083#issuecomment-2168121729) It looks like the hash function for 64bits integers are not very good: ``` static unsigned getHashValue(const unsigned long long& Val) { return (unsigned)(Val * 37ULL); } ``` Since the result is truncated to 32 bits. It looks like the higher 32 bits won't contribute to the result. So that `0x1'00000001` will have the the same results to `0x2'00000001`, `0x3'00000001`, ... Then we may meet a lot collisions in such cases. I feel it should generally good to include higher 32 bits for hashing functions. Not sure who's the appropriate reviewer, adding some people by impressions.