aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
11 hours[NFC][LLVM] Pass/return SMLoc by value instead of const reference (#160797)Rahul Joshi8-26/+23
SMLoc itself encapsulates just a pointer, so there is no need to pass or return it by reference.
11 hours[NFC][LLVM] Use Unix line endings for a few source files (#160794)Rahul Joshi3-542/+542
11 hours[NFC][LLVM] Fix line endings for DXILABI.cpp (#160791)Rahul Joshi1-33/+33
Fix line ending to Unix style by running dos2unix on this file.
11 hours[ARM] Add extra mulh tests with known-bits. NFCDavid Green1-12/+247
12 hours[BOLT][AArch64] Fix BUILD_SHARED_LIBS after #158738 (#160854)Paschalis Mpeis1-1/+1
Link BOLTUtils against the AArch64 target to support the new option that enables instrumentation without LSE (see #158738) This fixes shared library builds, eg: https://lab.llvm.org/staging/#/builders/220/builds/1537 Note: the link points to a collapsing builder.
12 hours[llvm] Proofread BugLifeCycle.rst (#160817)Kazu Hirata1-7/+7
12 hours[Support] Consolidate runOnNewStack (NFC) (#160816)Kazu Hirata1-10/+8
This patch consolidates two implementations of runOnNewStack with "if constexpr".
12 hours[ADT] Apply Rule of Five to StringTable::Iterator (#160815)Kazu Hirata1-6/+2
StringTable::Iterator has a user-defined copy assignment operator, a defaulted copy constructor, and a defaulted move constructor. This patch makes the copy assignment operator defaulted and adds a defaulted move assignment operator to adhere to the Rule of Five while making the operators constexpr.
12 hours[ADT] Refactor SmallPtrSetIterator (NFC) (#160814)Kazu Hirata1-25/+30
SmallPtrSetIterator and its base class SmallPtrSetIteratorImpl collectively have the following responsibilities: - type-safe user-facing iterator interface - type-erased iterator increment/dereference core - DebugEpochBase via inheritance This patch refactors the two classes so that SmallPtrSetIteratorImpl implements everything except the type-safe user-facing interface. Benefits: - DebugEpochBase::HandleBase is now part of the type-erased class. - AdvanceIfNotValid is now private in SmallPtrSetIteratorImpl. - SmallPtrSetIterator is a very thin wrapper around SmallPtrSetIteratorImpl and should generate very little code on its own.
12 hours[ADT] Reduce code duplication in SmallDenseMap (NFC) (#160813)Kazu Hirata1-25/+13
This patch reduces code duplication by having allocateBuckets take a larger role. Specifically, allocateBuckets now checks to see if we need to allocate heap memory and initializes Small appropriately. With this patch, allocateBuckets mirrors deallocateBuckets cleanly. Both methods handle the Small mode without asserting and are responsible for constructing and destructing LargeRep.
12 hoursPeepholeOpt: Use initializer list (#160898)Matt Arsenault1-2/+1
12 hours[libc++] Support comparison of more than two data sets in compare-benchmarksLouis Dionne1-35/+45
12 hours[clang][bytecode][NFC] Simplify align_up/down implementation (#160880)Timm Baeder1-7/+5
Fix a double assignment to a local variable and use the new popToAPSInt() overload.
12 hours[llvm][clang] Use the VFS in `FileCollector` (#160788)Jan Svoboda7-20/+36
This PR changes `llvm::FileCollector` to use the `llvm::vfs::FileSystem` API for making file paths absolute instead of using `llvm::sys::fs::make_absolute()` directly. This matches the behavior of the compiler on most other input files.
12 hoursGreedy: Make trySplitAroundHintReg try to match hints with subreg copies ↵Matt Arsenault2-30/+50
(#160294) This is essentially the same patch as 116ca9522e89f1e4e02676b5bbe505e80c4d4933; when trying to match a physreg hint, try to find a compatible physreg if there is a subregister copy. This has the slight difference of using getSubReg on the hint instead of getMatchingSuperReg (the other use should also use getSubReg instead, it's faster). At the moment this turns out to have very little effect. The adjacent code needs better handling of subregisters, so continue adding this piecemeal. The X86 test shows a net reduction in real instructions, plus a few new kills.
12 hours[clang][bytecode][NFC] Use switches for pointer type distinction (#160879)Timm Baeder2-26/+36
In the important places. They are all fully covered switch statements so we know where to add code when adding a new pointer type.
12 hoursRevert "[RegAlloc] Strengthen asserts in LiveRangeEdit::scanRemattable ↵Philip Reames1-3/+3
[nfc]" (#160897) Reverts llvm/llvm-project#160765. Failures on buildbot indicate second assertion does not in fact hold.
13 hours[RegAlloc] Add printer and dump for VNInfo [nfc] (#160758)Philip Reames2-12/+26
Uses the existing format of the LiveRange printer, and just factors it out so that you can do vni->dump() when debugging, or log a vni in a debug print statement.
13 hoursllvm-tli-checker: Take ifunc symbols into account (#158596)Gleb Popov2-2/+45
FreeBSD libc has a lot of symbols that are ifuncs, which makes TLI checker believe they are not available. This change makes the tool consider symbols with the STT_GNU_IFUNC type.
13 hours[AArch64][GlobalISel] Add support for ldexp (#160517)Ryan Cowan3-3/+120
13 hours[Clang][RVV][SVE] Cache getScalableVectorType lookups (#160108)Shaoce SUN2-7/+46
Currently, RVV/SVE intrinsics are cached, but the corresponding type construction is not. As a result, `ASTContext::getScalableVectorType` can become a performance hotspot, since every query must run through a long sequence of type checks and macro expansions.
13 hours[RegAlloc] Strengthen asserts in LiveRangeEdit::scanRemattable [nfc] (#160765)Philip Reames1-3/+3
We should always be able to find the VNInfo in the original live interval which corresponds to the subset we're trying to spill, and the only cases where we have a VNInfo without a definition instruction are if the vni is unused, or corresponds to a phi. Adjust the code structure to explicitly check for PHIDef, and assert the stronger conditions.
13 hours[RegAlloc] Add additional tracing in InlineSpiller::rematerializeFor (#160761)Philip Reames1-2/+11
We didn't have trace logging for two cases in this routine which makes it sometimes hard to tell what is going on. In addition to debug trace statements, add comments to explain the logic behind the early exits which don't mark the virtual register live. Suggestions on how to word these more precisely very welcome; I'm not clear I understand all the intrinicies of this code myself.
13 hours[CodeGen] Adjust global-split remat heuristic to match LICM (#160709)Philip Reames1-1/+2
This heuristic was originally added in 40c4aa with the stated purpose of avoiding global split on live long ranges created by MachineLICM hoisting trivially rematerializable instructions. In the meantime, various backends have introduced non-trivial rematerialization cases, MachineLICM gained an explicitly triviality check, and we've reworked our APIs to match naming wise. Let's move this heuristic back to truely trivial remat only. This is a functional change, though somewhat hard to hit. This change will cause non-trivially rematerializable instructions to be globally split more often. This is likely a good thing since non-trivial remat may not be legal at all possible points in the live interval, but may cost slightly more compile time. I don't have a motivating example; I found it when reviewing the callers of isRemMaterializable(MI).
14 hours[MLIR] Fix LivenessAnalysis/RemoveDeadValues handling of dead function ↵Mehdi Amini3-3/+36
arguments (#160755) In #153973 I added the correctly handling of block arguments, unfortunately this was gated on operation that also have results. This wasn't intentional and this excluded operations like function from being correctly processed.
14 hours[NFC][OpenACC][CIR] Extract 'base' class for Recipe generation (#160603)Erich Keane3-293/+357
It was brought up on a previous review that the CIRGenOpenACCRecipe.h file was getting too large. I noticed that the 'dependent on template argument' parts were actually quite small, so I extract a base class in this patch that allows me to implement it in the .cpp file, plus minimize the amount of code that needs instantiating.
14 hoursAllowing RDV to call `getArgOperandsMutable()` (#160415)Francisco Geiman Thiesen4-14/+165
## Problem `RemoveDeadValues` can legally drop dead function arguments on private `func.func` callees. But call-sites to such functions aren't fixed if the call operation keeps its call arguments in a **segmented operand group** (i.ie, uses `AttrSizedOperandSegments`), unless the call op implements `getArgOperandsMutable` and the RDV pass actually uses it. ## Fix When RDV decides to drop callee function args, it should, for each call-site that implements `CallOpInterface`, **shrink the call's argument segment** via `getArgOperandsMutable()` using the same dead-arg indices. This keeps both the flat operand list and the `operand_segment_sizes` attribute in sync (that's what `MutableOperandRange` does when bound to the segment). ## Note This change is a no-op for: * call ops without segment operands (they still get their flat operands erased via the generic path) * call ops whose calle args weren't dropped (public, external, non-`func-func`, unresolved symbol, etc) * `llvm.call`/`llvm.invoke` (RDV doesn't drop `llvm.func` args --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
14 hours[MLIR][TBLGen] Added compound assignment operator for any BitEnum (#160840)Alexandra Sidorova5-4/+25
## Details: - Added missing compound assignment operators `|=`, `&=`, `^=` to `mlir-tblgen` - Replaced the arithmetic operators with added assignment operators for `BitEnum` in the transformations - Updated related documentation ## Tickets: - Closes https://github.com/llvm/llvm-project/issues/158098
14 hours[libc] Update the memory helper functions for simd types (#160174)Joseph Huber3-18/+119
Summary: This unifies the interface to just be a bunch of `load` and `store` functions that optionally accept a mask / indices for gathers and scatters with masks. I had to rename this from `load` and `store` because it conflicts with the other version in `op_generic`. I might just work around that with a trait instead.
15 hours[flang-rt] Set _POSIX_C_SOURCE on Darwin (#160130)Leandro Lupori2-1/+17
Clang on Darwin enables non-POSIX extensions by default. This causes some macros to leak, such as HUGE from <math.h>, which causes some conflicts with Flang symbols (but not with Flang-RT, for now). It also causes some Flang-RT extensions to be disabled, such as FDATE, that checks for _POSIX_C_SOURCE. Setting _POSIX_C_SOURCE avoids these issues. This is already being done in Flang, but it was not ported to Flang-RT. This also fixes check-flang-rt on Darwin, as NoArgv.FdateNotSupported is broken since the flang runtime was moved to flang-rt. Fixes #82036
15 hours[ASan] Update meminstrinsics to use library memmove rather than internal ↵Dan Blackwell2-1/+4
(#160740) Currently `memcpy` and `memset` intrinsics map through to the library implementations if ASan has been inited, whereas `memmove` always calls `internal_memmove`. This patch changes `memmove` to use the library implementation if ASan has been inited.
15 hours[SCEV] Add tests for computing trip counts with align assumptions.Florian Hahn1-0/+159
15 hours[Flang][OpenMP] Enable no-loop kernels (#155818)Dominik Adamski8-38/+210
Enable the generation of no-loop kernels for Fortran OpenMP code. target teams distribute parallel do pragmas can be promoted to no-loop kernels if the user adds the -fopenmp-assume-teams-oversubscription and -fopenmp-assume-threads-oversubscription flags. If the OpenMP kernel contains reduction or num_teams clauses, it is not promoted to no-loop mode. The global OpenMP device RTL oversubscription flags no longer force no-loop code generation for Fortran.
16 hours[libc++] Switch back to plotting on revlist order for visualize-historicalLouis Dionne1-2/+2
That provides vastly better plots.
16 hours[X86] ↵Simon Pilgrim2-2/+2
canCreateUndefOrPoisonForTargetNode/isGuaranteedNotToBeUndefOrPoisonForTargetNode - add X86ISD::VPERMILPV handling (#160849) X86ISD::VPERMILPV shuffles can't create undef/poison itself, allowing us to fold freeze(vpermilps(x,y)) -> vpermilps(freeze(x),freeze(y))
16 hours[X86] Add test showing failure to fold freeze(insertps(x,y,i)) -> ↵Simon Pilgrim1-0/+20
insertps(freeze(x),freeze(y),i) (#160852)
16 hours[mlir] Add splitDebugFilename field in DIComplileUnitAttr. (#160704)Abid Qadeer9-20/+27
Mostly mechanical changes to add the missing field.
17 hours[ARM] Remove -fno-unsafe-math from a number of tests. NFCDavid Green7-382/+673
llvm.convert/to.fp16 and from.fp16 are no longer used / deprecated and do not need to be tested any more.
17 hours[X86] ↵Simon Pilgrim2-13/+5
canCreateUndefOrPoisonForTargetNode/isGuaranteedNotToBeUndefOrPoisonForTargetNode - add X86ISD::VPERMV handling (#160845) X86ISD::VPERMV shuffles can't create undef/poison itself, allowing us to fold freeze(vpermps(x,y)) -> vpermps(freeze(x),freeze(y))
17 hours[clang] Support building native tools when cross-compiling standalone clang ↵Ross Burton1-0/+6
(#160605) When cross-compiling the LLVM project as a whole (from llvm/), if it cannot find presupplied tools it will create a native build environment to build the tools it needs. However, when doing a standalone build of clang (that is, from clang/ and linking against an existing libLLVM) this doesn't work. Instead a _target_ binary is built which predictably then fails. The conventional workaround for this is to build the native tools in a separate native compile phase and pass the paths to the cross build, for example see OpenEmbedded[1] or Nix[2]. But we can do better! The first problem is that LLVM_USE_HOST_TOOLS is only set in the llvm/ CMakeLists.txt, so setup_host_tool() will never consider building a native binary. This can be solved by setting LLVM_USE_HOST_TOOLS based on CMAKE_CROSSCOMPILING in clang/CMakeLists.txt in the standalone case. Now setup_host_tool() will try to build a native tool, but it needs build_native_tool() from CrossCompile.cmake, so that also needs to be included. Finally, the native binary then fails because there's no provider for the dependency "CONFIGURE_Clang_NATIVE", so use llvm_create_cross_target to create the native environment. These few lines mirror what the lldb CMakeLists.txt does in the standalone case, so there is prior art for this. [1] https://git.openembedded.org/openembedded-core/tree/meta/recipes-devtools/clang/clang_git.bb?id=e18d697e92b55e57124e80234369d46575226386#n212 [2] https://github.com/NixOS/nixpkgs/blob/3354d448f2a26117a74638957b0131ce3da9c8c4/pkgs/development/compilers/llvm/common/tblgen.nix#L54
17 hours[clang][bytecode] Remove Program include from InterpFrame.h (#160843)Timm Baeder3-4/+7
Program itself is unused in that file, so just include the needed headers.
17 hours[SelectionDAG] Improve v2f16 maximumnum expansion (#160723)Lewis Crawford3-405/+165
On targets where f32 maximumnum is legal, but maximumnum on vectors of smaller types is not legal (e.g. v2f16), try unrolling the vector first as part of the expansion. Only fall back to expanding the full maximumnum computation into compares + selects if maximumnum on the scalar element type cannot be supported.
17 hours[X86] ↵Simon Pilgrim2-13/+5
canCreateUndefOrPoisonForTargetNode/isGuaranteedNotToBeUndefOrPoisonForTargetNode - add X86ISD::PSHUFB handling (#160842) X86ISD::PSHUFB shuffles can't create undef/poison itself, allowing us to fold freeze(pshufb(x,y)) -> pshufb(freeze(x),freeze(y))
17 hours[libc++][test] Guard non-guaranteed implicit-lifetime-ness cases with ↵A. Jiang1-0/+11
`_LIBCPP_VERSION` (#160627) And add some guaranteed cases (namely, for `expected`, `optional`, and `variant`) to `is_implicit_lifetime.pass.cpp`. It's somehow unfortunate that `pair` and `tuple` are not guaranteed to propagate triviality of copy/move constructors, and MSVC STL fails to do so due to ABI compatibility. This affects the implicit-lifetime property.
17 hours[QualGroup][docs] Reorganize QualGroup docs under Reference section (#160021)LeeYoungJoon3-4/+2
This patch makes the following updates to the `QualGroup` documentation: ✅ 1. Move to Reference section Relocated the Qualification Working Group (QualGroup) docs from the main index into the Reference section for better organization and consistency. ✅ 2. Add link in GettingInvolved Inserted a proper link to the QualGroup documentation in the GettingInvolved sync-ups table, improving discoverability for newcomers. ✅ 3. Align structure with Security Group Revised the documentation layout to follow the same structure pattern as the Security Group docs, ensuring consistency across LLVM working group references.
17 hours[flang][Driver] Support -gsplit-dwarf. (#160540)Abid Qadeer12-52/+199
This flags enables the compiler to generate most of the debug information in a separate file which can be useful for executable size and link times. Clang already supports this flag. I have tried to follow the logic of the clang implementation where possible. Some functions were moved where they could be used by both clang and flang. The `addOtherOptions` was renamed to `addDebugOptions` to better reflect its purpose. Clang also set the `splitDebugFilename` field of the `DICompileUnit` in the IR when this option is present. That part is currently missing from this patch and will come in a follow-up PR.
17 hours[X86] Add test showing failure to fold freeze(permilvar(x,y)) -> ↵Simon Pilgrim1-0/+12
permilvar(freeze(x),freeze(y)) (#160836)
17 hours[X86] Add test showing failure to fold freeze(vpermps(x,y)) -> ↵Simon Pilgrim1-0/+20
vpermps(freeze(x),freeze(y)) (#160837)
18 hours[X86] Add test showing failure to fold freeze(pshufb(x,y)) -> ↵Simon Pilgrim1-0/+20
pshufb(freeze(x),freeze(y)) (#160835)
18 hours[VPlan] Run CSE closer to VPlan::execute. (#160572)Florian Hahn21-152/+85
Additional CSE opportunities are exposed after converting to concrete recipes/dissolving regions and materializing various expressions. Run CSE later, to capitalize on some of the late opportunities. PR: https://github.com/llvm/llvm-project/pull/160572