aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Target/ARM/ARMParallelDSP.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-03-20[Target] Use *Set::insert_range (NFC) (#132140)Kazu Hirata1-1/+1
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.
2025-03-15[ARM] Avoid repeated map lookups (NFC) (#131420)Kazu Hirata1-4/+8
2025-02-14[ARM] Avoid repeated map lookups (NFC) (#127168)Kazu Hirata1-3/+4
2025-01-24[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)Jeremy Morse1-1/+1
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).
2024-11-12[ARM] Remove unused includes (NFC) (#115995)Kazu Hirata1-1/+0
Identified with misc-include-cleaner.
2024-10-11[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)Rahul Joshi1-6/+7
Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).
2024-06-29[IRBuilder] Don't include Module.h (NFC) (#97159)Nikita Popov1-1/+2
This used to be necessary to fetch the DataLayout, but isn't anymore.
2024-06-24Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"Stephen Tozer1-2/+4
Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24[IR][NFC] Update IRBuilder to use InsertPosition (#96497)Stephen Tozer1-4/+2
Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.
2024-04-15[ARM] Don't include IRBuilder.h in ARMISelLowering.h (NFC)Nikita Popov1-0/+1
Just a forward declaration is sufficient.
2023-06-28[llvm] Replace uses of Type::getPointerTo (NFC)Youngsuk Kim1-6/+4
Partial progress towards removing in-tree uses of `Type::getPointerTo`, before we can deprecate the API. If the API is used solely to support an unnecessary bitcast, get rid of the bitcast as well. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D153933
2023-05-19[ARM] Remove unused member variable MulCandidate::ReadOnlyKazu Hirata1-1/+0
The last use was removed by: commit a33e311a3b96086248cf347222f18e14e7adcf84 Author: Sam Parker <sam.parker@arm.com> Date: Mon May 13 09:23:32 2019 +0000
2023-05-19[ARM] Remove unused declaration CreateParallelPairsKazu Hirata1-4/+0
The declaration was added without a corresponding function definition by: commit 85ad78b1cfa3932eb658365b74f5b08c25dbfb0e Author: Sam Parker <sam.parker@arm.com> Date: Thu Jul 11 07:47:50 2019 +0000
2022-08-28[Target] Qualify auto in range-based for loops (NFC)Kazu Hirata1-3/+3
2022-08-01[ARMParallelDSP] Remove unnecessary ModRef intersection (NFC)Nikita Popov1-2/+1
Intersecting with ModRef is a no-op, as these are the only two possible values.
2022-06-09[ARM][ParallelDSP] Fix self reference bugSam Parker1-0/+5
Ensure we don't generate a smlad intrinsic that takes itself as an argument. Differential Revision: https://reviews.llvm.org/D127213
2021-06-23[ARMParallelDSP] Remove unnecessary wrapper function (NFC)Nikita Popov1-9/+1
AreSequentialAccesses() forwards directly to isConsecutiveAccess() and has an unnecessary template parameter to boot.
2021-01-24[Target] Use llvm::append_range (NFC)Kazu Hirata1-2/+1
2021-01-23Revert "[Target] Use llvm::append_range (NFC)"Kazu Hirata1-1/+2
This reverts commit cc7a23828657f35f706343982cf96bb6583d4d73. The X86WinEHState.cpp hunk seems to break certain builds.
2021-01-23[Target] Use llvm::append_range (NFC)Kazu Hirata1-2/+1
2020-12-03[NFC] Reduce include files dependency.dfukalov1-0/+1
1. Removed #include "...AliasAnalysis.h" in other headers and modules. 2. Cleaned up includes in AliasAnalysis.h. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92489
2020-11-26[AA] Split up LocationSize::unknown()Nikita Popov1-1/+1
Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationSize::unknown() only allows accesses after the base pointer. Some parts (various callers of AA) assume that LocationSize::unknown() allows accesses both before and after the base pointer (but within the underlying object). This patch splits up LocationSize::unknown() into LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer() to make this completely unambiguous. I tried my best to determine which one is appropriate for all the existing uses. The test changes in cs-cs.ll in particular illustrate a previously clearly incorrect AA result: We were effectively assuming that argmemonly functions were only allowed to access their arguments after the passed pointer, but not before it. I'm pretty sure that this was not intentional, and it's certainly not specified by LangRef that way. Differential Revision: https://reviews.llvm.org/D91649
2020-06-06LoopAnalysisManager.h - reduce includes to forward declarations. NFC.Simon Pilgrim1-0/+2
Move implicit include dependencies down to header/source files.
2020-04-26[Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directlySimon Pilgrim1-1/+0
Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h. Differential Revision: https://reviews.llvm.org/D78815
2020-04-21[ARM][ParallelDSP] Handle squaring multipliesSam Parker1-0/+4
The logic in ARMParallelDSP is setup to merge two 16-bits loads into a 32-bit load and feed them into the smlads. This requires that four loads are combined for the four inputs, but there wasn't actually a check for this. Differential Revision: https://reviews.llvm.org/D78492
2020-02-18[IR] Lazily number instructions for local dominance queriesReid Kleckner1-12/+10
Essentially, fold OrderedBasicBlock into BasicBlock, and make it auto-invalidate the instruction ordering when new instructions are added. Notably, we don't need to invalidate it when removing instructions, which is helpful when a pass mostly delete dead instructions rather than transforming them. The downside is that Instruction grows from 56 bytes to 64 bytes. The resulting LLVM code is substantially simpler and automatically handles invalidation, which makes me think that this is the right speed and size tradeoff. The important change is in SymbolTableTraitsImpl.h, where the numbering is invalidated. Everything else should be straightforward. We probably want to implement a fancier re-numbering scheme so that local updates don't invalidate the ordering, but I plan for that to be future work, maybe for someone else. Reviewed By: lattner, vsk, fhahn, dexonsmith Differential Revision: https://reviews.llvm.org/D51664
2020-01-23[Alignement][NFC] Deprecate untyped CreateAlignedLoadGuillaume Chatelet1-2/+1
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73260
2019-12-11[IR] Split out target specific intrinsic enums into separate headersReid Kleckner1-7/+8
This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320
2019-10-16[ARM][ParallelDSP] Change smlad insertion orderSam Parker1-16/+51
Instead of inserting everything after the 'root' of the reduction, insert all instructions as close to their operands as possible. This can help reduce register pressure. Differential Revision: https://reviews.llvm.org/D67392 llvm-svn: 374981
2019-09-09[ARM][ParallelDSP] Fix for sext inputSam Parker1-3/+9
The incoming accumulator value can be discovered through a sext, in which case there will be a mismatch between the input and the result. So sign extend the accumulator input if we're performing a 64-bit mac. Differential Revision: https://reviews.llvm.org/D67220 llvm-svn: 371370
2019-09-07Change TargetLibraryInfo analysis passes to always require FunctionTeresa Johnson1-1/+1
Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284
2019-09-04[ARM][ParallelDSP] SExt mul for accumulationSam Parker1-5/+14
For any unpaired muls, we accumulate them as an input to the reduction. Check the type of the mul and perform a sext if the existing accumlator input type is not the same. Differential Revision: https://reviews.llvm.org/D66993 llvm-svn: 370851
2019-08-28[ARM][ParallelDSP] Change search for mulsSam Parker1-166/+185
rL369567 reverted a couple of recent changes made to ARMParallelDSP because of a miscompilation error: PR43073. The issue stemmed from an underlying bug that was caused by adding muls into a reduction before it was proved that they could be executed in parallel with another mul. Most of the changes here are from the previously reverted commits. The additional changes have been made area: 1) The Search function now doesn't insert any muls into the Reduction object. That now happens once the search has successfully finished. 2) For any muls added into the reduction but that weren't paired, we accumulate their values as an input into the smlad. Differential Revision: https://reviews.llvm.org/D66660 llvm-svn: 370171
2019-08-21Revert r367389 (and follow-up r368404); it caused PR43073.Nico Weber1-50/+76
llvm-svn: 369567
2019-08-15[llvm] Migrate llvm::make_unique to std::make_uniqueJonas Devlieghere1-2/+2
Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013
2019-08-09[ARM][ParallelDSP] Replace SExt usesSam Parker1-3/+5
As loads are combined and widened, we replaced their sext users operands whereas we should have been replacing the uses of the sext. I've added a load of tests, with only a few of them originally causing assertion failures, the rest improve pattern coverage. Differential Revision: https://reviews.llvm.org/D65740 llvm-svn: 368404
2019-08-02[NFC][ARM[ParallelDSP] Rename/remove/change typesSam Parker1-13/+8
Remove forward declaration, fold a couple of typedefs and change one to be more useful. llvm-svn: 367665
2019-08-02[NFC][ARM][ParallelDSP] Remove ValueListSam Parker1-10/+8
We only care about the first element in the list. llvm-svn: 367660
2019-08-01[NFC][ARM][ParallelDSP] Getters and renamingSam Parker1-16/+22
Add a couple of getters for Reduction and do some renaming of variables around CreateSMLAD for clarity. llvm-svn: 367522
2019-07-31[ARM][ParallelDSP] Convert to function passSam Parker1-73/+45
Run across a whole function, visiting each basic block one at a time. Differential Revision: https://reviews.llvm.org/D65324 llvm-svn: 367389
2019-07-29[NFC][ARM[ParallelDSP] Cleanup of BinOpChainSam Parker1-81/+58
- Remove some unused typedefs. - Rename BinOpChain struct to MulCandidate. - Remove the size method of MulCandidate. - Store only the first input of the ValueList provided to MulCandidate, as it's the only value we care about. This means we don't have to perform any ugly (and unnecessary) iterations of the list later on. llvm-svn: 367208
2019-07-29[NFC][ARM][ParallelDSP] Remove AreSymmetricalSam Parker1-43/+0
We explicitly search for a parallel mac and we only care about its inputs, checking for symmetry doesn't add anything here. llvm-svn: 367205
2019-07-29[NFC][ARM][ParallelDSP] Remove PopulateLoadsSam Parker1-9/+0
We no longer have to check what loads are used, all this is performed at the start of the transform, so it's not doing anything now. llvm-svn: 367204
2019-07-26[ARM][ParallelDSP] Combine structsSam Parker1-19/+15
Combine OpChain and BinOpChain structs as OpChain is a base class to BinOpChain that is never used. llvm-svn: 367114
2019-07-26[NFC][ARM][ParallelDSP] Cleanup isNarrowSequenceSam Parker1-26/+5
Remove unused logic. llvm-svn: 367099
2019-07-24[ARM][ParallelDSP] Fix pointer operand reorderingSam Parker1-2/+2
While combining two loads into a single load, we often need to reorder the pointer operands for the new load. This reordering was broken in the cases where there was a chain of values that built up the pointer. Differential Revision: https://reviews.llvm.org/D65193 llvm-svn: 366881
2019-07-23[ARM] Add opt-bisect support to ARMParallelDSP.Eli Friedman1-0/+3
llvm-svn: 366851
2019-07-11[ARM][ParallelDSP] Change the search for smladsSam Parker1-252/+316
Two functional changes have been made here: - Now search up from any add instruction to find the chains of operations that we may turn into a smlad. This allows the generation of a smlad which doesn't accumulate into a phi. - The search function has been corrected to stop it falsely searching up through an invalid path. The bulk of the changes have been making the Reduction struct a class and making it more C++y with getters and setters. Differential Revision: https://reviews.llvm.org/D61780 llvm-svn: 365740
2019-05-30[NFC][ARM][ParallelDSP] Refactor narrow sequenceSam Parker1-48/+19
Most of the code used for finding a 'narrow' sequence is not used, so I've removed it and simplified the calls from the smlad matcher. llvm-svn: 362104
2019-05-13[ARM][ParallelDSP] Relax alias checksSam Parker1-184/+174
When deciding the safety of generating smlad, we checked for any writes within the block that may alias with any of the loads that need to be widened. This is overly conservative because it only matters when there's a potential aliasing write to a location accessed by a pair of loads. Now we check for aliasing writes only once, during setup. If two loads are found to have an aliasing write between them, we don't add these loads to LoadPairs. This means that later during the transform, we can safely widened a pair without worrying about aliasing. However, to maintain correctness, we also need to change the way that wide loads are inserted because the order is now important. The MatchSMLAD method has also been changed, absorbing MatchReductions and AddMACCandidate to hopefully improve readability. Differential Revision: https://reviews.llvm.org/D6102 llvm-svn: 360567