aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/CodeGenPrepare.cpp
AgeCommit message (Collapse)AuthorFilesLines
2020-09-11[CodeGenPrepare] Simplify code. NFCI.Benjamin Kramer1-16/+5
2020-09-02[CodeGenPrepare][X86] Teach optimizeGatherScatterInst to turn a splat ↵Craig Topper1-65/+89
pointer into GEP with scalar base and 0 index This helps SelectionDAGBuilder recognize the splat can be used as a uniform base. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D86371
2020-08-28[CodeGenPrepare] Zap the argument of llvm.assume when deleting itBenjamin Kramer1-0/+5
We know that the argument is mostly likely dead, so we can purge it early. Otherwise it would make it to codegen, and can block further optimizations.
2020-08-28[SVE] Make ElementCount members privateDavid Sherwood1-2/+2
This patch changes ElementCount so that the Min and Scalable members are now private and can only be accessed via the get functions getKnownMinValue() and isScalable(). In addition I've added some other member functions for more commonly used operations. Hopefully this makes the class more useful and will reduce the need for calling getKnownMinValue(). Differential Revision: https://reviews.llvm.org/D86065
2020-08-25[ARM][CGP] Fix scalar condition selects for MVEDavid Green1-4/+3
The arm backend does not handle select/select_cc on vectors with scalar conditions, preferring to expand them in codegenprepare instead. This usually works except when optimizing for size, where the optsize check would end up overruling the backend isSelectSupported check. We could handle the selects in ISel too, but this seems like smaller code than trying to splat the condition to all lanes. Differential Revision: https://reviews.llvm.org/D86433
2020-07-31Support addrspacecast initializers with isNoopAddrSpaceCastMatt Arsenault1-1/+1
Moves isNoopAddrSpaceCast to the TargetMachine. It logically belongs with the DataLayout.
2020-07-22[CGP] Add Pass DependenciesAndrew Litteken1-0/+4
Add pass dependecies: - TargetTransformInfoWrapperPass - TargetPassConfig - LoopInfoWrapperPass - TargetLibraryInfoWrapperPass To fix inconsistencies when passes are added to the pipeline. Reviewers: efriedma, kmclaughlin, paquette Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84346
2020-07-15CodeGenPrep: remove AssertingVH references before deleting dead instructions.Tim Northover1-9/+43
CodeGenPrepare keeps fairly close track of various instructions it's seen, particularly GEPs, in maps and vectors. However, sometimes those instructions become dead and get removed while it's still executing. This triggers AssertingVH references to them in an asserts build and could lead to miscompiles in a release build (I've only seen a later segfault though). So this patch adds a callback to RecursivelyDeleteTriviallyDeadInstructions which can make sure the instruction about to be deleted is removed from CodeGenPrepare's data structures.
2020-07-09[SVE] Remove calls to VectorType::getNumElements from CodeGenChristopher Tetreault1-2/+2
Reviewers: efriedma, fpetrogalli, sdesmalen, RKSimon, arsenm Reviewed By: RKSimon Subscribers: wdng, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82210
2020-07-08[CodeGen] Fix warnings in sve-ld1-addressing-mode-reg-imm.llDavid Sherwood1-8/+13
For the GetElementPtr case in function AddressingModeMatcher::matchOperationAddr I've changed the code to use the TypeSize class instead of relying upon the implicit conversion to a uint64_t. As part of this we now check for scalable types and if we encounter one just bail out for now as the subsequent optimisations doesn't currently support them. This changes fixes up all warnings in the following tests: llvm/test/CodeGen/AArch64/sve-ld1-addressing-mode-reg-imm.ll llvm/test/CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll Differential Revision: https://reviews.llvm.org/D83124
2020-07-08Upgrade TypePromotionTransaction to be able to report changes in CodeGenPrepareserge-sans-paille1-11/+14
optimizeMemoryInst was reporting no change while still modifying the IR. Inspect the status of TypePromotionTransaction to get a better status. Related to https://reviews.llvm.org/D80916 Differential Revision: https://reviews.llvm.org/D81256
2020-06-25CodeGenPrepare.cpp - remove unused IntrinsicsX86.h header. NFC.Simon Pilgrim1-1/+0
2020-06-25Fix typos in CodeGenPrepare::splitLargeGEPOffsets comments.Simon Pilgrim1-3/+3
2020-06-22Revert "[CGP] Enable CodeGenPrepares phi type convertion."Tres Popp1-1/+1
This reverts commit 67121d7b82ed78a47ea32f0c87b7317e2b469ab2. This is causing compile times to be 2x slower on some large binaries.
2020-06-21[CGP] Enable CodeGenPrepares phi type convertion.David Green1-1/+1
2020-06-21[CGP] Convert phi typesDavid Green1-0/+157
If a collection of interconnected phi nodes is only ever loaded, stored or bitcast then we can convert the whole set to the bitcast type, potentially helping to reduce the number of register moves needed as the phi's are passed across basic block boundaries. This has to be done in CodegenPrepare as it naturally straddles basic blocks. The alorithm just looks from phi nodes, looking at uses and operands for a collection of nodes that all together are bitcast between float and integer types. We record visited phi nodes to not have to process them more than once. The whole subgraph is then replaced with a new type. Loads and Stores are bitcast to the correct type, which should then be folded into the load/store, changing it's type. This comes up in the biquad testcase due to the way MVE needs to keep values in integer registers. I have also seen it come up from aarch64 partner example code, where a complicated set of sroa/inlining produced integer phis, where float would have been a better choice. I also added undef and extract element handling which increased the potency in some cases. This adds it with an option that defaults to off, and disabled for 32bit X86 due to potential issues around canonicalizing NaNs. Differential Revision: https://reviews.llvm.org/D81827
2020-06-17[CGP] Reset the debug location when promoting zext(s).Davide Italiano1-9/+5
When the zext gets promoted, it used to retain the original location, which pessimizes the debugging experience causing an unexpected jump in stepping at -Og. Fixes https://bugs.llvm.org/show_bug.cgi?id=46120 (which also contains a full C repro). Differential Revision: https://reviews.llvm.org/D81437
2020-06-15[CodeGenPrepare] Reset the debug location when promoting trunc(s)Davide Italiano1-0/+1
The promotion machinery in CGP moves instructions retaining debug locations. When the transformation is local, this is mostly correct, but when instructions are moved cross-BBs, this is not always true and causes jumpiness in line tables. This is the first of a series of commits. sext(s) and zext(s) need to be treated similarly. Differential Revision: https://reviews.llvm.org/D81879
2020-06-08[SVE] Eliminate calls to default-false VectorType::get() from CodeGenChristopher Tetreault1-2/+3
Reviewers: efriedma, c-rhodes, david-arm, spatel, craig.topper, aqjune, paquette, arsenm, gchatelet Reviewed By: spatel, gchatelet Subscribers: wdng, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80313
2020-06-05[CGP] Remove unnecessary MaybeAlign use (NFC)Nikita Popov1-3/+3
Stores now always have an alignment.
2020-06-03[Statepoints][CGP] Minor parameter type cleanupPhilip Reames1-6/+5
2020-06-02Undo initialization of TRI in CGP as this is unconditionally initializedEric Christopher1-1/+1
later.
2020-06-02Fix up clang-tidy warnings around null and pointers.Eric Christopher1-23/+23
2020-05-29[CGP] Ensure address scaled offset is representable as int64_tSimon Pilgrim1-2/+3
AddressingModeMatcher::matchScaledValue was calling getSExtValue for a constant before ensuring that we can actually represent the value as int64_t Fixes OSSFuzz#22723 which is a followup to rGc479052a74b2 (PR46004 / OSSFuzz#22357)
2020-05-27[Statepoint] Replace uses of isX functions with idiomatic isa<X>Philip Reames1-1/+1
Now that all of the statepoint related routines have classes with isa support, let's cleanup. I'm leaving the (dead) utitilities in tree for a few days so that I can do the same cleanup downstream without breakage.
2020-05-24[PatternMatch] abbreviate vector inst matchers; NFCSanjay Patel1-3/+2
Readability is not reduced with these opcodes/match lines, so reduce odds of awkward wrapping from 80-col limit.
2020-05-22[CGP] Ensure address offset is representable as int64_tSimon Pilgrim1-5/+7
AddressingModeMatcher::matchAddr was calling getSExtValue for a constant before ensuring that we can actually represent the value as int64_t Fixes PR46004 / OSSFuzz#22357
2020-05-17Fix warning "defined but not used" for debug function (NFC)Mehdi Amini1-1/+1
2020-05-16AllocaInst should store Align instead of MaybeAlign.Eli Friedman1-1/+1
Along the lines of D77454 and D79968. Unlike loads and stores, the default alignment is getPrefTypeAlign, to match the existing handling in various places, including SelectionDAG and InstCombine. Differential Revision: https://reviews.llvm.org/D80044
2020-05-16[x86][CGP] try to hoist funnel shift above select-of-splatsSanjay Patel1-0/+39
This is basically the same patch as D63233, but converted to funnel shifts rather than regular shifts. I did not see a way to effectively share code for these 2 cases though. This follows D79718 and D79827 to re-fix PR37426 because that gets canonicalized to funnel shift intrinsics in IR. I did draft an alternative patch as an enhancement to "shouldSinkOperands()", but that was awkward because we have to key the transform from the select, but then look at both its users and its operands.
2020-05-14[x86][CGP] improve sinking of splatted vector shift amount operandSanjay Patel1-67/+1
Expands on the enablement of the shouldSinkOperands() TLI hook in: D79718 The last codegen/IR test diff shows what I suspected could happen - we were sinking all splat shift operands into a loop. But that's not what we want in general; we only want to sink the *shift amount* operand if it is a splat. Differential Revision: https://reviews.llvm.org/D79827
2020-05-13[CodeGenPrepare] Remove a superflouos variable. NFC.Benjamin Kramer1-2/+1
Fixes a -Wunused-variable warning in Release builds.
2020-05-13[ARM] Convert floating point splats to integerDavid Green1-1/+55
Under MVE a vdup will always take a gpr register, not a floating point value. During DAG combine we convert the types to a bitcast to an integer in an attempt to fold the bitcast into other instructions. This is OK, but only works inside the same basic block. To do the same trick across a basic block boundary we need to convert the type in codegenprepare, before the splat is sunk into the loop. This adds a convertSplatType function to codegenprepare to do that, putting bitcasts around the splat to force the type to an integer. There is then some adjustment to the code in shouldSinkOperands to handle the extra bitcasts. Differential Revision: https://reviews.llvm.org/D78728
2020-05-11[CGP] remove duplicate function for finding a splat shuffle; NFCSanjay Patel1-13/+1
2020-05-08[SampleFDO] For functions without profiles, provide an option to putWei Mi1-0/+14
them in a special text section. For sampleFDO, because the optimized build uses profile generated from previous release, previously we couldn't tell a function without profile was truely cold or just newly created so we had to treat them conservatively and put them in .text section instead of .text.unlikely. The result was when we persuing the best performance by locking .text.hot and .text in memory, we wasted a lot of memory to keep cold functions inside. In https://reviews.llvm.org/D66374, we introduced profile symbol list to discriminate functions being cold versus functions being newly added. This mechanism works quite well for regular use cases in AutoFDO. However, in some case, we can only have a partial profile when optimizing a target. The partial profile may be an aggregated profile collected from many targets. The profile symbol list method used for regular sampleFDO profile is not applicable to partial profile use case because it may be too large and introduce many false positives. To solve the problem for partial profile use case, we provide an option called --profile-unknown-in-special-section. For functions without profile, we will still treat them conservatively in compiler optimizations -- for example, treat them as warm instead of cold in inliner. When we use profile info to add section prefix for functions, we will discriminate functions known to be not cold versus functions without profile (being unknown), and we will put functions being unknown in a special text section called .text.unknown. Runtime system will have the flexibility to decide where to put the special section in order to achieve a balance between performance and memory saving. Differential Revision: https://reviews.llvm.org/D62540
2020-05-07[BFI][CGP] Add limited support for detecting missed BFI updates and fix one ↵Hiroshi Yamauchi1-0/+25
in CodeGenPrepare. Summary: This helps detect some missed BFI updates during CodeGenPrepare. This is debug build only and disabled behind a flag. Fix a missed update in CodeGenPrepare::dupRetToEnableTailCallOpts(). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77417
2020-05-05[NFC][CostModel] Add TargetCostKind to relevant APIsSam Parker1-4/+11
Make the kind of cost explicit throughout the cost model which, apart from making the cost clear, will allow the generic parts to calculate better costs. It will also allow some backends to approximate and correlate the different costs if they wish. Another benefit is that it will also help simplify the cost model around immediate and intrinsic costs, where we currently have multiple APIs. RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html Differential Revision: https://reviews.llvm.org/D79002
2020-04-28[TTI] Add TargetCostKind argument to getUserCostSam Parker1-1/+2
There are several different types of cost that TTI tries to provide explicit information for: throughput, latency, code size along with a vague 'intersection of code-size cost and execution cost'. The vectorizer is a keen user of RecipThroughput and there's at least 'getInstructionThroughput' and 'getArithmeticInstrCost' designed to help with this cost. The latency cost has a single use and a single implementation. The intersection cost appears to cover most of the rest of the API. getUserCost is explicitly called from within TTI when the user has been explicit in wanting the code size (also only one use) as well as a few passes which are concerned with a mixture of size and/or a relative cost. In many cases these costs are closely related, such as when multiple instructions are required, but one evident diverging cost in this function is for div/rem. This patch adds an argument so that the cost required is explicit, so that we can make the important distinction when necessary. Differential Revision: https://reviews.llvm.org/D78635
2020-04-27[IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().Craig Topper1-2/+2
This method has been commented as deprecated for a while. Remove it and replace all uses with the equivalent getCalledOperand(). I also made a few cleanups in here. For example, to removes use of getElementType on a pointer when we could just use getFunctionType from the call. Differential Revision: https://reviews.llvm.org/D78882
2020-04-23[SVE] Remove calls to isScalable from CodeGenChristopher Tetreault1-1/+1
Reviewers: efriedma, sdesmalen, stoklund, sunfish Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77755
2020-04-20[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use ↵Craig Topper1-4/+6
Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443
2020-04-20Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use ↵Craig Topper1-6/+4
Align/MaybeAlign." This is breaking the clang build. This reverts commit 897409fb56f4525639b0e47e88960f24cd91c924.
2020-04-20[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use ↵Craig Topper1-4/+6
Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443
2020-04-17Remove asserting getters from base TypeChristopher Tetreault1-1/+1
Summary: Remove asserting vector getters from Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: dexonsmith, sdesmalen, efriedma Reviewed By: efriedma Subscribers: cfe-commits, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D77278
2020-04-16[SelectionDAGBuilder][CGP][X86] Move some of SDB's gather/scatter uniform ↵Craig Topper1-0/+119
base handling to CGP. I've always found the "findValue" a little odd and inconsistent with other things in SDB. This simplfifies the code in SDB to just handle a splat constant address or a 2 operand GEP in the same BB. This removes the need for "findValue" since the operands to the GEP are guaranteed to be available. The splat constant handling is new, but was needed to avoid regressions due to constant folding combining GEPs created in CGP. CGP is now responsible for canonicalizing gather/scatters into this form. The pattern I'm using for scalarizing, a scalar GEP followed by a GEP with an all zeroes index, seems to be subject to constant folding that the insertelement+shufflevector was not. Differential Revision: https://reviews.llvm.org/D76947
2020-04-12[CallSite removal][TargetLowering] Use CallBase instead of CallSite in ↵Craig Topper1-4/+2
TargetLowering::ParseConstraints interface. Differential Revision: https://reviews.llvm.org/D77929
2020-04-10Clean up usages of asserting vector getters in TypeChristopher Tetreault1-2/+2
Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: stoklund, sdesmalen, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77272
2020-03-31Remove "mask" operand from shufflevector.Eli Friedman1-2/+2
Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467
2020-03-31[CodeGenPrepare] Delete intrinsic call to llvm.assume to enable more tailcallGuozhi Wei1-0/+8
The attached test case is simplified from tcmalloc. Both function calls should be optimized as tailcall. But llvm can only optimize the first call. The second call can't be optimized because function dupRetToEnableTailCallOpts failed to duplicate ret into block case2. There 2 problems blocked the duplication: 1 Intrinsic call llvm.assume is not handled by dupRetToEnableTailCallOpts. 2 The control flow is more complex than expected, dupRetToEnableTailCallOpts can only duplicate ret into its predecessor, but here we have an intermediate block between call and ret. The solutions: 1 Since CodeGenPrepare is already at the end of LLVM IR phase, we can simply delete the intrinsic call to llvm.assume. 2 A general solution to the complex control flow is hard, but for this case, after exit2 is duplicated into case1, exit2 is the only successor of exit1 and exit1 is the only predecessor of exit2, so they can be combined through eliminateFallThrough. But this function is called too late, there is no more dupRetToEnableTailCallOpts after it. We can add an earlier call to eliminateFallThrough to solve it. Differential Revision: https://reviews.llvm.org/D76539
2020-03-25Minor fixes to a comment in CodeGenPrepareJuneyoung Lee1-1/+1