aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/CodeGenPrepare.cpp
AgeCommit message (Collapse)AuthorFilesLines
2022-06-03[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFCFangrui Song1-1/+1
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded. Also remove cl::init(false) while touching the lines.
2022-05-26Reland "[Propeller] Promote functions with propeller profiles to .text.hot."Rahman Lavaee1-1/+20
This relands commit 4d8d2580c53e130c3c3dd3877384301e3c495554. The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests. Differential Revision: https://reviews.llvm.org/D126518
2022-05-26Revert "[Propeller] Promote functions with propeller profiles to .text.hot."Rahman Lavaee1-20/+1
This reverts commit 4d8d2580c53e130c3c3dd3877384301e3c495554.
2022-05-26[Propeller] Promote functions with propeller profiles to .text.hot.Rahman Lavaee1-1/+20
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely). This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`. The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`. Differential Revision: https://reviews.llvm.org/D122930
2022-05-23[CGP] Freeze condition when despeculating ctlz/cttzNikita Popov1-2/+6
Freeze the condition of the newly introduced conditional branch, to avoid immediate undefined behavior if the input to ctlz/cttz was originally poison. Differential Revision: https://reviews.llvm.org/D125887
2022-05-18Extend switch condition in optimizeSwitchPhiConst when freeMatthias Braun1-4/+29
In a case like: switch((i32)x) { case 42: phi((i64)42, ...); } replace `(i64)42` with `zext(x)` when we can do so for free. This fixes a part of https://github.com/llvm/llvm-project/issues/55153 Differential Revision: https://reviews.llvm.org/D124897
2022-05-13[IRBuilder] Add IsInBounds parameter to CreateGEP()Nikita Popov1-10/+4
We commonly want to create either an inbounds or non-inbounds GEP based on a boolean value, e.g. when preserving inbounds from existing GEPs. Directly accept such a boolean in the API, rather than requiring a ternary between CreateGEP and CreateInBoundsGEP. This change is not entirely NFC, because we now preserve an inbounds flag in a constant expression edge-case in InstCombine.
2022-05-11[CodeGenPrepare] Use const reference to avoid unnecessary APInt copy. NFCCraig Topper1-1/+1
Spotted while looking at Matthias' patches. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D124985
2022-05-11Fix endless loop in optimizePhiConst with integer constant switch conditionMatthias Braun1-1/+5
Avoid endless loop in degenerate case with an integer constant as switch condition as reported in https://reviews.llvm.org/D124552
2022-05-10CodeGenPrepare: Replace constant PHI arguments with switch condition valueMatthias Braun1-1/+57
We often see code like the following after running SCCP: switch (x) { case 42: phi(42, ...); } This tends to produce bad code as we currently materialize the constant phi-argument in the switch-block. This increases register pressure and if the pattern repeats for `n` case statements, we end up generating `n` constant values. This changes CodeGenPrepare to catch this pattern and revert it back to: switch (x) { case 42: phi(x, ...); } Differential Revision: https://reviews.llvm.org/D124552
2022-05-10Avoid 8 and 16bit switch conditions on x86Matthias Braun1-1/+1
This adds a `TargetLoweringBase::getSwitchConditionType` callback to give targets a chance to control the type used in `CodeGenPrepare::optimizeSwitchInst`. Implement callback for X86 to avoid i8 and i16 types where possible as they often incur extra zero-extensions. This is NFC for non-X86 targets. Differential Revision: https://reviews.llvm.org/D124894
2022-04-13[InlineAsm] Add support for address operands ("p").Jonas Paulsson1-1/+2
This patch adds support for inline assembly address operands using the "p" constraint on X86 and SystemZ. This was in fact broken on X86 (see example at https://reviews.llvm.org/D110267, Nov 23). These operands should probably be treated the same as memory operands by CodeGenPrepare, which have been commented with "TODO" there. Review: Xiang Zhang and Ulrich Weigand Differential Revision: https://reviews.llvm.org/D122220
2022-03-16Cleanup codegen includesserge-sans-paille1-2/+0
This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
2022-03-10Revert "Cleanup codegen includes"Nico Weber1-0/+2
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10Cleanup codegen includesserge-sans-paille1-2/+0
after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
2022-02-23PGOInstrumentation, GCOVProfiling: Split indirectbr critical edges ↵Matthias Braun1-1/+2
regardless of PHIs The `SplitIndirectBrCriticalEdges` function was originally designed for `CodeGenPrepare` and skipped splitting of edges when the destination block didn't contain any `PHI` instructions. This only makes sense when reducing COPYs like `CodeGenPrepare`. In the case of `PGOInstrumentation` or `GCOVProfiling` it would result in missed counters and wrong result in functions with computed goto. Differential Revision: https://reviews.llvm.org/D120096
2022-02-03[CodeGenPrepare] Avoid out-of-bounds shiftBjorn Pettersson1-3/+3
AddressingModeMatcher::matchOperationAddr may attempt to shift a variable by the same amount of steps as found in the IR in a SHL instruction. This was done without considering that there could be undefined behavior in the IR, so the shift performed when compiling could end up having undefined behavior as well. This patch avoid UB in the codegenprepare by making sure that we limit the shift amount used, in a similar way as already being done in CodeGenPrepare::optimizeLoadExt. Differential Revision: https://reviews.llvm.org/D118602
2022-01-30[CodeGen] Use default member initialization (NFC)Kazu Hirata1-2/+2
Identified with modernize-use-default-member-init.
2022-01-23[CodeGenPrepare] Use dyn_cast result to check for null pointersSimon Pilgrim1-8/+8
Simplifies logic and helps the static analyzer correctly check for nullptr dereferences
2022-01-18Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for ↵David Sherwood1-1/+1
negative constants" This reverts commit 197f3c0deb76951315118ef13937b67ea9cbd5aa. Reverting after miscompilation errors discovered with ffmpeg.
2022-01-17[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative ↵David Sherwood1-1/+1
constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357
2022-01-13Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for ↵David Sherwood1-1/+1
negative constants" This reverts commit 31009f0b5afb504fc1f30769c038e1b7be6ea45b. It seems to be causing SVE VLA buildbot failures and has introduced a genuine regression. Reverting for now.
2022-01-13[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative ↵David Sherwood1-1/+1
constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/dag-numsignbits.ll CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357
2021-12-12[llvm] Use llvm::reverse (NFC)Kazu Hirata1-2/+1
2021-12-03[CodeGen] Use range-based for loops (NFC)Kazu Hirata1-6/+2
2021-10-25CodeGenPrep: remove all copies of GEP from list if there are duplicates.Tim Northover1-5/+1
Unfortunately ToT has changed enough from the revision where this actually caused problems that the test no longer triggers an assertion failure.
2021-10-20[CodeGenPrepare] Avoid a scalable-vector crash in ctlz/cttzFraser Cormack1-1/+1
This patch fixes a crash when despeculating ctlz/cttz intrinsics with scalable-vector types. It is not safe to speculatively get the size of the vector type in bits in case the vector type is not a fixed-length type. As it happens this isn't required as vector types are skipped anyway. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D112141
2021-10-04[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.Jay Foad1-1/+1
Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807
2021-09-30[llvm] Migrate from arg_operands to args (NFC)Kazu Hirata1-2/+2
Note that arg_operands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.
2021-09-21[RISCV] Teach RISCVTargetLowering::shouldSinkOperands to sink splats for ↵Craig Topper1-2/+3
and/or/xor. This requires a minor change to CodeGenPrepare to ensure that shouldSinkOperands will be called for And. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D110106
2021-09-19[llvm] Use pop_back_val (NFC)Kazu Hirata1-4/+2
2021-09-18[CodeGen] Use make_early_inc_range (NFC)Kazu Hirata1-6/+4
2021-09-12[CGP] Support opaque pointers in address mode foldNikita Popov1-26/+14
Rather than inspecting the pointer element type, use the access type of the load/store/atomicrmw/cmpxchg. In the process of doing this, simplify the logic by storing the address + type in MemoryUses, rather than an Instruction + Operand pair (which was then used to fetch the address).
2021-08-26[CGP] Fix the crash for combining address mode when having cyclic dependencyAndrew Wei1-1/+2
In the combination of addressing modes, when replacing the matched phi nodes, sometimes the phi node to be replaced has been modified. For example, there’s matcher set [A, B] and [C, A], which will have cyclic dependency: A is replaced by B and C will be replaced by A. Because we tried to match new phi node to another new phi node, we should ignore new phi nodes when mapping new phi node to old one. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D108635
2021-08-17[CodeGenPrepare] The instruction to be sunk should be inserted before its ↵Tiehu Zhang1-2/+12
user in a block In current implementation, the instruction to be sunk will be inserted before the target instruction without considering the def-use tree, which may case Instruction does not dominate all uses error. We need to choose a suitable location to insert according to the use chain Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D107262
2021-07-24[CGP] despeculateCountZeros - Don't create is-zero branch if cttz/ctlz ↵Simon Pilgrim1-0/+4
source is known non-zero If value tracking can confirm that the cttz/ctlz source is known non-zero then we don't need to create a branch (which DAG will struggle to recover from). Differential Revision: https://reviews.llvm.org/D106685
2021-07-22[WebAssembly] Implementation of global.get/set for reftypes in LLVM IRPaulo Matos1-0/+4
Reland of 31859f896. This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D104797
2021-07-05[DebugInfo] CGP+HWasan: Handle dbg.values with duplicate location opsStephen Tozer1-1/+3
This patch fixes an issue which occurred in CodeGenPrepare and HWAddressSanitizer, which both at some point create a map of Old->New instructions and update dbg.value uses of these. They did this by iterating over the dbg.value's location operands, and if an instance of the old instruction was found, replaceVariableLocationOp would be called on that dbg.value. This would cause an error if the same operand appeared multiple times as a location operand, as the first call to replaceVariableLocationOp would update all uses of the old instruction, invalidating the old iterator and eventually hitting an assertion. This has been fixed by no longer iterating over the dbg.value's location operands directly, but by first collecting them into a set and then iterating over that, ensuring that we never attempt to replace a duplicated operand multiple times. Differential Revision: https://reviews.llvm.org/D105129
2021-07-02Revert "[WebAssembly] Implementation of global.get/set for reftypes in LLVM IR"Roman Lebedev1-4/+0
This reverts commit 4facbf213c51e4add2e8c19b08d5e58ad71c72de. ``` ******************** FAIL: LLVM :: CodeGen/WebAssembly/funcref-call.ll (44466 of 44468) ******************** TEST 'LLVM :: CodeGen/WebAssembly/funcref-call.ll' FAILED ******************** Script: -- : 'RUN: at line 1'; /builddirs/llvm-project/build-Clang12/bin/llc < /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll --mtriple=wasm32-unknown-unknown -asm-verbose=false -mattr=+reference-types | /builddirs/llvm-project/build-Clang12/bin/FileCheck /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll -- Exit Code: 2 Command Output (stderr): -- llc: /repositories/llvm-project/llvm/include/llvm/Support/LowLevelTypeImpl.h:44: static llvm::LLT llvm::LLT::scalar(unsigned int): Assertion `SizeInBits > 0 && "invalid scalar size"' failed. ```
2021-07-02[WebAssembly] Implementation of global.get/set for reftypes in LLVM IRPaulo Matos1-0/+4
Reland of 31859f896. This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Differential Revision: https://reviews.llvm.org/D104797
2021-06-23[CGP][RISCV] Teach CodeGenPrepare::optimizeSwitchInst to honor ↵Craig Topper1-6/+14
isSExtCheaperThanZExt. This optimization pre-promotes the input and constants for a switch instruction to a legal type so that all the generated compares share the same extend. Since RISCV prefers sext for i32 to i64 extends, we should honor that to use sext.w instead of a pair of shifts. Reviewed By: jrtc27 Differential Revision: https://reviews.llvm.org/D104612
2021-06-11[NFC][OpaquePtr] Explicitly pass GEP source type in optimizeGatherScatterInst()Arthur Eubanks1-3/+17
Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103480
2021-06-10Revert "Implementation of global.get/set for reftypes in LLVM IR"David Spickett1-4/+0
This reverts commit 31859f896cf90d64904134ce7b31230f374c3fcc. Causing SVE and RISCV-V test failures on bots.
2021-06-10Implementation of global.get/set for reftypes in LLVM IRPaulo Matos1-0/+4
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D95425
2021-05-16[CPG][ARM] Optimize towards branch on zero in codegenprepareDavid Green1-0/+63
This adds a simple fold into codegenprepare that converts comparison of branches towards comparison with zero if possible. For example: %c = icmp ult %x, 8 br %c, bla, blb %tc = lshr %x, 3 becomes %tc = lshr %x, 3 %c = icmp eq %tc, 0 br %c, bla, blb As a first order approximation, this can reduce the number of instructions needed to perform the branch as the shift is (often) needed anyway. At the moment this does not effect very much, as llvm tends to prefer the opposite form. But it can protect against regressions from commits like rG9423f78240a2. Simple cases of Add and Sub are added along with Shift, equally as the comparison to zero can often be folded with cpsr flags. Differential Revision: https://reviews.llvm.org/D101778
2021-04-23[TTI] NFC: Change getIntImmCost[Inst|Intrin] to return InstructionCostSander de Smalen1-3/+2
This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100565
2021-04-19[DebugInfo] Move the findDbg* functions into DebugInfo.cppOCHyams1-0/+1
Move the findDbg* functions into lib/IR/DebugInfo.cpp from lib/Transforms/Utils/Local.cpp. D99169 adds a call to a function (findDbgUsers) that lives in lib/Transforms/Utils/Local.cpp (LLVMTransformUtils) from lib/IR/Value.cpp (LLVMCore). The Core lib doesn't include TransformUtils. The builtbots caught this here: https://lab.llvm.org/buildbot/#/builders/109/builds/12664. This patch moves the function, and the 3 similar ones for consistency, into DebugInfo.cpp which is part of LLVMCore. Reviewed By: dblaikie, rnk Differential Revision: https://reviews.llvm.org/D100632
2021-04-07[CSSPGO] Move pseudo probes to the beginning of a block to unblock ↵Hongtao Yu1-0/+24
SelectionDAG combine. Pseudo probes, when scattered in a block, can be chained dependencies of other regular DAG nodes and block DAG combine optimizations. To fix this, scattered probes in a block are grouped and placed at the beginning of the block. This shouldn't affect the profile quality. Test Plan: Reviewed By: wenlei, wmi Differential Revision: https://reviews.llvm.org/D100002
2021-04-06Use AssumeInst in a few more places [nfc]Philip Reames1-5/+3
Follow up to a6d2a8d6f5. These were found by simply grepping for "::assume", and are the subset of that result which looked cleaner to me using the isa/dyn_cast patterns.
2021-03-22[TargetTransformInfo] move branch probability query from TargetLoweringInfoSanjay Patel1-1/+1
This is no-functional-change intended (NFC), but needed to allow optimizer passes to use the API. See D98898 for a proposed usage by SimplifyCFG. I'm simplifying the code by removing the cl::opt. That was added back with the original commit in D19488, but I don't see any evidence in regression tests that it was used. Target-specific overrides can use the usual patterns to adjust as necessary. We could also restore that cl::opt, but it was not clear to me exactly how to do it in the convoluted TTI class structure.