aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/SelectOptimize.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-04-23[CostModel] Remove optional from InstructionCost::getValue() (#135596)David Green1-3/+3
InstructionCost is already an optional value, containing an Invalid state that can be checked with isValid(). There is little point in returning another optional from getValue(). Most uses do not make use of it being a std::optional, dereferencing the value directly (either isValid has been checked previously or the Cost is assumed to be valid). The one case that does in AMDGPU used value_or which has been replaced by a isValid() check.
2025-03-29[CodeGen] Use llvm::append_range (NFC) (#133603)Kazu Hirata1-2/+1
2025-02-25[CodeGen] Avoid repeated hash lookups (NFC) (#128631)Kazu Hirata1-4/+5
2025-01-28[CodeGen] Avoid repeated hash lookups (NFC) (#124677)Kazu Hirata1-11/+10
2025-01-24[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites ↵Jeremy Morse1-1/+1
(#123737) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to getFirstNonPHI use the iterator-returning version. This patch changes a bunch of call-sites calling getFirstNonPHI to use getFirstNonPHIIt, which returns an iterator. All these call sites are where it's obviously safe to fetch the iterator then dereference it. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer getFirstNonPHI, but not before adding concise documentation of what considerations are needed (very few). --------- Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
2025-01-24[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)Jeremy Morse1-4/+4
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).
2024-12-25[SelectOpt] Optimise big select groups in the latch of a non-inner loop to ↵Igor Kirillov1-0/+12
branches (#119728) Loop latches often have a loop-carried dependency, and if they have several SelectLike instructions in one select group, it is usually profitable to convert it to branches rather than keep selects.
2024-12-17[SelectOpt] Support BinOps with SExt operands. (#115879)Florian Hahn1-6/+9
Building on top of https://github.com/llvm/llvm-project/pull/115489 extend support for binops with SExt operand. PR: https://github.com/llvm/llvm-project/pull/115879
2024-12-12[SelectOpt] Add support for AShr/LShr operands (#118495)Igor Kirillov1-24/+66
For conditional increments with sign check conditions like X < 0 or X >= 0, the compiler may generate code like this: %cmp = icmp sgt i64 %1, -1 %shift = ashr i64 %1, 63 %j.next = add nsw i64 %j, %shift %sel = select i1 %cmp ... , where %cmp is not in computation but in some other implicit or regular expressions. This patch allows SelectOptimize pass to recognise these cases.
2024-12-10[SelectOpt] Fix incorrect IR for SUB when comparison dependent operand is ↵Igor Kirillov1-1/+6
first (#119362)
2024-11-30[SelectOpt] Support ADD and SUB with zext operands. (#115489)Florian Hahn1-4/+21
Extend the support for implicit selects in the form of OR with a ZExt operand to support ADD and SUB binops as well. They similarly can form implicit selects which can be profitable to convert back the branches. PR: https://github.com/llvm/llvm-project/pull/115489
2024-11-27[SelectOpt] Refactor to prepare for support more select-like operations ↵Igor Kirillov1-220/+262
(#117582) * Enables conversion of several select-like instructions within one group * Any number of auxiliary instructions depending on the same condition can be in between select-like instructions * After splitting the basic block, move select-like instructions into the relevant basic blocks and optimise them * Make it easier to add support shift-base select-like instructions and also any mixture of zext/sext/not instructions
2024-11-25Revert "[SelectOpt] Refactor to prepare for support more select-like ↵Igor Kirillov1-261/+220
operations (#115745)" This reverts commit b5a11d378db4b39ceb085ebd59c941e9369d9596.
2024-11-25[SelectOpt] Refactor to prepare for support more select-like operations ↵Igor Kirillov1-220/+261
(#115745) * Enables conversion of several select-like instructions within one group * Any number of auxiliary instructions depending on the same condition can be in between select-like instructions * After splitting the basic block, move select-like instructions into the relevant basic blocks and optimise them * Make it easier to add support shift-base select-like instructions and also any mixture of zext/sext/not instructions
2024-11-12[CodeGen] Remove unused includes (NFC) (#115996)Kazu Hirata1-1/+0
Identified with misc-include-cleaner.
2024-11-04[CodeGen] Avoid repeated hash lookups (NFC) (#113414)Kazu Hirata1-10/+15
2024-10-28Check hasOptSize() in shouldOptimizeForSize() (#112626)Ellis Hoag1-2/+2
2024-07-13[CodeGen] Use range-based for loops (NFC) (#98706)Kazu Hirata1-2/+2
2024-05-22[SelectOpt] Add handling for not conditions. (#92517)David Green1-18/+64
This patch attempts to help the SelectOpt pass detect select groups made up of conditions and not(conditions). Usually these are canonicalized in instcombine to remove the not and invert the true/false values, but this will not happen for Loginal operations, which can be beneficial to convert if they are part of a larger select group. The handling for not's are mostly handled in the SelectLike, which can be marked as Inverted in order to reverse the TrueValue and FalseValue. This helps fix a regression in fortran minloc constructs, after #84628 helped simplify a loop with branches into a loop with selects.
2024-03-19[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)Stephen Tozer1-2/+2
This is the major rename patch that prior patches have built towards. The DPValue class is being renamed to DbgVariableRecord, which reflects the updated terminology for the "final" implementation of the RemoveDI feature. This is a pure string substitution + clang-format patch. The only manual component of this patch was determining where to perform these string substitutions: `DPValue` and `DPV` are almost exclusively used for DbgRecords, *except* for: - llvm/lib/target, where 'DP' is used to mean double-precision, and so appears as part of .td files and in variable names. NB: There is a single existing use of `DPValue` here that refers to debug info, which I've manually updated. - llvm/tools/gold, where 'LDPV' is used as a prefix for symbol visibility enums. Outside of these places, I've applied several basic string substitutions, with the intent that they only affect DbgRecord-related identifiers; I've checked them as I went through to verify this, with reasonable confidence that there are no unintended changes that slipped through the cracks. The substitutions applied are all case-sensitive, and are applied in the order shown: ``` DPValue -> DbgVariableRecord DPVal -> DbgVarRec DPV -> DVR ``` Following the previous rename patches, it should be the case that there are no instances of any of these strings that are meant to refer to the general case of DbgRecords, or anything other than the DPValue class. The idea behind this patch is therefore that pure string substitution is correct in all cases as long as these assumptions hold.
2024-03-13[RemoveDI][NFC] Rename DPValue->DbgRecord in comments and varnames (#84939)Stephen Tozer1-7/+8
This patch continues the ongoing rename work, replacing DPValue with DbgRecord in comments and the names of variables, both members and fn-local. This is the most labour-intensive part of the rename, as it is where the most decisions have to be made about whether a given comment or variable is referring to DPValues (equivalent to debug variable intrinsics) or DbgRecords (a catch-all for all debug intrinsics); these decisions are not individually difficult, but comprise a fairly large amount of text to review. This patch still largely performs basic string substitutions followed by clang-format; there are almost* no places where, for example, a comment has been expanded or modified to reflect the semantic difference between DPValues and DbgRecords. I don't believe such a change is generally necessary in LLVM, but it may be useful in the docs, and so I'll be submitting docs changes as a separate patch. *In a few places, `dbg.values` was replaced with `debug intrinsics`.
2024-03-12[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords ↵Stephen Tozer1-3/+3
(#84793) As part of the effort to rename the DbgRecord classes, this patch renames the widely-used functions that operate on DbgRecords but refer to DbgValues or DPValues in their names to refer to DbgRecords instead; all such functions are defined in one of `BasicBlock.h`, `Instruction.h`, and `DebugProgramInstruction.h`. This patch explicitly does not change the names of any comments or variables, except for where they use the exact name of one of the renamed functions. The reason for this is reviewability; this patch can be trivially examined to determine that the only changes are direct string substitutions and any results from clang-format responding to the changed line lengths. Future patches will cover renaming variables and comments, and then renaming the classes themselves.
2024-02-14[RemoveDIs] Replicate dbg intrinsic movement pattern in SelectOptimize (#81737)Orlando Cazalet-Hyams1-0/+6
Fix crash mentioned in comments on d759618df76361a8e490eeae5c5399e0738cbfd0. The assertion being hit was complaining that we had dangling DPValues; the DPValues attached to the terminator of StartBlock become dangling after the terminator is erased, and they're never "flushed" back onto the new terminator once it's added. Doing that makes the crash go away, but doesn't replicate existing dbg.* behaviour. See the comment in the patch. This change both fixes the crash (because there are now no DPValues left on the terminator to dangle) and replicates existing behaviour (moves those DPValues down to the new block).
2024-02-01[SelectOpt] Print instruction instead of pointerwangpc1-1/+1
Pull Request: https://github.com/llvm/llvm-project/pull/80125
2024-01-22[SelectOpt] Add handling for Select-like operations. (#77284)David Green1-122/+273
Some operations behave like selects. For example `or(zext(c), y)` is the same as select(c, y|1, y)` and instcombine can canonicalize the select to the or form. These operations can still be worthwhile converting to branch as opposed to keeping as a select or or instruction. This patch attempts to add some basic handling for them, creating a SelectLike abstraction in the select optimization pass. The backend can opt into handling `or(zext(c),x)` as a select if it could be profitable, and the select optimization pass attempts to handle them in much the same way as a `select(c, x|1, x)`. The Or(x, 1) may need to be added as a new instruction, generated as the or is converted to branches. This helps fix a regression from selects being converted to or's recently.
2024-01-22[DebugInfo][RemoveDIs] Handle DPValues in SelectOptimize (#79005)Jeremy Morse1-0/+17
When there are debug intrinsics in-between groups of select instructions, select-optimise sinks them into the "end" block. This needs to be replicated for DPValues, the non-instruction variable assignment object. Implement that and add a RUN line to a test that was sensitive to this to ensure it gets tested. (The exact range of instructions being transformed here is a little fiddly, hence I've gone with a helper lambda).
2023-12-12[CodeGen] Port `SelectOptimize` to new pass manager (#74920)paperchalice1-61/+113
- Use `BlockFrequencyInfoWrapperPass` in legacy pass so member `std::unique_ptr<BranchProbabilityInfo> BPI` could be removed. - Member `DominatorTree *DT = nullptr` is unused, remove it.
2023-12-03[llvm] Stop including string (NFC)Kazu Hirata1-1/+0
Identified with clangd.
2023-10-05Use BlockFrequency type in more places (NFC) (#68266)Matthias Braun1-1/+1
The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it more consistently in various APIs and disable implicit conversion to make usage more consistent and explicit. - Use `BlockFrequency Freq` parameter for `setBlockFreq`, `getProfileCountFromFreq` and `setBlockFreqAndScale` functions. - Return `BlockFrequency` in `getEntryFreq()` functions. - While on it change some `const BlockFrequency& Freq` parameters to plain `BlockFreqency Freq`. - Mark `BlockFrequency(uint64_t)` constructor as explicit. - Add missing `BlockFrequency::operator!=`. - Remove `uint64_t BlockFreqency::getMaxFrequency()`. - Add `BlockFrequency BlockFrequency::max()` function.
2023-09-11[NFC][RemoveDIs] Prefer iterator-insertion over instructionsJeremy Morse1-1/+2
Continuing the patch series to get rid of debug intrinsics [0], instruction insertion needs to be done with iterators rather than instruction pointers, so that we can communicate information in the iterator class. This patch adds an iterator-taking insertBefore method and converts various call sites to take iterators. These are all sites where such debug-info needs to be preserved so that a stage2 clang can be built identically; it's likely that many more will need to be changed in the future. At this stage, this is just changing the spelling of a few operations, which will eventually become signifiant once the debug-info bearing iterator is used. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152537
2023-09-07[NFC][RemoveDIs] Create a new spelling of the moveBefore methodJeremy Morse1-1/+1
As outlined in my proposal of how to get rid of debug intrinsics, this patch adds a moveBefore method that signals the caller /intends/ the order of moved instructions is to stay the same. This semantic difference has an effect on debug-info, as it signals whether debug-info needs to move with instructions or not. The patch just replaces a few calls to moveBefore with calls to moveBeforePreserving -- and the latter just calls the former, so it's all NFC right now. A future patch will add an implementation of moveBeforePreserving that takes action to correctly preserve debug-info, but that's tightly coupled with our non-instruction debug-info representation that's still being reviewed. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D156369
2023-05-20[llvm] Reduce ComplexDeinterleavingPass.h includesElliot Goodrich1-0/+1
Remove the unnecessary `"llvm/IR/PatternMatch.h"` include directive from `ComplexDeinterleavingPass.h` and move it to the corresponding source file. Add missing includes that were transitively included by this header to 3 other source files. This reduces the total number of preprocessing tokens across the LLVM source files in `lib` from (roughly) 1,964,876,961 to 1,935,091,611 - a reduction of ~1.52%. This should result in a small improvement in compilation time.
2023-05-20Revert "[llvm] Reduce ComplexDeinterleavingPass.h includes"Elliot Goodrich1-1/+0
This reverts commit 058ca5c07106d38ad66e3ec4972a613a64e88151.
2023-05-20[llvm] Reduce ComplexDeinterleavingPass.h includesElliot Goodrich1-0/+1
Remove the unnecessary `"llvm/IR/PatternMatch.h"` include directive from `ComplexDeinterleavingPass.h` and move it to the corresponding source file. Add missing includes that were transitively included by this header to 2 other source files. This reduces the total number of preprocessing tokens across the LLVM source files in `lib` from (roughly) 1,964,876,961 to 1,935,091,611 - a reduction of ~1.52%. This should result in a small improvement in compilation time. Differential Revision: https://reviews.llvm.org/D150514
2023-04-17Fix uninitialized pointer members in CodeGenAkshay Khadse1-5/+5
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303
2022-12-16[Transforms,CodeGen] std::optional::value => operator*/operator->Fangrui Song1-2/+2
value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
2022-12-14Don't include Optional.hKazu Hirata1-1/+0
These files no longer use llvm::Optional.
2022-12-13[NFC] Add checks for potential null returnsPhoebe Wang1-1/+1
2022-12-13[CodeGen] llvm::Optional => std::optionalFangrui Song1-3/+3
2022-12-03[AArch64] Enable the select optimize pass for AArch64David Green1-0/+4
This enabled the select optimize patch for ARM Out of order AArch64 cores. It is trying to solve a problem that is difficult for the compiler to fix. The criteria for when a csel is better or worse than a branch depends heavily on whether the branch is well predicted and the amount of ILP in the loop (as well as other criteria like the core in question and the relative performance of the branch predictor). The pass seems to do a decent job though, with the inner loop heuristics being well implemented and doing a better job than I had expected in general, even without PGO information. I've been doing quite a bit of benchmarking. The headline numbers are these for SPEC2017 on a Neoverse N1: 500.perlbench_r -0.12% 502.gcc_r 0.02% 505.mcf_r 6.02% 520.omnetpp_r 0.32% 523.xalancbmk_r 0.20% 525.x264_r 0.02% 531.deepsjeng_r 0.00% 541.leela_r -0.09% 548.exchange2_r 0.00% 557.xz_r -0.20% Running benchmarks with a combination of the llvm-test-suite plus several versions of SPEC gave between a 0.2% and 0.4% geomean improvement depending on the core/run. The instruction count went down by 0.1% too, which is a good sign, but the results can be a little noisy. Some issues from other benchmarks I had ran were improved in rGca78b5601466f8515f5f958ef8e63d787d9d812e. In summary well predicted branches will see in improvement, badly predicted branches may get worse, and on average performance seems to be a little better overall. This patch enables the pass for AArch64 under -O3 for cores that will benefit for it. i.e. not in-order cores that do not fit into the "Assume infinite resources that allow to fully exploit the available instruction-level parallelism" cost model. It uses a subtarget feature for specifying when the pass will be enabled, which I have enabled under cpu=generic as the performance increases for out of order cores seems larger than any decreases for inorder, which were minor. Differential Revision: https://reviews.llvm.org/D138990
2022-12-02[CodeGen] Use std::nullopt instead of None (NFC)Kazu Hirata1-1/+1
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-24[SelectOpt] Don't treat LogicalAnd/LogicalOr as selectsDavid Green1-0/+15
A `select i1 %c, i1 true, i1 %d` is just an or and a `select i1 %c, i1 %d, i1 false` is just an and. There are better treated as such in the logic of SelectOpt, allowing the backend to optimize them to and/or directly. Differential Revision: https://reviews.llvm.org/D138490
2022-11-22[SelectOptimize] Add some debug logging. NFCDavid Green1-13/+27
This is some quick debug messages for the SelectOptimize pass, adding some information for the costs that are measured from getInstructionCost calls, and re-using the existing optimization remarks to print some information about if transforms were performed or not. Differential Revision: https://reviews.llvm.org/D138108
2022-11-21Return None instead of Optional<T>() (NFC)Kazu Hirata1-1/+1
This patch replaces: return Optional<T>(); with: return None; to make the migration from llvm::Optional to std::optional easier. Specifically, I can deprecate None (in my source tree, that is) to identify all the instances of None that should be replaced with std::nullopt. Note that "return None" far outnumbers "return Optional<T>();". There are more than 2000 instances of "return None" in our source tree. All of the instances in this patch come from functions that return Optional<T> except Archive::findSym and ASTNodeImporter::import, where we return Expected<Optional<T>>. Note that we can construct Expected<Optional<T>> from any parameter convertible to Optional<T>, which None certainly is. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 Differential Revision: https://reviews.llvm.org/D138464
2022-09-16[SelectOpti] Restrict load sinkingSotiris Apostolakis1-22/+28
This is a follow-up to D133777, which resolved a use-after-free case but did not cover all possible memory bugs due to misplacement of loads. In short, the overall problem was that sinked loads could be moved after state-modifying instructions leading to memory bugs. The solution is to restrict load sinking unless it is found to be sound. i) Within a basic block (to-be-sinked load and select-user are in the same BB), loads can be sinked only if there is no intervening state-modifying instruction. This is a conservative approach to avoid resorting to alias analysis to detect potential memory overlap. ii) Across basic blocks, sinking of loads is avoided. This is because going over multiple basic blocks looking for memory conflicts could be computationally expensive and also unlikely to allow loads to sink. Further, experiments showed that not sinking these loads has a slight positive performance effect. Maybe for some of these loads, having some separation allows enough time for the load to be executed in time for its user. This is not the case for floating point operations that benefit more from sinking. The solution in D133777 was essentially undone in this patch, since the latter is a complete solution to the observed problem. Overall, the performance impact of this patch is minimal. Tested on two internal Google workloads with instrPGO. Search application showed <0.05% perf difference, while the database one showed a slight improvement, but not statistically significant. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D133999
2022-09-13[SelectOpti] Fix lifetime intrinsic bugSotiris Apostolakis1-0/+17
When a select is converted to a branch and load instructions are sinked to the true/false blocks, lifetime intrinsics (if present) could be made unsound if not moved. This conservatively moves all lifetime intrinsics in a transformed BB to the end block to ensure preserved lifetime semantics. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D133777
2022-08-20Remove redundant initialization of Optional (NFC)Kazu Hirata1-1/+1
2022-08-03[llvm][NFC] Refactor code to use ProfDataUtilsPaul Kirth1-3/+4
In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860
2022-07-27Revert "[llvm][NFC] Refactor code to use ProfDataUtils"Paul Kirth1-4/+3
This reverts commit 300c9a78819b4608b96bb26f9320bea6b8a0c4d0. We will reland once these issues are ironed out.
2022-07-27[llvm][NFC] Refactor code to use ProfDataUtilsPaul Kirth1-3/+4
In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860