aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/IR/Instructions.cpp
AgeCommit message (Collapse)AuthorFilesLines
9 days[IR] Fix a few implicit conversions from TypeSize to uint64_t. NFC (#159894)Craig Topper1-2/+6
12 days[IR][CaptureTracking] Consider assume operand bundles captures(none) (#159083)Nikita Popov1-0/+4
Something like `call void @llvm.assume(i1 true) ["align"(ptr %p, i64 8)]` is equivalent to placing an `align 8` attribute on the parameter and should not be considered as capturing.
2025-08-08[IR] Introduce the `ptrtoaddr` instructionAlexander Richardson1-18/+34
This introduces a new `ptrtoaddr` instruction which is similar to `ptrtoint` but has two differences: 1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance 2) `ptrtoaddr` only extracts (and then extends/truncates) the low index-width bits of the pointer For most architectures, difference 2) does not matter since index (address) width and pointer representation width are the same, but this does make a difference for architectures that have pointers that aren't just plain integer addresses such as AMDGPU fat pointers or CHERI capabilities. This commit introduces textual and bitcode IR support as well as basic code generation, but optimization passes do not handle the new instruction yet so it may result in worse code than using ptrtoint. Follow-up changes will update capture tracking, etc. for the new instruction. RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54 Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/139357
2025-07-15[IR] Make intrinsic checks more efficient (NFC) (#148682)Nikita Popov1-1/+1
Directly cast the callee operand instead of going through getCalledFunction(). We can do this because for intrinsics the function type between the call and the function is guaranteed to match. This is a minor compile-time improvement as is, but has a much bigger impact with a future change that makes getCalledFunction() more expensive. There is some code duplication between these four uses, but they are each just different enough that representing one in terms of another would be less efficient.
2025-07-07[IR] Remove an unnecessary cast (NFC) (#147453)Kazu Hirata1-1/+1
predicate is already of Predicate.
2025-06-28[IR] Remove an unnecessary cast (NFC) (#146250)Kazu Hirata1-1/+1
Agg is already of Type *.
2025-06-25[FunctionAttrs][IR] Fix memory attr inference for volatile mem intrinsics ↵Nikita Popov1-0/+4
(#122926) Per LangRef volatile operations can read and write inaccessible memory: > any volatile operation can read and/or modify state which is not > accessible via a regular load or store in this module Model this by adding inaccessible memory effects in getMemoryEffects() if the operation is volatile. In the future, we should model volatile using operand bundles instead. Fixes https://github.com/llvm/llvm-project/issues/120932.
2025-06-10[IR] Simplify scalable vector handling in ShuffleVectorInst::getShuffleMask. ↵Craig Topper1-12/+7
NFC (#143596) Combine the scalable vector UndefValue check with the earlier ConstantAggregateZero handling for fixed and scalable vectors. Assert that the rest of the code is only reached for fixed vectors. Use append instead of resize since we know the size is increasing.
2025-06-02[llvm] annotate interfaces in llvm/IR for DLL export (#141650)Andrew Rogers1-7/+8
## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates the `llvm/IR`, `llvm/IRPrinter`, and `llvm/IRReader` libraries. These annotations currently have no meaningful impact on the LLVM build; however, they are a prerequisite to support an LLVM Windows DLL (shared library) build. ## Background This effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS on Linux: - Add `#include "llvm/Support/Compiler.h"` to files where it was not auto-added by IDS due to no pre-existing block of include statements. - Add `LLVM_ABI_FRIEND` to friend member functions declared with `LLVM_ABI` - Add `LLVM_TEMPLATE_ABI` and `LLVM_EXPORT_TEMPLATE` to exported instantiated templates - Add `LLVM_ABI` to a subset of private class methods and fields that require export - Add `LLVM_ABI` to a small number of symbols that require export but are not declared in headers - Reorder `LLVM_ABI` with `[[deprecated]]` and `[[nodiscard]]` attributes. ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang
2025-06-02[SimplifyCFG] Switch to use `paramHasNonNullAttr` (#125383)Yingwei Zheng1-1/+1
2025-04-30Reland [llvm] Add support for llvm IR atomicrmw fminimum/fmaximum ↵Jonathan Thackray1-0/+4
instructions (#137701) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.*` and `llvm.minimum.*` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.
2025-04-28Revert "[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum ↵Jonathan Thackray1-4/+0
instructions" (#137657) Reverts llvm/llvm-project#136759 due to bad interaction with c792b25e4
2025-04-28[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions ↵Jonathan Thackray1-0/+4
(#136759) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.*` and `llvm.minimum.*` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.
2025-04-26[llvm] Use llvm::copy (NFC) (#137470)Kazu Hirata1-1/+1
2025-04-22[IR] Intersect call and fn range in CallBase::getRange()Nikita Popov1-3/+11
To make sure that a larger range on the call-site does not suppress information from a smaller range at the declaration.
2025-03-20[SelectionDAG] Not issue TRAP node if naked function (#132147)yonghong-song1-0/+21
In [1], Nikita Popov suggested that during lowering 'unreachable' insn should not generate extra code for naked functions, and this applies to all architectures. Note that for naked functions, 'unreachable' insn is necessary in IR since the basic block needs a terminator to end. This patch checked whether a function is naked function or not. If it is a naked function, 'unreachable' insn will not generate ISD::TRAP. [1] https://github.com/llvm/llvm-project/pull/131731 Co-authored-by: Yonghong Song <yonghong.song@linux.dev>
2025-03-05[IR] Return correct memory effects for `convergencectrl` (#129874)Yingwei Zheng1-3/+5
`convergencectrl` doesn't imply any memory access. Closes https://github.com/llvm/llvm-project/issues/129856.
2025-02-27Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo ↵Nikita Popov1-0/+14
(#125880) (#128020) Relative to the previous attempt this includes two fixes: * Adjust callCapturesBefore() to not skip captures(ret: address, provenance) arguments, as these will not count as a capture at the call-site. * When visiting uses during stack slot optimization, don't skip the ModRef check for passthru captures. Calls can both modref and be passthru for captures. ------ This extends CaptureTracking to support inferring non-trivial CaptureInfos. The focus of this patch is to only support FunctionAttrs, other users of CaptureTracking will be updated in followups. The key API changes here are: * DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC component specifies what is captured at that Use and the ResultCC component specifies what may be captured via the return value of the User. Usually only one or the other will be used (corresponding to previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for call captures. * The CaptureTracking::captures() extension point is passed this UseCaptureInfo as well and then can decide what to do with it by returning an Action, which is one of: Stop: stop traversal. ContinueIgnoringReturn: continue traversal but don't follow the instruction return value. Continue: continue traversal and follow the instruction return value if it has additional CaptureComponents. For now, this patch retains the (unsound) special logic for comparison of null with a dereferenceable pointer. I'd like to switch key code to take advantage of address/address_is_null before dropping it. This PR mainly intends to introduce necessary API changes and basic inference support, there are various possible improvements marked with TODOs.
2025-02-19Revert "Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo ↵Nico Weber1-14/+0
(#125880)" This reverts commit 0fab404ee874bc5b0c442d1841c7d2005c3f8729. Seems to break LTO builds of clang on Windows, see comments on https://github.com/llvm/llvm-project/pull/125880
2025-02-14Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)Nikita Popov1-0/+14
Relative to the previous attempt, this adjusts isEscapeSource() to not treat calls with captures(ret: address, provenance) or similar arguments as escape sources. This addresses the miscompile reported at: https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577 The implementation uses a helper function on CallBase to make this check a bit more efficient (e.g. by skipping the byval checks) as checking attributes on all arguments if fairly expensive. ------ This extends CaptureTracking to support inferring non-trivial CaptureInfos. The focus of this patch is to only support FunctionAttrs, other users of CaptureTracking will be updated in followups. The key API changes here are: * DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC component specifies what is captured at that Use and the ResultCC component specifies what may be captured via the return value of the User. Usually only one or the other will be used (corresponding to previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for call captures. * The CaptureTracking::captures() extension point is passed this UseCaptureInfo as well and then can decide what to do with it by returning an Action, which is one of: Stop: stop traversal. ContinueIgnoringReturn: continue traversal but don't follow the instruction return value. Continue: continue traversal and follow the instruction return value if it has additional CaptureComponents. For now, this patch retains the (unsound) special logic for comparison of null with a dereferenceable pointer. I'd like to switch key code to take advantage of address/address_is_null before dropping it. This PR mainly intends to introduce necessary API changes and basic inference support, there are various possible improvements marked with TODOs.
2025-02-05[IR][NFC] Remove obsolete comments in `BinaryOperator::swapOperands` (#125819)Yingwei Zheng1-2/+1
Closes https://github.com/llvm/llvm-project/issues/125438
2025-01-29[IR] Convert from nocapture to captures(none) (#123181)Nikita Popov1-0/+19
This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.
2025-01-29[ValueTracking] Handle nonnull attributes at callsite (#124908)Yingwei Zheng1-0/+17
Alive2: https://alive2.llvm.org/ce/z/yJfskv Closes https://github.com/llvm/llvm-project/issues/124540.
2025-01-24[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites ↵Jeremy Morse1-1/+1
(#123737) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to getFirstNonPHI use the iterator-returning version. This patch changes a bunch of call-sites calling getFirstNonPHI to use getFirstNonPHIIt, which returns an iterator. All these call sites are where it's obviously safe to fetch the iterator then dereference it. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer getFirstNonPHI, but not before adding concise documentation of what considerations are needed (very few). --------- Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
2025-01-21[IR] Replace of PointerType::get(Type) with opaque version (NFC) (#123617)Mats Jun Larsen1-1/+1
In accordance with https://github.com/llvm/llvm-project/issues/123569 In order to keep the patch at reasonable size, this PR only covers for the llvm subproject, unittests excluded.
2025-01-14[ValueTracking] Squash compile-time regression from 66badf2 (#122700)Ramkumar Ramachandra1-0/+4
66badf2 (VT: teach a special-case optz about samesign) introduced a compile-time regression due to the use of CmpPredicate::getMatching, which is unnecessarily inefficient. Introduce CmpPredicate::getPreferredSignedPredicate, which alleviates the inefficiency problem and squashes the compile-time regression.
2025-01-14IR: handle FP predicates in CmpPredicate::getMatching (#122924)Ramkumar Ramachandra1-0/+2
CmpPredicate::getMatching implicitly assumes that both predicates are integer-predicates, and this has led to a crash being reported in VectorCombine after e409204 (VectorCombine: teach foldExtractedCmps about samesign). FP predicates are simple enough to handle as there is never any samesign information associated with them: hence handle them in CmpPredicate::getMatching, fixing the VectorCombine crash and guarding against future incorrect usages.
2025-01-13IR: introduce ICmpInst::isImpliedByMatchingCmp (#122597)Ramkumar Ramachandra1-16/+25
Create an abstraction over isImplied{True,False}ByMatchingCmp to faithfully communicate the result of both functions, cleaning up code in callsites. While at it, fix a bug in the implied-false version of the function, which was inadvertedenly dropping samesign information.
2025-01-11VT: teach isImpliedCondMatchingOperands about samesign (#122474)Ramkumar Ramachandra1-3/+10
Move isImplied{True,False}ByMatchingCmp from CmpInst to ICmpInst, so that it can operate on CmpPredicate instead of CmpInst::Predicate, and teach it about samesign. There are two callers of this function, and we choose to migrate the one in ValueTracking, namely isImpliedCondMatchingOperands to CmpPredicate, hence teaching it about samesign, with visible test impact.
2024-12-13EarlyCSE: fix CmpPredicate duplicate-hashing (#119902)Ramkumar Ramachandra1-4/+0
Strip hash_value() for CmpPredicate, as different callers have different hashing use-cases. In this case, there is just one caller, namely EarlyCSE, which calls hash_combine() on a CmpPredicate, which used to call hash_combine() on a CmpInst::Predicate prior to 4a0d53a (PatternMatch: migrate to CmpPredicate). This has uncovered a bug where two icmp instructions differing in just the fact that one of them has the samesign flag on it are hashed differently, leading to divergent hashing, and a crash. Fix this crash by dropping samesign information on icmp instructions before hashing them, preserving the former behavior. Fixes #119893.
2024-12-13PatternMatch: migrate to CmpPredicate (#118534)Ramkumar Ramachandra1-0/+18
With the introduction of CmpPredicate in 51a895a (IR: introduce struct with CmpInst::Predicate and samesign), PatternMatch is one of the first key pieces of infrastructure that must be updated to match a CmpInst respecting samesign information. Implement this change to Cmp-matchers. This is a preparatory step in migrating the codebase over to CmpPredicate. Since we no functional changes are desired at this stage, we have chosen not to migrate CmpPredicate::operator==(CmpPredicate) calls to use CmpPredicate::getMatching(), as that would have visible impact on tests that are not yet written: instead, we call CmpPredicate::operator==(Predicate), preserving the old behavior, while also inserting a few FIXME comments for follow-ups.
2024-12-03IR: introduce struct with CmpInst::Predicate and samesign (#116867)Ramkumar Ramachandra1-3/+19
Introduce llvm::CmpPredicate, an abstraction over a floating-point predicate, and a pack of an integer predicate with samesign information, in order to ease extending large portions of the codebase that take a CmpInst::Predicate to respect the samesign flag. We have chosen to demonstrate the utility of this new abstraction by migrating parts of ValueTracking, InstructionSimplify, and InstCombine from CmpInst::Predicate to llvm::CmpPredicate. There should be no functional changes, as we don't perform any extra optimizations with samesign in this patch, or use CmpPredicate::getMatching. The design approach taken by this patch allows for unaudited callers of APIs that take a llvm::CmpPredicate to silently drop the samesign information; it does not pose a correctness issue, and allows us to migrate the codebase piece-wise.
2024-11-21[LLVM][IR] Teach extractelement folds about constant ConstantInt/FP. (#116793)Paul Walker1-2/+10
2024-11-20IR: de-duplicate two CmpInst routines (NFC) (#116866)Ramkumar Ramachandra1-35/+1
De-duplicate the functions getSignedPredicate and getUnsignedPredicate, nearly identical versions of which were present in CmpInst and ICmpInst, creating less confusion.
2024-11-12[IR] Add helper for comparing KnownBits with IR predicate (NFC) (#115878)Nikita Popov1-0/+30
Add `ICmpInst::compare()` overload accepting `KnownBits`, similar to the existing one accepting `APInt`. This is not directly part of KnownBits (or APInt) for layering reasons.
2024-11-04IR: introduce CmpInst::isEquivalence (#111979)Ramkumar Ramachandra1-0/+31
Steal impliesEquivalanceIf{True,False} (sic) from GVN, and extend it for floating-point constant vectors, and accounting for denormal values. Since InstCombine also performs GVN-like replacements, introduce CmpInst::isEquivalence, and remove the corresponding code in GVN, with the intent of using it in more places. The code in GVN also has a bad FIXME saying that the optimization may be valid in the nsz case, but this is not the case. Alive2 proof: https://alive2.llvm.org/ce/z/vEaK8M
2024-09-30[NFC] Use initial-stack-allocations for more data structures (#110544)Jeremy Morse1-1/+1
This replaces some of the most frequent offenders of using a DenseMap that cause a malloc, where the typical element-count is small enough to fit in an initial stack allocation. Most of these are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector.
2024-09-11Don't rely on undefined behavior to store how a `User` object's allocation ↵Daniel Paoliello1-117/+118
is laid out (#105714) In `User::operator new` a single allocation is created to store the `User` object itself, "intrusive" operands or a pointer for "hung off" operands, and the descriptor. After allocation, details about the layout (number of operands, how the operands are stored, if there is a descriptor) are stored in the `User` object by settings its fields. The `Value` and `User` constructors are then very careful not to initialize these fields so that the values set during allocation can be subsequently read. However, when the `User` object is returned from `operator new` [its value is technically "indeterminate" and so reading a field without first initializing it is undefined behavior (and will be erroneous in C++26)](https://en.cppreference.com/w/cpp/language/default_initialization#Indeterminate_and_erroneous_values). We discovered this issue when trying to build LLVM using MSVC's [`/sdl` flag](https://learn.microsoft.com/en-us/cpp/build/reference/sdl-enable-additional-security-checks?view=msvc-170) which clears class fields after allocation (the docs say that this feature shouldn't be turned on for custom allocators and should only clear pointers, but that doesn't seem to match the implementation). MSVC's behavior both with and without the `/sdl` flag is standards conforming since a program is supposed to initialize storage before reading from it, thus the compiler implementation changing any values will never be observed in a well-formed program. The standard also provides no provisions for making storage bytes not indeterminate by setting them during allocation or `operator new`. The fix for this is to create a set of types that encode the layout and provide these to both `operator new` and the constructor: * The `AllocMarker` types are used to select which `operator new` to use. * `AllocMarker` can then be implicitly converted to a `AllocInfo` which tells the constructor how the type was laid out.
2024-09-06Add usub_cond and usub_sat operations to atomicrmw (#105568)anjenner1-0/+4
These both perform conditional subtraction, returning the minuend and zero respectively, if the difference is negative.
2024-08-08[DebugInfo][RemoveDIs] Use iterators to insert everywhere (#102003)Jeremy Morse1-2/+3
These are the final few places in LLVM where we use instruction pointers to identify the position that we're inserting something. We're trying to get away from that with a view to deprecating those methods, thus use iterators in all these places. I believe they're all debug-info safe. The sketchiest part is the ExtractValueInst copy constructor, where we cast nullptr to a BasicBlock pointer, so that we take the non-default insert-into-no-block path for instruction insertion, instead of the default nullptr-instruction path for UnaryInstruction. Such a hack is necessary until we get rid of the instruction constructor entirely.
2024-07-25Remove the `x86_mmx` IR type. (#98505)James Y Knight1-9/+0
It is now translated to `<1 x i64>`, which allows the removal of a bunch of special casing. This _incompatibly_ changes the ABI of any LLVM IR function with `x86_mmx` arguments or returns: instead of passing in mmx registers, they will now be passed via integer registers. However, the real-world incompatibility caused by this is expected to be minimal, because Clang never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>` or `double`, depending on ABI. This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type. That type simply no longer corresponds to an IR type, and is used only by MMX intrinsics and inline-asm operands. Because SelectionDAGBuilder only knows how to generate the operands/results of intrinsics based on the IR type, it thus now generates the intrinsics with the type MVT::v1i64, instead of MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus have the X86 backend fix them up in DAGCombine. (This may be a short-lived hack, if all the MMX intrinsics can be removed in upcoming changes.) Works towards issue #98272.
2024-07-03[IR] Add overflow check in AllocaInst::getAllocationSize (#97170)Tsz Chan1-4/+13
Fixes #91380.
2024-06-27[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)Nikita Popov1-2/+2
This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.
2024-06-24[llvm][ProfDataUtils] Provide getNumBranchWeights API (#90146)Paul Kirth1-5/+1
As suggested in https://github.com/llvm/llvm-project/pull/86609/files#r1556689262 an API for getting the number of branch weights directly from the MD node would be useful in a variety of checks, and keeps the logic within ProfDataUtils.
2024-06-24[IR] Generate poison for all-poison scalable shufflevector maskNikita Popov1-1/+1
Ultimately doesn't matter because the bitcode reader interprets undef and poison interchangeably in this context.
2024-06-24[IR] Use poison instead of undef for self-referential phiNikita Popov1-1/+1
2024-06-20[LLVM] Add InsertPosition union-type to remove overloads of ↵Stephen Tozer1-1317/+93
Instruction-creation (#94226) This patch simplifies instruction creation by replacing all overloads of instruction constructors/Create methods that are identical other than the Instruction *InsertBefore/BasicBlock *InsertAtEnd/BasicBlock::iterator InsertBefore argument with a single version that takes an InsertPosition argument. The InsertPosition class can be implicitly constructed from any of the above, internally converting them to the appropriate BasicBlock::iterator value which can then be used to insert the instruction (or to not insert it if an invalid iterator is passed). The upshot of this is that code will be deduplicated, and all callsites will switch to calling the new unified version without any changes needed to make the compiler happy. There is at least one exception to this; the construction of InsertPosition is a user-defined conversion, so any caller that was already relying on a different user-defined conversion won't work. In all of LLVM and Clang this happens exactly once: at clang/lib/CodeGen/CGExpr.cpp:123 we try to construct an alloca with an AssertingVH<Instruction> argument, which must now be cast to an Instruction* by using `&*`. If this is more common elsewhere, it could be fixed by adding an appropriate constructor to InsertPosition.
2024-06-12Reapply "[llvm][IR] Extend BranchWeightMetadata to track provenance o… ↵Paul Kirth1-1/+5
(#95281) …f weights" #95136 Reverts #95060, and relands #86609, with the unintended code generation changes addressed. This patch implements the changes to LLVM IR discussed in https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032 In this patch, we add an optional field to MD_prof meatdata nodes for branch weights, which can be used to distinguish weights added from llvm.expect* intrinsics from those added via other methods, e.g. from profiles or inserted by the compiler. One of the major motivations, is for use with MisExpect diagnostics, which need to know if branch_weight metadata originates from an llvm.expect intrinsic. Without that information, we end up checking branch weights multiple times in the case if ThinLTO + SampleProfiling, leading to some inaccuracy in how we report MisExpect related diagnostics to users. Since we change the format of MD_prof metadata in a fundamental way, we need to update code handling branch weights in a number of places. We also update the lang ref for branch weights to reflect the change.
2024-06-11Revert "[llvm][IR] Extend BranchWeightMetadata to track provenance of ↵Paul Kirth1-5/+1
weights" (#95060) Reverts llvm/llvm-project#86609 This change causes compile-time regressions for stage2 builds (https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u). It also introduced unintended changes to `.text` which should be addressed before relanding.
2024-06-10[llvm][IR] Extend BranchWeightMetadata to track provenance of weights (#86609)Paul Kirth1-1/+5
This patch implements the changes to LLVM IR discussed in https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032 In this patch, we add an optional field to MD_prof metadata nodes for branch weights, which can be used to distinguish weights added from `llvm.expect*` intrinsics from those added via other methods, e.g. from profiles or inserted by the compiler. One of the major motivations, is for use with MisExpect diagnostics, which need to know if branch_weight metadata originates from an llvm.expect intrinsic. Without that information, we end up checking branch weights multiple times in the case if ThinLTO + SampleProfiling, leading to some inaccuracy in how we report MisExpect related diagnostics to users. Since we change the format of MD_prof metadata in a fundamental way, we need to update code handling branch weights in a number of places. We also update the lang ref for branch weights to reflect the change.