aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/IR
AgeCommit message (Collapse)AuthorFilesLines
2 days[Intrinsic] Unify IIT_STRUCT{2-9} into ITT_STRUCT to support upto 257 return ↵Michael Liao1-23/+3
values - Currently, Intrinsic can only have up to 9 return values. In case new intrinsics require more than 9 return values, additional ITT_STRUCTxxx values need to be added to support > 9 return values. Instead, this patch unifies them into a single IIT_STRUCT followed by a BYTE specifying the minimal 2 (encoded as 0) and maximal 257 (encoded as 255) return values.
3 days[DebugInfo] Handle followup loop metadata in ↵Björn Pettersson1-3/+35
updateLoopMetadataDebugLocations (#157557) Inliner/IROutliner/CodeExtractor all uses the updateLoopMetadataDebugLocations helper in order to modify debug location related to loop metadata. However, the helper has only been updating DILocation nodes found as operands to the first level of the MD_loop metadata. There could however be more DILocations as part of the various kinds of followup metadata. A typical example would be llvm.loop metadata like this !6 = distinct !{!6, !7, !8, !9, !10, !11} !7 = !DILocation(line: 6, column: 3, scope: !3) !8 = !DILocation(line: 7, column: 22, scope: !3) !11 = !{!"llvm.loop.distribute.followup_all", !7, !8, ..., !14} !14 = !{!"llvm.loop.vectorize.followup_all", !7, !8, ...} Instead of just updating !7 and !8 in !6, this patch make sure that we now recursively update the DILocations in !11 and !14 as well. Fixes #141568
4 days[LLVMContext] Add OB_align assume bundle op ID. (#158078)Florian Hahn1-1/+3
Assume operand bundles are emitted in a few more places now, including used in various places in libc++. Add a dedicated ID for them. PR: https://github.com/llvm/llvm-project/pull/158078
4 days[Verifier] Modify TBAAVerifier helpers signatures to accept a nullable (NFC)Antonio Frighetto1-15/+15
sanitizer-aarch64-linux-bootstrap-ubsan buildbot was previously failing. Resolves: https://lab.llvm.org/buildbot/#/builders/169/builds/15232.
4 days[IR] Introduce `llvm.errno.tbaa` metadata for errno alias disambiguationAntonio Frighetto1-27/+42
Add a new named module-level frontend-annotated metadata that specifies the TBAA node for an integer access, for which, C/C++ `errno` accesses are guaranteed to use (under strict aliasing). This should allow LLVM to prove the involved memory location/ accesses may not alias `errno`; thus, to perform optimizations around errno-writing libcalls (store-to-load forwarding amongst others). Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.
4 days[IR] Forbid mixing condition and operand bundle assumes (#160460)Nikita Popov1-0/+5
Assumes either have a boolean condition, or a number of attribute based operand bundles. Currently, we also allow mixing both forms, though we don't make use of this in practice. This adds additional complexity for code dealing with assumes. Forbid mixing both forms, by requiring that assumes with operand bundles have an i1 true condition.
5 days[DataLayout][LangRef] Split non-integral and unstable pointer propertiesAlexander Richardson1-10/+34
This commit adds finer-grained versions of isNonIntegralAddressSpace() and isNonIntegralPointerType() where the current semantics prohibit introduction of both ptrtoint and inttoptr instructions. The current semantics are too strict for some targets (e.g. AMDGPU/CHERI) where ptrtoint has a stable value, but the pointer has additional metadata. Currently, marking a pointer address space as non-integral also marks it as having an unstable bitwise representation (e.g. when pointers can be changed by a copying GC). This property inhibits a lot of optimizations that are perfectly legal for other non-integral pointers such as fat pointers or CHERI capabilities that have a well-defined bitwise representation but can't be created with only an address. This change splits the properties of non-integral pointers and allows for address spaces to be marked as unstable or non-integral (or both) independently using the 'p' part of the DataLayout string. A 'u' following the p marks the address space as unstable and specifying a index width != representation width marks it as non-integral. Finally, we also add an 'e' flag to mark pointers with external state (such as the CHERI capability validity) state. These pointers require special handling of loads and stores in addition to being non-integral. This does not change the checks in any of the passes yet - we currently keep the existing non-integral behaviour. In the future I plan to audit calls to DL.isNonIntegral[PointerType]() and replace them with the DL.mustNotIntroduce{IntToPtr,PtrToInt}() checks that allow for more optimizations. RFC: https://discourse.llvm.org/t/rfc-finer-grained-non-integral-pointer-properties/83176 Reviewed By: nikic, krzysz00 Pull Request: https://github.com/llvm/llvm-project/pull/105735
6 days[IR] Check identical alignment for atomic instructions (#155349)Ellis Hoag1-1/+5
I noticed that `hasSameSpecialState()` checks alignment for `load`/`store` instructions, but not for `cmpxchg` or `atomicrmw`, which I assume is a bug. It looks like alignment for these instructions were added in https://github.com/llvm/llvm-project/commit/74c723757e69fbe7d85e42527d07b728113699ae.
6 days[Remarks] Restructure bitstream remarks to be fully standalone (#156715)Tobias Stadler1-20/+43
Currently there are two serialization modes for bitstream Remarks: standalone and separate. The separate mode splits remark metadata (e.g. the string table) from actual remark data. The metadata is written into the object file by the AsmPrinter, while the remark data is stored in a separate remarks file. This means we can't use bitstream remarks with tools like opt that don't generate an object file. Also, it is confusing to post-process bitstream remarks files, because only the standalone files can be read by llvm-remarkutil. We always need to use dsymutil to convert the separate files to standalone files, which only works for MachO. It is not possible for clang/opt to directly emit bitstream remark files in standalone mode, because the string table can only be serialized after all remarks were emitted. Therefore, this change completely removes the separate serialization mode. Instead, the remark string table is now always written to the end of the remarks file. This requires us to tell the serializer when to finalize remark serialization. This automatically happens when the serializer goes out of scope. However, often the remark file goes out of scope before the serializer is destroyed. To diagnose this, I have added an assert to alert users that they need to explicitly call finalizeLLVMOptimizationRemarks. This change paves the way for further improvements to the remark infrastructure, including more tooling (e.g. #159784), size optimizations for bitstream remarks, and more. Pull Request: https://github.com/llvm/llvm-project/pull/156715
6 days[CHERI] Add enum values and LL parse/print support for CHERIoT calling ↵Owen Anderson1-0/+9
conventions. (#156328) This is the set of the calling conventions supported by the CHERIoT downstream of LLVM. --------- Co-authored-by: Nikita Popov <github@npopov.com>
8 days[IR] Fix a few implicit conversions from TypeSize to uint64_t. NFC (#159894)Craig Topper3-4/+8
8 days[IR] Simplify dispatchRecalculateHash and dispatchResetHash (NFC) (#159903)Kazu Hirata1-4/+2
This patch simplifies dispatchRecalculateHash and dispatchResetHash with "constexpr if". This patch does not inline dispatchRecalculateHash and dispatchResetHash into their respective call sites. Using "constexpr if" in a non-template context like MDNode::uniquify would still require the discarded branch to be syntactically valid, causing a compilation error for node types that do not have recalculateHash/setHash. Using template functions ensures that the "constexpr if" is evaluated in a proper template context, allowing the compiler to fully discard the inactive branch.
8 days[IR] Modernize HasCachedHash (NFC) (#159902)Kazu Hirata1-7/+3
This patch modernizes HasCachedHash. - "struct SFINAE" is replaced with identically defined SameType. - The return types Yes and No are replaced with std::true_type and std::false_type. My previous attempt (#159510) to clean up HasCachedHash failed on clang++-18, but this version works with clang++-18.
9 days[IR] enable attaching metadata on ifuncs (#158732)Wael Yehia2-0/+18
Teach the IR parser and writer to support metadata on ifuncs, and update documentation. In PR #153049, we have a use case of attaching the `!associated` metadata to an ifunc. Since an ifunc is similar to a function declaration, it seems natural to allow metadata on ifuncs. Currently, the metadata API allows adding Metadata to llvm::GlobalObject, so the in-memory IR allows for metadata on ifuncs, but the IR reader/writer is not aware of that. --------- Co-authored-by: Wael Yehia <wyehia@ca.ibm.com>
10 daysRevert "[IR] Simplify HasCachedHash with is_detected (NFC) (#159510)" (#159622)Jordan Rupprecht1-2/+7
This reverts commit d6b7ac830ab4c1b26a1b2eecd15306eccf9cea90. Build breakages reported on the PR hint at not working with certain versions of the host compiler.
10 days[IR] Simplify HasCachedHash with is_detected (NFC) (#159510)Kazu Hirata1-7/+2
With is_detected, we don't need to implement a SFINAE trick on our own.
11 days[IR] NFC: Remove 'experimental' from partial.reduce.add intrinsic (#158637)Sander de Smalen2-2/+5
The partial reduction intrinsics are no longer experimental, because they've been used in production for a while and are unlikely to change.
11 days[IR][CaptureTracking] Consider assume operand bundles captures(none) (#159083)Nikita Popov1-0/+4
Something like `call void @llvm.assume(i1 true) ["align"(ptr %p, i64 8)]` is equivalent to placing an `align 8` attribute on the parameter and should not be considered as capturing.
12 daysRe-apply "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional ↵Mingming Liu1-1/+13
update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159161) This is a reland of https://github.com/llvm/llvm-project/pull/158460 Test failures are gone once I undo the changes in codegenprepare.
12 daysRevert "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional ↵Mingming Liu1-13/+1
update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159159) Reverts llvm/llvm-project#158460 due to buildbot failures
12 days[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional update if ↵Mingming Liu1-1/+13
existing prefix is not equivalent to the new one. Returns whether prefix changed. (#158460) Before this change, `setSectionPrefix` overwrites existing section prefix with new one unconditionally. After this change, `setSectionPrefix` checks for equivalences, updates conditionally and returns whether an update happens. Update the existing callers to make use of the return value. [PR 155337](https://github.com/llvm/llvm-project/pull/155337/files#diff-cc0c67ac89807f4453f0cfea9164944a4650cd6873a468a0f907e7158818eae9) is a motivating use case whether the 'update' semantic is needed.
12 daysAdd DebugSSAUpdater class to track debug value liveness (#135349)Stephen Tozer1-0/+4
This patch adds a class that uses SSA construction, with debug values as definitions, to determine whether and which debug values for a particular variable are live at each point in an IR function. This will be used by the IR reader of llvm-debuginfo-analyzer to compute variable ranges and coverage, although it may be applicable to other debug info IR analyses.
14 days[llvm] Use std::bool_constant (NFC) (#158520)Kazu Hirata1-3/+2
This patch replaces, std::integral_constant<bool, ...> with std::bool_constant for brevity. Note that std::bool_constant was introduced as part of C++17. There are cases where we could replace EXPECT_EQ(false, ...) with EXPECT_FALSE(...), but I'm not doing that in this patch to avoid doing multiple things in one patch.
2025-09-12[IR] Add `MD_prof` to the `Keep` list of `dropUBImplyingAttrsAndMetadata` ↵Mircea Trofin1-3/+4
(#154635) `MD_prof` is safe to keep when e.g. hoisting instructions. Issue #147390
2025-09-12[IntrinsicEmitter] Make AttributesMap bits adaptive (#157965)Elvin Wang1-8/+0
Make IntrinsicsToAttributesMap's func. and arg. fields be able to have adaptive sizes based on input other than hardcoded 8bits/8bits. This will ease the pressure for adding new intrinsics in private downstreams. func. attr bitsize will become 7(127/128) vs 8(255/256)
2025-09-11[PGO] Add llvm.loop.estimated_trip_count metadata (#152775)Joel E. Denny2-0/+13
This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As the RFC explains, that metadata enables future patches, such as PR #128785, to fix block frequency issues without losing estimated trip counts.
2025-09-11[llvm] Move data layout string computation to TargetParser (#157612)Reid Kleckner1-12/+0
Clang and other frontends generally need the LLVM data layout string in order to generate LLVM IR modules for LLVM. MLIR clients often need it as well, since MLIR users often lower to LLVM IR. Before this change, the LLVM datalayout string was computed in the LLVM${TGT}CodeGen library in the relevant TargetMachine subclass. However, none of the logic for computing the data layout string requires any details of code generation. Clients who want to avoid duplicating this information were forced to link in LLVMCodeGen and all registered targets, leading to bloated binaries. This happened in PR #145899, which measurably increased binary size for some of our users. By moving this information to the TargetParser library, we can delete the duplicate datalayout strings in Clang, and retain the ability to generate IR for unregistered targets. This is intended to be a very mechanical LLVM-only change, but there is an immediately obvious follow-up to clang, which will be prepared separately. The vast majority of data layouts are computable with two inputs: the triple and the "ABI name". There is only one exception, NVPTX, which has a cl::opt to enable short device pointers. I invented a "shortptr" ABI name to pass this option through the target independent interface. Everything else fits. Mips is a bit awkward because it uses a special MipsABIInfo abstraction, which includes members with codegen-like concepts like ABI physical registers that can't live in TargetParser. I think the string logic of looking for "n32" "n64" etc is reasonable to duplicate. We have plenty of other minor duplication to preserve layering. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com> Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-09-11ARM: Move remaining half convert libcall config into tablegen (#153408)Matt Arsenault1-25/+0
The __truncdfhf2 handling is kind of convoluted, but reproduces the existing, likely wrong, handling.
2025-09-10[profcheck] Require `unknown` metadata have an origin parameter (#157594)Mircea Trofin2-12/+25
Rather than passes using `!prof = !{!”unknown”}`​for cases where don’t have enough information to emit profile values, this patch captures the pass (or some other information) that can help diagnostics - i.e. `!{!”unknown”, !”some-pass-name”}`​. For example, suppose we emitted a `select`​ with the unknown metadata, and, later, end up needing to lower that to a conditional branch. If we observe (via sample profiling, for example) that the branch is biased and would have benefitted from a valid profile, the extra information can help speed up debugging. We can also (in a subsequent pass) generate optimization remarks about such lowered selects, with a similar aim - identify patterns lowering to `select`​ that may be worth some extra investment in extracting a more precise profile.
2025-09-10[DebugInfo] When merging locations prefer unannotated empty locs (#157707)Stephen Tozer1-3/+12
When merging DILocations, we prefer to use DebugLoc::getMergedLocation when possible to better preserve DebugLoc coverage tracking information through transformations (as conversion to DILocations drops all coverage tracking data). Currently, DebugLoc::getMergedLocation checks to see if either DebugLoc is empty and returns it directly if so, to propagate that DebugLoc's coverage tracking data to the merged location; however, it only checks whether either location is valid, not whether they are annotated. This is significant because an annotated location is not a bug, while an empty unannotated location may be one; therefore, we check to see if either location is unannotated, and prefer to return that location if it exists rather than an annotated one. This change is NFC outside of DebugLoc coverage tracking builds.
2025-09-10[x86][AVX-VNNI] Fix VPDPBUSD Argument Types (#155194)BaiXilin1-13/+89
Fixed intrinsic VPDPBUSD[,S]_128/256/512's argument types to match with the ISA. Fixes part of #97271
2025-09-10[Verifier] Remove redundant null-check (NFC) (#157458)Daniel Kuts1-1/+1
Fixes #157448
2025-09-08[DataLayout] Remove i1 alignment entry (#156657)Nikita Popov1-1/+0
I don't think we need to explicitly specify i1 alignment, as this is going to fall back to i8 alignment. This may change behavior if a data layout explicitly sets i8 alignment without also setting i1 layout, but I'd expect this to be a bug fix in that case.
2025-09-05[LLD][COFF] Add more `--time-trace` tags for ThinLTO linking (#156471)Alexandre Ganea4-0/+9
In order to better see what's going on during ThinLTO linking, this PR adds more profile tags when using `--time-trace` on a `lld-link.exe` invocation. After PR, linking `clang.exe`: <img width="3839" height="2026" alt="Capture d’écran 2025-09-02 082021" src="https://github.com/user-attachments/assets/bf0c85ba-2f85-4bbf-a5c1-800039b56910" /> Linking a custom (Unreal Engine game) binary gives a completly different picture, probably because of using Unity files, and the sheer amount of input files (here, providing over 60 GB of .OBJs/.LIBs). <img width="1940" height="1008" alt="Capture d’écran 2025-09-02 102048" src="https://github.com/user-attachments/assets/60b28630-7995-45ce-9e8c-13f3cb5312e0" />
2025-09-05[X86][AVX10] Remove EVEX512 and AVX10-256 implementations (#157034)Phoebe Wang1-10/+0
The 256-bit maximum vector register size control was removed from AVX10 whitepaper, ref: https://cdrdv2.intel.com/v1/dl/getContent/784343 We have warned these options in LLVM21 through #132542. This patch removes underlying implementations in LLVM22.
2025-09-04[profcheck] Allow `unknown` function entry count (#155918)Mircea Trofin2-7/+14
Some passes synthesize functions, e.g. WPD, so we may need to indicate “this synthesized function’s entry count cannot be estimated at compile time” - akin to `branch_weights`​. Issue #147390
2025-09-04[AMDGPU][gfx1250] Add 128B cooperative atomics (#156418)Pierre van Houtryve1-0/+22
- Add clang built-ins + sema/codegen - Add IR Intrinsic + verifier - Add DAG/GlobalISel codegen for the intrinsics - Add lowering in SIMemoryLegalizer using a MMO flag.
2025-09-04[DataLayout] Specialize the getTypeAllocSize() implementation (#156687)Nikita Popov1-0/+38
getTypeAllocSize() currently works by taking the type store size and aligning it to the ABI alignment. However, this ends up doing redundant work in various cases, for example arrays will unnecessarily repeat the alignment step, and structs will fetch the StructLayout multiple times. As this code is rather hot (it is called every time we need to calculate GEP offsets for example), specialize the implementation. This repeats a small amount of logic from getAlignment(), but I think that's worthwhile.
2025-09-03Add documentation on debugging LLVM.Peter Collingbourne1-0/+2
Reviewers: fmayer, nikic Reviewed By: fmayer Pull Request: https://github.com/llvm/llvm-project/pull/156128
2025-09-03[DataLayout] Use linear scan to determine integer alignment (NFC)Nikita Popov1-1/+6
The number of alignment entries is usually very small (5-7), so it is more efficient to use a linear scan than a binary search.
2025-09-02[NFC] RuntimeLibcalls: Prefix the impls with 'Impl_' (#153850)Daniel Paoliello1-6/+12
As noted in #153256, TableGen is generating reserved names for RuntimeLibcalls, which resulted in a build failure for Arm64EC since `vcruntime.h` defines `__security_check_cookie` as a macro. To avoid using reserved names, all impl names will now be prefixed with `Impl_`. `NumLibcallImpls` was lifted out as a `constexpr size_t` instead of being an enum field. While I was churning the dependent code, I also removed the TODO to move the impl enum into its own namespace and use an `enum class`: I experimented with using an `enum class` and adding a namespace, but we decided it was too verbose so it was dropped.
2025-09-02[IR] Allow nofree metadata to inttoptr (#153149)Ruiling, Song2-0/+14
Our GPU compiler usually construct pointers through inttoptr. The memory was pre-allocated before the shader function execution and remains valid through the execution of the shader function. This brings back the expected behavior of instruction hoisting for the test `hoist-speculatable-load.ll`, which was broken by #126117.
2025-09-02[DataLayout] Explicitly call getFixedValue() (NFC)Nikita Popov1-3/+4
Instead of relying on the implicit cast. The scalable case has been explicitly checked beforehand.
2025-09-01[MLIR] Add target_specific_attrs attribute to mlir.global (#154706)Vadim Curcă1-0/+13
Adds a `target_specific_attrs` optional array attribute to `mlir.global`, as well as conversions to and from LLVM attributes on `llvm::GlobalVariable` objects. This is necessary to preserve unknown attributes on global variables when converting to and from the LLVM Dialect. Previously, any attributes on an `llvm::GlobalVariable` not explicitly modeled by `mlir.global` were dropped during conversion.
2025-08-30[NFC] Fix typos 'seperate' -> 'separate' (#144368)Roman1-1/+1
Correct few typos: 'seperate' -> 'separate' .
2025-08-29[DirectX] Make dx.RawBuffer an op that can't be replaced (#154620)Farzon Lotfi2-6/+15
fixes #152348 SimplifyCFG collapses raw buffer store from a if\else load into a select. This change prevents the TargetExtType dx.Rawbuffer from being replace thus preserving the if\else blocks. A further change was needed to eliminate the phi node before we process Intrinsic::dx_resource_getpointer in DXILResourceAccess.cpp
2025-08-29Singleton hack of fixing static initialisation order fiasco (#154541)dalmurii1-9/+13
https://github.com/llvm/llvm-project/issues/154528 # Brief Indirect linking of llvm as a shared library is causing a "free() invalid size abortion". In my case, my project depends on google/clspv which in turn pulls `llvm`. Note that the issue does not occur when `clspv` and `llvm` is all statically linked. # Structure of a project which might be causing an error [google/clspv](https://github.com/google/clspv) has been depending on this project (llvm-project), as a static library. My personal project has been depending on [google/clspv](https://github.com/google/clspv) as a shared library. So `MyProject` was linked to shared object `clspv_core.so` which is containing `llvm-project` as its component. # Problem Linking `llvm-project` indirectly to `MyProject` via `clspv_core` was causing the `free() invalid size` abortion. > When library is all statically linked, this problem did not occur. [This issue](https://github.com/llvm/llvm-project/issues/154528) has a full log of the programme running with valgrind. # Reason in my expectation `KnownAssumptionStrings` from [clang/lib/Sema/SemaOpenMP.cpp](https://github.com/llvm/llvm-project/pull/154541/files#diff-032b46da5a8b94f6d8266072e296726c361066e32139024c86dcba5bf64960fc), [llvm/include/llvm/IR/Assumptions.h](https://github.com/llvm/llvm-project/pull/154541/files#diff-ebb09639e5957c2e4d27be9dcb1b1475da67d88db829d24ed8039f351a63ccff), [llvm/lib/IR/Assumptions.cpp](https://github.com/llvm/llvm-project/pull/154541/files#diff-1b490dd29304c875364871e35e1cc8e47bf71898affe3a4dbde6eb91c4016d06) and `FeatureMap` from [llvm/lib/Analysis/MLInlineAdvisor.cpp](https://github.com/llvm/llvm-project/pull/154541/files#diff-26c738eb291410ed83595a4162de617e8cbebddb46331f56d39d193868e29857), [llvm/include/llvm/Analysis/InlineModelFeatureMaps.h](https://github.com/llvm/llvm-project/pull/154541/files#diff-3b5a3359b2a0784186fb3f90dfabf905e8640b6adfd7d2c75259a6835751a6a7) which have been placed on global scope, causing static initialisation order ficasso when indirectly linked by `Myproject`. # Fix trial Changing those global instances I've mentioned ~ `KnownAssumptionStrings` and `FeatureMap` ~ to functions which return a static variable's left value ~ `getKnownAssumptionStrings()`, `getFeatureMap()` ~ has solved my personal problem, so I am pulling a request of it.
2025-08-29[llvm][DebugInfo] Fix set debug validation with the addition of the new ↵peter mckinna1-0/+2
subrange debug (#154665) Set debug was failing validation where the base type was subrange. This has happened with the addition of the new functionality for proper subrange debugging. This fix just allows set types to be based on subranges.
2025-08-27[NVPTX] Auto-upgrade nvvm.grid_constant to param attribute (#155489)Alex MacLean1-0/+10
Upgrade the !"grid_constant" !nvvm.annotation to a "nvvm.grid_constant" attribute. This attribute is much simpler for front-ends to apply and faster and simpler to query.
2025-08-26[AMDGPU] wmma_scale* IR verification (#155493)Stanislav Mekhanoshin1-1/+3