aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/IR/DataLayout.cpp
AgeCommit message (Collapse)AuthorFilesLines
5 days[DataLayout][LangRef] Split non-integral and unstable pointer propertiesAlexander Richardson1-10/+34
This commit adds finer-grained versions of isNonIntegralAddressSpace() and isNonIntegralPointerType() where the current semantics prohibit introduction of both ptrtoint and inttoptr instructions. The current semantics are too strict for some targets (e.g. AMDGPU/CHERI) where ptrtoint has a stable value, but the pointer has additional metadata. Currently, marking a pointer address space as non-integral also marks it as having an unstable bitwise representation (e.g. when pointers can be changed by a copying GC). This property inhibits a lot of optimizations that are perfectly legal for other non-integral pointers such as fat pointers or CHERI capabilities that have a well-defined bitwise representation but can't be created with only an address. This change splits the properties of non-integral pointers and allows for address spaces to be marked as unstable or non-integral (or both) independently using the 'p' part of the DataLayout string. A 'u' following the p marks the address space as unstable and specifying a index width != representation width marks it as non-integral. Finally, we also add an 'e' flag to mark pointers with external state (such as the CHERI capability validity) state. These pointers require special handling of loads and stores in addition to being non-integral. This does not change the checks in any of the passes yet - we currently keep the existing non-integral behaviour. In the future I plan to audit calls to DL.isNonIntegral[PointerType]() and replace them with the DL.mustNotIntroduce{IntToPtr,PtrToInt}() checks that allow for more optimizations. RFC: https://discourse.llvm.org/t/rfc-finer-grained-non-integral-pointer-properties/83176 Reviewed By: nikic, krzysz00 Pull Request: https://github.com/llvm/llvm-project/pull/105735
2025-09-11[llvm] Move data layout string computation to TargetParser (#157612)Reid Kleckner1-12/+0
Clang and other frontends generally need the LLVM data layout string in order to generate LLVM IR modules for LLVM. MLIR clients often need it as well, since MLIR users often lower to LLVM IR. Before this change, the LLVM datalayout string was computed in the LLVM${TGT}CodeGen library in the relevant TargetMachine subclass. However, none of the logic for computing the data layout string requires any details of code generation. Clients who want to avoid duplicating this information were forced to link in LLVMCodeGen and all registered targets, leading to bloated binaries. This happened in PR #145899, which measurably increased binary size for some of our users. By moving this information to the TargetParser library, we can delete the duplicate datalayout strings in Clang, and retain the ability to generate IR for unregistered targets. This is intended to be a very mechanical LLVM-only change, but there is an immediately obvious follow-up to clang, which will be prepared separately. The vast majority of data layouts are computable with two inputs: the triple and the "ABI name". There is only one exception, NVPTX, which has a cl::opt to enable short device pointers. I invented a "shortptr" ABI name to pass this option through the target independent interface. Everything else fits. Mips is a bit awkward because it uses a special MipsABIInfo abstraction, which includes members with codegen-like concepts like ABI physical registers that can't live in TargetParser. I think the string logic of looking for "n32" "n64" etc is reasonable to duplicate. We have plenty of other minor duplication to preserve layering. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com> Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-09-08[DataLayout] Remove i1 alignment entry (#156657)Nikita Popov1-1/+0
I don't think we need to explicitly specify i1 alignment, as this is going to fall back to i8 alignment. This may change behavior if a data layout explicitly sets i8 alignment without also setting i1 layout, but I'd expect this to be a bug fix in that case.
2025-09-04[DataLayout] Specialize the getTypeAllocSize() implementation (#156687)Nikita Popov1-0/+38
getTypeAllocSize() currently works by taking the type store size and aligning it to the ABI alignment. However, this ends up doing redundant work in various cases, for example arrays will unnecessarily repeat the alignment step, and structs will fetch the StructLayout multiple times. As this code is rather hot (it is called every time we need to calculate GEP offsets for example), specialize the implementation. This repeats a small amount of logic from getAlignment(), but I think that's worthwhile.
2025-09-03[DataLayout] Use linear scan to determine integer alignment (NFC)Nikita Popov1-1/+6
The number of alignment entries is usually very small (5-7), so it is more efficient to use a linear scan than a binary search.
2025-09-02[DataLayout] Explicitly call getFixedValue() (NFC)Nikita Popov1-3/+4
Instead of relying on the implicit cast. The scalable case has been explicitly checked beforehand.
2025-05-12[IR] Use llvm::upper_bound (NFC) (#139656)Kazu Hirata1-5/+4
2025-05-06[NFC][llvm] Drop isOsWindowsOrUEFI API (#138733)Prabhu Rajasekaran1-1/+1
The Triple and SubTarget API functions isOsWindowsOrUEFI is not preferred. Dropping them.
2025-01-28[nfc][llvm] Clean up isUEFI checks (#124845)Prabhuk1-1/+1
The check for `isOSWindows() || isUEFI()` is used in several places across the codebase. Introducing `isOSWindowsOrUEFI()` in Triple.h to simplify these checks.
2024-12-13[DataLayout] Remove getMaxIndexSizeInBits() APINikita Popov1-7/+0
The last use was removed in #119365, and we should not add more uses of this concept in the future either.
2024-12-06DataLayout: Fix latent issues with getMaxIndexSizeInBits (#118740)Owen Anderson1-4/+2
Because it was implemented in terms of getMaxIndexSize, it was always rounding the values up to a multiple of 8. Additionally, it was using the PointerSpec's BitWidth rather than its IndexBitWidth, which was self-evidently incorrect. Since getMaxIndexSize was only used by getMaxIndexSizeInBits, and its name and function seem niche and somewhat confusing, go ahead and remove it until a concrete need for it arises.
2024-10-25[DataLayout] Refactor storage of non-integral address spacesAlexander Richardson1-9/+23
Instead of storing this as a separate array of non-integral pointers, add it to the PointerSpec class instead. This will allow for future simplifications such as splitting the non-integral property into multiple distinct ones: relocatable (i.e. non-stable representation) and non-integral representation (i.e. pointers with metadata). Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/105734
2024-08-20[DataLayout] Refactor the rest of `parseSpecification` (#104545)Sergei Barannikov1-97/+43
The aim is to improve test coverage of data layout string parsing. Pull Request: https://github.com/llvm/llvm-project/pull/104545
2024-08-19[DataLayout] Refactor parsing of i/f/v/a specifications (#104699)Sergei Barannikov1-93/+89
Split off of #104545 to reduce patch size.
2024-08-16[DataLayout] Refactor parsing of "p" specification (#104583)Sergei Barannikov1-73/+108
Split off of #104545 to reduce patch size. Similar to #104546, this introduces `parseSize` and `parseAlignment`, which are improved versions of `getInt` tailored for specific needs. I'm not a GTest guru, so the tests are not ideal.
2024-08-16[DataLayout] Refactor parsing of "ni" specification (#104546)Sergei Barannikov1-16/+36
Split off of #104545 to reduce patch size. This introduces `parseAddrSpace` function, intended as a replacement for `getAddrSpace`, which doesn't check for trailing characters after the address space number. `getAddrSpace` will be removed after switching all uses to `parseAddrSpace`. Pull Request: https://github.com/llvm/llvm-project/pull/104546
2024-08-15[DataLayout] Extract loop body into a function to reduce nesting (NFC) (#104420)Sergei Barannikov1-244/+251
Also, use `iterator_range` version of `split`. Pull Request: https://github.com/llvm/llvm-project/pull/104420
2024-08-15[DataLayout] Add helper predicates to sort specifications (NFC) (#104417)Sergei Barannikov1-23/+24
2024-08-15[DataLayout] Move '*AlignElem' structs and enum inside DataLayout (NFC) ↵Sergei Barannikov1-131/+104
(#103723) This makes `LayoutAlignElem` / `PointerAlignElem` and `AlignTypeEnum` inner types of `DataLayout`. The types are also renamed to match their meaning (LangRef refers to them as "specification" and "specifier"). Pull Request: https://github.com/llvm/llvm-project/pull/103723
2024-08-14[DataLayout] Use member initialization (NFC) (#103712)Sergei Barannikov1-29/+30
This also adds a default constructor and a few uses of it.
2024-08-14[DataLayout] Split StructAlignment into two fields (NFC) (#103700)Sergei Barannikov1-7/+8
Aggregate type specification doesn't have the size component. Don't abuse LayoutAlignElem to avoid confusion.
2024-08-13[DataLayout] Remove `clear` and `reset` methods (NFC) (#102993)Sergei Barannikov1-48/+29
`clear` was never necessary as it is always called on a fresh instance of the class or just before freeing an instance's memory. `reset` is effectively the same as the constructor. Pull Reuquest: https://github.com/llvm/llvm-project/pull/102993
2024-08-13[DataLayout] Remove constructor accepting a pointer to Module (#102841)Sergei Barannikov1-7/+0
The constructor initializes `*this` with `M->getDataLayout()`, which is effectively the same as calling the copy constructor. There does not seem to be a case where a copy would be necessary. Pull Request: https://github.com/llvm/llvm-project/pull/102841
2024-08-12[DataLayout] Move `operator=` to cpp file (NFC) (#102849)Sergei Barannikov1-19/+39
`DataLayout` isn't exactly cheap to copy (448 bytes on a 64-bit host). Move `operator=` to cpp file to improve compilation time. Also move `operator==` closer to `operator=` and add a couple of FIXMEs.
2024-08-08[DataLayout] Remove deprecated method (#101495)Sergei Barannikov1-5/+0
The method has been deprecated for a year and a half now.
2024-07-25Remove the `x86_mmx` IR type. (#98505)James Y Knight1-1/+0
It is now translated to `<1 x i64>`, which allows the removal of a bunch of special casing. This _incompatibly_ changes the ABI of any LLVM IR function with `x86_mmx` arguments or returns: instead of passing in mmx registers, they will now be passed via integer registers. However, the real-world incompatibility caused by this is expected to be minimal, because Clang never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>` or `double`, depending on ABI. This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type. That type simply no longer corresponds to an IR type, and is used only by MMX intrinsics and inline-asm operands. Because SelectionDAGBuilder only knows how to generate the operands/results of intrinsics based on the IR type, it thus now generates the intrinsics with the type MVT::v1i64, instead of MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus have the X86 backend fix them up in DAGCombine. (This may be a short-lived hack, if all the MMX intrinsics can be removed in upcoming changes.) Works towards issue #98272.
2024-03-10Add llvm::min/max_element and use it in llvm/ and mlir/ directories. (#84678)Justin Lebar1-1/+1
For some reason this was missing from STLExtras.
2024-01-04[IR] Fix GEP offset computations for vector GEPs (#75448)Jannik Silvanus1-3/+2
Vectors are always bit-packed and don't respect the elements' alignment requirements. This is different from arrays. This means offsets of vector GEPs need to be computed differently than offsets of array GEPs. This PR fixes many places that rely on an incorrect pattern that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`. We replace these by usages of `GTI.getSequentialElementStride(DL)`, which is a new helper function added in this PR. This changes behavior for GEPs into vectors with element types for which the (bit) size and alloc size is different. This includes two cases: * Types with a bit size that is not a multiple of a byte, e.g. i1. GEPs into such vectors are questionable to begin with, as some elements are not even addressable. * Overaligned types, e.g. i16 with 32-bit alignment. Existing tests are unaffected, but a miscompilation of a new test is fixed. --------- Co-authored-by: Nikita Popov <github@npopov.com>
2023-11-22[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)Sander de Smalen1-5/+5
It seems TypeSize is currently broken in the sense that: TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8) without failing its assert that explicitly tests for this case: assert(LHS.Scalable == RHS.Scalable && ...); The reason this fails is that `Scalable` is a static method of class TypeSize, and LHS and RHS are both objects of class TypeSize. So this is evaluating if the pointer to the function Scalable == the pointer to the function Scalable, which is always true because LHS and RHS have the same class. This patch fixes the issue by renaming `TypeSize::Scalable` -> `TypeSize::getScalable`, as well as `TypeSize::Fixed` to `TypeSize::getFixed`, so that it no longer clashes with the variable in FixedOrScalableQuantity. The new methods now also better match the coding standard, which specifies that: * Variable names should be nouns (as they represent state) * Function names should be verb phrases (as they represent actions)
2023-10-26[IR] Require index width to be ule pointer width (#70015)Nikita Popov1-0/+2
I don't think there is a use case for having an index type that is wider than the pointer type, and I'm not entirely clear what semantics this would even have. Also clarify the GEP semantics to explicitly say how they interact with the index type width.
2023-09-28[Basic] Support 64-bit x86 target for UEFIprabhukr1-1/+1
Adding support for X86_64 UEFI target to begin with. Reviewed By: phosek, MaskRay Differential Revision: https://reviews.llvm.org/D152206
2023-09-28Revert "[Basic] Support 64-bit x86 target for UEFI"prabhukr1-1/+1
This reverts commit 315a407086b0ab302d0293b720d7f9b3e8f6ffa9. The new test added fails to link the unit tests correctly and breaks certain buildbots.
2023-09-27[Basic] Support 64-bit x86 target for UEFIprabhukr1-1/+1
Adding support for X86_64 UEFI target to begin with. Reviewed By: phosek, MaskRay Differential Revision: https://reviews.llvm.org/D152206
2023-06-12[IR] Remove getABITypeAlignmentKazu Hirata1-5/+0
The last use of getABITypeAlignment was removed by: commit 26bd6476c61f08fc8c01895caa02b938d6a37221 Author: Guillaume Chatelet <gchatelet@google.com> Date: Fri Jan 13 15:05:24 2023 +0000 Differential Revision: https://reviews.llvm.org/D152670
2023-05-19[1/11][IR] Permit load/store/alloca for struct of the same scalable vector typeeopXD1-15/+35
This patch-set aims to simplify the existing RVV segment load/store intrinsics to use a type that represents a tuple of vectors instead. To achieve this, first we need to relax the current limitation for an aggregate type to be a target of load/store/alloca when the aggregate type contains homogeneous scalable vector types. Then to adjust the prolog of an LLVM function during lowering to clang. Finally we re-define the RVV segment load/store intrinsics to use the tuple types. The pull request under the RVV intrinsic specification is riscv-non-isa/rvv-intrinsic-doc#198 --- This is the 1st patch of the patch-set. This patch is originated from D98169. This patch allows aggregate type (StructType) that contains homogeneous scalable vector types to be a target of load/store/alloca. The RFC of this patch was posted in LLVM Discourse. https://discourse.llvm.org/t/rfc-ir-permit-load-store-alloca-for-struct-of-the-same-scalable-vector-type/69527 The main changes in this patch are: Extend `StructLayout::StructSize` from `uint64_t` to `TypeSize` to accommodate an expression of scalable size. Allow `StructType:isSized` to also return true for homogeneous scalable vector types. Let `Type::isScalableTy` return true when `Type` is `StructType` and contains scalable vectors Extra description is added in the LLVM Language Reference Manual on the relaxation of this patch. Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Co-Authored-by: eop Chen <eop.chen@sifive.com> Reviewed By: craig.topper, nikic Differential Revision: https://reviews.llvm.org/D146872
2023-03-28[llvm] Use pointer index type for more GEP offsets (pre-codegen)Krzysztof Drewniak1-0/+5
Many uses of getIntPtrType() were using that type to calculate the neened type for GEP offset arguments. However, some time ago, DataLayout was extended to support pointers where the size of the pointer is not equal to the size of the values used to index it. Much code was already migrated to, for example, use getIndexSizeInBits instead of getPtrSizeInBits, but some rewrites still used getIntPtrType() to get the type for GEP offsets. This commit changes uses of getIntPtrType() to getIndexType() where they are involved in a GEP-related calculation. In at least one case (bounds check insertion) this resolves a compiler crash that the new test added here would previously trigger. This commit does not impact - C library-related rewriting (memcpy()), which are operating under the assumption that intptr_t == size_t. While all the mechanisms for breaking this assumption now exist, doing so is outside the scope of this commit. - Code generation and below. Note that the use of getIntPtrType() in CodeGenPrepare will be changed in a future commit. - Usage of getIntPtrType() in any backend Depends on D143435 Reviewed By: arichardson Differential Revision: https://reviews.llvm.org/D143437
2023-02-23[LangRef] Correct value ranges for address space, vector, and float bit sizes.Michael Liao1-2/+2
- The current implementation checks them for 24-bit inegers but the document says 23-bit one effectively by listing the range as [1,2^23). - Minor error message correction. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D144685
2023-02-16[DataLayout] Use separate vectors to store alignment (NFC)Nikita Popov1-62/+74
Instead of storing alignment for integers, floats, vectors and structs in a single vector with a type tag, store them in separate vectors instead. This makes the alignment lookup faster, as we don't have to scan over irrelevant alignment entries.
2023-02-07[NFC][TargetParser] Remove llvm/ADT/Triple.hArchibald Elliott1-1/+1
I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.
2023-01-23[llvm] Fix warningsKazu Hirata1-1/+1
This patch fixes: llvm/lib/IR/DataLayout.cpp:942:13: warning: unused variable ‘VecTy’ [-Wunused-variable] llvm/lib/Transforms/IPO/OpenMPOpt.cpp:2899:27: warning: unused variable ‘MI’ [-Wunused-variable]
2023-01-23[IR] Avoid creation of GEPs into vectors (in one place)Jannik Silvanus1-7/+4
The method DataLayout::getGEPIndexForOffset(Type *&ElemTy, APInt &Offset) allows to generate GEP indices for a given byte-based offset. This allows to generate "natural" GEPs using the given type structure if the byte offset happens to match a nested element object. With opaque pointers and a general move towards byte-based GEPs [1], this function may be questionable in the future. This patch avoids creation of GEPs into vectors in routines that use DataLayout::getGEPIndexForOffset by not returning indices in that case. The reason is that A) GEPs into vectors have been discouraged for a long time [2], and B) that GEPs into vectors are currently broken if the element type is overaligned [1]. This is also demonstrated by a lit test where previously InstCombine replaced valid loads by poison. Note that the result of InstCombine on that test is *still* invalid, because padding bytes are assumed. Moreover, GEPs into vectors may be outright forbidden in the future [1]. [1]: https://discourse.llvm.org/t/67497 [2]: https://llvm.org/docs/GetElementPtr.html The test case is new. It will be precommitted if this patch is accepted. Differential Revision: https://reviews.llvm.org/D142146
2023-01-23[LangRef] Require i8s to be naturally alignedJannik Silvanus1-0/+3
It is widely assumed that i8 is naturally aligned (i8:8), and that hence i8s can be used to access arbitrary bytes. As discussed in https://discourse.llvm.org/t/status-of-overaligned-i8, this patch makes this assumption explicit, by documenting it in the LangRef, and enforcing it when parsing a data layout string. Historically, there have been data layouts that violate this requirement, notably the old DXIL data layout that aligns i8 to 32 bits. A previous patch (df1a74a) enabled importing modules with invalid data layouts using override callbacks. Users who wish to continue importing modules with overaligned i8s (e.g. DXIL) thus need to provide a data layout override callback that fixes the data layout, at minimum by setting natural alignment for i8. Any further adjustments to the module (e.g. adding padding bytes if necessary) need to be done after module import. In the case of DXIL, this should not be necessary, because i8 usage in DXIL is very limited and its alignment actually does not matter, see https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#primitive-types Differential Revision: https://reviews.llvm.org/D142211
2023-01-11[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize()Guillaume Chatelet1-2/+2
This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.
2023-01-11[NFC] Use TypeSize::getKnownMinValue() instead of TypeSize::getKnownMinSize()Guillaume Chatelet1-2/+2
This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.
2023-01-06Revert D141134 "[NFC] Only expose getXXXSize functions in TypeSize"Guillaume Chatelet1-1/+1
The patch should be discussed further. This reverts commit dd56e1c92b0e6e6be249f2d2dd40894e0417223f.
2023-01-06[NFC] Only expose getXXXSize functions in TypeSizeGuillaume Chatelet1-1/+1
Currently 'TypeSize' exposes two functions that serve the same purpose: - getFixedSize / getFixedValue - getKnownMinSize / getKnownMinValue source : https://github.com/llvm/llvm-project/blob/bf82070ea465969e9ae86a31dfcbf94c2a7b4c4c/llvm/include/llvm/Support/TypeSize.h#L337-L338 This patch offers to remove one of the two and stick to a single function in the code base. Differential Revision: https://reviews.llvm.org/D141134
2022-12-20[IR] Add a target extension type to LLVM.Joshua Cranmer1-0/+4
Target-extension types represent types that need to be preserved through optimization, but otherwise are not introspectable by target-independent optimizations. This patch doesn't add any uses of these types by an existing backend, it only provides basic infrastructure such that these types would work correctly. Reviewed By: nikic, barannikov88 Differential Revision: https://reviews.llvm.org/D135202
2022-12-05[IR] llvm::Optional => std::optionalFangrui Song1-3/+3
Many llvm/IR/* files have been migrated by other contributors. This migrates most remaining files.
2022-12-02[IR] Use std::nullopt instead of None (NFC)Kazu Hirata1-3/+3
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-02-02Cleanup header dependencies in LLVMCoreserge-sans-paille1-1/+2
Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/ I've tried to summarize the biggest change below: - llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l before: 6400831 after: 6189948 200k lines less to process is no that bad ;-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118652