aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Serialization/ASTWriterDecl.cpp
AgeCommit message (Collapse)AuthorFilesLines
2024-11-06Revert "Reapply "[Clang][Sema] Refactor collection of multi-level template ↵Krystian Stasiowski1-7/+10
argument lists (#106585, #111173)" (#111852)" (#115159) This reverts commit 2bb3d3a3f32ffaef3d9b6a27db7f1941f0cb1136.
2024-11-06[Clang] Correctly initialize placeholder fields from their initializers ↵cor3ntin1-1/+1
(#114196) We made the incorrect assumption that names of fields are unique when creating their default initializers. We fix that by keeping track of the instantiaation pattern for field decls that are placeholder vars, like we already do for unamed fields. Fixes #114069
2024-10-24[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399)Jay Foad1-1/+1
Follow up to #109133.
2024-10-11Reapply "[Clang][Sema] Refactor collection of multi-level template argument ↵Krystian Stasiowski1-10/+7
lists (#106585, #111173)" (#111852) This patch reapplies #111173, fixing a bug when instantiating dependent expressions that name a member template that is later explicitly specialized for a class specialization that is implicitly instantiated. The bug is addressed by adding the `hasMemberSpecialization` function, which return `true` if _any_ redeclaration is a member specialization. This is then used when determining the instantiation pattern for a specialization of a template, and when collecting template arguments for a specialization of a template.
2024-10-09Revert "Reapply "[Clang][Sema] Refactor collection of multi-level template ↵Krystian Stasiowski1-7/+10
argument lists (#106585)" (#111173)" (#111766) This reverts commit 4da8ac34f76e707ab94380b94f616457cfd2cb83.
2024-10-09Revert "[clang] Track function template instantiation from definition ↵Krystian Stasiowski1-2/+1
(#110387)" (#111764) This reverts commit 4336f00f2156970cc0af2816331387a0a4039317.
2024-10-09[clang] Track function template instantiation from definition (#110387)Matheus Izvekov1-1/+2
This fixes instantiation of definition for friend function templates, when the declaration found and the one containing the definition have different template contexts. In these cases, the the function declaration corresponding to the definition is not available; it may not even be instantiated at all. So this patch adds a bit which tracks which function template declaration was instantiated from the member template. It's used to find which primary template serves as a context for the purpose of obtaining the template arguments needed to instantiate the definition. Fixes #55509
2024-10-08Reapply "[Clang][Sema] Refactor collection of multi-level template argument ↵Krystian Stasiowski1-10/+7
lists (#106585)" (#111173) Reapplies #106585, fixing an issue where non-dependent names of member templates appearing prior to that member template being explicitly specialized for an implicitly instantiated class template specialization would incorrectly use the definition of the explicitly specialized member template.
2024-09-29[C++20] [Modules] Emit implicit Deduction Guide for implicit class ↵Chuanqi Xu1-0/+12
specialization Fixed a crash for the attached test case due to we missed to emit the deduction guide. The reason is, the deduction guide is attached to the export-decl in the imported module. So we won't emit it by traversing the AST of the current TU.
2024-09-27[C++20][Modules] Fix non-determinism in serialized AST (#110131)Dmitry Polukhin1-1/+2
Summary: https://github.com/llvm/llvm-project/pull/109167 serializes FunctionToLambdasMap in the order of pointers in DenseMap. It gives different order with different memory layouts. Fix this issue by using LocalDeclID instead of pointers. Test Plan: check-clang
2024-09-25[C++20][Modules] Fix crash when function and lambda inside loaded from ↵Dmitry Polukhin1-0/+5
different modules (#109167) Summary: Because AST loading code is lazy and happens in unpredictable order, it is possible that a function and lambda inside the function can be loaded from different modules. As a result, the captured DeclRefExpr won’t match the corresponding VarDecl inside the function. This situation is reflected in the AST as follows: ``` FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline |-also in ./folly-conv.h `-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1> |-DeclStmt 0x555564f7ced8 <line:34:3, col:17> | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit | `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0 |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>' | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0 | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)' | |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition | | |-also in ./thrift_cpp2_base.h | | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init | | |-DefaultConstructor defaulted_is_constexpr | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment | | `-Destructor simple irrelevant trivial constexpr needs_implicit | `-CompoundStmt 0x555564f7d1a8 <col:58, col:75> | `-ReturnStmt 0x555564f7d198 <col:60, col:67> | `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11> `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void' ``` This diff modifies the AST deserialization process to load lambdas within the canonical function declaration sooner, immediately following the function, ensuring that they are loaded from the same module. Re-land https://github.com/llvm/llvm-project/pull/104512 Added test case that caused crash due to multiple enclosed lambdas deserialization. Test Plan: check-clang
2024-09-21Revert "[Clang][Sema] Refactor collection of multi-level template argument ↵Martin Storsjö1-7/+10
lists (#106585)" This reverts commit cdd71d61664b63ae57bdba9ee0d891f78ef79c07 (and 30adb43c897a45c18d7dd163fb4ff40c915fc488). This change broke compiling Qt, see https://github.com/llvm/llvm-project/pull/106585#issuecomment-2365309463 for details.
2024-09-20[Clang][Sema] Refactor collection of multi-level template argument lists ↵Krystian Stasiowski1-10/+7
(#106585) Currently, clang rejects the following explicit specialization of `f` due to the constraints not being equivalent: ``` template<typename T> struct A { template<bool B> void f() requires B; }; template<> template<bool B> void A<int>::f() requires B { } ``` This happens because, in most cases, we do not set the flag indicating whether a `RedeclarableTemplate` is an explicit specialization of a member of an implicitly instantiated class template specialization until _after_ we compare constraints for equivalence. This patch addresses the issue (and a number of other issues) by: - storing the flag indicating whether a declaration is a member specialization on a per declaration basis, and - significantly refactoring `Sema::getTemplateInstantiationArgs` so we collect the right set of template argument in all cases. Many of our declaration matching & constraint evaluation woes can be traced back to bugs in `Sema::getTemplateInstantiationArgs`. This change/refactor should fix a lot of them. It also paves the way for fixing #101330 and #105462 per my suggestion in #102267 (which I have implemented on top of this patch but will merge in a subsequent PR).
2024-09-11Revert "[RFC][C++20][Modules] Fix crash when function and lambda insi… ↵Pranav Kant1-42/+0
(#108311) …de loaded from different modules (#104512)" This reverts commit d778689fdc812033e7142ed87e4ee13c4997b3f9.
2024-09-10[RFC][C++20][Modules] Fix crash when function and lambda inside loaded from ↵Dmitry Polukhin1-0/+42
different modules (#104512) Summary: Because AST loading code is lazy and happens in unpredictable order it could happen that function and lambda inside function can be loaded from different modules. In this case, captured DeclRefExpr won’t match the corresponding VarDecl inside function. In AST it looks like this: ``` FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline |-also in ./folly-conv.h `-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1> |-DeclStmt 0x555564f7ced8 <line:34:3, col:17> | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit | `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0 |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>' | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0 | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)' | |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition | | |-also in ./thrift_cpp2_base.h | | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init | | |-DefaultConstructor defaulted_is_constexpr | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment | | `-Destructor simple irrelevant trivial constexpr needs_implicit | `-CompoundStmt 0x555564f7d1a8 <col:58, col:75> | `-ReturnStmt 0x555564f7d198 <col:60, col:67> | `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11> `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void' ``` This diff changes AST deserialization to load lambdas inside canonical function declaration earlier right after the function to make sure that their canonical decl is loaded from the same module. Test Plan: check-clang
2024-08-15[Clang] Implement C++26’s P2893R3 ‘Variadic friends’ (#101448)Sirraide1-0/+1
Implement P2893R3 ‘Variadic friends’ for C++26. This closes #98587. Co-authored-by: Younan Zhang <zyn7109@gmail.com>
2024-08-08Reland [C++20] [Modules] [Itanium ABI] Generate the vtable in the mod… ↵Chuanqi Xu1-0/+6
(#102287) Reland https://github.com/llvm/llvm-project/pull/75912 The differences of this PR between https://github.com/llvm/llvm-project/pull/75912 are: - Fixed a regression in `Decl::isInAnotherModuleUnit()` in DeclBase.cpp pointed by @mizvekov and add the corresponding test. - Fixed the regression in windows https://github.com/llvm/llvm-project/issues/97447. The changes are in `CodeGenModule::getVTableLinkage` from `clang/lib/CodeGen/CGVTables.cpp`. According to the feedbacks from MSVC devs, the linkage of vtables won't affected by modules. So I simply skipped the case for MSVC. Given this is more or less fundamental to the use of modules. I hope we can backport this to 19.x.
2024-07-18[C++20] [Modules] Write ODRHash for decls in GMFChuanqi Xu1-13/+4
Previously, we skipped calculating ODRHash for decls in GMF when writing them to .pcm files as an optimization. But actually, it is not true that this will be a pure optimization. Whether or not it is beneficial depends on the use cases. For example, if we're writing a function `a` in module and there are 10 consumers of `a` in other TUs, then the other TUs will pay for the cost to calculate the ODR hash for `a` ten times. Then this optimization doesn't work. However, if all the consumers of the module didn't touch `a`, then we can save the cost to calculate the ODR hash of `a` for 1 times. And the assumption to make it was: generally, the consumers of a module may only consume a small part of the imported module. This is the reason why we tried to load declarations, types and identifiers lazily. Then it looks good to do the similar thing for calculating ODR hashs. It works fine for a long time, until we started to look into the support of modules in clangd. Then we meet multiple issue reports complaining we're calculating ODR hash in the wrong place. To workaround these issue reports, I decided to always write the ODRhash for decls in GMF. In my local test, I only observed less than 1% compile time regression after doing this. So it should be fine.
2024-07-15[Clang][AST] Move NamespaceDecl bits to DeclContext (#98567)Krystian Stasiowski1-1/+1
Currently, `NamespaceDecl` has a member `AnonOrFirstNamespaceAndFlags` which stores a few pieces of data: - a bit indicating whether the namespace was declared `inline`, and - a bit indicating whether the namespace was declared as a _nested-namespace-definition_, and - a pointer a `NamespaceDecl` that either stores: - a pointer to the first declaration of that namespace if the declaration is no the first declaration, or - a pointer to the unnamed namespace that inhabits the namespace otherwise. `Redeclarable` already stores a pointer to the first declaration of an entity, so it's unnecessary to store this in `NamespaceDecl`. `DeclContext` has 8 bytes in which various bitfields can be stored for a declaration, so it's not necessary to store these in `NamespaceDecl` either. We only need to store a pointer to the unnamed namespace that inhabits the first declaration of a namespace. This patch moves the two bits currently stored in `NamespaceDecl` to `DeclContext`, and only stores a pointer to the unnamed namespace that inhabits a namespace in the first declaration of that namespace. Since `getOriginalNamespace` always returns the same `NamespaceDecl` as `getFirstDecl`, this function is removed to avoid confusion.
2024-07-10Revert "[C++20] [Modules] [Itanium ABI] Generate the vtable in the module ↵Chuanqi Xu1-6/+0
unit of dynamic classes (#75912)" This reverts commit 18f3bcbb13ca83d33223b00761d8cddf463e9ffb, 15bb02650e26875c48889053d6a9697444583721 and 99873b35da7ecb905143c8a6b8deca4d4416f1a9. See the post commit message in https://github.com/llvm/llvm-project/pull/75912 to see the reasons.
2024-06-19[NFC] [Serialization] Unify how LocalDeclID can be createdChuanqi Xu1-2/+2
Now we can create a LocalDeclID directly with an integer without verifying. It may be hard to refactor if we want to change the way we serialize DeclIDs (See https://github.com/llvm/llvm-project/pull/95897). Also it is hard for us to debug if someday someone construct a LocalDeclID with an incorrect value. So in this patch, I tried to unify the way we can construct a LocalDeclID in ASTReader, where we will construct the LocalDeclID from the serialized data. Also, now we can verify the constructed LocalDeclID sooner in the new interface.
2024-06-17[C++20] [Modules] [Itanium ABI] Generate the vtable in the module unit of ↵Chuanqi Xu1-0/+6
dynamic classes (#75912) Close https://github.com/llvm/llvm-project/issues/70585 and reflect https://github.com/itanium-cxx-abi/cxx-abi/issues/170. The significant change of the patch is: for dynamic classes attached to module units, we generate the vtable to the attached module units directly and the key functions for such classes is meaningless.
2024-06-03[C++20] [Modules] [Reduced BMI] Handling Deduction Guide in reduced BMIChuanqi Xu1-1/+1
carefully Close https://github.com/llvm/llvm-project/issues/93859 The direct pattern of the issue is that, in a reduced BMI, we're going to wrtie a class but we didn't write the deduction guide. Although we handled deduction guide, but we tried to record the found deduction guide from `noload_lookup` directly. It is slightly problematic if the found deduction guide is from AST. e.g., ``` module; export module m; import xxx; // Also contains the class and the deduction guide ... ``` Then when we writes the class in the current file, we tried to record the deduction guide, but `noload_lookup` returns the deduction guide from the AST file then we didn't record the local deduction guide. Then mismatch happens. To mitiagte the problem, we tried to record the canonical declaration for the decution guide.
2024-05-22[clang] NFCI: use TemplateArgumentLoc for NTTP DefaultArgument (#92852)Matheus Izvekov1-1/+1
This is an enabler for https://github.com/llvm/llvm-project/pull/92855 This allows an NTTP default argument to be set as an arbitrary TemplateArgument, not just an expression. This allows template parameter packs to have default arguments in the AST, even though the language proper doesn't support the syntax for it. This allows NTTP default arguments to be other kinds of arguments, like packs, integral constants, and such.
2024-05-21[clang] NFCI: use TemplateArgumentLoc for type-param DefaultArgument (#92854)Matheus Izvekov1-1/+1
This is an enabler for a future patch. This allows an type-parameter default argument to be set as an arbitrary TemplateArgument, not just a type. This allows template parameter packs to have default arguments in the AST, even though the language proper doesn't support the syntax for it. This will be used in a later patch which synthesizes template parameter lists with arbitrary default arguments taken from template specializations. There are a few places we used SubsType, because we only had a type, now we use SubstTemplateArgument. SubstTemplateArgument was missing arguments for setting Instantiation location and entity names. Adding those is needed so we don't regress in diagnostics.
2024-05-14Reapply "[Clang] Unify interface for accessing template arguments as written ↵Krystian Stasiowski1-10/+26
for class/variable template specializations (#81642)" (#91393) Reapplies #81642, fixing the crash which occurs when running the lldb test suite.
2024-05-07Revert "[Clang] Unify interface for accessing template arguments as written ↵Adrian Prantl1-26/+10
for class/variable template specializations (#81642)" This reverts commit 7115ed0fff027b65fa76fdfae215ed1382ed1473. This commit broke several LLDB tests. https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake/3480/
2024-05-07[Clang] Unify interface for accessing template arguments as written for ↵Krystian Stasiowski1-10/+26
class/variable template specializations (#81642) Our current method of storing the template arguments as written for `(Class/Var)Template(Partial)SpecializationDecl` suffers from a number of flaws: - We use `TypeSourceInfo` to store `TemplateArgumentLocs` for class template/variable template partial/explicit specializations. For variable template specializations, this is a rather unintuitive hack (as we store a non-type specialization as a type). Moreover, we don't ever *need* the type as written -- in almost all cases, we only want the template arguments (e.g. in tooling use-cases). - The template arguments as written are stored in a number of redundant data members. For example, `(Class/Var)TemplatePartialSpecialization` have their own `ArgsAsWritten` member that stores an `ASTTemplateArgumentListInfo` (the template arguments). `VarTemplateSpecializationDecl` has yet _another_ redundant member "`TemplateArgsInfo`" that also stores an `ASTTemplateArgumentListInfo`. This patch eliminates all `(Class/Var)Template(Partial)SpecializationDecl` members which store the template arguments as written, and turns the `ExplicitInfo` member into a `llvm::PointerUnion<const ASTTemplateArgumentListInfo*, ExplicitInstantiationInfo*>` (to avoid unnecessary allocations when the declaration isn't an explicit instantiation). The template arguments as written are now accessed via `getTemplateArgsWritten` in all cases. The "most breaking" change is to AST Matchers, insofar that `hasTypeLoc` will no longer match class template specializations (since they no longer store the type as written).
2024-05-06Reland "[Modules] No transitive source location change (#86912)"Chuanqi Xu1-3/+5
This relands 6c31104. The patch was reverted due to incorrectly introduced alignment. And the patch was re-commited after fixing the alignment issue. Following off are the original message: This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` | | | _____ base offset of an imported module | | | |_____ base offset of another imported module | | | | | ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` |-----------------------|-----------------------| | A | B | C | * A: 32 bit. The index of the module file in the module manager + 1. * The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file * containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save * some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.
2024-04-30Revert "[Modules] No transitive source location change (#86912)"Chuanqi Xu1-5/+3
This reverts commit 6c3110464bac3600685af9650269b0b2b8669d34. Required by the post commit comments: https://github.com/llvm/llvm-project/pull/86912
2024-04-30[Modules] No transitive source location change (#86912)Chuanqi Xu1-3/+5
This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. # Motivation Example This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. # Internal perspective of status quo To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` | | | _____ base offset of an imported module | | | |_____ base offset of another imported module | | | | | ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. # The high level design of current patch From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. # Some low level details I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` |-----------------------|-----------------------| | A | B | C | * A: 32 bit. The index of the module file in the module manager + 1. The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. # Correctness The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. # Future Plans I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.
2024-04-25[NFC] [Serialization] Avoid using DeclID directly as much as possibleChuanqi Xu1-10/+12
This patch tries to remove all the direct use of DeclID except the real low level reading and writing. All the use of DeclID is converted to the use of LocalDeclID or GlobalDeclID. This is helpful to increase the readability and type safety.
2024-04-25Revert "[NFC] [Serialization] Avoid using DeclID directly as much as possible"Chuanqi Xu1-12/+10
This reverts commit 42070a5c092ed420bf92ebf38229c594885e94c7. I forgot to touch lldb.
2024-04-25[NFC] [Serialization] Avoid using DeclID directly as much as possibleChuanqi Xu1-10/+12
This patch tries to remove all the direct use of DeclID except the real low level reading and writing. All the use of DeclID is converted to the use of LocalDeclID or GlobalDeclID. This is helpful to increase the readability and type safety.
2024-04-25[NFC] Move DeclID from serialization/ASTBitCodes.h to AST/DeclID.h (#89873)Chuanqi Xu1-2/+2
Previously, the DeclID is defined in serialization/ASTBitCodes.h under clang::serialization namespace. However, actually the DeclID is not purely used in serialization part. The DeclID is already widely used in AST and all around the clang project via classes like `LazyPtrDecl` or calling `ExternalASTSource::getExernalDecl()`. All such uses are via the raw underlying type of `DeclID` as `uint32_t`. This is not pretty good. This patch moves the DeclID class family to a new header `AST/DeclID.h` so that the whole project can use the wrapped class `DeclID`, `GlobalDeclID` and `LocalDeclID` instead of the raw underlying type. This can improve the readability and the type safety.
2024-04-16[unused-includes][Serialization] Remove unused includes. NFC. (#88790)Volodymyr Sapsai1-1/+0
2024-04-16[StmtProfile] Don't profile the body of lambda expressionsChuanqi Xu1-4/+4
Close https://github.com/llvm/llvm-project/issues/87609 We tried to profile the body of the lambda expressions in https://reviews.llvm.org/D153957. But as the original comments show, it is indeed dangerous. After we tried to skip calculating the ODR hash values recently, we have fall into this trap twice. So in this patch, I choose to not profile the body of the lambda expression. The signature of the lambda is still profiled.
2024-04-14[Clang] [C++26] Implement P2573R2: `= delete("should have a reason");` (#86526)Sirraide1-2/+9
This implements support for the `= delete("message")` syntax that was only just added to C++26 ([P2573R2](https://isocpp.org/files/papers/P2573R2.html#proposal-scope)).
2024-04-12[C++20] [Modules] [Reduced BMI] Remove unreachable decls GMF in redued BMI ↵Chuanqi Xu1-2/+25
(#88359) Following of https://github.com/llvm/llvm-project/pull/76930 This follows the idea of "only writes what we writes", which I think is the most natural and efficient way to implement this optimization. We start writing the BMI from the first declaration in module purview instead of the global module fragment, so that everything in the GMF untouched won't be written in the BMI naturally. The exception is, as I said in https://github.com/llvm/llvm-project/pull/76930, when we write a declaration we need to write its decl context, and when we write the decl context, we need to write everything from it. So when we see `std::vector`, we basically need to write everything under namespace std. This violates our intention. To fix this, this patch delays the writing of namespace in the GMF. From my local measurement, the size of the BMI decrease to 90M from 112M for a local modules build. I think this is significant. This feature will be covered under the experimental reduced BMI so that it won't affect any existing users. So I'd like to land this when the CI gets green. Documents will be added seperately.
2024-04-11[Clang][AST] Track whether template template parameters used the 'typename' ↵Krystian Stasiowski1-0/+1
keyword (#88139) This patch adds a `Typename` bit-field to `TemplateTemplateParmDecl` which stores whether the template template parameter was declared with the `typename` keyword.
2024-03-14[C++20] [Modules] [Reduced BMI] Generate the function body from implicitly ↵Chuanqi Xu1-1/+8
instantiated class and constant variables After this patch, we will generate the function body from implicitly instantiated class. This is important for consumers with same template arguments. Otherwise the consumers won't see the function body. Since the consumers won't instantiate the templates again if they find an instantiation. Also we will generate the variable definition if the variable is non-inline but known as constant. Such variables may not affect the ABI, but they may get involved into the compile time constant computation in the consumer's code. So we have to generate such definitions.
2024-03-11[C++20] [Moduls] Avoid computing odr hash for functions from comparing ↵Chuanqi Xu1-4/+4
constraint expression Previously we disabled to compute ODR hash for declarations from the global module fragment. However, we missed the case that the functions lives in the concept requiments (see the attached the test files for example). And the mismatch causes the potential crashment. Due to we will set the function body as lazy after we deserialize it and we will only take its body when needed. However, we don't allow to take the body during deserializing. So it is actually potentially problematic if we set the body as lazy first and computing the hash value of the function, which requires to deserialize its body. So we will meet a crash here. This patch tries to solve the issue by not taking the body of the function from GMF. Note that we can't skip comparing the constraint expression from the GMF directly since it is an key part of the function selecting and it may be the reason why we can't return 0 directly for `FunctionDecl::getODRHash()` from the GMF.
2024-03-08[C++20] [Modules] Introduce reduced BMI (#75894)Chuanqi Xu1-7/+38
Close https://github.com/llvm/llvm-project/issues/71034 See https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755 This patch introduces reduced BMI, which doesn't contain the definitions of functions and variables if its definitions won't contribute to the ABI. Testing is a big part of the patch. We want to make sure the reduced BMI contains the same behavior with the existing and relatively stable fatBMI. This is pretty helpful for further reduction. The user interfaces part it left to following patches to ease the reviewing.
2024-02-20[Serialization] Record whether the ODR is skipped (#82302)Chuanqi Xu1-5/+10
Close https://github.com/llvm/llvm-project/issues/80570. In https://github.com/llvm/llvm-project/commit/a0b6747804e46665ecfd00295b60432bfe1775b6, we skipped ODR checks for decls in GMF. Then it should be natural to skip storing the ODR values in BMI. Generally it should be fine as long as the writer and the reader keep consistent. However, the use of preamble in clangd shows the tricky part. For, ``` // test.cpp module; // any one off these is enough to crash clangd // #include <iostream> // #include <string_view> // #include <cmath> // #include <system_error> // #include <new> // #include <bit> // probably many more // only ok with libc++, not the system provided libstdc++ 13.2.1 // these are ok export module test; ``` clangd will store the headers as preamble to speedup the parsing and the preamble reuses the serialization techniques. (Generally we'd call the preamble as PCH. However it is not true strictly. I've tested the PCH wouldn't be problematic.) However, the tricky part is that the preamble is not modules. It literally serialiaze and deserialize things. So before clangd parsing the above test module, clangd will serialize the headers into the preamble. Note that there is no concept like GMF now. So the ODR bits are stored. However, when clangd parse the file actually, the decls from preamble are thought as in GMF literally, then hte ODR bits are skipped. Then mismatch happens. To solve the problem, this patch adds another bit for decls to record whether or not the ODR bits are skipped.
2024-02-01[C++20] [Modules] Introduce -fskip-odr-check-in-gmf (#79959)Chuanqi Xu1-4/+4
Close https://github.com/llvm/llvm-project/issues/79240 Cite the comment from @mizvekov in //github.com/llvm/llvm-project/issues/79240: > There are two kinds of bugs / issues relevant here: > > Clang bugs that this change hides > Here we can add a Frontend flag that disables the GMF ODR check, just > so > we can keep tracking, testing and fixing these issues. > The Driver would just always pass that flag. > We could add that flag in this current issue. > Bugs in user code: > I don't think it's worth adding a corresponding Driver flag for > controlling the above Frontend flag, since we intend it's behavior to > become default as we fix the problems, and users interested in testing > the more strict behavior can just use the Frontend flag directly. This patch follows the suggestion: - Introduce the CC1 flag `-fskip-odr-check-in-gmf` which is by default off, so that the every existing test will still be tested with checking ODR violations. - Passing `-fskip-odr-check-in-gmf` in the driver to keep the behavior we intended. - Edit the document to tell the users who are still interested in more strict checks can use `-Xclang -fno-skip-odr-check-in-gmf` to get the existing behavior.
2024-01-29[C++20] [Modules] Don't perform ODR checks in GMFChuanqi Xu1-4/+9
Close https://github.com/llvm/llvm-project/issues/79240. See the linked issue for details. Given the frequency of issue reporting about false positive ODR checks (I received private issue reports too), I'd like to backport this to 18.x too.
2024-01-18[Clang][NFC] Rename CXXMethodDecl::isPure -> is VirtualPure (#78463)cor3ntin1-1/+1
To avoid any possible confusion with the notion of pure function and the gnu::pure attribute.
2023-12-28[Serialization] Don't pack bits for the function scope index of ParmVarDeclChuanqi Xu1-3/+8
Close https://github.com/llvm/llvm-project/issues/76443 Previously we assume the bits of the function scope index of ParmVarDecl won't exceed 8. But this is a misreading. See the implementation of `ParmVarDecl::getParameterIndex()`, which may exceed the size of the normal bitfield. So it may be better to not pack these bits.
2023-12-21[NFC] [Serialization] Improve AST serialization by reordering packedChuanqi Xu1-112/+149
bits and extract big bits from packed bits Previously I tried to improve the size of .pcm files by introducing packed bits. And I find we can improve it further by reordering the bits. The secret comes from the VBR format. We can find the formal definition of VBR format in the doc of LLVM. The VBR format will be pretty efficicent for small numbers. For example, if we need to pack 8 bits into a value and the stored value is 0xf0, the actual stored value will be 0b000111'110000, which takes 12 bits actually. However, if we changed the order to be 0x0f, then we can store it as 0b001111, which takes 6 bits only now. So we can improve the size by placing bits with lower probability to be 1 in the higher bits and extract bit bigs from the packed bits to make it possible to be optimized by VBR. After this patch, the size of std module becomes to 27.7MB from 28.1MB.
2023-12-21Recommit [NFC] [Serialization] Packing more bits and refactor AbbrevToUseChuanqi Xu1-89/+292
This patch tries to pack more bits into a value to reduce the size of .pcm files. Also, after we introduced BitsPackers, it may slightly better to adjust the way we use Abbrev. After this patch, the size of the BMI for std module reduce from 28.94MB to 28.1 MB. This was reverted due to it broke the build of lldb. The reason that we skip the serialization of a source location incorrectly. And this patch now fixes that.