aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/AST/DeclBase.cpp
diff options
context:
space:
mode:
authorChuanqi Xu <yedeng.yd@linux.alibaba.com>2024-06-06 11:51:05 +0800
committerChuanqi Xu <yedeng.yd@linux.alibaba.com>2024-06-07 20:21:55 +0800
commit5a0181f568e56e37df80d0f74eca4775776fa8cd (patch)
tree4776c024543b1730e04fa1a24cb2bea364365904 /clang/lib/AST/DeclBase.cpp
parentb8cc85b318c0dd89e4dd69e3691ffcad5e401885 (diff)
downloadllvm-5a0181f568e56e37df80d0f74eca4775776fa8cd.zip
llvm-5a0181f568e56e37df80d0f74eca4775776fa8cd.tar.gz
llvm-5a0181f568e56e37df80d0f74eca4775776fa8cd.tar.bz2
[serialization] no transitive decl change (#92083)
Following of https://github.com/llvm/llvm-project/pull/86912 The motivation of the patch series is that, for a module interface unit `X`, when the dependent modules of `X` changes, if the changes is not relevant with `X`, we hope the BMI of `X` won't change. For the specific patch, we hope if the changes was about irrelevant declaration changes, we hope the BMI of `X` won't change. **However**, I found the patch itself is not very useful in practice, since the adding or removing declarations, will change the state of identifiers and types in most cases. That said, for the most simple example, ``` // partA.cppm export module m:partA; // partA.v1.cppm export module m:partA; export void a() {} // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` the BMI of `onlyUseB` will change after we change the implementation of `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new identifiers and types (the function prototype). So in this patch, we have to write the tests as: ``` // partA.cppm export module m:partA; export int getA() { ... } export int getA2(int) { ... } // partA.v1.cppm export module m:partA; export int getA() { ... } export int getA(int) { ... } export int getA2(int) { ... } // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` so that the new introduced declaration `int getA(int)` doesn't introduce new identifiers and types, then the BMI of `onlyUseB` can keep unchanged. While it looks not so great, the patch should be the base of the patch to erase the transitive change for identifiers and types since I don't know how can we introduce new types and identifiers without introducing new declarations. Given how tightly the relationship between declarations, types and identifiers, I think we can only reach the ideal state after we made the series for all of the three entties. The design of the patch is similar to https://github.com/llvm/llvm-project/pull/86912, which extends the 32-bit DeclID to 64-bit and use the higher bits to store the module file index and the lower bits to store the Local Decl ID. A slight difference is that we only use 48 bits to store the new DeclID since we try to use the higher 16 bits to store the module ID in the prefix of Decl class. Previously, we use 32 bits to store the module ID and 32 bits to store the DeclID. I don't want to allocate additional space so I tried to make the additional space the same as 64 bits. An potential interesting thing here is about the relationship between the module ID and the module file index. I feel we can get the module file index by the module ID. But I didn't prove it or implement it. Since I want to make the patch itself as small as possible. We can make it in the future if we want. Another change in the patch is the new concept Decl Index, which means the index of the very big array `DeclsLoaded` in ASTReader. Previously, the index of a loaded declaration is simply the Decl ID minus PREDEFINED_DECL_NUMs. So there are some places they got used ambiguously. But this patch tried to split these two concepts. As https://github.com/llvm/llvm-project/pull/86912 did, the change will increase the on-disk PCM file sizes. As the declaration ID may be the most IDs in the PCM file, this can have the biggest impact on the size. In my experiments, this change will bring 6.6% increase of the on-disk PCM size. No compile-time performance regression observed. Given the benefits in the motivation example, I think the cost is worthwhile.
Diffstat (limited to 'clang/lib/AST/DeclBase.cpp')
-rw-r--r--clang/lib/AST/DeclBase.cpp40
1 files changed, 33 insertions, 7 deletions
diff --git a/clang/lib/AST/DeclBase.cpp b/clang/lib/AST/DeclBase.cpp
index 1e9c879..7f1ed9c 100644
--- a/clang/lib/AST/DeclBase.cpp
+++ b/clang/lib/AST/DeclBase.cpp
@@ -74,18 +74,17 @@ void *Decl::operator new(std::size_t Size, const ASTContext &Context,
GlobalDeclID ID, std::size_t Extra) {
// Allocate an extra 8 bytes worth of storage, which ensures that the
// resulting pointer will still be 8-byte aligned.
- static_assert(sizeof(unsigned) * 2 >= alignof(Decl),
- "Decl won't be misaligned");
+ static_assert(sizeof(uint64_t) >= alignof(Decl), "Decl won't be misaligned");
void *Start = Context.Allocate(Size + Extra + 8);
void *Result = (char*)Start + 8;
- unsigned *PrefixPtr = (unsigned *)Result - 2;
+ uint64_t *PrefixPtr = (uint64_t *)Result - 1;
- // Zero out the first 4 bytes; this is used to store the owning module ID.
- PrefixPtr[0] = 0;
+ *PrefixPtr = ID.get();
- // Store the global declaration ID in the second 4 bytes.
- PrefixPtr[1] = ID.get();
+ // We leave the upper 16 bits to store the module IDs. 48 bits should be
+ // sufficient to store a declaration ID.
+ assert(*PrefixPtr < llvm::maskTrailingOnes<uint64_t>(48));
return Result;
}
@@ -111,6 +110,29 @@ void *Decl::operator new(std::size_t Size, const ASTContext &Ctx,
return ::operator new(Size + Extra, Ctx);
}
+GlobalDeclID Decl::getGlobalID() const {
+ if (!isFromASTFile())
+ return GlobalDeclID();
+ // See the comments in `Decl::operator new` for details.
+ uint64_t ID = *((const uint64_t *)this - 1);
+ return GlobalDeclID(ID & llvm::maskTrailingOnes<uint64_t>(48));
+}
+
+unsigned Decl::getOwningModuleID() const {
+ if (!isFromASTFile())
+ return 0;
+
+ uint64_t ID = *((const uint64_t *)this - 1);
+ return ID >> 48;
+}
+
+void Decl::setOwningModuleID(unsigned ID) {
+ assert(isFromASTFile() && "Only works on a deserialized declaration");
+ uint64_t *IDAddress = (uint64_t *)this - 1;
+ *IDAddress &= llvm::maskTrailingOnes<uint64_t>(48);
+ *IDAddress |= (uint64_t)ID << 48;
+}
+
Module *Decl::getOwningModuleSlow() const {
assert(isFromASTFile() && "Not from AST file?");
return getASTContext().getExternalSource()->getModule(getOwningModuleID());
@@ -2163,3 +2185,7 @@ DependentDiagnostic *DependentDiagnostic::Create(ASTContext &C,
return DD;
}
+
+unsigned DeclIDBase::getLocalDeclIndex() const {
+ return ID & llvm::maskTrailingOnes<DeclID>(32);
+}