diff options
author | Dmitry Polukhin <34227995+dmpolukhin@users.noreply.github.com> | 2024-09-25 08:31:49 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-09-25 08:31:49 +0100 |
commit | 2ccac07bf22d17029d4437b0a727dd55c8c86d56 (patch) | |
tree | a97f405a101b1576f2d50b5c9d0f8684c323cd2b /clang/lib/Serialization/ASTWriter.cpp | |
parent | ae7b454f98ba002ba1cc4cf19fac55f629cfc52f (diff) | |
download | llvm-2ccac07bf22d17029d4437b0a727dd55c8c86d56.zip llvm-2ccac07bf22d17029d4437b0a727dd55c8c86d56.tar.gz llvm-2ccac07bf22d17029d4437b0a727dd55c8c86d56.tar.bz2 |
[C++20][Modules] Fix crash when function and lambda inside loaded from different modules (#109167)
Summary:
Because AST loading code is lazy and happens in unpredictable order, it
is possible that a function and lambda inside the function can be loaded
from different modules. As a result, the captured DeclRefExpr won’t
match the corresponding VarDecl inside the function. This situation is
reflected in the AST as follows:
```
FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline
|-also in ./folly-conv.h
`-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1>
|-DeclStmt 0x555564f7ced8 <line:34:3, col:17>
| `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit
| `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0
|-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>'
| |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0
| |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing
| `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)'
| |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition
| | |-also in ./thrift_cpp2_base.h
| | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init
| | |-DefaultConstructor defaulted_is_constexpr
| | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
| | |-MoveConstructor exists simple trivial needs_implicit
| | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
| | |-MoveAssignment
| | `-Destructor simple irrelevant trivial constexpr needs_implicit
| `-CompoundStmt 0x555564f7d1a8 <col:58, col:75>
| `-ReturnStmt 0x555564f7d198 <col:60, col:67>
| `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture
`-ReturnStmt 0x555564f7bc78 <line:40:3, col:11>
`-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void'
```
This diff modifies the AST deserialization process to load lambdas
within the canonical function declaration sooner, immediately following
the function, ensuring that they are loaded from the same module.
Re-land https://github.com/llvm/llvm-project/pull/104512 Added test case
that caused crash due to multiple enclosed lambdas deserialization.
Test Plan: check-clang
Diffstat (limited to 'clang/lib/Serialization/ASTWriter.cpp')
-rw-r--r-- | clang/lib/Serialization/ASTWriter.cpp | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp index 4ee14b1..f326e3c 100644 --- a/clang/lib/Serialization/ASTWriter.cpp +++ b/clang/lib/Serialization/ASTWriter.cpp @@ -903,6 +903,7 @@ void ASTWriter::WriteBlockInfoBlock() { RECORD(PENDING_IMPLICIT_INSTANTIATIONS); RECORD(UPDATE_VISIBLE); RECORD(DELAYED_NAMESPACE_LEXICAL_VISIBLE_RECORD); + RECORD(FUNCTION_DECL_TO_LAMBDAS_MAP); RECORD(DECL_UPDATE_OFFSETS); RECORD(DECL_UPDATES); RECORD(CUDA_SPECIAL_DECL_REFS); @@ -5707,6 +5708,27 @@ void ASTWriter::WriteDeclAndTypes(ASTContext &Context) { Stream.EmitRecord(DELAYED_NAMESPACE_LEXICAL_VISIBLE_RECORD, DelayedNamespaceRecord); + if (!FunctionToLambdasMap.empty()) { + // TODO: on disk hash table for function to lambda mapping might be more + // efficent becuase it allows lazy deserialization. + RecordData FunctionToLambdasMapRecord; + for (const auto &Pair : FunctionToLambdasMap) { + FunctionToLambdasMapRecord.push_back( + GetDeclRef(Pair.first).getRawValue()); + FunctionToLambdasMapRecord.push_back(Pair.second.size()); + for (const auto &Lambda : Pair.second) + FunctionToLambdasMapRecord.push_back(Lambda.getRawValue()); + } + + auto Abv = std::make_shared<llvm::BitCodeAbbrev>(); + Abv->Add(llvm::BitCodeAbbrevOp(FUNCTION_DECL_TO_LAMBDAS_MAP)); + Abv->Add(llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Array)); + Abv->Add(llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::VBR, 6)); + unsigned FunctionToLambdaMapAbbrev = Stream.EmitAbbrev(std::move(Abv)); + Stream.EmitRecord(FUNCTION_DECL_TO_LAMBDAS_MAP, FunctionToLambdasMapRecord, + FunctionToLambdaMapAbbrev); + } + const TranslationUnitDecl *TU = Context.getTranslationUnitDecl(); // Create a lexical update block containing all of the declarations in the // translation unit that do not come from other AST files. |