aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Interpreter/IncrementalParser.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-08-26[clang-repl] Delegate CodeGen related operations for PTU to ↵Anutosh Bhat1-2/+28
IncrementalParser (#137458) Read discussion : https://github.com/llvm/llvm-project/pull/136404#discussion_r2059149768 and the following comments for context Motivation 1) `IncrementalAction` is designed to keep Frontend statealive across inputs. As per the docstring: “IncrementalAction ensures it keeps its underlying action's objects alive as long as the IncrementalParser needs them.” 2) To align responsibilities with that contract, the parser layer (host: `IncrementalParser`, device: `IncrementalCUDADeviceParser`) should manage PTU registration and module generation, while the interpreter orchestrates at a higher level. What this PR does 1) Moves CodeGen surfaces behind IncrementalAction: GenModule(), getCodeGen(), and the cached “first CodeGen module” now live in IncrementalAction. 2) Moves PTU ownership to the parser layer: Adds IncrementalParser::RegisterPTU(…) (and device counterpart) 3) Add device-side registration in IncrementalCUDADeviceParser. 4) Remove Interpreter::{getCodeGen, GenModule, RegisterPTU}.
2025-06-02[clang-repl] Fix error recovery while PTU cleanup (#127467)Anutosh Bhat1-1/+1
Fixes #123300 What is seen ``` clang-repl> int x = 42; clang-repl> auto capture = [&]() { return x * 2; }; In file included from <<< inputs >>>:1: input_line_4:1:17: error: non-local lambda expression cannot have a capture-default 1 | auto capture = [&]() { return x * 2; }; | ^ zsh: segmentation fault clang-repl --Xcc="-v" (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x8) * frame #0: 0x0000000107b4f8b8 libclang-cpp.19.1.dylib`clang::IncrementalParser::CleanUpPTU(clang::PartialTranslationUnit&) + 988 frame #1: 0x0000000107b4f1b4 libclang-cpp.19.1.dylib`clang::IncrementalParser::ParseOrWrapTopLevelDecl() + 416 frame #2: 0x0000000107b4fb94 libclang-cpp.19.1.dylib`clang::IncrementalParser::Parse(llvm::StringRef) + 612 frame #3: 0x0000000107b52fec libclang-cpp.19.1.dylib`clang::Interpreter::ParseAndExecute(llvm::StringRef, clang::Value*) + 180 frame #4: 0x0000000100003498 clang-repl`main + 3560 frame #5: 0x000000018d39a0e0 dyld`start + 2360 ``` Though the error is justified, we shouldn't be interested in exiting through a segfault in such cases. The issue is that empty named decls weren't being taken care of resulting into this assert https://github.com/llvm/llvm-project/blob/c1a229252617ed58f943bf3f4698bd8204ee0f04/clang/include/clang/AST/DeclarationName.h#L503 Can also be seen when the example is attempted through xeus-cpp-lite. ![image](https://github.com/user-attachments/assets/9b0e6ead-138e-4b06-9ad9-fcb9f8d5bf6e)
2025-03-05[Clang] Don't give up on an unsuccessful function instantiation (#126723)Younan Zhang1-2/+3
For constexpr function templates, we immediately instantiate them upon reference. However, if the function isn't defined at the time of instantiation, even though it might be defined later, the instantiation would forever fail. This patch corrects the behavior by popping up failed instantiations through PendingInstantiations, so that we are able to instantiate them again in the future (e.g. at the end of TU.) Fixes https://github.com/llvm/llvm-project/issues/125747
2024-09-23[clang-repl] Simplify the value printing logic to enable out-of-process. ↵Vassil Vassilev1-256/+16
(#107737) This patch improves the design of the IncrementalParser and Interpreter classes. Now the incremental parser is only responsible for building the partial translation unit declaration and the AST, while the Interpreter fills in the lower level llvm::Module and other JIT-related infrastructure. Finally the Interpreter class now orchestrates the AST and the LLVM IR with the IncrementalParser and IncrementalExecutor classes. The design improvement allows us to rework some of the logic that extracts an interpreter value into the clang::Value object. The new implementation simplifies use-cases which are used for out-of-process execution by allowing interpreter to be inherited or customized with an clang::ASTConsumer. This change will enable completing the pretty printing work which is in llvm/llvm-project#84769
2024-07-05[BPF] Fix linking issues in static map initializers (#91310)Nick Zavaritsky1-1/+1
When BPF object files are linked with bpftool, every symbol must be accompanied by BTF info. Ensure that extern functions referenced by global variable initializers are included in BTF. The primary motivation is "static" initialization of PROG maps: ```c extern int elsewhere(struct xdp_md *); struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __type(key, int); __type(value, int); __array(values, int (struct xdp_md *)); } prog_map SEC(".maps") = { .values = { elsewhere } }; ``` BPF backend needs debug info to produce BTF. Debug info is not normally generated for external variables and functions. Previously, it was solved differently for variables (collecting variable declarations in ExternalDeclarations vector) and functions (logic invoked during codegen in CGExpr.cpp). This patch generalises ExternalDefclarations to include both function and variable declarations. This change ensures that function references are not missed no matter the context. Previously external functions referenced in constant expressions lacked debug info.
2024-06-05Fix clang reject valid C++ code after d999ce0302f06d250f6d496b56a5a5f (#94471)Haojian Wu1-1/+2
The incremental processing mode doesn't seem to work well for C++, see the https://github.com/llvm/llvm-project/pull/89804#issuecomment-2149840711 for details.
2024-06-04Reland "[clang-repl] Extend the C support. (#89804)"Vassil Vassilev1-2/+11
Original commit message:" [clang-repl] Extend the C support. (#89804) The IdResolver chain is the main way for C to implement lookup rules. Every new partial translation unit caused clang to exit the top-most scope which in turn cleaned up the IdResolver chain. That was not an issue for C++ because its lookup is implemented on the level of declaration contexts. This patch keeps the IdResolver chain across partial translation units maintaining proper C-style lookup infrastructure. " It was reverted in dfdf1c5fe45a82b9c578306f3d7627fd251d63f8 because it broke the bots of lldb. This failure was subtle to debug but the current model does not work well with ObjectiveC support in lldb. This patch does cleans up the partial translation units in ObjectiveC. In future if we want to support ObjectiveC we need to understand what exactly lldb is doing when recovering from errors...
2024-05-21Revert "[clang-repl] Extend the C support. (#89804)"Jason Molenda1-11/+2
This reverts commit 253c28fa829cee0104c2fc59ed1a958980b5138c. This commit is causing failures on the lldb CI bots, e.g. https://ci.swift.org/view/all/job/llvm.org/view/LLDB/job/as-lldb-cmake/4307/ On my local macOS desktop build, ``` bin/lldb-dotest -p TestImportBuiltinFileID.py Assertion failed: (D->getLexicalDeclContext() == this && "Decl inserted into wrong lexical context"), function addHiddenDecl, file DeclBase.cpp, line 1692. 6 libsystem_c.dylib 0x0000000185f0b8d0 abort + 128 7 libsystem_c.dylib 0x0000000185f0abc8 err + 0 8 liblldb.19.0.0git.dylib 0x00000001311e5800 clang::DeclContext::addHiddenDecl(clang::Decl*) + 120 9 liblldb.19.0.0git.dylib 0x00000001311e5978 clang::DeclContext::addDecl(clang::Decl*) + 32 10 liblldb.19.0.0git.dylib 0x000000012f617b48 clang::Sema::ActOnStartTopLevelStmtDecl(clang::Scope*) + 64 11 liblldb.19.0.0git.dylib 0x000000012eaf76c8 clang::Parser::ParseTopLevelStmtDecl() + 208 12 liblldb.19.0.0git.dylib 0x000000012ec051fc clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) + 3412 13 liblldb.19.0.0git.dylib 0x000000012ec03274 clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) + 2020 14 liblldb.19.0.0git.dylib 0x000000012eaca860 clang::ParseAST(clang::Sema&, bool, bool) + 604 15 liblldb.19.0.0git.dylib 0x000000012e8554c0 clang::ASTFrontendAction::ExecuteAction() + 308 16 liblldb.19.0.0git.dylib 0x000000012e854c78 clang::FrontendAction::Execute() + 124 17 liblldb.19.0.0git.dylib 0x000000012e76dcfc clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 984 18 liblldb.19.0.0git.dylib 0x000000012e784500 compileModuleImpl(clang::CompilerInstance&, clang::SourceLocation, llvm::StringRef, clang::FrontendInputFile, llvm::StringRef, llvm::StringRef, llvm::function_ref<void (clang::CompilerInstance&)>, llvm::function_ref<void (clang::CompilerInstance&)>)::$_1::operator()() const + 52 ``` Reverting until Vassil has a chance to look int oit.
2024-05-21[clang-repl] Extend the C support. (#89804)Vassil Vassilev1-2/+11
The IdResolver chain is the main way for C to implement lookup rules. Every new partial translation unit caused clang to exit the top-most scope which in turn cleaned up the IdResolver chain. That was not an issue for C++ because its lookup is implemented on the level of declaration contexts. This patch keeps the IdResolver chain across partial translation units maintaining proper C-style lookup infrastructure.
2024-04-20Reland "[clang-repl] Keep the first llvm::Module empty to avoid invalid ↵Vassil Vassilev1-5/+19
memory access. (#89031)" Original commit message: " Clang's CodeGen is designed to work with a single llvm::Module. In many cases for convenience various CodeGen parts have a reference to the llvm::Module (TheModule or Module) which does not change when a new module is pushed. However, the execution engine wants to take ownership of the module which does not map well to CodeGen's design. To work this around we clone the module and pass it down. With some effort it is possible to teach CodeGen to ask the CodeGenModule for its current module and that would have an overall positive impact on CodeGen improving the encapsulation of various parts but that's not resilient to future regression. This patch takes a more conservative approach and keeps the first llvm::Module empty intentionally and does not pass it to the Jit. That's also not bullet proof because we have to guarantee that CodeGen does not write on the blueprint. However, we have inserted some assertions to catch accidental additions to that canary module. This change will fixes a long-standing invalid memory access reported by valgrind when we enable the TBAA optimization passes. It also unblock progress on https://github.com/llvm/llvm-project/pull/84758. " This patch reverts adc4f6233df734fbe3793118ecc89d3584e0c90f and removes the check of `named_metadata_empty` of the first llvm::Module because on darwin clang inserts some harmless metadata which we can ignore.
2024-04-20Revert "[clang-repl] Keep the first llvm::Module empty to avoid invalid ↵Vassil Vassilev1-20/+5
memory access. (#89031)" This reverts commit ca090452d64e229b539a66379a3be891c4e8f3d8 and 1faf3148fdef34ce0d556ec6a4049e06cbde71b3 because it broke a darwin bot.
2024-04-20[Interpreter] Fix warningsKazu Hirata1-2/+2
This patch fixes: clang/lib/Interpreter/IncrementalParser.cpp:214:29: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move] clang/lib/Interpreter/IncrementalParser.cpp:232:22: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
2024-04-20[clang-repl] Keep the first llvm::Module empty to avoid invalid memory ↵Vassil Vassilev1-5/+20
access. (#89031) Clang's CodeGen is designed to work with a single llvm::Module. In many cases for convenience various CodeGen parts have a reference to the llvm::Module (TheModule or Module) which does not change when a new module is pushed. However, the execution engine wants to take ownership of the module which does not map well to CodeGen's design. To work this around we clone the module and pass it down. With some effort it is possible to teach CodeGen to ask the CodeGenModule for its current module and that would have an overall positive impact on CodeGen improving the encapsulation of various parts but that's not resilient to future regression. This patch takes a more conservative approach and keeps the first llvm::Module empty intentionally and does not pass it to the Jit. That's also not bullet proof because we have to guarantee that CodeGen does not write on the blueprint. However, we have inserted some assertions to catch accidental additions to that canary module. This change will fixes a long-standing invalid memory access reported by valgrind when we enable the TBAA optimization passes. It also unblock progress on https://github.com/llvm/llvm-project/pull/84758.
2024-03-26[clang-repl] Fix remove invalidates iterators in CleanUpPTU() (#85378)Stefan Gränitz1-6/+12
Using remove() on DeclContext::lookup_result list invalidates iterators. This assertion failure was one (fortunate) symptom: ``` clang/include/clang/AST/DeclBase.h:1337: reference clang::DeclListNode::iterator::operator*() const: Assertion `Ptr && "dereferencing end() iterator"' failed. ```
2023-08-28Reland "[clang-repl] support code completion at a REPL."Fred Fu1-7/+2
Original commit message: " This patch enabled code completion for ClangREPL. The feature was built upon three existing Clang components: a list completer for LineEditor, a CompletionConsumer from SemaCodeCompletion, and the ASTUnit::codeComplete method. The first component serves as the main entry point of handling interactive inputs. Because a completion point for a compiler instance has to be unchanged once it is set, an incremental compiler instance is created for each code completion. Such a compiler instance carries over AST context source from the main interpreter compiler in order to obtain declarations or bindings from previous input in the same REPL session. The most important API codeComplete in Interpreter/CodeCompletion is a thin wrapper that calls with ASTUnit::codeComplete with necessary arguments, such as a code completion point and a ReplCompletionConsumer, which communicates completion results from SemaCodeCompletion back to the list completer for the REPL. In addition, PCC_TopLevelOrExpression and CCC_TopLevelOrExpression` top levels were added so that SemaCodeCompletion can treat top level statements like expression statements at the REPL. For example, clang-repl> int foo = 42; clang-repl> f<tab> From a parser's persective, the cursor is at a top level. If we used code completion without any changes, PCC_Namespace would be supplied to Sema::CodeCompleteOrdinaryName, and thus the completion results would not include foo. Currently, the way we use PCC_TopLevelOrExpression and CCC_TopLevelOrExpression is no different from the way we use PCC_Statement and CCC_Statement respectively. Differential revision: https://reviews.llvm.org/D154382 " The new patch also fixes clangd and several memory issues that the bots reported and upload the missing files.
2023-08-28Revert "Reland "[clang-repl] support code completion at a REPL.""Vassil Vassilev1-2/+7
This reverts commit 5ab25a42ba70c4b50214b0e78eaaccd30696fa09 due to forgotten files.
2023-08-28Reland "[clang-repl] support code completion at a REPL."Fred Fu1-7/+2
Original commit message: " This patch enabled code completion for ClangREPL. The feature was built upon three existing Clang components: a list completer for LineEditor, a CompletionConsumer from SemaCodeCompletion, and the ASTUnit::codeComplete method. The first component serves as the main entry point of handling interactive inputs. Because a completion point for a compiler instance has to be unchanged once it is set, an incremental compiler instance is created for each code completion. Such a compiler instance carries over AST context source from the main interpreter compiler in order to obtain declarations or bindings from previous input in the same REPL session. The most important API codeComplete in Interpreter/CodeCompletion is a thin wrapper that calls with ASTUnit::codeComplete with necessary arguments, such as a code completion point and a ReplCompletionConsumer, which communicates completion results from SemaCodeCompletion back to the list completer for the REPL. In addition, PCC_TopLevelOrExpression and CCC_TopLevelOrExpression` top levels were added so that SemaCodeCompletion can treat top level statements like expression statements at the REPL. For example, clang-repl> int foo = 42; clang-repl> f<tab> From a parser's persective, the cursor is at a top level. If we used code completion without any changes, PCC_Namespace would be supplied to Sema::CodeCompleteOrdinaryName, and thus the completion results would not include foo. Currently, the way we use PCC_TopLevelOrExpression and CCC_TopLevelOrExpression is no different from the way we use PCC_Statement and CCC_Statement respectively. Differential revision: https://reviews.llvm.org/D154382 " The new patch also fixes clangd and several memory issues that the bots reported.
2023-08-23Revert "[clang-repl] support code completion at a REPL."Vassil Vassilev1-2/+7
This reverts commit eb0e6c3134ef6deafe0a4958e9e1a1214b3c2f14 due to failures in clangd such as https://lab.llvm.org/buildbot/#/builders/57/builds/29377
2023-08-23[clang-repl] support code completion at a REPL.Fred Fu1-7/+2
This patch enabled code completion for ClangREPL. The feature was built upon three existing Clang components: a list completer for LineEditor, a CompletionConsumer from SemaCodeCompletion, and the ASTUnit::codeComplete method. The first component serves as the main entry point of handling interactive inputs. Because a completion point for a compiler instance has to be unchanged once it is set, an incremental compiler instance is created for each code completion. Such a compiler instance carries over AST context source from the main interpreter compiler in order to obtain declarations or bindings from previous input in the same REPL session. The most important API codeComplete in Interpreter/CodeCompletion is a thin wrapper that calls with ASTUnit::codeComplete with necessary arguments, such as a code completion point and a ReplCompletionConsumer, which communicates completion results from SemaCodeCompletion back to the list completer for the REPL. In addition, PCC_TopLevelOrExpression and CCC_TopLevelOrExpression` top levels were added so that SemaCodeCompletion can treat top level statements like expression statements at the REPL. For example, clang-repl> int foo = 42; clang-repl> f<tab> From a parser's persective, the cursor is at a top level. If we used code completion without any changes, PCC_Namespace would be supplied to Sema::CodeCompleteOrdinaryName, and thus the completion results would not include foo. Currently, the way we use PCC_TopLevelOrExpression and CCC_TopLevelOrExpression is no different from the way we use PCC_Statement and CCC_Statement respectively. Differential revision: https://reviews.llvm.org/D154382
2023-05-27[clang-repl][CUDA] Re-land: Initial interactive CUDA support for clang-replAnubhab Ghosh1-10/+26
CUDA support can be enabled in clang-repl with --cuda flag. Device code linking is not yet supported. inline must be used with all __device__ functions. Differential Revision: https://reviews.llvm.org/D146389
2023-05-23Reland "Reland [clang-repl] Introduce Value to capture expression results"Jun Zhang1-9/+90
This reverts commit 094ab4781262b6cb49d57b0ecdf84b047c879295. Reland with changing `ParseAndExecute` to `Parse` in `Interpreter::create`. This avoid creating JIT instance everytime even if we don't really need them. This should fixes failures like https://lab.llvm.org/buildbot/#/builders/38/builds/11955 The original reverted patch also causes GN bot fails on M1. (https://lab.llvm.org/buildbot/#/builders/38/builds/11955) However, we can't reproduce it so let's reland it and see what happens. See discussions here: https://reviews.llvm.org/rGd71a4e02277a64a9dece591cdf2b34f15c3b19a0
2023-05-20Revert "[clang-repl][CUDA] Initial interactive CUDA support for clang-repl"Anubhab Ghosh1-26/+10
This reverts commit 80e7eed6a610ab3c7289e6f9b7ec006bc7d7ae31.
2023-05-20[clang-repl][CUDA] Initial interactive CUDA support for clang-replAnubhab Ghosh1-10/+26
CUDA support can be enabled in clang-repl with --cuda flag. Device code linking is not yet supported. inline must be used with all __device__ functions. Differential Revision: https://reviews.llvm.org/D146389
2023-05-19Revert "Reland [clang-repl] Introduce Value to capture expression results"Jun Zhang1-90/+9
This reverts commit d71a4e02277a64a9dece591cdf2b34f15c3b19a0. See http://45.33.8.238/macm1/61024/step_7.txt
2023-05-19Reland [clang-repl] Introduce Value to capture expression resultsJun Zhang1-9/+90
This reverts commit 7158fd381a0bc0222195d6a07ebb42ea57957bda. * Fixes endianness issue on big endian machines like PowerPC-bl * Disable tests on platforms that having trouble to support JIT Signed-off-by: Jun Zhang <jun@junz.org>
2023-05-16Revert "[clang-repl] Introduce Value to capture expression results"Jun Zhang1-90/+9
This reverts commit a423b7f1d7ca8b263af85944f57a69aa08fc942c. See https://lab.llvm.org/buildbot/#/changes/95083
2023-05-16[clang-repl] Introduce Value to capture expression resultsJun Zhang1-9/+90
This is the second part of the below RFC: https://discourse.llvm.org/t/rfc-handle-execution-results-in-clang-repl/68493 This patch implements a Value class that can be used to carry expression results in clang-repl. In other words, when we see a top expression without semi, it will be captured and stored to a Value object. You can explicitly specify where you want to store the object, like: ``` Value V; llvm::cantFail(Interp->ParseAndExecute("int x = 42;")); llvm::cantFail(Interp->ParseAndExecute("x", &V)); ``` `V` now stores some useful infomation about `x`, you can get its real value (42), it's `clang::QualType` or anything interesting. However, if you don't specify the optional argument, it will be captured to a local variable, and automatically called `Value::dump`, which is not implemented yet in this patch. Signed-off-by: Jun Zhang <jun@junz.org>
2023-05-16[clang] Add a new annotation token: annot_repl_input_endJun Zhang1-8/+8
This patch is the first part of the below RFC: https://discourse.llvm.org/t/rfc-handle-execution-results-in-clang-repl/68493 It adds an annotation token which will replace the original EOF token when we are in the incremental C++ mode. In addition, when we're parsing an ExprStmt and there's a missing semicolon after the expression, we set a marker in the annotation token and continue parsing. Eventually, we propogate this info in ParseTopLevelStmtDecl and are able to mark this Decl as something we want to do value printing. Below is a example: clang-repl> int x = 42; clang-repl> x // `x` is a TopLevelStmtDecl and without a semicolon, we should set // it's IsSemiMissing bit so we can do something interesting in // ASTConsumer::HandleTopLevelDecl. The idea about annotation toke is proposed by Richard Smith, thanks! Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D148997
2022-12-03[clang-repl] Support statements on global scope in incremental mode.Vassil Vassilev1-4/+0
This patch teaches clang to parse statements on the global scope to allow: ``` ./bin/clang-repl clang-repl> int i = 12; clang-repl> ++i; clang-repl> extern "C" int printf(const char*,...); clang-repl> printf("%d\n", i); 13 clang-repl> %quit ``` Generally, disambiguating between statements and declarations is a non-trivial task for a C++ parser. The challenge is to allow both standard C++ to be translated as if this patch does not exist and in the cases where the user typed a statement to be executed as if it were in a function body. Clang's Parser does pretty well in disambiguating between declarations and expressions. We have added DisambiguatingWithExpression flag which allows us to preserve the existing and optimized behavior where needed and implement the extra rules for disambiguating. Only few cases require additional attention: * Constructors/destructors -- Parser::isConstructorDeclarator was used in to disambiguate between ctor-looking declarations and statements on the global scope(eg. `Ns::f()`). * The template keyword -- the template keyword can appear in both declarations and statements. This patch considers the template keyword to be a declaration starter which breaks a few cases in incremental mode which will be tackled later. * The inline (and similar) keyword -- looking at the first token in many cases allows us to classify what is a declaration. * Other language keywords and specifiers -- ObjC/ObjC++/OpenCL/OpenMP rely on pragmas or special tokens which will be handled in subsequent patches. The patch conceptually models a "top-level" statement into a TopLevelStmtDecl. The TopLevelStmtDecl is lowered into a void function with no arguments. We attach this function to the global initializer list to execute the statement blocks in the correct order. Differential revision: https://reviews.llvm.org/D127284
2022-08-08[clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFCFangrui Song1-7/+7
With C++17 there is no Clang pedantic warning or MSVC C5051. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D131346
2022-08-02Handles failing driver tests of clangPurva-Chaudhari1-0/+4
Added support for incremental mode 8 and 28 ie. `frontend::EmitBC:` and `frontend::PrintPreprocessedInput:` Added supporting clang tests to test in clang-repl mode Reviewed By: v.g.vassilev Differential Revision: https://reviews.llvm.org/D125946
2022-06-26[clang-repl] Implement code undo.Jun Zhang1-21/+21
In interactive C++ it is convenient to roll back to a previous state of the compiler. For example: clang-repl> int x = 42; clang-repl> %undo clang-repl> float x = 24 // not an error To support this, the patch extends the functionality used to recover from errors and adds functionality to recover the low-level execution infrastructure. The current implementation is based on watermarks. It exploits the fact that at each incremental input the underlying compiler infrastructure is in a valid state. We can only go N incremental inputs back to a previous valid state. We do not need and do not do any further dependency tracking. This patch was co-developed with V. Vassilev, relies on the past work of Purva Chaudhari in clang-repl and is inspired by the past work on the same feature in the Cling interpreter. Co-authored-by: Purva-Chaudhari <purva.chaudhari02@gmail.com> Co-authored-by: Vassil Vassilev <v.g.vassilev@gmail.com> Signed-off-by: Jun Zhang <jun@junz.org>
2022-06-24Implement soft reset of the diagnostics engine.Tapasweni Pathak1-2/+2
This patch implements soft reset and adds tests for soft reset success of the diagnostics engine. This allows us to recover from errors in clang-repl without resetting the pragma handlers' state. Differential revision: https://reviews.llvm.org/D126183
2022-06-24Reland "[clang-repl] Recover the lookup tables of the primary context."Vassil Vassilev1-1/+1
The asan issue was fixed in llvm/llvm-project@7bc00ce5cd41 This reverts commit 575e297fcb289f0a9b0ac4b01d1d0fa051f5cc29. Differential revision: https://reviews.llvm.org/D123674
2022-06-18[clang-repl] Remove memory leak of ASTContext/TargetMachine.Sunho Kim1-1/+4
Removes memory leak of ASTContext and TargetMachine. When DisableFree is turned on, it intentionally leaks these instances as they can be trivially deallocated. This patch turns this off and delete Parser instance early so that they will not reference dangling pargma headers. Asan shouldn't detect these as leaks normally, since burypointer is called for them. But, every invocation of incremental parser createa an additional leak of TargetMachine. If there are many invocations within a single test case, we easily reach number of leaks exceeding kGraveYardMaxSize (which is 12) and leaks start to get reported by asan buildbots. Reviewed By: v.g.vassilev Differential Revision: https://reviews.llvm.org/D127991
2022-05-31Revert "[clang-repl] Recover the lookup tables of the primary context."Vassil Vassilev1-1/+1
This reverts commit 5ff27fe1ff03d5aeaf8567c97618170f0cef8f58. This patch caused failures in asan: https://lab.llvm.org/buildbot/#/builders/5/builds/24221
2022-05-29[clang-repl] Recover the lookup tables of the primary context.Purva-Chaudhari1-1/+1
Before this patch, there was re-declaration error if error was encountered in the same line. The recovery support acted only if this type of error was encountered in the first line of the program and not in subsequent lines. For example: ``` clang-repl> int i=9; clang-repl> int j=9; err; input_line_3:1:5: error: redefinition of 'j' int j = 9; ``` Differential revision: https://reviews.llvm.org/D123674
2022-02-21[C++20][Modules][1/8] Track valid import state.Iain Sandoe1-2/+3
In C++20 modules imports must be together and at the start of the module. Rather than growing more ad-hoc flags to test state, this keeps track of the phase of of a valid module TU (first decl, global module frag, module, private module frag). If the phasing is broken (with some diagnostic) the pattern does not conform to a valid C++20 module, and we set the state accordingly. We can thus issue diagnostics when imports appear in the wrong places and decouple the C++20 modules state from other module variants (modules-ts and clang modules). Additionally, we attempt to diagnose wrong imports before trying to find the module where possible (the latter will generally emit an unhelpful diagnostic about the module not being available). Although this generally simplifies the handling of C++20 module import diagnostics, the motivation was that, in particular, it allows detecting invalid imports like: import module A; int some_decl(); import module B; where being in a module purview is insufficient to identify them. Differential Revision: https://reviews.llvm.org/D118893
2022-02-20Revert "[C++20][Modules][1/8] Track valid import state."Iain Sandoe1-3/+2
This reverts commit 8a3f9a584ad43369cf6a034dc875ebfca76d9033. need to investigate build failures that do not show on CI or local testing.
2022-02-20[C++20][Modules][1/8] Track valid import state.Iain Sandoe1-2/+3
In C++20 modules imports must be together and at the start of the module. Rather than growing more ad-hoc flags to test state, this keeps track of the phase of of a valid module TU (first decl, global module frag, module, private module frag). If the phasing is broken (with some diagnostic) the pattern does not conform to a valid C++20 module, and we set the state accordingly. We can thus issue diagnostics when imports appear in the wrong places and decouple the C++20 modules state from other module variants (modules-ts and clang modules). Additionally, we attempt to diagnose wrong imports before trying to find the module where possible (the latter will generally emit an unhelpful diagnostic about the module not being available). Although this generally simplifies the handling of C++20 module import diagnostics, the motivation was that, in particular, it allows detecting invalid imports like: import module A; int some_decl(); import module B; where being in a module purview is insufficient to identify them. Differential Revision: https://reviews.llvm.org/D118893
2021-12-29[clang] Use nullptr instead of 0 or NULL (NFC)Kazu Hirata1-1/+1
Identified with modernize-use-nullptr.
2021-11-10[clang-repl] Allow Interpreter::getSymbolAddress to take a mangled name.Vassil Vassilev1-0/+7
2021-10-05Reland "[clang-repl] Allow loading of plugins in clang-repl."Vassil Vassilev1-0/+2
Differential revision: https://reviews.llvm.org/D110484
2021-10-05Revert "[clang-repl] Allow loading of plugins in clang-repl."Vassil Vassilev1-2/+0
This reverts commit 81fb640f83b6a5d099f9124739ab3049be79ea56 due to bot failures: https://lab.llvm.org/buildbot#builders/57/builds/10807
2021-10-05[clang-repl] Allow loading of plugins in clang-repl.Vassil Vassilev1-0/+2
Differential revision: https://reviews.llvm.org/D110484
2021-07-12Reland "[clang-repl] Implement partial translation units and error recovery."Vassil Vassilev1-27/+62
Original commit message: [clang-repl] Implement partial translation units and error recovery. https://reviews.llvm.org/D96033 contained a discussion regarding efficient modeling of error recovery. @rjmccall has outlined the key ideas: Conceptually, we can split the translation unit into a sequence of partial translation units (PTUs). Every declaration will be associated with a unique PTU that owns it. The first key insight here is that the owning PTU isn't always the "active" (most recent) PTU, and it isn't always the PTU that the declaration "comes from". A new declaration (that isn't a redeclaration or specialization of anything) does belong to the active PTU. A template specialization, however, belongs to the most recent PTU of all the declarations in its signature - mostly that means that it can be pulled into a more recent PTU by its template arguments. The second key insight is that processing a PTU might extend an earlier PTU. Rolling back the later PTU shouldn't throw that extension away. For example, if the second PTU defines a template, and the third PTU requires that template to be instantiated at float, that template specialization is still part of the second PTU. Similarly, if the fifth PTU uses an inline function belonging to the fourth, that definition still belongs to the fourth. When we go to emit code in a new PTU, we map each declaration we have to emit back to its owning PTU and emit it in a new module for just the extensions to that PTU. We keep track of all the modules we've emitted for a PTU so that we can unload them all if we decide to roll it back. Most declarations/definitions will only refer to entities from the same or earlier PTUs. However, it is possible (primarily by defining a previously-declared entity, but also through templates or ADL) for an entity that belongs to one PTU to refer to something from a later PTU. We will have to keep track of this and prevent unwinding to later PTU when we recognize it. Fortunately, this should be very rare; and crucially, we don't have to do the bookkeeping for this if we've only got one PTU, e.g. in normal compilation. Otherwise, PTUs after the first just need to record enough metadata to be able to revert any changes they've made to declarations belonging to earlier PTUs, e.g. to redeclaration chains or template specialization lists. It should even eventually be possible for PTUs to provide their own slab allocators which can be thrown away as part of rolling back the PTU. We can maintain a notion of the active allocator and allocate things like Stmt/Expr nodes in it, temporarily changing it to the appropriate PTU whenever we go to do something like instantiate a function template. More care will be required when allocating declarations and types, though. We would want the PTU to be efficiently recoverable from a Decl; I'm not sure how best to do that. An easy option that would cover most declarations would be to make multiple TranslationUnitDecls and parent the declarations appropriately, but I don't think that's good enough for things like member function templates, since an instantiation of that would still be parented by its original class. Maybe we can work this into the DC chain somehow, like how lexical DCs are. We add a different kind of translation unit `TU_Incremental` which is a complete translation unit that we might nonetheless incrementally extend later. Because it is complete (and we might want to generate code for it), we do perform template instantiation, but because it might be extended later, we don't warn if it declares or uses undefined internal-linkage symbols. This patch teaches clang-repl how to recover from errors by disconnecting the most recent PTU and update the primary PTU lookup tables. For instance: ```./clang-repl clang-repl> int i = 12; error; In file included from <<< inputs >>>:1: input_line_0:1:13: error: C++ requires a type specifier for all declarations int i = 12; error; ^ error: Parsing failed. clang-repl> int i = 13; extern "C" int printf(const char*,...); clang-repl> auto r1 = printf("i=%d\n", i); i=13 clang-repl> quit ``` Differential revision: https://reviews.llvm.org/D104918
2021-07-11Revert "[clang-repl] Implement partial translation units and error recovery."Vassil Vassilev1-62/+27
This reverts commit 6775fc6ffa3ca1c36b20c25fa4e7f48f81213cf2. It also reverts "[lldb] Fix compilation by adjusting to the new ASTContext signature." This reverts commit 03a3f86071c10a1f6cbbf7375aa6fe9d94168972. We see some failures on the lldb infrastructure, these changes might play a role in it. Let's revert it now and see if the bots will become green. Ref: https://reviews.llvm.org/D104918
2021-07-11[clang-repl] Implement partial translation units and error recovery.Vassil Vassilev1-27/+62
https://reviews.llvm.org/D96033 contained a discussion regarding efficient modeling of error recovery. @rjmccall has outlined the key ideas: Conceptually, we can split the translation unit into a sequence of partial translation units (PTUs). Every declaration will be associated with a unique PTU that owns it. The first key insight here is that the owning PTU isn't always the "active" (most recent) PTU, and it isn't always the PTU that the declaration "comes from". A new declaration (that isn't a redeclaration or specialization of anything) does belong to the active PTU. A template specialization, however, belongs to the most recent PTU of all the declarations in its signature - mostly that means that it can be pulled into a more recent PTU by its template arguments. The second key insight is that processing a PTU might extend an earlier PTU. Rolling back the later PTU shouldn't throw that extension away. For example, if the second PTU defines a template, and the third PTU requires that template to be instantiated at float, that template specialization is still part of the second PTU. Similarly, if the fifth PTU uses an inline function belonging to the fourth, that definition still belongs to the fourth. When we go to emit code in a new PTU, we map each declaration we have to emit back to its owning PTU and emit it in a new module for just the extensions to that PTU. We keep track of all the modules we've emitted for a PTU so that we can unload them all if we decide to roll it back. Most declarations/definitions will only refer to entities from the same or earlier PTUs. However, it is possible (primarily by defining a previously-declared entity, but also through templates or ADL) for an entity that belongs to one PTU to refer to something from a later PTU. We will have to keep track of this and prevent unwinding to later PTU when we recognize it. Fortunately, this should be very rare; and crucially, we don't have to do the bookkeeping for this if we've only got one PTU, e.g. in normal compilation. Otherwise, PTUs after the first just need to record enough metadata to be able to revert any changes they've made to declarations belonging to earlier PTUs, e.g. to redeclaration chains or template specialization lists. It should even eventually be possible for PTUs to provide their own slab allocators which can be thrown away as part of rolling back the PTU. We can maintain a notion of the active allocator and allocate things like Stmt/Expr nodes in it, temporarily changing it to the appropriate PTU whenever we go to do something like instantiate a function template. More care will be required when allocating declarations and types, though. We would want the PTU to be efficiently recoverable from a Decl; I'm not sure how best to do that. An easy option that would cover most declarations would be to make multiple TranslationUnitDecls and parent the declarations appropriately, but I don't think that's good enough for things like member function templates, since an instantiation of that would still be parented by its original class. Maybe we can work this into the DC chain somehow, like how lexical DCs are. We add a different kind of translation unit `TU_Incremental` which is a complete translation unit that we might nonetheless incrementally extend later. Because it is complete (and we might want to generate code for it), we do perform template instantiation, but because it might be extended later, we don't warn if it declares or uses undefined internal-linkage symbols. This patch teaches clang-repl how to recover from errors by disconnecting the most recent PTU and update the primary PTU lookup tables. For instance: ```./clang-repl clang-repl> int i = 12; error; In file included from <<< inputs >>>:1: input_line_0:1:13: error: C++ requires a type specifier for all declarations int i = 12; error; ^ error: Parsing failed. clang-repl> int i = 13; extern "C" int printf(const char*,...); clang-repl> auto r1 = printf("i=%d\n", i); i=13 clang-repl> quit ``` Differential revision: https://reviews.llvm.org/D104918
2021-05-18[clang-repl] Better match the underlying architecture.Vassil Vassilev1-1/+4
In cases where -fno-integrated-as is specified we should overwrite the EmitAssembly action as well. We also should rely on the target triple from the process at least until we implement out-of-process execution. This patch should improve clang-repl on AIX. Discussion available at: https://reviews.llvm.org/D96033 Differential revision: https://reviews.llvm.org/D102688
2021-05-13[clang-repl] Recommit "Land initial infrastructure for incremental parsing"Vassil Vassilev1-0/+254
Original commit message: In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have mentioned our plans to make some of the incremental compilation facilities available in llvm mainline. This patch proposes a minimal version of a repl, clang-repl, which enables interpreter-like interaction for C++. For instance: ./bin/clang-repl clang-repl> int i = 42; clang-repl> extern "C" int printf(const char*,...); clang-repl> auto r1 = printf("i=%d\n", i); i=42 clang-repl> quit The patch allows very limited functionality, for example, it crashes on invalid C++. The design of the proposed patch follows closely the design of cling. The idea is to gather feedback and gradually evolve both clang-repl and cling to what the community agrees upon. The IncrementalParser class is responsible for driving the clang parser and codegen and allows the compiler infrastructure to process more than one input. Every input adds to the “ever-growing” translation unit. That model is enabled by an IncrementalAction which prevents teardown when HandleTranslationUnit. The IncrementalExecutor class hides some of the underlying implementation details of the concrete JIT infrastructure. It exposes the minimal set of functionality required by our incremental compiler/interpreter. The Transaction class keeps track of the AST and the LLVM IR for each incremental input. That tracking information will be later used to implement error recovery. The Interpreter class orchestrates the IncrementalParser and the IncrementalExecutor to model interpreter-like behavior. It provides the public API which can be used (in future) when using the interpreter library. Differential revision: https://reviews.llvm.org/D96033