diff options
author | Utkarsh Saxena <usx@google.com> | 2025-07-10 23:42:20 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-07-10 23:42:20 +0200 |
commit | 3076794e924f30ae21d1a12f27b1e6349dfa5fc4 (patch) | |
tree | 5f902266b31d02545e6c45427c80a02f12ba8d4e | |
parent | 7920dff39406c2af3859d8e316c8f098526d6af3 (diff) | |
download | llvm-3076794e924f30ae21d1a12f27b1e6349dfa5fc4.zip llvm-3076794e924f30ae21d1a12f27b1e6349dfa5fc4.tar.gz llvm-3076794e924f30ae21d1a12f27b1e6349dfa5fc4.tar.bz2 |
[LifetimeSafety] Introduce intra-procedural analysis in Clang (#142313)
This patch introduces the initial implementation of the
intra-procedural, flow-sensitive lifetime analysis for Clang, as
proposed in the recent RFC:
https://discourse.llvm.org/t/rfc-intra-procedural-lifetime-analysis-in-clang/86291
The primary goal of this initial submission is to establish the core
dataflow framework and gather feedback on the overall design, fact
representation, and testing strategy. The focus is on the dataflow
mechanism itself rather than exhaustively covering all C++ AST edge
cases, which will be addressed in subsequent patches.
#### Key Components
* **Conceptual Model:** Introduces the fundamental concepts of `Loan`,
`Origin`, and `Path` to model memory borrows and the lifetime of
pointers.
* **Fact Generation:** A frontend pass traverses the Clang CFG to
generate a representation of lifetime-relevant events, such as pointer
assignments, taking an address, and variables going out of scope.
* **Testing:** `llvm-lit` tests validate the analysis by checking the
generated facts.
### Next Steps
*(Not covered in this PR but planned for subsequent patches)*
The following functionality is planned for the upcoming patches to build
upon this foundation and make the analysis usable in practice:
* **Dataflow Lattice:** A dataflow lattice used to map each pointer's
symbolic `Origin` to the set of `Loans` it may contain at any given
program point.
* **Fixed-Point Analysis:** A worklist-based, flow-sensitive analysis
that propagates the lattice state across the CFG to a fixed point.
* **Placeholder Loans:** Introduce placeholder loans to represent the
lifetimes of function parameters, forming the basis for analysis
involving function calls.
* **Annotation and Opaque Call Handling:** Use placeholder loans to
correctly model **function calls**, both by respecting
`[[clang::lifetimebound]]` annotations and by conservatively handling
opaque/un-annotated functions.
* **Error Reporting:** Implement the final analysis phase that consumes
the dataflow results to generate user-facing diagnostics. This will
likely require liveness analysis to identify live origins holding
expired loans.
* **Strict vs. Permissive Modes:** Add the logic to support both
high-confidence (permissive) and more comprehensive (strict) warning
levels.
* **Expanded C++ Coverage:** Broaden support for common patterns,
including the lifetimes of temporary objects and pointers within
aggregate types (structs/containers).
* Performance benchmarking
* Capping number of iterations or number of times a CFGBlock is
processed.
---------
Co-authored-by: Baranov Victor <bar.victor.2002@gmail.com>
-rw-r--r-- | clang/include/clang/Analysis/Analyses/LifetimeSafety.h | 30 | ||||
-rw-r--r-- | clang/include/clang/Basic/DiagnosticGroups.td | 3 | ||||
-rw-r--r-- | clang/include/clang/Basic/DiagnosticSemaKinds.td | 4 | ||||
-rw-r--r-- | clang/lib/Analysis/CMakeLists.txt | 1 | ||||
-rw-r--r-- | clang/lib/Analysis/LifetimeSafety.cpp | 510 | ||||
-rw-r--r-- | clang/lib/Sema/AnalysisBasedWarnings.cpp | 10 | ||||
-rw-r--r-- | clang/test/Sema/warn-lifetime-safety-dataflow.cpp | 191 |
7 files changed, 749 insertions, 0 deletions
diff --git a/clang/include/clang/Analysis/Analyses/LifetimeSafety.h b/clang/include/clang/Analysis/Analyses/LifetimeSafety.h new file mode 100644 index 0000000..9998702 --- /dev/null +++ b/clang/include/clang/Analysis/Analyses/LifetimeSafety.h @@ -0,0 +1,30 @@ +//===- LifetimeSafety.h - C++ Lifetime Safety Analysis -*----------- C++-*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines the entry point for a dataflow-based static analysis +// that checks for C++ lifetime violations. +// +// The analysis is based on the concepts of "origins" and "loans" to track +// pointer lifetimes and detect issues like use-after-free and dangling +// pointers. See the RFC for more details: +// https://discourse.llvm.org/t/rfc-intra-procedural-lifetime-analysis-in-clang/86291 +// +//===----------------------------------------------------------------------===// +#ifndef LLVM_CLANG_ANALYSIS_ANALYSES_LIFETIMESAFETY_H +#define LLVM_CLANG_ANALYSIS_ANALYSES_LIFETIMESAFETY_H +#include "clang/AST/DeclBase.h" +#include "clang/Analysis/AnalysisDeclContext.h" +#include "clang/Analysis/CFG.h" +namespace clang { + +void runLifetimeSafetyAnalysis(const DeclContext &DC, const CFG &Cfg, + AnalysisDeclContext &AC); + +} // namespace clang + +#endif // LLVM_CLANG_ANALYSIS_ANALYSES_LIFETIMESAFETY_H diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td index f54a830..9a7a308 100644 --- a/clang/include/clang/Basic/DiagnosticGroups.td +++ b/clang/include/clang/Basic/DiagnosticGroups.td @@ -532,6 +532,9 @@ def Dangling : DiagGroup<"dangling", [DanglingAssignment, DanglingInitializerList, DanglingGsl, ReturnStackAddress]>; + +def LifetimeSafety : DiagGroup<"experimental-lifetime-safety">; + def DistributedObjectModifiers : DiagGroup<"distributed-object-modifiers">; def DllexportExplicitInstantiationDecl : DiagGroup<"dllexport-explicit-instantiation-decl">; def ExcessInitializers : DiagGroup<"excess-initializers">; diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 934f445..09ba796 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -10627,6 +10627,10 @@ def warn_dangling_reference_captured_by_unknown : Warning< "object whose reference is captured will be destroyed at the end of " "the full-expression">, InGroup<DanglingCapture>; +def warn_experimental_lifetime_safety_dummy_warning : Warning< + "todo: remove this warning after we have atleast one warning based on the lifetime analysis">, + InGroup<LifetimeSafety>, DefaultIgnore; + // For non-floating point, expressions of the form x == x or x != x // should result in a warning, since these always evaluate to a constant. // Array comparisons have similar warnings diff --git a/clang/lib/Analysis/CMakeLists.txt b/clang/lib/Analysis/CMakeLists.txt index 8cd3990..0523d92 100644 --- a/clang/lib/Analysis/CMakeLists.txt +++ b/clang/lib/Analysis/CMakeLists.txt @@ -21,6 +21,7 @@ add_clang_library(clangAnalysis FixitUtil.cpp IntervalPartition.cpp IssueHash.cpp + LifetimeSafety.cpp LiveVariables.cpp MacroExpansionContext.cpp ObjCNoReturn.cpp diff --git a/clang/lib/Analysis/LifetimeSafety.cpp b/clang/lib/Analysis/LifetimeSafety.cpp new file mode 100644 index 0000000..1f18952 --- /dev/null +++ b/clang/lib/Analysis/LifetimeSafety.cpp @@ -0,0 +1,510 @@ +//===- LifetimeSafety.cpp - C++ Lifetime Safety Analysis -*--------- C++-*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +#include "clang/Analysis/Analyses/LifetimeSafety.h" +#include "clang/AST/Decl.h" +#include "clang/AST/Expr.h" +#include "clang/AST/StmtVisitor.h" +#include "clang/AST/Type.h" +#include "clang/Analysis/Analyses/PostOrderCFGView.h" +#include "clang/Analysis/AnalysisDeclContext.h" +#include "clang/Analysis/CFG.h" +#include "llvm/ADT/FoldingSet.h" +#include "llvm/ADT/PointerUnion.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/Support/Debug.h" +#include "llvm/Support/TimeProfiler.h" +#include <cstdint> + +namespace clang { +namespace { + +/// Represents the storage location being borrowed, e.g., a specific stack +/// variable. +/// TODO: Model access paths of other types, e.g., s.field, heap and globals. +struct AccessPath { + const clang::ValueDecl *D; + + AccessPath(const clang::ValueDecl *D) : D(D) {} +}; + +/// A generic, type-safe wrapper for an ID, distinguished by its `Tag` type. +/// Used for giving ID to loans and origins. +template <typename Tag> struct ID { + uint32_t Value = 0; + + bool operator==(const ID<Tag> &Other) const { return Value == Other.Value; } + bool operator!=(const ID<Tag> &Other) const { return !(*this == Other); } + bool operator<(const ID<Tag> &Other) const { return Value < Other.Value; } + ID<Tag> operator++(int) { + ID<Tag> Tmp = *this; + ++Value; + return Tmp; + } + void Profile(llvm::FoldingSetNodeID &IDBuilder) const { + IDBuilder.AddInteger(Value); + } +}; + +template <typename Tag> +inline llvm::raw_ostream &operator<<(llvm::raw_ostream &OS, ID<Tag> ID) { + return OS << ID.Value; +} + +using LoanID = ID<struct LoanTag>; +using OriginID = ID<struct OriginTag>; + +/// Information about a single borrow, or "Loan". A loan is created when a +/// reference or pointer is created. +struct Loan { + /// TODO: Represent opaque loans. + /// TODO: Represent nullptr: loans to no path. Accessing it UB! Currently it + /// is represented as empty LoanSet + LoanID ID; + AccessPath Path; + SourceLocation IssueLoc; + + Loan(LoanID id, AccessPath path, SourceLocation loc) + : ID(id), Path(path), IssueLoc(loc) {} +}; + +/// An Origin is a symbolic identifier that represents the set of possible +/// loans a pointer-like object could hold at any given time. +/// TODO: Enhance the origin model to handle complex types, pointer +/// indirection and reborrowing. The plan is to move from a single origin per +/// variable/expression to a "list of origins" governed by the Type. +/// For example, the type 'int**' would have two origins. +/// See discussion: +/// https://github.com/llvm/llvm-project/pull/142313/commits/0cd187b01e61b200d92ca0b640789c1586075142#r2137644238 +struct Origin { + OriginID ID; + /// A pointer to the AST node that this origin represents. This union + /// distinguishes between origins from declarations (variables or parameters) + /// and origins from expressions. + llvm::PointerUnion<const clang::ValueDecl *, const clang::Expr *> Ptr; + + Origin(OriginID ID, const clang::ValueDecl *D) : ID(ID), Ptr(D) {} + Origin(OriginID ID, const clang::Expr *E) : ID(ID), Ptr(E) {} + + const clang::ValueDecl *getDecl() const { + return Ptr.dyn_cast<const clang::ValueDecl *>(); + } + const clang::Expr *getExpr() const { + return Ptr.dyn_cast<const clang::Expr *>(); + } +}; + +/// Manages the creation, storage and retrieval of loans. +class LoanManager { +public: + LoanManager() = default; + + Loan &addLoan(AccessPath Path, SourceLocation Loc) { + AllLoans.emplace_back(getNextLoanID(), Path, Loc); + return AllLoans.back(); + } + + const Loan &getLoan(LoanID ID) const { + assert(ID.Value < AllLoans.size()); + return AllLoans[ID.Value]; + } + llvm::ArrayRef<Loan> getLoans() const { return AllLoans; } + +private: + LoanID getNextLoanID() { return NextLoanID++; } + + LoanID NextLoanID{0}; + /// TODO(opt): Profile and evaluate the usefullness of small buffer + /// optimisation. + llvm::SmallVector<Loan> AllLoans; +}; + +/// Manages the creation, storage, and retrieval of origins for pointer-like +/// variables and expressions. +class OriginManager { +public: + OriginManager() = default; + + Origin &addOrigin(OriginID ID, const clang::ValueDecl &D) { + AllOrigins.emplace_back(ID, &D); + return AllOrigins.back(); + } + Origin &addOrigin(OriginID ID, const clang::Expr &E) { + AllOrigins.emplace_back(ID, &E); + return AllOrigins.back(); + } + + OriginID get(const Expr &E) { + // Origin of DeclRefExpr is that of the declaration it refers to. + if (const auto *DRE = dyn_cast<DeclRefExpr>(&E)) + return get(*DRE->getDecl()); + auto It = ExprToOriginID.find(&E); + // TODO: This should be an assert(It != ExprToOriginID.end()). The current + // implementation falls back to getOrCreate to avoid crashing on + // yet-unhandled pointer expressions, creating an empty origin for them. + if (It == ExprToOriginID.end()) + return getOrCreate(E); + + return It->second; + } + + OriginID get(const ValueDecl &D) { + auto It = DeclToOriginID.find(&D); + // TODO: This should be an assert(It != DeclToOriginID.end()). The current + // implementation falls back to getOrCreate to avoid crashing on + // yet-unhandled pointer expressions, creating an empty origin for them. + if (It == DeclToOriginID.end()) + return getOrCreate(D); + + return It->second; + } + + OriginID getOrCreate(const Expr &E) { + auto It = ExprToOriginID.find(&E); + if (It != ExprToOriginID.end()) + return It->second; + + if (const auto *DRE = dyn_cast<DeclRefExpr>(&E)) { + // Origin of DeclRefExpr is that of the declaration it refers to. + return getOrCreate(*DRE->getDecl()); + } + OriginID NewID = getNextOriginID(); + addOrigin(NewID, E); + ExprToOriginID[&E] = NewID; + return NewID; + } + + const Origin &getOrigin(OriginID ID) const { + assert(ID.Value < AllOrigins.size()); + return AllOrigins[ID.Value]; + } + + llvm::ArrayRef<Origin> getOrigins() const { return AllOrigins; } + + OriginID getOrCreate(const ValueDecl &D) { + auto It = DeclToOriginID.find(&D); + if (It != DeclToOriginID.end()) + return It->second; + OriginID NewID = getNextOriginID(); + addOrigin(NewID, D); + DeclToOriginID[&D] = NewID; + return NewID; + } + +private: + OriginID getNextOriginID() { return NextOriginID++; } + + OriginID NextOriginID{0}; + /// TODO(opt): Profile and evaluate the usefullness of small buffer + /// optimisation. + llvm::SmallVector<Origin> AllOrigins; + llvm::DenseMap<const clang::ValueDecl *, OriginID> DeclToOriginID; + llvm::DenseMap<const clang::Expr *, OriginID> ExprToOriginID; +}; + +/// An abstract base class for a single, atomic lifetime-relevant event. +class Fact { + +public: + enum class Kind : uint8_t { + /// A new loan is issued from a borrow expression (e.g., &x). + Issue, + /// A loan expires as its underlying storage is freed (e.g., variable goes + /// out of scope). + Expire, + /// An origin is propagated from a source to a destination (e.g., p = q). + AssignOrigin, + /// An origin escapes the function by flowing into the return value. + ReturnOfOrigin + }; + +private: + Kind K; + +protected: + Fact(Kind K) : K(K) {} + +public: + virtual ~Fact() = default; + Kind getKind() const { return K; } + + template <typename T> const T *getAs() const { + if (T::classof(this)) + return static_cast<const T *>(this); + return nullptr; + } + + virtual void dump(llvm::raw_ostream &OS) const { + OS << "Fact (Kind: " << static_cast<int>(K) << ")\n"; + } +}; + +class IssueFact : public Fact { + LoanID LID; + OriginID OID; + +public: + static bool classof(const Fact *F) { return F->getKind() == Kind::Issue; } + + IssueFact(LoanID LID, OriginID OID) : Fact(Kind::Issue), LID(LID), OID(OID) {} + LoanID getLoanID() const { return LID; } + OriginID getOriginID() const { return OID; } + void dump(llvm::raw_ostream &OS) const override { + OS << "Issue (LoanID: " << getLoanID() << ", OriginID: " << getOriginID() + << ")\n"; + } +}; + +class ExpireFact : public Fact { + LoanID LID; + +public: + static bool classof(const Fact *F) { return F->getKind() == Kind::Expire; } + + ExpireFact(LoanID LID) : Fact(Kind::Expire), LID(LID) {} + LoanID getLoanID() const { return LID; } + void dump(llvm::raw_ostream &OS) const override { + OS << "Expire (LoanID: " << getLoanID() << ")\n"; + } +}; + +class AssignOriginFact : public Fact { + OriginID OIDDest; + OriginID OIDSrc; + +public: + static bool classof(const Fact *F) { + return F->getKind() == Kind::AssignOrigin; + } + + AssignOriginFact(OriginID OIDDest, OriginID OIDSrc) + : Fact(Kind::AssignOrigin), OIDDest(OIDDest), OIDSrc(OIDSrc) {} + OriginID getDestOriginID() const { return OIDDest; } + OriginID getSrcOriginID() const { return OIDSrc; } + void dump(llvm::raw_ostream &OS) const override { + OS << "AssignOrigin (DestID: " << getDestOriginID() + << ", SrcID: " << getSrcOriginID() << ")\n"; + } +}; + +class ReturnOfOriginFact : public Fact { + OriginID OID; + +public: + static bool classof(const Fact *F) { + return F->getKind() == Kind::ReturnOfOrigin; + } + + ReturnOfOriginFact(OriginID OID) : Fact(Kind::ReturnOfOrigin), OID(OID) {} + OriginID getReturnedOriginID() const { return OID; } + void dump(llvm::raw_ostream &OS) const override { + OS << "ReturnOfOrigin (OriginID: " << getReturnedOriginID() << ")\n"; + } +}; + +class FactManager { +public: + llvm::ArrayRef<const Fact *> getFacts(const CFGBlock *B) const { + auto It = BlockToFactsMap.find(B); + if (It != BlockToFactsMap.end()) + return It->second; + return {}; + } + + void addBlockFacts(const CFGBlock *B, llvm::ArrayRef<Fact *> NewFacts) { + if (!NewFacts.empty()) + BlockToFactsMap[B].assign(NewFacts.begin(), NewFacts.end()); + } + + template <typename FactType, typename... Args> + FactType *createFact(Args &&...args) { + void *Mem = FactAllocator.Allocate<FactType>(); + return new (Mem) FactType(std::forward<Args>(args)...); + } + + void dump(const CFG &Cfg, AnalysisDeclContext &AC) const { + llvm::dbgs() << "==========================================\n"; + llvm::dbgs() << " Lifetime Analysis Facts:\n"; + llvm::dbgs() << "==========================================\n"; + if (const Decl *D = AC.getDecl()) + if (const auto *ND = dyn_cast<NamedDecl>(D)) + llvm::dbgs() << "Function: " << ND->getQualifiedNameAsString() << "\n"; + // Print blocks in the order as they appear in code for a stable ordering. + for (const CFGBlock *B : *AC.getAnalysis<PostOrderCFGView>()) { + llvm::dbgs() << " Block B" << B->getBlockID() << ":\n"; + auto It = BlockToFactsMap.find(B); + if (It != BlockToFactsMap.end()) { + for (const Fact *F : It->second) { + llvm::dbgs() << " "; + F->dump(llvm::dbgs()); + } + } + llvm::dbgs() << " End of Block\n"; + } + } + + LoanManager &getLoanMgr() { return LoanMgr; } + OriginManager &getOriginMgr() { return OriginMgr; } + +private: + LoanManager LoanMgr; + OriginManager OriginMgr; + llvm::DenseMap<const clang::CFGBlock *, llvm::SmallVector<const Fact *>> + BlockToFactsMap; + llvm::BumpPtrAllocator FactAllocator; +}; + +class FactGenerator : public ConstStmtVisitor<FactGenerator> { + +public: + FactGenerator(FactManager &FactMgr, AnalysisDeclContext &AC) + : FactMgr(FactMgr), AC(AC) {} + + void run() { + llvm::TimeTraceScope TimeProfile("FactGenerator"); + // Iterate through the CFG blocks in reverse post-order to ensure that + // initializations and destructions are processed in the correct sequence. + for (const CFGBlock *Block : *AC.getAnalysis<PostOrderCFGView>()) { + CurrentBlockFacts.clear(); + for (unsigned I = 0; I < Block->size(); ++I) { + const CFGElement &Element = Block->Elements[I]; + if (std::optional<CFGStmt> CS = Element.getAs<CFGStmt>()) + Visit(CS->getStmt()); + else if (std::optional<CFGAutomaticObjDtor> DtorOpt = + Element.getAs<CFGAutomaticObjDtor>()) + handleDestructor(*DtorOpt); + } + FactMgr.addBlockFacts(Block, CurrentBlockFacts); + } + } + + void VisitDeclStmt(const DeclStmt *DS) { + for (const Decl *D : DS->decls()) + if (const auto *VD = dyn_cast<VarDecl>(D)) + if (hasOrigin(VD->getType())) + if (const Expr *InitExpr = VD->getInit()) + addAssignOriginFact(*VD, *InitExpr); + } + + void VisitCXXNullPtrLiteralExpr(const CXXNullPtrLiteralExpr *N) { + /// TODO: Handle nullptr expr as a special 'null' loan. Uninitialized + /// pointers can use the same type of loan. + FactMgr.getOriginMgr().getOrCreate(*N); + } + + void VisitImplicitCastExpr(const ImplicitCastExpr *ICE) { + if (!hasOrigin(ICE->getType())) + return; + Visit(ICE->getSubExpr()); + // An ImplicitCastExpr node itself gets an origin, which flows from the + // origin of its sub-expression (after stripping its own parens/casts). + // TODO: Consider if this is actually useful in practice. Alternatively, we + // could directly use the sub-expression's OriginID instead of creating a + // new one. + addAssignOriginFact(*ICE, *ICE->getSubExpr()); + } + + void VisitUnaryOperator(const UnaryOperator *UO) { + if (UO->getOpcode() == UO_AddrOf) { + const Expr *SubExpr = UO->getSubExpr(); + if (const auto *DRE = dyn_cast<DeclRefExpr>(SubExpr)) { + if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) { + // Check if it's a local variable. + if (VD->hasLocalStorage()) { + OriginID OID = FactMgr.getOriginMgr().getOrCreate(*UO); + AccessPath AddrOfLocalVarPath(VD); + const Loan &L = FactMgr.getLoanMgr().addLoan(AddrOfLocalVarPath, + UO->getOperatorLoc()); + CurrentBlockFacts.push_back( + FactMgr.createFact<IssueFact>(L.ID, OID)); + } + } + } + } + } + + void VisitReturnStmt(const ReturnStmt *RS) { + if (const Expr *RetExpr = RS->getRetValue()) { + if (hasOrigin(RetExpr->getType())) { + OriginID OID = FactMgr.getOriginMgr().getOrCreate(*RetExpr); + CurrentBlockFacts.push_back( + FactMgr.createFact<ReturnOfOriginFact>(OID)); + } + } + } + + void VisitBinaryOperator(const BinaryOperator *BO) { + if (BO->isAssignmentOp()) { + const Expr *LHSExpr = BO->getLHS(); + const Expr *RHSExpr = BO->getRHS(); + + // We are interested in assignments like `ptr1 = ptr2` or `ptr = &var` + // LHS must be a pointer/reference type that can be an origin. + // RHS must also represent an origin (either another pointer/ref or an + // address-of). + if (const auto *DRE_LHS = dyn_cast<DeclRefExpr>(LHSExpr)) + if (const auto *VD_LHS = + dyn_cast<ValueDecl>(DRE_LHS->getDecl()->getCanonicalDecl()); + VD_LHS && hasOrigin(VD_LHS->getType())) + addAssignOriginFact(*VD_LHS, *RHSExpr); + } + } + +private: + // Check if a type has an origin. + bool hasOrigin(QualType QT) { return QT->isPointerOrReferenceType(); } + + template <typename Destination, typename Source> + void addAssignOriginFact(const Destination &D, const Source &S) { + OriginID DestOID = FactMgr.getOriginMgr().getOrCreate(D); + OriginID SrcOID = FactMgr.getOriginMgr().get(S); + CurrentBlockFacts.push_back( + FactMgr.createFact<AssignOriginFact>(DestOID, SrcOID)); + } + + void handleDestructor(const CFGAutomaticObjDtor &DtorOpt) { + /// TODO: Also handle trivial destructors (e.g., for `int` + /// variables) which will never have a CFGAutomaticObjDtor node. + /// TODO: Handle loans to temporaries. + /// TODO: Consider using clang::CFG::BuildOptions::AddLifetime to reuse the + /// lifetime ends. + const VarDecl *DestructedVD = DtorOpt.getVarDecl(); + if (!DestructedVD) + return; + // Iterate through all loans to see if any expire. + /// TODO(opt): Do better than a linear search to find loans associated with + /// 'DestructedVD'. + for (const Loan &L : FactMgr.getLoanMgr().getLoans()) { + const AccessPath &LoanPath = L.Path; + // Check if the loan is for a stack variable and if that variable + // is the one being destructed. + if (LoanPath.D == DestructedVD) + CurrentBlockFacts.push_back(FactMgr.createFact<ExpireFact>(L.ID)); + } + } + + FactManager &FactMgr; + AnalysisDeclContext &AC; + llvm::SmallVector<Fact *> CurrentBlockFacts; +}; + +// ========================================================================= // +// TODO: Run dataflow analysis to propagate loans, analyse and error reporting. +// ========================================================================= // +} // anonymous namespace + +void runLifetimeSafetyAnalysis(const DeclContext &DC, const CFG &Cfg, + AnalysisDeclContext &AC) { + llvm::TimeTraceScope TimeProfile("LifetimeSafetyAnalysis"); + DEBUG_WITH_TYPE("PrintCFG", Cfg.dump(AC.getASTContext().getLangOpts(), + /*ShowColors=*/true)); + FactManager FactMgr; + FactGenerator FactGen(FactMgr, AC); + FactGen.run(); + DEBUG_WITH_TYPE("LifetimeFacts", FactMgr.dump(Cfg, AC)); +} +} // namespace clang diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp b/clang/lib/Sema/AnalysisBasedWarnings.cpp index 7420ba2..4cce77e3 100644 --- a/clang/lib/Sema/AnalysisBasedWarnings.cpp +++ b/clang/lib/Sema/AnalysisBasedWarnings.cpp @@ -29,6 +29,7 @@ #include "clang/Analysis/Analyses/CFGReachabilityAnalysis.h" #include "clang/Analysis/Analyses/CalledOnceCheck.h" #include "clang/Analysis/Analyses/Consumed.h" +#include "clang/Analysis/Analyses/LifetimeSafety.h" #include "clang/Analysis/Analyses/ReachableCode.h" #include "clang/Analysis/Analyses/ThreadSafety.h" #include "clang/Analysis/Analyses/UninitializedValues.h" @@ -49,6 +50,7 @@ #include "llvm/ADT/STLFunctionalExtras.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" +#include "llvm/Support/Debug.h" #include <algorithm> #include <deque> #include <iterator> @@ -2744,6 +2746,8 @@ void clang::sema::AnalysisBasedWarnings::IssueWarnings( .setAlwaysAdd(Stmt::UnaryOperatorClass); } + bool EnableLifetimeSafetyAnalysis = !Diags.isIgnored( + diag::warn_experimental_lifetime_safety_dummy_warning, D->getBeginLoc()); // Install the logical handler. std::optional<LogicalErrorHandler> LEH; if (LogicalErrorHandler::hasActiveDiagnostics(Diags, D->getBeginLoc())) { @@ -2866,6 +2870,12 @@ void clang::sema::AnalysisBasedWarnings::IssueWarnings( } } + // TODO: Enable lifetime safety analysis for other languages once it is + // stable. + if (EnableLifetimeSafetyAnalysis && S.getLangOpts().CPlusPlus) { + if (CFG *cfg = AC.getCFG()) + runLifetimeSafetyAnalysis(*cast<DeclContext>(D), *cfg, AC); + } // Check for violations of "called once" parameter properties. if (S.getLangOpts().ObjC && !S.getLangOpts().CPlusPlus && shouldAnalyzeCalledOnceParameters(Diags, D->getBeginLoc())) { diff --git a/clang/test/Sema/warn-lifetime-safety-dataflow.cpp b/clang/test/Sema/warn-lifetime-safety-dataflow.cpp new file mode 100644 index 0000000..df45afa --- /dev/null +++ b/clang/test/Sema/warn-lifetime-safety-dataflow.cpp @@ -0,0 +1,191 @@ +// RUN: %clang_cc1 -mllvm -debug-only=LifetimeFacts -Wexperimental-lifetime-safety %s 2>&1 | FileCheck %s + +struct MyObj { + int id; + ~MyObj() {} // Non-trivial destructor +}; + +// Simple Local Variable Address and Return +// CHECK-LABEL: Function: return_local_addr +MyObj* return_local_addr() { + MyObj x {10}; + MyObj* p = &x; +// CHECK: Block B{{[0-9]+}}: +// CHECK: Issue (LoanID: [[L_X:[0-9]+]], OriginID: [[O_ADDR_X:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_P:[0-9]+]], SrcID: [[O_ADDR_X]]) + return p; +// CHECK: AssignOrigin (DestID: [[O_RET_VAL:[0-9]+]], SrcID: [[O_P]]) +// CHECK: ReturnOfOrigin (OriginID: [[O_RET_VAL]]) +// CHECK: Expire (LoanID: [[L_X]]) +} + + +// Pointer Assignment and Return +// CHECK-LABEL: Function: assign_and_return_local_addr +// CHECK-NEXT: Block B{{[0-9]+}}: +MyObj* assign_and_return_local_addr() { + MyObj y{20}; + MyObj* ptr1 = &y; +// CHECK: Issue (LoanID: [[L_Y:[0-9]+]], OriginID: [[O_ADDR_Y:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_PTR1:[0-9]+]], SrcID: [[O_ADDR_Y]]) + MyObj* ptr2 = ptr1; +// CHECK: AssignOrigin (DestID: [[O_PTR1_RVAL:[0-9]+]], SrcID: [[O_PTR1]]) +// CHECK: AssignOrigin (DestID: [[O_PTR2:[0-9]+]], SrcID: [[O_PTR1_RVAL]]) + ptr2 = ptr1; +// CHECK: AssignOrigin (DestID: [[O_PTR1_RVAL_2:[0-9]+]], SrcID: [[O_PTR1]]) +// CHECK: AssignOrigin (DestID: [[O_PTR2]], SrcID: [[O_PTR1_RVAL_2]]) + ptr2 = ptr2; // Self assignment. +// CHECK: AssignOrigin (DestID: [[O_PTR2_RVAL:[0-9]+]], SrcID: [[O_PTR2]]) +// CHECK: AssignOrigin (DestID: [[O_PTR2]], SrcID: [[O_PTR2_RVAL]]) + return ptr2; +// CHECK: AssignOrigin (DestID: [[O_PTR2_RVAL_2:[0-9]+]], SrcID: [[O_PTR2]]) +// CHECK: ReturnOfOrigin (OriginID: [[O_PTR2_RVAL_2]]) +// CHECK: Expire (LoanID: [[L_Y]]) +} + + +// Return of Non-Pointer Type +// CHECK-LABEL: Function: return_int_val +// CHECK-NEXT: Block B{{[0-9]+}}: +int return_int_val() { + int x = 10; + return x; +} +// CHECK-NEXT: End of Block + + +// Loan Expiration (Automatic Variable, C++) +// CHECK-LABEL: Function: loan_expires_cpp +// CHECK-NEXT: Block B{{[0-9]+}}: +void loan_expires_cpp() { + MyObj obj{1}; + MyObj* pObj = &obj; +// CHECK: Issue (LoanID: [[L_OBJ:[0-9]+]], OriginID: [[O_ADDR_OBJ:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_POBJ:[0-9]+]], SrcID: [[O_ADDR_OBJ]]) +// CHECK: Expire (LoanID: [[L_OBJ]]) +} + + +// FIXME: No expire for Trivial Destructors +// CHECK-LABEL: Function: loan_expires_trivial +// CHECK-NEXT: Block B{{[0-9]+}}: +void loan_expires_trivial() { + int trivial_obj = 1; + int* pTrivialObj = &trivial_obj; +// CHECK: Issue (LoanID: [[L_TRIVIAL_OBJ:[0-9]+]], OriginID: [[O_ADDR_TRIVIAL_OBJ:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_PTOBJ:[0-9]+]], SrcID: [[O_ADDR_TRIVIAL_OBJ]]) +// CHECK-NOT: Expire (LoanID: [[L_TRIVIAL_OBJ]]) +// CHECK-NEXT: End of Block + // FIXME: Add check for Expire once trivial destructors are handled for expiration. +} + + +// CHECK-LABEL: Function: conditional +void conditional(bool condition) { + int a = 5; + int b = 10; + int* p = nullptr; + + if (condition) + p = &a; + // CHECK: Issue (LoanID: [[L_A:[0-9]+]], OriginID: [[O_ADDR_A:[0-9]+]]) + // CHECK: AssignOrigin (DestID: [[O_P:[0-9]+]], SrcID: [[O_ADDR_A]]) + else + p = &b; + // CHECK: Issue (LoanID: [[L_B:[0-9]+]], OriginID: [[O_ADDR_B:[0-9]+]]) + // CHECK: AssignOrigin (DestID: [[O_P]], SrcID: [[O_ADDR_B]]) + int *q = p; + // CHECK: AssignOrigin (DestID: [[O_P_RVAL:[0-9]+]], SrcID: [[O_P]]) + // CHECK: AssignOrigin (DestID: [[O_Q:[0-9]+]], SrcID: [[O_P_RVAL]]) +} + + +// CHECK-LABEL: Function: overwrite_origin +void overwrite_origin() { + MyObj s1; + MyObj s2; + MyObj* p = &s1; +// CHECK: Block B{{[0-9]+}}: +// CHECK: Issue (LoanID: [[L_S1:[0-9]+]], OriginID: [[O_ADDR_S1:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_P:[0-9]+]], SrcID: [[O_ADDR_S1]]) + p = &s2; +// CHECK: Issue (LoanID: [[L_S2:[0-9]+]], OriginID: [[O_ADDR_S2:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_P]], SrcID: [[O_ADDR_S2]]) +// CHECK: Expire (LoanID: [[L_S2]]) +// CHECK: Expire (LoanID: [[L_S1]]) +} + + +// CHECK-LABEL: Function: reassign_to_null +void reassign_to_null() { + MyObj s1; + MyObj* p = &s1; +// CHECK: Block B{{[0-9]+}}: +// CHECK: Issue (LoanID: [[L_S1:[0-9]+]], OriginID: [[O_ADDR_S1:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_P:[0-9]+]], SrcID: [[O_ADDR_S1]]) + p = nullptr; +// CHECK: AssignOrigin (DestID: [[O_P]], SrcID: [[O_NULLPTR:[0-9]+]]) +// CHECK: Expire (LoanID: [[L_S1]]) +} +// FIXME: Have a better representation for nullptr than just an empty origin. +// It should be a separate loan and origin kind. + + +// CHECK-LABEL: Function: reassign_in_if +void reassign_in_if(bool condition) { + MyObj s1; + MyObj s2; + MyObj* p = &s1; +// CHECK: Block B{{[0-9]+}}: +// CHECK: Issue (LoanID: [[L_S1:[0-9]+]], OriginID: [[O_ADDR_S1:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_P:[0-9]+]], SrcID: [[O_ADDR_S1]]) + if (condition) { + p = &s2; +// CHECK: Block B{{[0-9]+}}: +// CHECK: Issue (LoanID: [[L_S2:[0-9]+]], OriginID: [[O_ADDR_S2:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_P]], SrcID: [[O_ADDR_S2]]) + } +// CHECK: Block B{{[0-9]+}}: +// CHECK: Expire (LoanID: [[L_S2]]) +// CHECK: Expire (LoanID: [[L_S1]]) +} + + +// CHECK-LABEL: Function: nested_scopes +void nested_scopes() { + MyObj* p = nullptr; +// CHECK: Block B{{[0-9]+}}: +// CHECK: AssignOrigin (DestID: [[O_NULLPTR_CAST:[0-9]+]], SrcID: [[O_NULLPTR:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_P:[0-9]+]], SrcID: [[O_NULLPTR_CAST]]) + { + MyObj outer; + p = &outer; +// CHECK: Issue (LoanID: [[L_OUTER:[0-9]+]], OriginID: [[O_ADDR_OUTER:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_P]], SrcID: [[O_ADDR_OUTER]]) + { + MyObj inner; + p = &inner; +// CHECK: Issue (LoanID: [[L_INNER:[0-9]+]], OriginID: [[O_ADDR_INNER:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_P]], SrcID: [[O_ADDR_INNER]]) + } +// CHECK: Expire (LoanID: [[L_INNER]]) + } +// CHECK: Expire (LoanID: [[L_OUTER]]) +} + + +// CHECK-LABEL: Function: pointer_indirection +void pointer_indirection() { + int a; + int *p = &a; +// CHECK: Block B1: +// CHECK: Issue (LoanID: [[L_A:[0-9]+]], OriginID: [[O_ADDR_A:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_P:[0-9]+]], SrcID: [[O_ADDR_A]]) + int **pp = &p; +// CHECK: Issue (LoanID: [[L_P:[0-9]+]], OriginID: [[O_ADDR_P:[0-9]+]]) +// CHECK: AssignOrigin (DestID: [[O_PP:[0-9]+]], SrcID: [[O_ADDR_P]]) + +// FIXME: The Origin for the RHS is broken + int *q = *pp; +// CHECK: AssignOrigin (DestID: [[O_Q:[0-9]+]], SrcID: {{[0-9]+}}) +} |