aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Analysis
AgeCommit message (Collapse)AuthorFilesLines
36 hours[LifetimeSafety] Avoid adding already present items in sets/maps (#159582)Utkarsh Saxena1-4/+8
Optimize lifetime safety analysis performance - Added early return optimization in `join` function for ImmutableSet when sets are identical - Improved ImmutableMap join logic to avoid unnecessary operations when values are equal I was under the impression that ImmutableSets/Maps would not modify the underlying if already existing elements are added to the container (and was hoping for structural equality in this aspect). It looks like the current implementation of `ImmutableSet` would perform addition nevertheless thereby creating (presumably `O(log(N))` tree nodes. This change considerably brings down compile times for some edge cases which happened to be present in the LLVM codebase. Now it is actually possible to compile LLVM in under 20 min with the lifetime analysis. The compile time hit is still significant but not as bad as before this change where it was not possible to compile LLVM without severely limiting analysis' scope (giving up on CFG with > 3000 blocks). Fixes https://github.com/llvm/llvm-project/issues/157420 <details> <summary>Report (Before)</summary> </details> <details> <summary>Report (After)</summary> # Lifetime Analysis Performance Report > Generated on: 2025-09-18 14:28:00 --- ## Test Case: Pointer Cycle in Loop **Timing Results:** | N (Input Size) | Total Time | Analysis Time (%) | Fact Generator (%) | Loan Propagation (%) | Expired Loans (%) | |:---------------|-----------:|------------------:|-------------------:|---------------------:|------------------:| | 25 | 53.76 ms | 85.58% | 0.00% | 85.46% | 0.00% | | 50 | 605.35 ms | 98.39% | 0.00% | 98.37% | 0.00% | | 75 | 2.89 s | 99.62% | 0.00% | 99.61% | 0.00% | | 100 | 8.62 s | 99.80% | 0.00% | 99.80% | 0.00% | **Complexity Analysis:** | Analysis Phase | Complexity O(n<sup>k</sup>) | |:------------------|:--------------------------| | Total Analysis | O(n<sup>3.82</sup> &pm; 0.01) | | FactGenerator | (Negligible) | | LoanPropagation | O(n<sup>3.82</sup> &pm; 0.01) | | ExpiredLoans | (Negligible) | --- ## Test Case: CFG Merges **Timing Results:** | N (Input Size) | Total Time | Analysis Time (%) | Fact Generator (%) | Loan Propagation (%) | Expired Loans (%) | |:---------------|-----------:|------------------:|-------------------:|---------------------:|------------------:| | 400 | 66.02 ms | 58.61% | 1.04% | 56.53% | 1.02% | | 1000 | 319.24 ms | 81.31% | 0.63% | 80.04% | 0.64% | | 2000 | 1.43 s | 92.00% | 0.40% | 91.32% | 0.28% | | 5000 | 9.35 s | 97.01% | 0.25% | 96.63% | 0.12% | **Complexity Analysis:** | Analysis Phase | Complexity O(n<sup>k</sup>) | |:------------------|:--------------------------| | Total Analysis | O(n<sup>2.12</sup> &pm; 0.02) | | FactGenerator | O(n<sup>1.54</sup> &pm; 0.02) | | LoanPropagation | O(n<sup>2.12</sup> &pm; 0.03) | | ExpiredLoans | O(n<sup>1.13</sup> &pm; 0.03) | --- ## Test Case: Deeply Nested Loops **Timing Results:** | N (Input Size) | Total Time | Analysis Time (%) | Fact Generator (%) | Loan Propagation (%) | Expired Loans (%) | |:---------------|-----------:|------------------:|-------------------:|---------------------:|------------------:| | 50 | 137.30 ms | 90.72% | 0.00% | 90.42% | 0.00% | | 100 | 1.09 s | 98.13% | 0.00% | 98.02% | 0.09% | | 150 | 4.06 s | 99.24% | 0.00% | 99.18% | 0.05% | | 200 | 10.44 s | 99.66% | 0.00% | 99.63% | 0.03% | **Complexity Analysis:** | Analysis Phase | Complexity O(n<sup>k</sup>) | |:------------------|:--------------------------| | Total Analysis | O(n<sup>3.29</sup> &pm; 0.01) | | FactGenerator | (Negligible) | | LoanPropagation | O(n<sup>3.29</sup> &pm; 0.01) | | ExpiredLoans | O(n<sup>1.42</sup> &pm; 0.19) | --- </details>
4 days[clang][BufferUsage] Fix a StringRef lifetime issue (#159109)Timm Baeder1-13/+13
The code before assigned the `std::string` returned from `tryEvaluateString()` to the `StringRef`, but it was possible that the underlying data of that string vanished in the meantime, passing invalid stack memory to `ParsePrintfString`. Fix this by using two different code paths for the `getCharByteWidth() == 1` case and the `tryEvaluateString()` one.
7 days[LifetimeSafety] Add support for GSL Pointer types (#154009)Utkarsh Saxena1-4/+90
This extends the lifetime safety analysis to support C++ types annotated with `gsl::Pointer`, which represent non-owning "view" types like `std::string_view`. These types have the same lifetime safety concerns as raw pointers and references. - Added support for detecting and analyzing `gsl::Pointer` annotated types in lifetime safety analysis - Implemented handling for various expressions involving `gsl::Pointer` types: - Constructor expressions - Member call expressions (especially conversion operators) - Functional cast expressions - Initialization list expressions - Materialized temporary expressions - Updated the pointer type detection to recognize `gsl::Pointer` types - Added handling for function calls that create borrows through reference parameters Fixes: https://github.com/llvm/llvm-project/issues/152513
10 days[LifetimeSafety] Associate origins to all l-valued expressions (#156896)Utkarsh Saxena1-65/+113
This patch refactors the C++ lifetime safety analysis to implement a more consistent model for tracking borrows. The central idea is to make loan creation a consequence of referencing a variable, while making loan propagation dependent on the type's semantics. This change introduces a more uniform model for tracking borrows from non-pointer types: * Centralised Loan Creation: A Loan is now created for every `DeclRefExpr` that refers to a **non-pointer type** (e.g., `std::string`, `int`). This correctly models that any use of an **gl-value** is a borrow of its storage, replacing the previous heuristic-based loan creation. * The address-of operator (&) no longer creates loans. Instead, it propagates the origin (and thus the loans) of its sub-expression. This is guarded to exclude expressions that are already pointer types, deferring the complexity of pointers-to-pointers. **Future Work: Multi-Origin Model** This patch deliberately defers support for creating loans on references to pointer-type expressions (e.g., `&my_pointer`). The current single-origin model is unable to distinguish between a loan to the pointer variable itself (its storage) and a loan to the object it points to. The future plan is to move to a multi-origin model where a type has a "list of origins" governed by its level of indirection, which will allow the analysis to track these distinct lifetimes separately. Once this more advanced model is in place, the restriction can be lifted, and all `DeclRefExpr` nodes, regardless of type, can uniformly create a loan, making the analysis consistent.
10 daysThread Safety Analysis: Basic capability alias-analysis (#142955)Marco Elver2-26/+167
Add basic alias analysis for capabilities by reusing LocalVariableMap, which tracks currently valid definitions of variables. Aliases created through complex control flow are not tracked. This implementation would satisfy the basic needs of addressing the concerns for Linux kernel application [1]. For example, the analysis will no longer generate false positives for cases such as (and many others): void testNestedAccess(Container *c) { Foo *ptr = &c->foo; ptr->mu.Lock(); c->foo.data = 42; // OK - no false positive ptr->mu.Unlock(); } void testNestedAcquire(Container *c) EXCLUSIVE_LOCK_FUNCTION(&c->foo.mu) { Foo *buf = &c->foo; buf->mu.Lock(); // OK - no false positive } Given the analysis is now able to identify potentially unsafe patterns it was not able to identify previously (see added FIXME test case for an example), mark alias resolution as a "beta" feature behind the flag `-Wthread-safety-beta`. **Fixing LocalVariableMap:** It was found that LocalVariableMap was not properly tracking loop-invariant aliases: the old implementation failed because the merge logic compared raw VarDefinition IDs. The algorithm for handling back-edges (in createReferenceContext()) generates new 'reference' definitions for loop-scoped variables. Later ID comparison caused alias invalidation at back-edge merges (in intersectBackEdge()) and at subsequent forward-merges with non-loop paths (in intersectContexts()). Fix LocalVariableMap by adding the getCanonicalDefinitionID() helper that resolves any definition ID down to its non-reference base. As a result, a variable's definition is preserved across control-flow merges as long as its underlying canonical definition remains the same. Link: https://lore.kernel.org/all/CANpmjNPquO=W1JAh1FNQb8pMQjgeZAKCPQUAd7qUg=5pjJ6x=Q@mail.gmail.com/ [1]
11 days[analyzer][NFC] Modernize LivenessValues::isLive (#157800)Balazs Benics1-6/+8
Removing statefullness also adds the benefit of short circuiting.
11 daysReland [clang][dataflow] Transfer more cast expressions. (#157535)Samira Bakon2-19/+94
Reverts llvm/llvm-project#157148 Adds fixes to `TransferVisitor::VisitCXXConstructExpr` and `copyRecord` to avoid crashing on base class initialization from sibling-derived class instances. I believe this is the only use of copyRecord where we need this special handling for a shared base class.
12 days[analyzer][NFC] Change LiveVariablesImpl::inAssignment from DenseMap to ↵Balazs Benics1-3/+4
DenseSet (#157685) The `inAssignment` variable is actually used as a set; let's declare it as a set.
12 days[analyzer][NFC] Modernize loops in LiveVariables analysis (#157670)Balazs Benics1-25/+12
12 days[analyzer][NFC] Remove dead LiveVariables::Observer::observerKill (#157661)Balazs Benics1-33/+0
This API was never used in the clang code base. There might be downstream users, but I highly doubt that. I think the best is to get rid of this unused API.
12 days[clang-tidy] `bugprone-unchecked-optional-access`: handle ↵Valentyn Yukhymenko1-0/+14
`BloombergLP::bdlb:NullableValue::makeValue` to prevent false-positives (#144313) https://github.com/llvm/llvm-project/pull/101450 added support for `BloombergLP::bdlb::NullableValue`. However, `NullableValue::makeValue` and `NullableValue::makeValueInplace` have been missed which impacts code like this: ```cpp if (opt.isNull()) { opt.makeValue(42); } opt.value(); // triggers false positive warning from `bugprone-unchecked-optional-access` ``` My patch addresses this issue. [Docs that I used for methods mocks](https://bloomberg.github.io/bde-resources/doxygen/bde_api_prod/classbdlb_1_1NullableValue.html) --------- Co-authored-by: Baranov Victor <bar.victor.2002@gmail.com>
12 days[analyzer][NFC] Rename LivenessValues::equals to LivenessValues::operator== ↵Balazs Benics1-2/+2
(#157657) This is just more conventional.
12 days[analyzer] In LivenessValues::equals also check liveBindings (#157645)Balazs Benics1-1/+2
This was likely accidentally omitted when `liveBindings` was introduced. I don't think in practice it matters.
2025-09-05[LifetimeSafety] Mark all DeclRefExpr as usages of the corresp. origin (#154316)Utkarsh Saxena1-26/+63
Instead of identifying various forms of pointer usage (like dereferencing, member access, or function calls) individually, this new approach simplifies the logic by treating all `DeclRefExpr`s as uses of their underlying origin When a `DeclRefExpr` appears on the left-hand side of an assignment, the corresponding `UseFact` is marked as a "write" operation. These write operations are then exempted from use-after-free checks.
2025-09-05Revert "[clang][dataflow] Transfer more cast expressions." (#157148)Samira Bakon1-59/+12
Reverts llvm/llvm-project#153066 copyRecord crashes if copying from the RecordStorageLocation shared by the base/derived objects after a DerivedToBase cast because the source type is still `Derived` but the copy destination could be of a sibling type derived from Base that has children not present in `Derived`. For example, running the dataflow analysis over the following produces UB by nullptr deref, or fails asserts if enabled: ```cc struct Base {}; struct DerivedOne : public Base { int DerivedOneField; }; struct DerivedTwo : public Base { int DerivedTwoField; DerivedTwo(const DerivedOne& d1) : Base(d1), DerivedTwoField(d1.DerivedOneField) {} }; ``` The constructor initializer for `DerivedTwoField` serves the purpose of forcing `DerivedOneField` to be modeled, which is necessary to trigger the crash but not the assert failure.
2025-09-03[LifetimeSafety] Fix duplicate loan generation for ImplicitCastExpr (#153661)Utkarsh Saxena1-1/+0
This PR fixes a bug in the lifetime safety analysis where `ImplicitCastExpr` nodes were causing duplicate loan generation. The changes: 1. Remove the recursive `Visit(ICE->getSubExpr())` call in `VisitImplicitCastExpr` to prevent duplicate processing of the same expression 2. Ensure the CFG build options are properly configured for lifetime safety analysis by moving the flag check earlier 3. Enhance the unit test infrastructure to properly handle multiple loans per variable 4. Add a test case that verifies implicit casts to const don't create duplicate loans 5. Add a test case for ternary operators with a FIXME note about origin propagation These changes prevent the analysis from generating duplicate loans when expressions are wrapped in implicit casts, which improves the accuracy of the lifetime safety analysis.
2025-09-01[clang]: Support `analyzer_noreturn` attribute in `CFG` (#150952)Andrey Karlov1-1/+2
## Problem Currently, functions with `analyzer_noreturn` attribute aren't recognized as `no-return` by `CFG`: ```cpp void assertion_handler() __attribute__((analyzer_noreturn)) { log(...); } void handle_error(const std::optional<int> opt) { if (!opt) { fatal_error(); // Static analyzer doesn't know this never returns } *opt = 1; // False-positive `unchecked-optional-access` warning as analyzer thinks this is reachable } ``` ## Solution 1. Extend the `FunctionDecl` class by adding an `isAnalyzerNoReturn()` function 2. Update `CFGBuilder::VisitCallExpr` to check both `FD->isNoReturn()` and `FD->isAnalyzerNoReturn()` properties ## Comments This PR incorporates part of the work done in https://github.com/llvm/llvm-project/pull/146355
2025-08-27[clang] NFC: reintroduce clang/include/clang/AST/Type.h (#155050)Matheus Izvekov15-15/+15
This reintroduces `Type.h`, having earlier been renamed to `TypeBase.h`, as a redirection to `TypeBase.h`, and redirects most users to include the former instead. This is a preparatory patch for being able to provide inline definitions for `Type` methods which would otherwise cause a circular dependency with `Decl{,CXX}.h`. Doing these operations into their own NFC patch helps the git rename detection logic work, preserving the history. This patch makes clang just a little slower to build (~0.17%), just because it makes more code indirectly include `DeclCXX.h`.
2025-08-27[clang] NFC: rename clang/include/clang/AST/Type.h to TypeBase.h (#155049)Matheus Izvekov15-15/+15
This is a preparatory patch, to be able to provide inline definitions for `Type` functions which depend on `Decl{,CXX}.h`. As the latter also depends on `Type.h`, this would not be possible without some reorganizing. Splitting this rename into its own patch allows git to track this as a rename, and preserve all git history, and not force any code reformatting. A later NFC patch will reintroduce `Type.h` as redirection to `TypeBase.h`, rewriting most places back to directly including `Type.h` instead of `TypeBase.h`, leaving only a handful of places where this is necessary. Then yet a later patch will exploit this by making more stuff inline.
2025-08-25[clang] NFC: change more places to use Type::getAsTagDecl and friends (#155313)Matheus Izvekov2-8/+7
This changes a bunch of places which use getAs<TagType>, including derived types, just to obtain the tag definition. This is preparation for #155028, offloading all the changes that PR used to introduce which don't depend on any new helpers.
2025-08-20[clang][dataflow] Fix uninitialized memory bug. (#154575)Yitzhak Mandelbaum1-3/+3
Commit #3ecfc03 introduced a bug involving an uninitialized field in `exportLogicalContext`. This patch initializes the field properly.
2025-08-20[clang][dataflow] Transfer more cast expressions. (#153066)Samira Bakon1-12/+59
Transfer all casts by kind as we currently do implicit casts. This obviates the need for specific handling of static casts. Also transfer CK_BaseToDerived and CK_DerivedToBase and add tests for these and missing tests for already-handled cast types. Ensure that CK_BaseToDerived casts result in modeling of the fields of the derived class.
2025-08-20Thread Safety Analysis: Graduate ACQUIRED_BEFORE() and ACQUIRED_AFTER() from ↵Marco Elver1-2/+1
beta features (#152853) Both these attributes were introduced in ab1dc2d54db5 ("Thread Safety Analysis: add support for before/after annotations on mutexes") back in 2015 as "beta" features. Anecdotally, we've been using `-Wthread-safety-beta` for years without problems. Furthermore, this feature requires the user to explicitly use these attributes in the first place. After 10 years, let's graduate the feature to the stable feature set, and reserve `-Wthread-safety-beta` for new upcoming features.
2025-08-19[clang] Replace SmallSet with SmallPtrSet (NFC) (#154262)Kazu Hirata1-1/+1
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N> {}; We only have 30 instances that rely on this "redirection", with about half of them under clang/. Since the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types. I'm planning to remove the redirection eventually.
2025-08-19[LifetimeSafety] Improve Origin information in debug output (#153951)Utkarsh Saxena1-14/+34
The previous debug output only showed numeric IDs for origins, making it difficult to understand what each origin represented. This change makes the debug output more informative by showing what kind of entity each origin refers to (declaration or expression) and additional details like declaration names or expression class names. This improved output makes it easier to debug and understand the lifetime safety analysis.
2025-08-18[clang][dataflow] Add support for serialization and deserialization. (#152487)Yitzhak Mandelbaum3-0/+233
Adds support for compact serialization of Formulas, and a corresponding parse function. Extends Environment and AnalysisContext with necessary functions for serializing and deserializing all formula-related parts of the environment.
2025-08-18[LifetimeSafety] Implement a basic use-after-free diagnostic (#149731)Utkarsh Saxena1-48/+216
Implement use-after-free detection in the lifetime safety analysis with two warning levels. - Added a `LifetimeSafetyReporter` interface for reporting lifetime safety issues - Created two warning levels: - Definite errors (reported with `-Wexperimental-lifetime-safety-permissive`) - Potential errors (reported with `-Wexperimental-lifetime-safety-strict`) - Implemented a `LifetimeChecker` class that analyzes loan propagation and expired loans to detect use-after-free issues. - Added tracking of use sites through a new `UseFact` class. - Enhanced the `ExpireFact` to track the expressions where objects are destroyed. - Added test cases for both definite and potential use-after-free scenarios. The implementation now tracks pointer uses and can determine when a pointer is dereferenced after its loan has been expired, with appropriate diagnostics. The two warning levels provide flexibility - definite errors for high-confidence issues and potential errors for cases that depend on control flow.
2025-08-09[clang] Improve nested name specifier AST representation (#147835)Matheus Izvekov4-9/+17
This is a major change on how we represent nested name qualifications in the AST. * The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy. * The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present. * TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size. This patch offers a great performance benefit. It greatly improves compilation time for [stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for `test_on2.cpp` in that project, which is the slowest compiling test, this patch improves `-c` compilation time by about 7.2%, with the `-fsyntax-only` improvement being at ~12%. This has great results on compile-time-tracker as well: ![image](https://github.com/user-attachments/assets/700dce98-2cab-4aa8-97d1-b038c0bee831) This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands. It has some other miscelaneous drive-by fixes. About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact. There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work. How to review: The important changes are all in `clang/include/clang/AST` and `clang/lib/AST`, with also important changes in `clang/lib/Sema/TreeTransform.h`. The rest and bulk of the changes are mostly consequences of the changes in API. PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just for easier to rebasing. I plan to rename it back after this lands. Fixes #136624 Fixes https://github.com/llvm/llvm-project/issues/43179 Fixes https://github.com/llvm/llvm-project/issues/68670 Fixes https://github.com/llvm/llvm-project/issues/92757
2025-08-07[-Wunsafe-buffer-usage] Do not warn about class methods with libc function ↵Ziqing Luo1-0/+8
names (#151270) This commit fixes the false positive that C++ class methods with libc function names would be false warned about. For example, ``` struct T {void strcpy() const;}; void test(const T& t) { str.strcpy(); // no warn } ``` rdar://156264388
2025-08-04Thread safety analysis: Allocate FactEntrys with BumpPtrAllocator (#149660)Aaron Puchert1-87/+145
The FactManager managing the FactEntrys stays alive for the analysis of a single function. We are already using that by allocating TIL S-expressions via BumpPtrAllocator. We can do the same with FactEntrys. If we allocate the facts in an arena, we won't get destructor calls for them, and they better be trivially destructible. This required replacing the SmallVector member of ScopedLockableFactEntry with TrailingObjects. FactEntrys are now passed around by plain pointer. However, to hide the allocation of TrailingObjects from users, we introduce `create` methods, and this allows us to make the constructors private.
2025-08-03[LifetimeSafety] Handle pruned-edges (null blocks) in dataflow (#150670)Utkarsh Saxena1-0/+2
Fix a crash in the lifetime safety dataflow analysis when handling null CFG blocks. Added a null check for adjacent blocks in the dataflow analysis algorithm to prevent dereferencing null pointers. This occurs when processing CFG blocks with unreachable successors or predecessors. Original crash: https://compiler-explorer.com/z/qfzfqG5vM Fixes https://github.com/llvm/llvm-project/issues/150095
2025-08-03Thread safety analysis: Don't warn on acquiring reentrant capability (#150857)Aaron Puchert1-1/+1
The point of reentrant capabilities is that they can be acquired multiple times, so they should probably be excluded from requiring a negative capability on acquisition via -Wthread-safety-negative. However, we still propagate explicit negative requirements.
2025-08-01[Analysis] Avoid creating a temporary instance of std::string (NFC) (#151625)Kazu Hirata1-2/+1
hasName takes StringRef, so we don't need to create a temporary instance of std::string.
2025-07-30[-Wunsafe-buffer-usage] Support safe patterns of "%.*s" in printf functions ↵Ziqing Luo1-15/+74
(#145862) The character buffer passed to a "%.*s" specifier may be safely bound if the precision is properly specified, even if the buffer does not guarantee null-termination. For example, ``` void f(std::span<char> span) { printf("%.*s", (int)span.size(), span.data()); // "span.data()" does not guarantee null-termination but is safely bound by "span.size()", so this call is safe } ``` rdar://154072130
2025-07-23[LifetimeSafety] Add loan expiry analysis (#148712)Utkarsh Saxena1-6/+95
This PR adds the `ExpiredLoansAnalysis` class to track which loans have expired. The analysis uses a dataflow lattice (`ExpiredLattice`) to maintain the set of expired loans at each program point. This is a very light weight dataflow analysis and is expected to reach fixed point in ~2 iterations. In principle, this does not need a dataflow analysis but is used for convenience in favour of lean code.
2025-07-22Reapply "[LifetimeSafety] Revamp test suite using unittests (#149158)"Utkarsh Saxena1-46/+135
This reverts commit 54b50681ca0fd1c0c6ddb059c88981a45e2f1b19.
2025-07-22Revert "[LifetimeSafety] Revamp test suite using unittests (#149158)"Utkarsh Saxena1-135/+46
This reverts commit 688ea048affe8e79221ea1a8c376bcf20ef8f3bb.
2025-07-22[analyzer] Prettify checker registration and unittest code (#147797)Donát Nagy3-20/+24
This commit tweaks the interface of `CheckerRegistry::addChecker` to make it more practical for plugins and tests: - The parameter `IsHidden` now defaults to `false` even in the non-templated overload (because setting it to true is unusual, especially in plugins). - The parameter `DocsUri` defaults to the dummy placeholder string `"NoDocsUri"` because (as of now) nothing queries its value from the checker registry (it's only used by the logic that generates the clang-tidy documentation, but that loads it directly from `Checkers.td` without involving the `CheckerRegistry`), so there is no reason to demand specifying this value. In addition to propagating these changes, this commit clarifies, corrects and extends lots of comments and performs various minor code quality improvements in the code of unit tests and example plugins. I originally wrote the bulk of this commit when I was planning to add an extra parameter to `addChecker` in order to implement some technical details of the CheckerFamily framework. At the end I decided against adding that extra parameter, so this cleanup was left out of the PR https://github.com/llvm/llvm-project/pull/139256 and I'm merging it now as a separate commit (after minor tweaks). This commit is mostly NFC: the only functional change is that the analyzer will be compatible with plugins that rely on the default argument values and don't specify `IsHidden` or `DocsUri`. (But existing plugin code will remain valid as well.)
2025-07-22[LifetimeSafety] Revamp test suite using unittests (#149158)Utkarsh Saxena1-46/+135
Refactor the Lifetime Safety Analysis infrastructure to support unit testing. - Created a public API class `LifetimeSafetyAnalysis` that encapsulates the analysis functionality - Added support for test points via a special `TestPointFact` that can be used to mark specific program points - Added unit tests that verify loan propagation in various code patterns
2025-07-22[LifetimeSafety] Add per-program-point lattice tracking (#149199)Utkarsh Saxena1-8/+31
Add per-program-point state tracking to the dataflow analysis framework. - Added a `ProgramPoint` type representing a pair of a CFGBlock and a Fact within that block - Added a `PerPointStates` map to store lattice states at each program point - Modified the `transferBlock` method to store intermediate states after each fact is processed - Added a `getLoans` method to the `LoanPropagationAnalysis` class that uses program points This change enables more precise analysis by tracking program state at each individual program point rather than just at block boundaries. This is necessary for answering queries about the state of loans, origins, and other properties at specific points in the program, which is required for error reporting in the lifetime safety analysis.
2025-07-16[LifetimeSafety] Support bidirectional dataflow analysis (#148967)Utkarsh Saxena1-49/+59
Generalize the dataflow analysis to support both forward and backward analyses. Some program analyses would be expressed as backward dataflow problems (like liveness analysis). This change enables the framework to support both forward analyses (like the loan propagation analysis) and backward analyses with the same infrastructure.
2025-07-16[LifetimeSafety] Make the dataflow analysis generic (#148222)Utkarsh Saxena1-177/+218
Refactored the lifetime safety analysis to use a generic dataflow framework with a policy-based design. ### Changes - Introduced a generic `DataflowAnalysis` template class that can be specialized for different analyses - Renamed `LifetimeLattice` to `LoanPropagationLattice` to better reflect its purpose - Created a `LoanPropagationAnalysis` class that inherits from the generic framework - Moved transfer functions from the standalone `Transferer` class into the analysis class - Restructured the code to separate the dataflow engine from the specific analysis logic - Updated debug output and test expectations to use the new class names ### Motivation In order to add more analyses, e.g. [loan expiry](https://github.com/llvm/llvm-project/pull/148712) and origin liveness, the previous implementation would have separate, nearly identical dataflow runners for each analysis. This change creates a single, reusable component, which will make it much simpler to add subsequent analyses without repeating boilerplate code. This is quite close to the existing dataflow framework!
2025-07-15[NFC][-Wunsafe-buffer-usage] Refactor safe pattern check for pointer-size ↵Ziqing Luo1-101/+109
pairs (#145626) Refactor the safe pattern analysis of pointer and size expression pairs so that the check can be re-used in more places. For example, it can be used to check whether the following cases are safe: - `std::span<T>{ptr, size} // span construction` - `snprintf(ptr, size, "%s", ...) // unsafe libc call` - `printf("%.*s", size, ptr) // unsafe libc call`
2025-07-14[clang] Fix -Wuninitialized for values passed by const pointers (#147221)Igor Kudrin1-2/+0
This enables producing a "variable is uninitialized" warning when a value is passed to a pointer-to-const argument: ``` void foo(const int *); void test() { int *v; foo(v); } ``` Fixes #37460
2025-07-14[clang] Add -Wuninitialized-const-pointer (#148337)Igor Kudrin1-9/+17
This option is similar to -Wuninitialized-const-reference, but diagnoses the passing of an uninitialized value via a const pointer, like in the following code: ``` void foo(const int *); void test() { int v; foo(&v); } ``` This is an extract from #147221 as suggested in [this comment](https://github.com/llvm/llvm-project/pull/147221#discussion_r2190998730).
2025-07-14[LifetimeSafety] Implement dataflow analysis for loan propagation (#148065)Utkarsh Saxena1-1/+253
This patch introduces the core dataflow analysis infrastructure for the C++ Lifetime Safety checker. This change implements the logic to propagate "loan" information across the control-flow graph. The primary goal is to compute a fixed-point state that accurately models which pointer (Origin) can hold which borrow (Loan) at any given program point. Key components * `LifetimeLattice`: Defines the dataflow state, mapping an `OriginID` to a `LoanSet` using `llvm::ImmutableMap`. * `Transferer`: Implements the transfer function, which updates the `LifetimeLattice` by applying the lifetime facts (Issue, AssignOrigin, etc.) generated for each basic block. * `LifetimeDataflow`: A forward dataflow analysis driver that uses a worklist algorithm to iterate over the CFG until the lattice state converges. The existing test suite has been extended to check the final dataflow results. This work is a prerequisite for the final step of the analysis: consuming these results to identify and report lifetime violations.
2025-07-10[clang] Combine ConstRefUse with other warnings for uninitialized values ↵Igor Kudrin1-8/+5
(#147898) This helps to avoid duplicating warnings in cases like: ``` > cat test.cpp void bar(int); void foo(const int &); void test(bool a) { int v = v; if (a) bar(v); else foo(v); } > clang++.exe test.cpp -fsyntax-only -Wuninitialized test.cpp:4:11: warning: variable 'v' is uninitialized when used within its own initialization [-Wuninitialized] 4 | int v = v; | ~ ^ test.cpp:4:11: warning: variable 'v' is uninitialized when used within its own initialization [-Wuninitialized] 4 | int v = v; | ~ ^ 2 warnings generated. ```
2025-07-10[clang][NFC] Remove an unused parameter in CFGBlockValues::getValue() (#147897)Igor Kudrin1-6/+6
The second parameter is unused since 6080d32194.
2025-07-10[LifetimeSafety] Introduce intra-procedural analysis in Clang (#142313)Utkarsh Saxena2-0/+511
This patch introduces the initial implementation of the intra-procedural, flow-sensitive lifetime analysis for Clang, as proposed in the recent RFC: https://discourse.llvm.org/t/rfc-intra-procedural-lifetime-analysis-in-clang/86291 The primary goal of this initial submission is to establish the core dataflow framework and gather feedback on the overall design, fact representation, and testing strategy. The focus is on the dataflow mechanism itself rather than exhaustively covering all C++ AST edge cases, which will be addressed in subsequent patches. #### Key Components * **Conceptual Model:** Introduces the fundamental concepts of `Loan`, `Origin`, and `Path` to model memory borrows and the lifetime of pointers. * **Fact Generation:** A frontend pass traverses the Clang CFG to generate a representation of lifetime-relevant events, such as pointer assignments, taking an address, and variables going out of scope. * **Testing:** `llvm-lit` tests validate the analysis by checking the generated facts. ### Next Steps *(Not covered in this PR but planned for subsequent patches)* The following functionality is planned for the upcoming patches to build upon this foundation and make the analysis usable in practice: * **Dataflow Lattice:** A dataflow lattice used to map each pointer's symbolic `Origin` to the set of `Loans` it may contain at any given program point. * **Fixed-Point Analysis:** A worklist-based, flow-sensitive analysis that propagates the lattice state across the CFG to a fixed point. * **Placeholder Loans:** Introduce placeholder loans to represent the lifetimes of function parameters, forming the basis for analysis involving function calls. * **Annotation and Opaque Call Handling:** Use placeholder loans to correctly model **function calls**, both by respecting `[[clang::lifetimebound]]` annotations and by conservatively handling opaque/un-annotated functions. * **Error Reporting:** Implement the final analysis phase that consumes the dataflow results to generate user-facing diagnostics. This will likely require liveness analysis to identify live origins holding expired loans. * **Strict vs. Permissive Modes:** Add the logic to support both high-confidence (permissive) and more comprehensive (strict) warning levels. * **Expanded C++ Coverage:** Broaden support for common patterns, including the lifetimes of temporary objects and pointers within aggregate types (structs/containers). * Performance benchmarking * Capping number of iterations or number of times a CFGBlock is processed. --------- Co-authored-by: Baranov Victor <bar.victor.2002@gmail.com>
2025-07-01[Analysis] Use range-based for loops (NFC) (#146466)Kazu Hirata6-20/+13