aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-05-31[Analysis] Remove unused includes (NFC) (#142255)Kazu Hirata1-3/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2024-11-26[clang][analysis][NFC]add static for internal linkage function (#117481)Congcong Cai1-6/+8
Detected by misc-use-internal-linkage
2024-11-15[Clang] [NFC] Refactor AST visitors in Sema and the static analyser to use ↵Sirraide1-7/+7
DynamicRecursiveASTVisitor (#115144) This pr refactors all recursive AST visitors in `Sema`, `Analyze`, and `StaticAnalysis` to inherit from DRAV instead. This is over half of the visitors that inherit from RAV directly. See also #115132, #110040, #93462 LLVM Compile-Time Tracker link for this branch: https://llvm-compile-time-tracker.com/compare.php?from=5adb5c05a2e9f31385fbba8b0436cbc07d91a44d&to=b58e589a86c06ba28d4d90613864d10be29aa5ba&stat=instructions%3Au
2024-07-26[clang][dataflow] Handle CXXInheritedCtorInitExpr in ResultObjectVisitor. ↵Pasquale Riello1-1/+1
(#99616) `CXXInheritedCtorInitExpr` is another of the node kinds that should be considered an "original initializer". An assertion failure in `assert(Children.size() == 1)` happens without this fix. --------- Co-authored-by: martinboehme <mboehme@google.com>
2024-07-22[clang][dataflow] Handle this-capturing lambdas in field initializers. (#99519)Samira Bazuzi1-6/+15
We previously would assume these lambdas appeared inside a method definition and end up crashing.
2024-07-13[clang][dataflow]Propagate the result object location for ↵Samira Bazuzi1-2/+8
CXXDefaultInitExpr. (#98490) These are not "original initializers"; the single node underneath represents the initializing node.
2024-06-11[clang][dataflow] Handle `AtomicExpr` in `ResultObjectVisitor`. (#94963)martinboehme1-1/+1
This is one of the node kinds that should be considered an "original initializer". The patch adds a test that was causing an assertion failure in `assert(Children.size() == 1)` without the fix.
2024-06-03[clang][dataflow] Rewrite `getReferencedDecls()` with a ↵martinboehme1-36/+1
`RecursiveASTVisitor`. (#93461) We previously had a hand-rolled recursive traversal here that was exactly what `RecursiveASTVistor` does anyway. Using the visitor not only eliminates the explicit traversal logic but also allows us to introduce a common visitor base class for `getReferencedDecls()` and `ResultObjectVisitor`, ensuring that the two are consistent in terms of the nodes they visit. Inconsistency between these two has caused crashes in the past when `ResultObjectVisitor` tried to propagate result object locations to entities that weren't modeled becasue `getReferencedDecls()` didn't visit them.
2024-05-15[clang][dataflow] Fully support Environment construction for Stmt analysis. ↵Samira Bazuzi1-50/+57
(#91616) Assume in fewer places that the analysis is of a `FunctionDecl`, and initialize the `Environment` properly for `Stmt`s. Moves constructors for `Environment` to header to make it more obvious that there are only minor differences between them and very little initialization in the constructors. Tested with check-clang-tooling.
2024-05-06[clang][dataflow] Don't propagate result objects in unevaluated contexts ↵martinboehme1-0/+11
(reland #90438) (#91172) This relands #90348 with a fix for a [buildbot failure](https://lab.llvm.org/buildbot/#/builders/216/builds/38446) caused by the test being run with `-fno-rtti`.
2024-05-02Revert "[clang][dataflow] Don't propagate result objects in unevaluated ↵Weaver1-11/+0
contexts (#90438)" This reverts commit 597a3150e932a9423c65b5ea4b53dd431aff5865. Caused test failure on the following buildbot: https://lab.llvm.org/buildbot/#/builders/216/builds/38446
2024-05-02[clang][dataflow] Don't propagate result objects in unevaluated contexts ↵martinboehme1-0/+11
(#90438) Trying to do so can cause crashes -- see newly added test and the comments in the fix. We're starting to see a repeating pattern here: We're getting crashes because `ResultObjectVisitor` and `getReferencedDecls()` don't agree on which parts of the AST to visit and, hence, which fields should be modeled. I think we should ensure consistency between these two parts of the code by using a `RecursiveASTVisitor` in `getReferencedDecls()`[^1]; the `Traverse...()` functions that control which parts of the AST we visit would go in a common base class that would be used for both `ResultObjectVisitor` and `getReferencedDecls()`. I'd like to focus this PR, however, on a targeted fix for the current crash and postpone the refactoring to a later PR (which will be easier to revert if there are unintended side-effects). [^1]: As an added bonus, this would make the code better structured and more efficient than the current sequence of `if (dyn_cast<T>(...))` statements).
2024-04-25[clang][dataflow] Crash fix for `widenDistinctValues()`. (#89895)martinboehme1-4/+9
We used to crash if the previous iteration contained a `BoolValue` and the current iteration contained an `IntegerValue`. The accompanying test sets up this situation -- see comments there for details. While I'm here, clean up the tests for integral casts to use the test helpers we have available now. I was looking at these tests to understand how we handle integral casts, and the test helpers make the tests easier to read.
2024-04-25[clang][dataflow] Don't propagate result objects in nested declarations. ↵martinboehme1-0/+12
(#89903) Trying to do so can cause crashes -- see newly added test and the comments in the fix.
2024-04-23Reapply "[clang][dataflow] Model conditional operator correctly." with fixes ↵martinboehme1-22/+24
(#89596) I reverted https://github.com/llvm/llvm-project/pull/89213 beause it was causing buildbots to fail with assertion failures. Embarrassingly, it turns out I had been running tests locally in `Release` mode, i.e. with `assert()` compiled away. This PR re-lands #89213 with fixes for the failing assertions.
2024-04-22Revert "[clang][dataflow] Model conditional operator correctly." (#89577)martinboehme1-24/+22
Reverts llvm/llvm-project#89213 This is causing buildbot failures.
2024-04-22[clang][dataflow] Model conditional operator correctly. (#89213)martinboehme1-22/+24
2024-04-19[clang][dataflow][NFC] Fix code formatting in DataflowEnvironment.cpp (#89352)martinboehme1-8/+6
For some reason, when I merged #89235, two lines were mis-formatted. This patch corrects this; while I'm here, I'm also correcting other existing formatting errors.
2024-04-19[clang][nullability] Remove `RecordValue`. (#89052)martinboehme1-72/+26
This class no longer serves any purpose; see also the discussion here: https://reviews.llvm.org/D155204#inline-1503204 A lot of existing tests in TransferTest.cpp check for the existence of `RecordValue`s. Some of these checks are now simply redundant and have been removed. In other cases, tests were checking for the existence of a `RecordValue` as a way of testing whether a record has been initialized. I have typically changed these test to instead check whether a field of the record has a value.
2024-04-19[clang][dataflow] Support `CXXParenListInitExpr` in ↵martinboehme1-19/+29
`PropagateResultObject()`. (#89235)
2024-04-18Revert "[clang][dataflow] Refactor `PropagateResultObject()` with a switch ↵martinboehme1-50/+36
statement." (#89176) Reverts llvm/llvm-project#88865 There were failing tests in the CI that I didn't notice. Sorry.
2024-04-18[clang][dataflow] Refactor `PropagateResultObject()` with a switch ↵martinboehme1-36/+50
statement. (#88865) See also discussion in #88726.
2024-04-17[clang][dataflow] Treat `BuiltinBitCastExpr` correctly in ↵martinboehme1-1/+5
`PropagateResultObject()`. (#88875) This patch includes a test that assert-fails without the fix.
2024-04-17[clang][dataflow] Support `StmtExpr` in `PropagateResultObject()`. (#88872)martinboehme1-0/+5
This patch adds a test that assert-fails without the fix.
2024-04-16[clang][dataflow] Expose getReferencedDecls and relocate free functions. ↵Samira Bazuzi1-172/+5
(#88754) Moves free functions from DataflowEnvironment.h/cc and DataflowAnalysisContext.h/cc to RecordOps and a new ASTOps and exposes them as needed for current use and to expose getReferencedDecls for out-of-tree use. Minimal change in functionality, only to modify the return type of getReferenceDecls to return the collected decls instead of using output params. Tested with `ninja check-clang-tooling`.
2024-04-16[clang][dataflow] Fix result object location for builtin `<=>`. (#88726)martinboehme1-0/+5
The newly added test causes an assertion failure in `PropagateResultObject()` without the fix added here.
2024-04-11[clang][dataflow] Reland #87320: Propagate locations from result objects to ↵martinboehme1-120/+307
initializers. (#88316) This relands #87320 and additionally removes the now-unused function `isOriginalRecordConstructor()`, which was causing buildbots to fail.
2024-04-10Revert "[clang][dataflow] Propagate locations from result objects to ↵martinboehme1-307/+98
initializers." (#88315) Reverts llvm/llvm-project#87320 This is causing buildbots to fail because `isOriginalRecordConstructor()` is now unused.
2024-04-10[clang][dataflow] Propagate locations from result objects to initializers. ↵martinboehme1-98/+307
(#87320) Previously, we were propagating storage locations the other way around, i.e. from initializers to result objects, using `RecordValue::getLoc()`. This gave the wrong behavior in some cases -- see the newly added or fixed tests in this patch. In addition, this patch now unblocks removing the `RecordValue` class entirely, as we no longer need `RecordValue::getLoc()`. With this patch, the test `TransferTest.DifferentReferenceLocInJoin` started to fail because the framework now always uses the same storge location for a `MaterializeTemporaryExpr`, meaning that the code under test no longer set up the desired state where a variable of reference type is mapped to two different storage locations in environments being joined. Rather than trying to modify this test to set up the test condition again, I have chosen to replace the test with an equivalent test in DataflowEnvironmentTest.cpp that sets up the test condition directly; because this test is more direct, it will also be less brittle in the face of future changes.
2024-04-07Fix warnings discovered by #87348 [-Wunused-but-set-variable]NAKAMURA Takumi1-0/+1
2024-04-04[clang][dataflow] Refactor `widen` API to be explicit about change effect. ↵Yitzhak Mandelbaum1-24/+25
(#87233) The previous API relied on pointer equality of inputs and outputs to signal whether a change occured. This was too subtle and led to bugs in practice. It was also very limiting: the override could not return an equivalent (but not identical) value.
2024-03-28[clang][dataflow] Introduce a helper class for handling record initializer ↵martinboehme1-0/+36
lists. (#86675) This is currently only used in one place, but I'm working on a patch that will use this from a second place. And I think this already improves the readability of the one place this is used so far.
2024-03-26[NFC][CLANG] Fix static analyzer bugs about unnecessary object copies with ↵smanna121-1/+1
auto keyword (#85962) Reported by Static Analyzer Tool: In clang::dataflow::Environment::initialize(): Using the auto keyword without an & causes the copy of an object of type LambdaCapture
2024-03-18[clang][dataflow] Fix `getResultObjectLocation()` on `CXXDefaultArgExpr`. ↵martinboehme1-0/+1
(#85072) This patch includes a test that causes an assertion failure without the other changes in this patch.
2024-03-08[clang][dataflow] When analyzing ctors, don't initialize fields of `*this` ↵martinboehme1-2/+19
with values. (#84164) This is the constructor's job, and we want to be able to test that it does this.
2024-03-07[clang][nullability] Don't discard expression state before end of ↵martinboehme1-4/+24
full-expression. (#82611) In https://github.com/llvm/llvm-project/pull/72985, I made a change to discard expression state (`ExprToLoc` and `ExprToVal`) at the beginning of each basic block. I did so with the claim that "we never need to access entries from these maps outside of the current basic block", noting that there are exceptions to this claim when control flow happens inside a full-expression (the operands of `&&`, `||`, and the conditional operator live in different basic blocks than the operator itself) but that we already have a mechanism for retrieving the values of these operands from the environment for the block they are computed in. It turns out, however, that the operands of these operators aren't the only expressions whose values can be accessed from a different basic block; when control flow happens within a full-expression, that control flow can be "interposed" between an expression and its parent. Here is an example: ```cxx void f(int*, int); bool cond(); void target() { int i = 0; f(&i, cond() ? 1 : 0); } ``` ([godbolt](https://godbolt.org/z/hrbj1Mj3o)) In the CFG[^1] , note how the expression for `&i` is computed in block B4, but the parent of this expression (the `CallExpr`) is located in block B1. The the argument expression `&i` and the `CallExpr` are essentially "torn apart" into different basic blocks by the conditional operator in the second argument. In other words, the edge between the `CallExpr` and its argument `&i` straddles the boundary between two blocks. I used to think that this scenario -- where an edge between an expression and one of its children straddles a block boundary -- could only happen between the expression that triggers the control flow (`&&`, `||`, or the conditional operator) and its children, but the example above shows that other expressions can be affected as well; the control flow is still triggered by `&&`, `||` or the conditional operator, but the expressions affected lie outside these operators. Discarding expression state too soon is harmful. For example, an analysis that checks the arguments of the `CallExpr` above would not be able to retrieve a value for the `&i` argument. This patch therefore ensures that we don't discard expression state before the end of a full-expression. In other cases -- when the evaluation of a full-expression is complete -- we still want to discard expression state for the reasons explained in https://github.com/llvm/llvm-project/pull/72985 (avoid performing joins on boolean values that are no longer needed, which unnecessarily extends the flow condition; improve debuggability by removing clutter from the expression state). The impact on performance from this change is about a 1% slowdown in the Crubit nullability check benchmarks: ``` name old cpu/op new cpu/op delta BM_PointerAnalysisCopyPointer 71.9µs ± 1% 71.9µs ± 2% ~ (p=0.987 n=15+20) BM_PointerAnalysisIntLoop 190µs ± 1% 192µs ± 2% +1.06% (p=0.000 n=14+16) BM_PointerAnalysisPointerLoop 325µs ± 5% 324µs ± 4% ~ (p=0.496 n=18+20) BM_PointerAnalysisBranch 193µs ± 0% 192µs ± 4% ~ (p=0.488 n=14+18) BM_PointerAnalysisLoopAndBranch 521µs ± 1% 525µs ± 3% +0.94% (p=0.017 n=18+19) BM_PointerAnalysisTwoLoops 337µs ± 1% 341µs ± 3% +1.19% (p=0.004 n=17+19) BM_PointerAnalysisJoinFilePath 1.62ms ± 2% 1.64ms ± 3% +0.92% (p=0.021 n=20+20) BM_PointerAnalysisCallInLoop 1.14ms ± 1% 1.15ms ± 4% ~ (p=0.135 n=16+18) ``` [^1]: ``` [B5 (ENTRY)] Succs (1): B4 [B1] 1: [B4.9] ? [B2.1] : [B3.1] 2: [B4.4]([B4.6], [B1.1]) Preds (2): B2 B3 Succs (1): B0 [B2] 1: 1 Preds (1): B4 Succs (1): B1 [B3] 1: 0 Preds (1): B4 Succs (1): B1 [B4] 1: 0 2: int i = 0; 3: f 4: [B4.3] (ImplicitCastExpr, FunctionToPointerDecay, void (*)(int *, int)) 5: i 6: &[B4.5] 7: cond 8: [B4.7] (ImplicitCastExpr, FunctionToPointerDecay, _Bool (*)(void)) 9: [B4.8]() T: [B4.9] ? ... : ... Preds (1): B5 Succs (2): B2 B3 [B0 (EXIT)] Preds (1): B1 ```
2024-03-01[clang][dataflow] Correctly treat empty initializer lists for unions. (#82986)martinboehme1-5/+20
This fixes a crash introduced by https://github.com/llvm/llvm-project/pull/82348 but also adds additional handling to make sure that we treat empty initializer lists for both unions and structs/classes correctly (see tests added in this patch).
2024-02-26Revert "[clang][dataflow] Correctly handle `InitListExpr` of union type." ↵Samira Bazuzi1-14/+4
(#82856) Reverts llvm/llvm-project#82348, which caused crashes when analyzing empty InitListExprs for unions, e.g. ```cc union U { double double_value; int int_value; }; void target() { U value; value = {}; } ``` Co-authored-by: Samira Bazuzi <bazuzi@users.noreply.github.com>
2024-02-21[clang][dataflow] Correctly handle `InitListExpr` of union type. (#82348)martinboehme1-4/+14
2024-02-13[clang][Dataflow] Fix unnecessary copy in `initializeFieldsWithValues` (NFC)Antonio Frighetto1-1/+1
2024-02-13[clang][dataflow] Add `Environment::initializeFieldsWithValues()`. (#81239)martinboehme1-27/+47
This function will be useful when we change the behavior of record-type prvalues so that they directly initialize the associated result object. See also the comment here for more details: https://github.com/llvm/llvm-project/blob/9e73656af524a2c592978aec91de67316c5ce69f/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h#L354 As part of this patch, we document and assert that synthetic fields may not have reference type. There is no practical use case for this: A `StorageLocation` may not have reference type, and a synthetic field of the corresponding non-reference type can serve the same purpose.
2024-02-06[clang][dataflow] Add new `join` API and replace existing `merge` ↵Yitzhak Mandelbaum1-31/+28
implementations. (#80361) This patch adds a new interface for the join operation, now properly called `join`. Originally, the framework offered a single `merge` operation, which could serve either as a join or a widening. In practice, though we found this conflation didn't work for non-trivial anlyses, and split of the widening operation (`widen`). This change completes the transition by introducing a proper `join` with strict join semantics. In the process, it drops an odd (and often misused) aspect of `merge` wherein callees could implictly instruct the framework to drop the current entry by returning `false`. This features was never used correctly in analyses and doesn't belong in a join operation, so it is omitted. --------- Co-authored-by: Dmitri Gribenko <gribozavr@gmail.com> Co-authored-by: martinboehme <mboehme@google.com>
2024-01-31[clang][dataflow] Extend debug output for `Environment`. (#79982)martinboehme1-6/+30
* Print `ReturnLoc`, `ReturnVal`, and `ThisPointeeLoc` if applicable. * For entries in `LocToVal` that correspond to declarations, print the names of the declarations next to them. I've removed the FIXME because all relevant fields are now being dumped. I'm not sure we actually need the capability for the caller to specify which fields to dump, so I've simply deleted this part of the comment. Some examples of the output: ![image](https://github.com/llvm/llvm-project/assets/29098113/17d0978f-b86d-4555-8a61-d1f2021f8d59) ![image](https://github.com/llvm/llvm-project/assets/29098113/021dbb24-5fe2-4720-8a08-f48dcf4b88f8)
2024-01-24[clang][dataflow] Eliminate two uses of `RecordValue::getLoc()`. (#79163)martinboehme1-2/+2
This is a small step towards eventually eliminating `RecordValue` entirely.
2024-01-22[clang][dataflow] Treat comma operator correctly in ↵martinboehme1-2/+7
`getResultObjectLocation()`. (#78427)
2024-01-18[clang][dataflow] Consider `CXXDefaultInitExpr` to be an "original record ↵martinboehme1-0/+1
ctor". (#78423) The CFG doesn't contain a CFGElement for the `CXXDefaultInitExpr::getInit()`, so it makes sense to consider the `CXXDefaultInitExpr` to be the expression that originally constructs the object.
2024-01-16[clang][dataflow] Use `ignoreCFGOmittedNodes()` in `setValue()`. (#78245)martinboehme1-4/+6
This is to be consistent with `getValue()`, which also uses `ignoreCFGOmittedNodes()`. Before this fix, it was not possible to retrieve a `Value` from a "CFG omitted" node that had previously been set using `setValue()`; see the accompanying test, which fails without the fix. I discovered this issue while running internal integration tests on https://github.com/llvm/llvm-project/pull/78127.
2024-01-16[clang][dataflow] Tighten checking for existence of a function body. (#78163)martinboehme1-2/+2
In various places, we would previously call `FunctionDecl::hasBody()` (which checks whether any redeclaration of the function has a body, not necessarily the one on which `hasBody()` is being called). This is bug-prone, as a recent bug in Crubit's nullability checker has shown ([fix](https://github.com/google/crubit/commit/4b01ed0f14d953cda20f92d62256e7365d206b2e), [fix for the fix](https://github.com/google/crubit/commit/e0c5d8ddd7d647da483c2ae198ff91d131c12055)). Instead, we now use `FunctionDecl::doesThisDeclarationHaveABody()` which, as the name implies, checks whether the specific redeclaration it is being called on has a body. Alternatively, I considered being more lenient and "canonicalizing" to the `FunctionDecl` that has the body if the `FunctionDecl` being passed is a different redeclaration. However, this also risks hiding bugs: A caller might inadverently perform the analysis for all redeclarations of a function and end up duplicating work without realizing it. By accepting only the redeclaration that contains the body, we prevent this. I've checked, and all clients that I'm aware of do currently pass in the redeclaration that contains the function body. Typically this is because they use the `ast_matchers::hasBody()` matcher which, unlike `FunctionDecl::hasBody()`, only matches for the redeclaration containing the body.
2023-12-21[clang][dataflow] Add `Environment::get<>()`. (#76027)martinboehme1-6/+5
This template function casts the result of `getValue()` or `getStorageLocation()` to a given subclass of `Value` or `StorageLocation` (using `cast_or_null`). It's a common pattern to do something like this: ```cxx auto *Val = cast_or_null<PointerValue>(Env.getValue(E)); ``` This can now be expressed more concisely like this: ```cxx auto *Val = Env.get<PointerValue>(E); ``` Instead of adding a new method `get()`, I had originally considered simply adding a template parameter to `getValue()` and `getStorageLocation()` (with a default argument of `Value` or `StorageLocation`), but this results in an undesirable repetition at the callsite, e.g. `getStorageLocation<RecordStorageLocation>(...)`. The `Value` and `StorageLocation` in the method name adds nothing of value when the template argument already contains this information, so it seemed best to shorten the method name to simply `get()`.
2023-12-18[clang][dataflow] Fix an issue with ↵martinboehme1-21/+58
`Environment::getResultObjectLocation()`. (#75483) So far, if there was a chain of record type prvalues, `getResultObjectLocation()` would assign a different result object location to each one. This makes no sense, of course, as all of these prvalues end up initializing the same result object. This patch fixes this by propagating storage locations up through the entire chain of prvalues. The new implementation also has the desirable effect of making it possible to make `getResultObjectLocation()` const, which seems appropriate given that, logically, it is just an accessor.