Age | Commit message (Collapse) | Author | Files | Lines |
|
This commit converts the class DereferenceChecker to the checker family
framework and simplifies some parts of the implementation.
This commit is almost NFC, the only technically "functional" change is
that it removes the hidden modeling checker `DereferenceModeling` which
was only relevant as an implementation detail of the old checker
registration procedure.
|
|
MallocChecker.cpp has a complex heuristic that supresses reports where
the memory release happens during the release of a reference-counted
object (to suppress a significant amount of false positives).
Previously this logic asserted that there is at most one release point
corresponding to a symbol, but it turns out that there is a rare corner
case where the symbol can be released, forgotten and then released
again. This commit removes that assertion to avoid the crash. (As this
issue just affects a bug suppression heuristic, I didn't want to dig
deeper and modify the way the state of the symbol is changed.)
Fixes #149754
|
|
sugar types (#149613)
The checks for the 'z' and 't' format specifiers added in the original
PR #143653 had some issues and were overly strict, causing some build
failures and were consequently reverted at
https://github.com/llvm/llvm-project/commit/4c85bf2fe8042c855c9dd5be4b02191e9d071ffd.
In the latest commit
https://github.com/llvm/llvm-project/pull/149613/commits/27c58629ec76a703fde9c0b99b170573170b4a7a,
I relaxed the checks for the 'z' and 't' format specifiers, so warnings
are now only issued when they are used with mismatched types.
The original intent of these checks was to diagnose code that assumes
the underlying type of `size_t` is `unsigned` or `unsigned long`, for
example:
```c
printf("%zu", 1ul); // Not portable, but not an error when size_t is unsigned long
```
However, it produced a significant number of false positives. This was
partly because Clang does not treat the `typedef` `size_t` and
`__size_t` as having a common "sugar" type, and partly because a large
amount of existing code either assumes `unsigned` (or `unsigned long`)
is `size_t`, or they define the equivalent of size_t in their own way
(such as
sanitizer_internal_defs.h).https://github.com/llvm/llvm-project/blob/2e67dcfdcd023df2f06e0823eeea23990ce41534/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h#L203
|
|
(#148988)
This patch addresses the lack of support for parenthesized
initialization in the Clang Static Analyzer's `ExprEngine`. Previously,
initializations such as `V v(1, 2);` were not modeled properly, which
could lead to false negatives in analyses like `DivideZero`.
```cpp
struct A {
int x;
A(int v) : x(v) {}
};
int t() {
A a(42);
return 1 / (a.x - 42); // expected-warning {{Division by zero}}
}
```
Fixes #148875
|
|
sugar types instead of built-in types (#143653)"
This reverts commit c27e283cfbca2bd22f34592430e98ee76ed60ad8.
A builbot failure has been reported:
https://lab.llvm.org/buildbot/#/builders/186/builds/10819/steps/10/logs/stdio
I'm also getting a large number of warnings related to %zu and %zx.
|
|
types instead of built-in types (#143653)
Including the results of `sizeof`, `sizeof...`, `__datasizeof`,
`__alignof`, `_Alignof`, `alignof`, `_Countof`, `size_t` literals, and
signed `size_t` literals, the results of pointer-pointer subtraction and
checks for standard library functions (and their calls).
The goal is to enable clang and downstream tools such as clangd and
clang-tidy to provide more portable hints and diagnostics.
The previous discussion can be found at #136542.
This PR implements this feature by introducing a new subtype of `Type`
called `PredefinedSugarType`, which was considered appropriate in
discussions. I tried to keep `PredefinedSugarType` simple enough yet not
limited to `size_t` and `ptrdiff_t` so that it can be used for other
purposes. `PredefinedSugarType` wraps a canonical `Type` and provides a
name, conceptually similar to a compiler internal `TypedefType` but
without depending on a `TypedefDecl` or a source file.
Additionally, checks for the `z` and `t` format specifiers in format
strings for `scanf` and `printf` were added. It will precisely match
expressions using `typedef`s or built-in expressions.
The affected tests indicates that it works very well.
Several code require that `SizeType` is canonical, so I kept `SizeType`
to its canonical form.
The failed tests in CI are allowed to fail. See the
[comment](https://github.com/llvm/llvm-project/pull/135386#issuecomment-3049426611)
in another PR #135386.
|
|
|
|
Variables `stdin`, `stdout`, `stderr` are now added to internal memory space
instead of system memory space. The system memory space is invalidated
at calls to system-defined functions which is not the correct behavior for
the standard stream variables.
|
|
This commit removes the DoubleDelete bug type from `MallocChecker.cpp`
because it's completely redundant with the `DoubleFree` bug (which is
already used for all allocator families, including new/delete).
This simplifies the code of the checker and prevents the potential
confusion caused by two semantically equivalent and very similar, but
not identical bug report messages.
|
|
This commit converts MallocChecker to the new checker family framework
that was introduced in the recent commit
6833076a5d9f5719539a24e900037da5a3979289 -- and gets rid of some awkward
unintended interactions between the checker frontends.
|
|
(#146859)
Fix crash when ConditionBRVisitor::VisitTerminator is faced with `if
consteval` which doesn't have the expected traditional condition
Fixes #139130
|
|
(#146212)
Bounded string functions takes smallest of two values as it's copy size
(`amountCopied` variable in `evalStrcpyCommon`), and it's used to
decided whether this operation will cause out-of-bound access and
invalidate it's super region if it does.
for `strlcat`: `amountCopied = min (size - dstLen - 1 , srcLen)`
for others: `amountCopied = min (srcLen, size)`
Currently when one of two values is unknown or `SValBuilder` can't
decide which one is smaller, `amountCopied` will remain `UnknownVal`,
which will invalidate copy destination's super region unconditionally.
This patch add check to see if one of these two values is definitely
in-bound, if so `amountCopied` has to be in-bound too, because it‘s less
than or equal to them, we can avoid the invalidation of super region and
some related false positives in this situation.
Note: This patch uses `size` as an approximation of `size - dstLen - 1`
in `strlcat` case because currently analyzer doesn't handle complex
expressions like this very well.
Closes #143807.
|
|
unnamed bit-field (#145066)
For the following code in C mode: https://godbolt.org/z/3eo1MeGhe
(There is no warning in C++ mode though).
```c++
struct B {
int i : 2;
int : 30; // unnamed bit-field
};
extern void consume_B(struct B);
void bitfield_B_init(void) {
struct B b1;
b1.i = 1; // b1 is initialized
consume_B(b1); // FP: Passed-by-value struct argument contains uninitialized data (e.g., field: '') [core.CallAndMessage]
}
```
|
|
(#146369)
CFEqual is a trivial function, so treat it as safe.
|
|
Mostly `else-after-return` and `else-after-continue` warnings
|
|
SourceLocation (#145711)
Introduce a type alias for the commonly used `std::pair<FileID,
unsigned>` to improve code readability, and make it easier for future
updates (64-bit source locations).
|
|
For lambdas that are converted to C function pointers,
```
int (*ret_zero)() = []() { return 0; };
```
clang will generate conversion method like:
```
CXXConversionDecl implicit used constexpr operator int (*)() 'int (*() const noexcept)()' inline
-CompoundStmt
-ReturnStmt
-ImplicitCastExpr 'int (*)()' <FunctionToPointerDecay>
-DeclRefExpr 'int ()' lvalue CXXMethod 0x5ddb6fe35b18 '__invoke' 'int ()'
-CXXMethodDecl implicit used __invoke 'int ()' static inline
-CompoundStmt (empty)
```
Based on comment in Sema, `__invoke`'s function body is left empty
because it's will be filled in CodeGen, so in AST analysis phase we
should get lambda's `operator()` directly instead of calling `__invoke`
itself.
|
|
Fixes #144884
|
|
This commit converts the class DynamicTypePropagation to a very simple
checker family, which has only one checker frontend -- but also supports
enabling the backend ("modeling checker") without the frontend.
As a tangentially related change, this commit adds the backend of
DynamicTypePropagation as a dependency of alpha.core.DynamicTypeChecker
in Checkers.td, because the header comment of DynamicTypeChecker.cpp
claims that it depends on DynamicTypePropagation and the source code
seems to confirm this.
(The lack of this dependency relationship didn't cause problems, because
'core.DynamicTypePropagation' is in the group 'core', so it is
practically always active. However, explicitly declaring the dependency
clarifies the fact that the separate existence of the modeling checker
is warranted.)
|
|
## Purpose
This patch makes a minor changes to LLVM and Clang so that LLVM can be
built as a Windows DLL with `clang-cl`. These changes were not required
for building a Windows DLL with MSVC.
## Background
The Windows DLL effort is tracked in #109483. Additional context is
provided in [this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
## Overview
Specific changes made in this patch:
- Remove `constexpr` fields that reference DLL exported symbols. These
symbols cannot be resolved at compile time when building a Windows DLL
using `clang-cl`, so they cannot be `constexpr`. Instead, they are made
`const` and initialized in the implementation file rather than at
declaration in the header.
- Annotate symbols now defined out-of-line with `LLVM_ABI` so they are
exported when building as a shared library.
- Explicitly add default copy assignment operator for `ELFFile` to
resolve a compiler warning.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
|
|
(#144491)
This patch corrects the state of the error node generated by the
core.DivideZero checker when it detects potential division by zero
involving a tainted denominator.
The checker split in
https://github.com/llvm/llvm-project/pull/106389/commits/91ac5ed10a154410c246d985752c1bbfcf23b105
started to introduce a conflicting assumption about the denominator into
the error node:
Node with the Bug Report "Division by a tainted value, possibly zero"
has an assumption "denominator != 0".
This has been done as a shortcut to continue analysis with the correct
assumption *after* the division - if we proceed, we can only assume the
denominator was not zero. However, this assumption is introduced
one-node too soon, leading to a self-contradictory error node.
In this patch, I make the error node with assumption of zero denominator
fatal, but allow analysis to continue on the second half of the state
split with the assumption of non-zero denominator.
---
CPP-6376
|
|
This commit converts NullabilityChecker to the new checker family
framework that was introduced in the recent commit
6833076a5d9f5719539a24e900037da5a3979289
This commit removes the dummy checker `nullability.NullabilityBase`
because it was hidden from the users and didn't have any useful role
except for helping the registration of the checker parts in the old
ad-hoc system (which is replaced by the new standardized framework).
Except for the removal of this dummy checker, no functional changes
intended.
|
|
Placement new does not allocate memory, so it should not be reported as
a memory leak. A recent MallocChecker refactor changed inlining of
placement-new calls with manual evaluation by MallocChecker.
https://github.com/llvm/llvm-project/commit/339282d49f5310a2837da45c0ccc19da15675554
This change avoids marking the value returned by placement new as
allocated and hence avoids the false leak reports.
Note that the there are two syntaxes to invoke placement new:
`new (p) int` and an explicit operator call `operator new(sizeof(int), p)`.
The first syntax was already properly handled by the engine.
This change corrects handling of the second syntax.
CPP-6375
|
|
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
|
|
This removes the delayed typo correction functionality from Clang
(regular typo correction still remains) due to fragility of the
solution.
An RFC was posted here:
https://discourse.llvm.org/t/rfc-removing-support-for-delayed-typo-correction/86631
and while that RFC was asking for folks to consider stepping up to be
maintainers, and we did have a few new contributors show some interest,
experiments show that it's likely worth it to remove this functionality
entirely and focus efforts on improving regular typo correction.
This removal fixes ~20 open issues (quite possibly more), improves
compile time performance by roughly .3-.4%
(https://llvm-compile-time-tracker.com/?config=Overview&stat=instructions%3Au&remote=AaronBallman&sortBy=date),
and does not appear to regress diagnostic behavior in a way we wouldn't
find acceptable.
Fixes #142457
Fixes #139913
Fixes #138850
Fixes #137867
Fixes #137860
Fixes #107840
Fixes #93308
Fixes #69470
Fixes #59391
Fixes #58172
Fixes #46215
Fixes #45915
Fixes #45891
Fixes #44490
Fixes #36703
Fixes #32903
Fixes #23312
Fixes #69874
|
|
NS_REQUIRES_PROPERTY_DEFINITIONS (#143408)
Allow @property of a raw pointer when NS_REQUIRES_PROPERTY_DEFINITIONS
is specified on the interface since such an interface does not
automatically synthesize raw pointer ivars.
Also emit a warning for @property(assign) and
@property(unsafe_unretained) under ARC as well as when explicitly
synthesizing a unsafe raw pointer property.
|
|
This PR adds the WebKit checker support for
[[clang::annotate_type("webkit.pointerconversion")]].
When this attribute is set on the return value of a function, the
function is treated as safe to call anywhere and the return value's
pointer origin is the argument.`
|
|
CheckedPtr as safe. (#142485)
It's safe for a member function of a class or struct to call a function
or allocate a local variable with a pointer or a reference to a member
variable since "this" pointer, and therefore all its members, will be
kept alive by its caller so recognize as such.
|
|
|
|
not result in a warning (#142471)
This PR fixes the bug that the checker emits a warning when a function
takes T&& and passes it to another function using std::move. We should
treat std::move like any other pointer conversion and the origin of the
pointer to be that of the argument.
|
|
Previously some checkers attached explicitly created program point tags
to some of the exploded graph nodes that they created. In most of the
checkers this ad-hoc tagging only affected the debug dump of the
exploded graph (and they weren't too relevant for debugging) so this
commit removes them.
There were two checkers where the tagging _did_ have a functional role:
- In `RetainCountChecker` the presence of tags were checked by
`RefCountReportVisitor`.
- In `DynamicTypePropagation` the checker sometimes wanted to create two
identical nodes and had to apply an explicit tag on the second one to
avoid "caching out".
In these two situations I preserved the tags but switched to using
`SimpleProgramPointTag` instead of `CheckerProgramPointTag` because
`CheckerProgramPointTag` didn't provide enough benefits to justify its
existence.
Note that this commit depends on the earlier commit "[analyzer] Fix
tagging of PostAllocatorCall" ec96c0c072ef3f78813c378949c00e1c07aa44e5
and would introduce crashes when cherry-picked onto a branch that
doesn't contain that commit.
For more details about the background see the discourse thread
https://discourse.llvm.org/t/role-of-programpointtag-in-the-static-analyzer/
As a tangentially related changes, this commit also adds some comments
to document the surprising behavior of `CheckerContext::addTransition`
and an assertion in the constructor of `PathSensitiveBugReport` to get a
more readable crash dump in the case when the report is constructed with
`nullptr` as the `ErrorNode`. (This can happen due to "caching out".)
|
|
The function `tryExpandAsInteger` attempts to extract an integer from a
macro definition. Previously, the attempt would fail when the macro is
from a PCH, because the function tried to access the text buffer of the
source file, which does not exist in case of PCHs. The fix uses
`Preprocessor::getSpelling`, which works in either cases.
rdar://151403070
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
|
|
By design the `Location` data member of a `CheckerContext` is always a
`ProgramPoint` which is tagged with the currently active checker (note
that all checker classes are subclasses of `ProgramPointTag`). This
ensures that exploded nodes created by the checker are by default tagged
by the checker object unless the checker specifies some other tag (e.g.
a note tag) when it calls the `addTransition`-like method that creates
the node.
This was followed by all the `CheckerManager::runCheckersForXXX`
methods, except for `runCheckerForNewAllocator`, where the
implementation constructed the `PostAllocatorCall` program point which
was used to create the `CheckerContext` without passing
`checkFn.Checker` as the tag of the program point.
This commit elimintates this inconsistency and adds an assertion to the
constructor of `CheckerContext` to ensure that this invariant will be
upheld even if we e.g. add a new program point kind.
I strongly suspect that this is a non-functional change because program
point tags are a vestigial feature in the codebase that barely affect
anything -- but e.g. their presence affects the infamous node
reclamation process, so I'm not marking this as NFC.
|
|
Tranersing the CFG blocks of a function is a fundamental operation. Many
C++ constructs can create splits in the control-flow, such as `if`,
`for`, and similar control structures or ternary expressions, gnu
conditionals, gotos, switches and possibly more.
Checkers should be able to get notifications about entering or leaving a
CFG block of interest.
Note that in the ExplodedGraph there is always a BlockEntrance
ProgramPoint right after the BlockEdge ProgramPoint. I considered naming
this callback check::BlockEdge, but then that may leave the observer of
the graph puzzled to see BlockEdge points followed more BlockEdge nodes
describing the same CFG transition. This confusion could also apply to
Bug Report Visitors too.
Because of this, I decided to hook BlockEntrance ProgramPoints instead.
The same confusion applies here, but I find this still a better place
TBH. There would only appear only one BlockEntrance ProgramPoint in the
graph if no checkers modify the state or emit a bug report. Otherwise
they modify some GDM (aka. State) thus create a new ExplodedNode with
the same BlockEntrance ProgramPoint in the graph.
CPP-6484
|
|
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
|
|
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
|
|
The checker classes (i.e. classes derived from `CheckerBase` via the
utility template `Checker<...>`) act as intermediates between the user
and the analyzer engine, so they have two interfaces:
- On the frontend side, they have a public name, can be enabled or
disabled, can accept checker options and can be reported as the source
of bug reports.
- On the backend side, they can handle various checker callbacks and
they "leave a mark" on the `ExplodedNode`s that are created by them.
(These `ProgramPointTag` marks are internal: they appear in debug logs
and can be queried by checker logic; but the user doesn't see them.)
In a significant majority of the checkers there is 1:1 correspondence
between these sides, but there are also many checker classes where
several related user-facing checkers share the same backend class.
Historically each of these "multi-part checker" classes had its own
hacks to juggle its multiple names, which led to lots of ugliness like
lazy initialization of `mutable std::unique_ptr<BugType>` members and
redundant data members (when a checker used its custom `CheckNames`
array and ignored the inherited single `Name`).
My recent commit 27099982da2f5a6c2d282d6b385e79d080669546 tried to unify
and standardize these existing solutions to get rid of some of the
technical debt, but it still used enum values to identify the checker
parts within a "multi-part" checker class, which led to some ugliness.
This commit introduces a new framework which takes a more direct,
object-oriented approach: instead of identifying checker parts with
`{parent checker object, index of part}` pairs, the parts of a
multi-part checker become stand-alone objects that store their own name
(and enabled/disabled status) as a data member.
This is implemented by separating the functionality of `CheckerBase`
into two new classes: `CheckerFrontend` and `CheckerBackend`. The name
`CheckerBase` is kept (as a class derived from both `CheckerFrontend`
and `CheckerBackend`), so "simple" checkers that use `CheckerBase` and
`Checker<...>` continues to work without changes. However we also get
first-class support for the "many frontends - one backend" situation:
- The class `CheckerFamily<...>` works exactly like `Checker<...>` but
inherits from `CheckerBackend` instead of `CheckerBase`, so it won't
have a superfluous single `Name` member.
- Classes deriving from `CheckerFamily` can freely own multiple
`CheckerFrontend` data members, which are enabled within the
registration methods corresponding to their name and can be used to
initialize the `BugType`s that they can emit.
In this scheme each `CheckerFamily` needs to override the pure virtual
method `ProgramPointTag::getTagDescription()` which returns a string
which represents that class for debugging purposes. (Previously this
used the name of one arbitrary sub-checker, which was passable for
debugging purposes, but not too elegant.)
I'm planning to implement follow-up commits that convert all the
"multi-part" checkers to this `CheckerFamily` framework.
|
|
|
|
checker (#141232)
Resolves
https://github.com/llvm/llvm-project/issues/76208#issuecomment-2830854351
Quoting the docs of `[[clang::flag_enum]]`:
https://clang.llvm.org/docs/AttributeReference.html#flag-enum
> This attribute can be added to an enumerator to signal to the compiler that it
> is intended to be used as a flag type. This will cause the compiler to assume
> that the range of the type includes all of the values that you can get by
> manipulating bits of the enumerator when issuing warnings.
Ideally, we should still check the upper bounds but for simplicity let's not bother for now.
|
|
|
|
|
|
checker (#141076)
Add extra branches for the case when the buffer argument is NULL.
Fixes #135720
|
|
This helps to gain contextual information about how we entered a CFG block.
The `noexprcrash.c` test probably changed due to the fact that now
BlockEntrance ProgramPoint Profile also hashes the pointer of the
previous CFG block. I didn't investigate.
CPP-6483
|
|
(#140035)
[analyzer][NFC] Move PrettyStackTraceLocationContext into
dispatchWorkItem
This change helps with ensuring that the abstract machine call stack is
only dumped exactly once no matter what checker callback we have the
crash in.
Note that `check::EndAnalysis` callbacks are resolved outside of
`dispatchWorkItem`, but that's the only checker callback that is outside
of `dispatchWorkItem`.
CPP-6476
|
|
|
|
(Reland) (#139980)
Fixes #139779.
The bug was introduced in #137355 in `SymbolConjured::getStmt`, when
trying to obtain a statement for a CFG initializer without an
initializer. This commit adds a null check before access.
Previous PR #139820, Revert #139936
Additional notes since previous PR:
When conjuring a symbol, sometimes there is no valid CFG element, e.g.
in the file causing the crash, there is no element at all in the CFG. In
these cases, the CFG element reference in the expression engine will be
invalid. As a consequence, there needs to be extra checks to ensure the
validity of the CFG element reference.
|
|
Previously the class `ExplodedGraph` had a data member called `Roots`
which could (in theory) store multiple root nodes -- but in practice
exploded graphs always had at most one root node (zero was possible for
empty and partially copied graphs) and introducing a graph with multiple
roots would've caused severe malfuncitons (e.g. in code that used the
pattern `*roots_begin()` to refer to _the_ root node).
I don't see any practical use case for adding multiple root nodes in a
single graph (this seems to be yet another of the "generalize for the
sake of generalization" decisions which were common in the early history
of the analyzer), so this commit replaces the vector `Roots` with
`ExplodedNode *Root` (which may be null when the graph is empty or under
construction).
Note that the complicated logic of `ExplodedGraph::trim` deserves a
through cleanup, but I left that for a follow-up commit.
|
|
Previously, CSA did not handle __builtin_bit_cast correctly. It
evaluated the LvalueToRvalue conversion for the casting expression,
but did not actually convert the value of the expression to be of the
destination type.
This commit fixes the problem.
rdar://149987320
|
|
Also fix some typos in comments
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
|
|
SymbolConjured (#137355)
Closes #57270.
This PR changes the `Stmt *` field in `SymbolConjured` with
`CFGBlock::ConstCFGElementRef`. The motivation is that, when conjuring a
symbol, there might not always be a statement available, causing
information to be lost for conjured symbols, whereas the CFGElementRef
can always be provided at the callsite.
Following the idea, this PR changes callsites of functions to create
conjured symbols, and replaces them with appropriate `CFGElementRef`s.
There is a caveat at loop widening, where the correct location is the
CFG terminator (which is not an element and does not have a ref). In
this case, the first element in the block is passed as a location.
Previous PR #128251, Reverted at #137304.
|