Age | Commit message (Collapse) | Author | Files | Lines |
|
Revert llvm/llvm-project#143520 for now since it’s causing issues for
people who are using symlinks and prefer to preserve the original path
(i.e. looks like we’ll have to make this configurable after all; I just
need to figure out how to pass `-no-canonical-prefixes` down through the
driver); I’m planning to refactor this a bit and reland it in a few
days.
|
|
This can significantly shorten file paths to standard library headers,
e.g. on my system, `<ranges>` is currently printed as
```console
/usr/lib/gcc/x86_64-redhat-linux/15/../../../../include/c++/15/ranges
```
but with this change, we instead print
```console
/usr/include/c++/15/ranges
```
This is of course just a heuristic; there are paths that would get longer
as a result of this, so we use whichever path ends up being shorter.
@AaronBallman pointed out that this might be problematic for network
file systems since path resolution might take a while, so this is enabled
only for paths that are part of a local filesystem—though not on Windows
since there we noticed that the check itself is slow.
The file names are cached in `SourceManager`.
|
|
The `SLocEntry` structure is 24 bytes, and the binary search only needs
the offset. Loading an entry's offset might pull the entire SLocEntry
object into the CPU cache.
To make the binary search much more cache-efficient, we use a separate
offset table.
See
https://llvm-compile-time-tracker.com/compare.php?from=650d0151c623c123e4e9736fe50421624a329260&to=6af564c0d75aff28a2784a8554448c0679877792&stat=instructions:u.
|
|
getFileID (#146782)
`getFileID` is a hot method. By caching the offset range in
`LastFileIDLookup`, we can more quickly check whether a given offset
falls within it, avoiding calling `isOffsetInFileID`.
https://llvm-compile-time-tracker.com/compare.php?from=0588e8188c647460b641b09467fe6b13a8d510d5&to=64843a500f0191b79a8109da9acd7e80d961c7a3&stat=instructions:u
|
|
SourceLocation (#145711)
Introduce a type alias for the commonly used `std::pair<FileID,
unsigned>` to improve code readability, and make it easier for future
updates (64-bit source locations).
|
|
(#139584)"
This reverts commit e2a885537f11f8d9ced1c80c2c90069ab5adeb1d. Build failures were fixed right away and reverting the original commit without the fixes breaks the build again.
|
|
(#139584)"
This reverts commit 9e306ad4600c4d3392c194a8be88919ee758425c.
Multiple builtbot failures have been reported:
https://github.com/llvm/llvm-project/pull/139584
|
|
The `DiagnosticOptions` class is currently intrusively
reference-counted, which makes reasoning about its lifetime very
difficult in some cases. For example, `CompilerInvocation` owns the
`DiagnosticOptions` instance (wrapped in `llvm::IntrusiveRefCntPtr`) and
only exposes an accessor returning `DiagnosticOptions &`. One would
think this gives `CompilerInvocation` exclusive ownership of the object,
but that's not the case:
```c++
void shareOwnership(CompilerInvocation &CI) {
llvm::IntrusiveRefCntPtr<DiagnosticOptions> CoOwner = &CI.getDiagnosticOptions();
// ...
}
```
This is a perfectly valid pattern that is being actually used in the
codebase.
I would like to ensure the ownership of `DiagnosticOptions` by
`CompilerInvocation` is guaranteed to be exclusive. This can be
leveraged for a copy-on-write optimization later on. This PR changes
usages of `DiagnosticOptions` across `clang`, `clang-tools-extra` and
`lldb` to not be intrusively reference-counted.
|
|
There are checks in clang codebase that determine the type of source
file, associated with a given location - specifically, if it is an
ordonary file or comes from sources like command-line options or a
built-in definitions. These checks often rely on calls to
`getPresumedLoc`, which is relatively expensive. In certain cases, these
checks are combined, leading to repeated calculations of the costly
function negatively affecting compile time.
This change tries to optimize such checks. It must fix compile time
regression introduced in
https://github.com/llvm/llvm-project/pull/137306/.
---------
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
|
|
|
|
(#96292)
We have been running into source location exhaustion recently and want
to use the statistics to monitor the usage in various files to be able
to anticipate where the next problem will happen.
I picked `Statistic` because it can be written into a structured JSON
file and is easier to consume by further automation.
This commit does not change any existing per-source-manager metrics
exposed via `SourceManager::PrintStats()`. This does create some
redundancy, but I also expect to be non-controversial because it aligns
with the intended use of `Statistic`.
|
|
The commit adds serialization and de-serialization implementations for
the stored regions. Basically, the serialized representation of the
regions of a PP is a (ordered) sequence of source location encodings.
For de-serialization, regions from loaded files are stored by their ASTs.
When later one queries if a loaded location L is in an opt-out
region, PP looks up the regions of the loaded AST where L is at.
(Background if helps: a pair of `#pragma clang unsafe_buffer_usage begin/end` pragmas marks a
warning-opt-out region. The begin and end locations (opt-out regions)
are stored in preprocessor instances (PP) and will be queried by the
`-Wunsafe-buffer-usage` analyzer.)
The reported issue at upstream: https://github.com/llvm/llvm-project/issues/90501
rdar://124035402
|
|
This patch adds several const-qualified variants of existing member
functions to `SourceManager`.
I started with removing const qualification from
`setNumCreatedFIDsForFileID`, and removing `const_cast` in the body of
this function, as I think it doesn't make sense to const-qualify
setters.
|
|
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber StringRef::equals by a factor of
24 under clang/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
|
|
|
|
This patch doesn't touch `CodeGenOptions.h`, `DiagnosticOptions.h`, `LangOptions.h`, `IdentifierTable.h`.
|
|
In `SourceManager::getFileID()`, Clang performs binary search over its
buffer of `SLocEntries`. For modules, this binary search fully
deserializes the entire `SLocEntry` block for each visited entry. For
some entries, that includes decompressing the associated buffer (e.g.
the predefines buffer, macro expansion buffers, contents of volatile
files), which shows up in profiles of the dependency scanner.
This patch moves the binary search over loaded entries into `ASTReader`,
which can perform cheaper partial deserialization during the binary
search, reducing the wall time of dependency scans by ~3%. This also
reduces the number of retired instructions by ~1.4% on regular
(implicit) modules compilation.
Note that this patch drops the optimizations based on the last lookup ID
(pruning the search space and performing linear search before resorting
to the full binary search). Instead, it reduces the search space by
asking `ASTReader::GlobalSLocOffsetMap` for the containing `ModuleFile`
and only does binary search over entries of single module file.
|
|
This commit removes the list of SLocEntry offsets to preload eagerly
from PCM files. Commit introducing this functionality (258ae54a) doesn't
clarify why this would be more performant than the lazy approach used
regularly.
Currently, the only SLocEntry the reader is supposed to preload is the
predefines buffer, but in my experience, it's not actually referenced in
most modules, so the time spent deserializing its SLocEntry is wasted.
This is especially noticeable in the dependency scanner, where this
change brings 4.56% speedup on my benchmark.
|
|
This patch removes the `SourceManager` APIs that create `FileID` from a
`const FileEntry *` in favor of APIs that take `FileEntryRef`. This also
removes a misleading documentation that claims `nullptr` file entry
represents stdin. I don't think that's right, since we just try to
dereference that pointer anyways.
|
|
The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.
By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
|
|
`SourceManager::getMemoryBufferForFileOr{None,Fake}()`
|
|
|
|
|
|
|
|
|
|
|
|
This patch replaces (llvm::|)Optional< with std::optional<. I'll post
a separate patch to remove #include "llvm/ADT/Optional.h".
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.
I'll post a separate patch to actually replace llvm::Optional with
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
The needed tweaks are mostly trivial, the one nasty bit is Clang's usage
of OptionalStorage. To keep this working old Optional stays around as
clang::CustomizableOptional, with the default Storage removed.
Optional<File/DirectoryEntryRef> is replaced with a typedef.
I tested this with GCC 7.5, the oldest supported GCC I had around.
Differential Revision: https://reviews.llvm.org/D140332
|
|
std::optional"
This reverts commit 8f0df9f3bbc6d7f3d5cbfd955c5ee4404c53a75d.
The Optional*RefDegradesTo*EntryPtr types want to keep the same size as
the underlying type, which std::optional doesn't guarantee. For use with
llvm::Optional, they define their own storage class, and there is no way
to do that in std::optional.
On top of that, that commit broke builds with older GCCs, where
std::optional was not trivially copyable (static_assert in the clang
sources was failing).
|
|
|
|
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
indicating why we ran out.
|
|
This is a prep-patch for D136624 which allows querying `SourceManager` with raw offsets instead of `SourceLocation`s.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D137215
|
|
Update SourceManager::ContentCache::OrigEntry to keep the original
FileEntryRef, and use that to enable ModuleMap::getModuleMapFile* to
return the original FileEntryRef. This change should be NFC for
most users of SourceManager::ContentCache, but it could affect behaviour
for users of getNameAsRequested such as in compileModuleImpl. I have not
found a way to detect that difference without additional functional
changes, other than incidental cases like changes from / to \ on
Windows so there is no new test.
Differential Revision: https://reviews.llvm.org/D135220
|
|
isBeforeInTranslationUnit compares SourceLocations across FileIDs by
mapping them onto a common ancestor file, following include/expansion edges.
It is possible to get a tie in the common ancestor, because multiple
"chunks" of a macro arg will expand to the same macro param token in the body:
#define ID(X) X
#define TWO 2
ID(1 TWO)
Here two FileIDs both expand into `X` in ID's expansion:
- one containing `1` and spelled on line 3
- one containing `2` and spelled by the macro expansion of TWO
isBeforeInTranslationUnit breaks this tie by comparing the two FileIDs:
the one "on the left" is always created first and is numerically smaller.
This seems correct so far.
Prior to this patch it also takes a shortcut (unclear if intentionally).
Instead of comparing the two FileIDs that directly expand to the same location,
it compares the original FileIDs being compared. These may not be the
same if there are multiple macro expansions in between.
This *almost* always yields the right answer, because macro expansion
yields "trees" of FileIDs allocated in a contiguous range: when comparing tree A
to tree B, it doesn't matter what representative you pick.
However, the splitting of >> tokens is modeled as macro expansion (as if
the first '>' was a macro that expands to a '>' spelled a scratch buffer).
This splitting occurs retroactively when parsing, so the FileID allocated is
larger than expected if it were a real macro expansion performed during lexing.
As a result, macro tree A can be on the left of tree B, and yet contain
a token-split FileID whose numeric value is *greator* than those in B.
In this case the tiebreak gives the wrong answer.
Concretely:
#define ID(X) X
template <typename> class S{};
ID(
ID(S<S<int>> x);
int y;
)
Given Greater = (typeloc of S<int>).getEndLoc();
Y = (decl of y).getLocation();
isBeforeInTranslationUnit(Greater, Y) should return true, but returns false.
Here the common FileID of (Greater, Y) is the body of the outer ID
expansion, and they both expand to X within it.
With the current tiebreak rules, we compare the FileID of Greater (a split)
to the FileID of Y (a macro arg expansion into X of the outer ID).
The former is larger because the token split occurred relatively late.
This patch fixes the issue by removing the shortcut. It tracks the immediate
FileIDs used to reach the common file, and uses these IDs to break ties.
In the example, we now compare the macro arg expansion of the inner ID()
to the macro arg expansion of Y, and find that it is smaller.
This requires some changes to the InBeforeInTUCacheEntry (sic).
We store a little more data so it's probably slightly slower.
It was difficult to resist more invasive changes:
- performance: the sizing is very suspicious, and once the cache "fills up"
we're thrashing a single entry
- API: the class seems to be needlessly complicated
However I tried to avoid mixing these with subtle behavior changes, and
will send a followup instead.
Differential Revision: https://reviews.llvm.org/D134685
|
|
I went over the output of the following mess of a command:
(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Differential Revision: https://reviews.llvm.org/D130827
|
|
This patch is a part of the upstreaming efforts. Cling has the ability to spawn
child interpreters (mainly for auto completions). The child interpreter import
Decls using the ASTImporter which casuses the assertion here
https://github.com/llvm/llvm-project/blob/65eb74e94b414fcde6bfa810d1c30c7fcb136b77/clang/include/clang/Basic/SourceLocation.h#L322
The patch is co-developed with V. Vassilev.
Differential revision: https://reviews.llvm.org/D88780
|
|
And haven't been since 2011: https://github.com/llvm/llvm-project/commit/eeca36fe9ad767380b2eab76a6fe5ba410a47393
|
|
> Includes regression test for problem noted by @hans.
> is reverts commit 973de71.
>
> Differential Revision: https://reviews.llvm.org/D106898
Feature implemented as-is is fairly expensive and hasn't been used by
libc++. A potential reimplementation is possible if libc++ become
interested in this feature again.
Differential Revision: https://reviews.llvm.org/D123885
|
|
This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.
NFC: this patch introduces typedefs for the integer type used by
SourceLocation and makes all the boring changes to use the typedefs
everywhere, but for the moment, they are unconditionally defined to
uint32_t.
Patch originally by Mikhail Maltsev.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D105492
|
|
Change `SourceManager::getOrCreateContentCache` to take a `FileEntryRef`
and update call sites (mostly internal to SourceManager.cpp). In a
couple of cases this temporarily relies on `FileEntry::getLastRef`, but
those can be cleaned up once other APIs switch over.
The one change outside of SourceManager.cpp is in ASTReader.cpp, which
stops relying on the auto-degrade-to-`FileEntry*` behaviour from
`InputFile::getFile` since it now needs a `FileEntryRef`.
No functionality change here.
Differential Revision: https://reviews.llvm.org/D92983
|
|
Currently, there are many instances where `SourceLocation` objects are
converted to raw representation to be stored in structs that are
used as fields of tagged unions.
This is done to make the corresponding structs trivial.
Triviality allows avoiding undefined behavior when implicitly changing
the active member of the union.
However, in most cases, we can explicitly construct an active member
using placement new. This patch adds the required active member
selections and replaces `SourceLocation`-s represented as
`unsigned int` with proper `SourceLocation`-s.
One notable exception is `DeclarationNameLoc`: the objects of this class
are often not properly initialized (so the code currently relies on
its default constructor which uses memset). This class will be fixed
in a separate patch.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D94237
|
|
Migrate ObjCMT.cpp from using `const FileEntry*` to `FileEntryRef`. This
is one of the blockers for changing `SourceManager` to use
`FileEntryRef`.
This adds an initial version of `SourceManager::getFileEntryRefForID`,
which uses to `FileEntry::getLastRef`; after `SourceManager` switches,
`SourceManager::getFileEntryForID` will need to call this function.
This also adds uses of `FileEntryRef` as a key in a `DenseMap`, and a
call to `hash_value(Optional)` in `DenseMapInfo<EditEntry>`; support for
these were added in prep commits.
Differential Revision: https://reviews.llvm.org/D92678
|
|
CompilerInstance::InitializeSourceManager, NFC
Use `FileManager::getVirtualFileRef` to get the virtual file for stdin,
and add an overload of `SourceManager::overrideFileContents` that takes
a `FileEntryRef`, migrating `CompilerInstance::InitializeSourceManager`.
Differential Revision: https://reviews.llvm.org/D92680
|
|
Add a `FileEntryRef` overload of `SourceManager::translateFile`, and
migrate `ParseDirective` in VerifyDiagnosticConsumer.cpp to use it and
the corresponding overload of `createFileID`.
No functionality change here.
Differential Revision: https://reviews.llvm.org/D92699
|
|
getVirtualFileRef, NFC
Change the `InputFile` class to store `Optional<FileEntryRef>` instead
of `FileEntry*`. This paged in a few API changes:
- Added `FileManager::getVirtualFileRef`, and converted `getVirtualFile`
to a wrapper of it.
- Updated `SourceManager::bypassFileContentsOverride` to take
`FileEntryRef` and return `Optional<FileEntryRef>`
(`ASTReader::getInputFile` is the only caller).
Differential Revision: https://reviews.llvm.org/D90053
|
|
This seems to due to a bad merge by rG156e8b37024a
|