Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch integrates CallStackRadixTreeBuilder into the V3 format,
reducing the profile size to about 27% of the V2 profile size.
- Serialization: writeMemProfCallStackArray just needs to write out
the radix tree array prepared by CallStackRadixTreeBuilder.
Mappings from CallStackIds to LinearCallStackIds are moved by new
function CallStackRadixTreeBuilder::takeCallStackPos.
- Deserialization: Deserializing a call stack is the same as
deserializing an array encoded in the obvious manner -- the length
followed by the payload, except that we need to follow a pointer to
the parent to take advantage of common prefixes once in a while.
This patch teaches LinearCallStackIdConverter to how to handle those
pointers.
|
|
Call stacks are a huge portion of the MemProf profile, taking up 70+%
of the profile file size.
This patch implements a radix tree to compress call stacks, which are
known to have long common prefixes. Specifically,
CallStackRadixTreeBuilder, introduced in this patch, takes call stacks
in the MemProf profile, sorts them in the dictionary order to maximize
the common prefix between adjacent call stacks, and then encodes a
radix tree into a single array that is ready for serialization.
The resulting radix array is essentially a concatenation of call stack
arrays, each encoded with its length followed by the payload, except
that these arrays contain "instructions" like "skip 7 elements
forward" to borrow common prefixes from other call stacks.
This patch does not integrate with the MemProf
serialization/deserialization infrastructure yet. Once integrated,
the radix tree is expected to roughly halve the file size of the
MemProf profile.
|
|
CallStackIdConverter sets LastUnmappedId when a mapping failure
occurs. Now, since toMemProfRecord takes an instance of
CallStackIdConverter by value, namely std::function, the caller of
toMemProfRecord never receives the mapping failure that occurs inside
toMemProfRecord. The same problem applies to FrameIdConverter.
The patch fixes the problem by passing FrameIdConverter and
CallStackIdConverter by reference, namely llvm::function_ref.
While I am it, this patch deletes the copy constructor and copy
assignment operator to avoid accidental copies.
|
|
|
|
commit 4c8ec8f8bc3fb4dda4fd36c3b2ad745bd3451970
Author: Kazu Hirata <kazu@google.com>
Date: Wed Apr 24 16:25:35 2024 -0700
introduced the idea of serializing/deserializing a subset of the
fields in PortableMemInfoBlock. While it reduces the size of the
indexed MemProf profile file, we now could inadvertently access
unavailable fields and go without noticing.
To protect ourselves from the risk, this patch adds access checks to
PortableMemInfoBlock::get* methods by embedding a bit set representing
available fields into PortableMemInfoBlock.
|
|
Currently, we convert FrameId to Frame and CallStackId to a call stack
at several places. This patch unifies those into function objects --
FrameIdConverter and CallStackIdConverter.
The existing implementation of CallStackIdConverter, being removed in
this patch, handles both FrameId and CallStackId conversions. This
patch splits it into two phases for flexibility (but make them
composable) because some places only require the FrameId conversion.
This iteration fixes a problem uncovered with ubsan, where we were
dereferencing an uninitialized std::unique_ptr.
|
|
Reverts llvm/llvm-project#90307
Breaks bots https://lab.llvm.org/buildbot/#/builders/5/builds/42943
|
|
Currently, we convert FrameId to Frame and CallStackId to a call stack
at several places. This patch unifies those into function objects --
FrameIdConverter and CallStackIdConverter.
The existing implementation of CallStackIdConverter, being removed in
this patch, handles both FrameId and CallStackId conversions. This
patch splits it into two phases for flexibility (but make them
composable) because some places only require the FrameId conversion.
|
|
This patch replaces count with contains, following the spirit of
clang-tidy's readability-container-contains.
|
|
PortableMemInfoBlock (#90103)
These functions do not operate on PortableMemInfoBlock. This patch
moves them outside the class.
|
|
This patch removes getFullSchema in MemProfTest.cpp in favor of
llvm::memprof::PortableMemInfoBlock::getFullSchema as they do exactly
the same thing.
|
|
|
|
This patch enables users of MemProfReader to directly supply mappings
from CallStackId to actual call stacks.
Once the users of the current constructor without CSIdMap switch to
the new constructor, we'll have fewer users of:
- IndexedAllocationInfo::CallStack
- IndexedMemProfRecord::CallSites
bringing us one step closer to the removal of these fields in favor
of:
- IndexedAllocationInfo::CSId
- IndexedMemProfRecord::CallSiteIds
|
|
We are in the process of referring to call stacks with CallStackId in
IndexedMemProfRecord and IndexedAllocationInfo instead of holding call
stacks inline (both in memory and the serialized format). Doing so
deduplicates call stacks and reduces the MemProf profile file size.
Before we can eliminate the two fields holding call stacks inline:
- IndexedAllocationInfo::CallStack
- IndexedMemProfRecord::CallSites
we need to eliminate all the read operations on them.
This patch is a step toward that direction. Specifically, we
eliminate the read operations in the context of MemProfReader and
RawMemProfReader. A subsequent patch will eliminate the read
operations during the serialization.
|
|
(#88200)
This patch renames RawMemProfReader.{cpp,h} to MemProfReader.{cpp,h},
respectively. Also, it re-creates RawMemProfReader.h just to include
MemProfReader.h for compatibility with out-of-tree users.
|
|
I'm currently developing a new version of the indexed memprof format
where we deduplicate call stacks in IndexedAllocationInfo::CallStack
and IndexedMemProfRecord::CallSites. We refer to call stacks with
integer IDs, namely CallStackId, just as we refer to Frame with
FrameId. The deduplication will cut down the profile file size by 80%
in a large memprof file of mine.
As a step toward the goal, this patch teaches
IndexedMemProfRecord::{serialize,deserialize} to speak Version2. A
subsequent patch will add Version2 support to llvm-profdata.
The essense of the patch is to replace the serialization of a call
stack, a vector of FrameIDs, with that of a CallStackId. That is:
const IndexedAllocationInfo &N = ...;
...
LE.write<uint64_t>(N.CallStack.size());
for (const FrameId &Id : N.CallStack)
LE.write<FrameId>(Id);
becomes:
LE.write<CallStackId>(N.CSId);
|
|
The indexed MemProf file has a huge amount of redundancy. In a large
internal application, 82% of call stacks, stored in
IndexedAllocationInfo::CallStack, are duplicates.
We should work toward deduplicating call stacks by referring to them
with unique IDs with actual call stacks stored in a separate data
structure, much like we refer to memprof::Frame with memprof::FrameId.
At the same time, we need to facilitate a graceful transition from the
current version of the MemProf format to the next. We should be able
to read (but not write) the current version of the MemProf file even
after we move onto the next one.
With those goals in mind, I propose to have an integer ID next to
CallStack in IndexedAllocationInfo to refer to a call stack in a
succinct manner. We'll gradually increase the areas of the compiler
where IDs and call stacks have one-to-one correspondence and
eventually remove the existing CallStack field.
This patch adds call stack ID, named CSId, to IndexedAllocationInfo
and teaches the raw profile reader to compute unique call stack IDs
and store them in the new field. It does not introduce any user of
the call stack IDs yet, except in verifyFunctionProfileData.
|
|
GNU addr2line supports lookup by symbol name in addition to the existing
address lookup. llvm-symbolizer starting from
e144ae54dcb96838a6176fd9eef21028935ccd4f supports lookup by symbol name.
This change extends this lookup with possibility to specify optional
offset.
Now the address for which source information is searched for can be
specified with offset:
llvm-symbolize --obj=abc.so "SYMBOL func_22+0x12"
It decreases the gap in features of llvm-symbolizer and GNU addr2line.
This lookup now is supported for code only.
Migrated from: https://reviews.llvm.org/D139859
Pull request: https://github.com/llvm/llvm-project/pull/75067
|
|
Recent versions of GNU binutils starting from 2.39 support symbol+offset
lookup in addition to the usual numeric address lookup. This change adds
symbol lookup to llvm-symbolize and llvm-addr2line.
Now llvm-symbolize behaves closer to GNU addr2line, - if the value specified
as address in command line or input stream is not a number, it is treated as
a symbol name. For example:
llvm-symbolize --obj=abc.so func_22
llvm-symbolize --obj=abc.so "CODE func_22"
This lookup is now supported only for functions. Specification with
offset is not supported yet.
This is a recommit of 2b27948783e4bbc1132d3220d8517ef62607b558, reverted
in 39fec5457c0925bd39f67f63fe17391584e08258 because the test
llvm/test/Support/interrupts.test started failing on Windows. The test was
changed in 18f036d0105589c3175bb51a518c5d272dae61e2 and is also updated in
this commit.
Differential Revision: https://reviews.llvm.org/D149759
|
|
This reverts commit 2b27948783e4bbc1132d3220d8517ef62607b558.
On some buildbots the test LLVM::interrupts.test start failing.
|
|
Recent versions of GNU binutils starting from 2.39 support symbol+offset
lookup in addition to the usual numeric address lookup. This change adds
symbol lookup to llvm-symbolize and llvm-addr2line.
Now llvm-symbolize behaves closer to GNU addr2line, - if the value specified
as address in command line or input stream is not a number, it is treated as
a symbol name. For example:
llvm-symbolize --obj=abc.so func_22
llvm-symbolize --obj=abc.so "CODE func_22"
This lookup is now supported only for functions. Specification with
offset is not supported yet.
Differential Revision: https://reviews.llvm.org/D149759
|
|
Add a MemProfReader base class which can be used directly where
symbolization and processing a raw profile is unnecessary.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D159141
|
|
Canonicalize the function name (strip suffixes etc) to ensure that
function name suffixes added by late stage passes do not cause
mismatches when memprof profile data is consumed.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D159132
|
|
|
|
Update the isRuntime check to only match against known memprof filenames
where interceptors are defined. This avoid issues where the path does
not include the directory based on how the runtime was compiled. Also
update the unittest.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D145521
|
|
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes check-llvm.
|
|
|
|
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
|
|
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
|
|
|
|
Extend the Frame struct to hold the symbol name if requested
when a RawMemProfReader object is constructed. This change updates the
tests and removes the need to pass --debug to obtain the mapping from
GUID to symbol names.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D126344
|
|
The current implementation of memprof information in the indexed profile
format stores the representation of each calling context fram inline.
This patch uses an interned representation where the frame contents are
stored in a separate on-disk hash table. The table is indexed via a hash
of the contents of the frame. With this patch, the compressed size of a
large memprof profile reduces by ~22%.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D123094
|
|
This reverts commit f4b794427e8037a4e952cacdfe7201e961f31a6f.
Reland with underlying msan issue fixed in D122260.
|
|
This reverts commit 0d362c90d335509c57c0fbd01ae1829e2b9c3765.
Reason: Causes the MSan buildbot to fail (see comments on
https://reviews.llvm.org/D121179 for more information
|
|
To ease profile annotation, each of the callsites in a function can be
annotated with profile data - "IR metadata format for MemProf" [1]. This
patch extends the on-disk serialized record format to store the debug
information for allocation callsites incl inline frames. This change is
incompatible with the existing format i.e. indexed profiles must be
regenerated, raw profiles are unaffected.
[1] https://groups.google.com/g/llvm-dev/c/aWHsdMxKAfE/m/WtEmRqyhAgAJ
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D121179
|
|
Since DI frames are enumerated with the leaf function at index 0, this
patch fixes the logic when IsInlineFrame is set. Also update the
unittests to check that only the last frame is marked as non-inline from
a set of DI Frames for a PC address.
Differential Revision: https://reviews.llvm.org/D121830
|
|
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
|
|
This patch filters out callstack frames which can't be symbolized or if
the frames belong to the runtime. Symbolization may not be possible if
debug information is unavailable or if the addresses are from a shared
library. For now we only support optimization of the main binary which
is statically linked to the compiler runtime.
Differential Revision: https://reviews.llvm.org/D120860
|
|
Currently, symbolization of stack frames occurs on demand when the instrprof writer
iterates over all the records in the raw memprof reader. With this
change we symbolize and cache the frames immediately after reading the
raw profiles. For a large internal binary this results in a runtime
reduction of ~50% (2m -> 48s) when merging a memprof raw profile with a
raw instr profile to generate an indexed profile. This change also makes
it simpler in the future to generate additional calling context
metadata to attach to each memprof record.
Differential Revision: https://reviews.llvm.org/D120430
|
|
This patch adds support for optional memory profile information to be
included with and indexed profile. The indexed profile header adds a new
field which points to the offset of the memory profile section (if
present) in the indexed profile. For users who do not utilize this
feature the only overhead is a 64-bit offset in the header.
The memory profile section contains (1) profile metadata describing the
information recorded for each entry (2) an on-disk hashtable containing
the profile records indexed via llvm::md5(function_name). We chose to
introduce a separate hash table instead of the existing one since the
indexing for the instrumented fdo hash table is based on a CFG hash
which itself is perturbed by memprof instrumentation.
This commit also includes the changes reviewed separately in D120093.
Differential Revision: https://reviews.llvm.org/D120103
|
|
profiles.""
This reverts commit 807ba7aace188ada83ddb4477265728e97346af1.
|
|
This reverts commit 85355a560a33897453df2ef959e255ee725eebce.
This patch adds support for optional memory profile information to be
included with and indexed profile. The indexed profile header adds a new
field which points to the offset of the memory profile section (if
present) in the indexed profile. For users who do not utilize this
feature the only overhead is a 64-bit offset in the header.
The memory profile section contains (1) profile metadata describing the
information recorded for each entry (2) an on-disk hashtable containing
the profile records indexed via llvm::md5(function_name). We chose to
introduce a separate hash table instead of the existing one since the
indexing for the instrumented fdo hash table is based on a CFG hash
which itself is perturbed by memprof instrumentation.
Differential Revision: https://reviews.llvm.org/D118653
|
|
This reverts commit e6999040f5758f89a64b6232119b775b7bd1c85b.
Update test to fix signed int comparison warning, fix whitespace in
compiler-rt MIBEntryDef.inc file.
Differential Revision: https://reviews.llvm.org/D117256
|
|
This reverts commit 857ec0d01f8021ff0d9540fcbf6ff24e29868ba4.
Fixes -DLLVM_ENABLE_MODULES=On build by adding the new textual
header to the modulemap file.
Reviewed in https://reviews.llvm.org/D117722
|
|
This reverts commit 9def83c6d02944b2931efd50cd2491953a772aab. [4/4]
|
|
This reverts commit 9b67165285c5e752fce3b554769f5a22e7b38da8. [3/4]
|
|
profiles.""
This reverts commit de54e4ab78ef09b60f870e8df6f8a87e56d6bd94 [1/4]
|
|
This reverts commit 0f73fb18ca333e38cdb9ffa701a8db026c56041d.
Use llvm/Profile/MIBEntryDef.inc instead of relative path.
Generated the raw profile data with `-mllvm
-enable-name-compression=false` so that builbots where the reader is
built without zlib do not fail.
Also updated the test build instructions.
|
|
This reverts commit 43c2348c5b926df6bdbc5b70efaa35ecdefe12d5.
Buildbots are failing with an error on reading memprof testdata.
"Inputs/basic.profraw: profile uses zlib
compression but the profile reader was built without zlib support"
https://lab.llvm.org/buildbot/#/builders/16/builds/24490
|
|
This patch adds support for optional memory profile information to be
included with and indexed profile. The indexed profile header adds a new
field which points to the offset of the memory profile section (if
present) in the indexed profile. For users who do not utilize this
feature the only overhead is a 64-bit offset in the header.
The memory profile section contains (1) profile metadata describing the
information recorded for each entry (2) an on-disk hashtable containing
the profile records indexed via llvm::md5(function_name). We chose to
introduce a separate hash table instead of the existing one since the
indexing for the instrumented fdo hash table is based on a CFG hash
which itself is perturbed by memprof instrumentation.
Differential Revision: https://reviews.llvm.org/D118653
|