aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/ProfileData/InstrProfWriter.cpp
AgeCommit message (Collapse)AuthorFilesLines
2024-04-20[memprof] Accept Schema in the constructor of RecordWriterTrait (NFC) (#89486)Kazu Hirata1-2/+1
The comment being deleted in this patch is not correct. We already construct an instance of RecordWriterTrait with Version. This patch teaches the constructor of RecordWriterTrait to accept Schema. While I am at it, this patch makes Version a private variable.
2024-04-18[memprof] Use structured binding (NFC) (#89315)Kazu Hirata1-9/+9
2024-04-18[memprof] Add Version2 of the indexed MemProf format (#89100)Kazu Hirata1-7/+83
This patch adds Version2 of the indexed MemProf format. The new format comes with a hash table from CallStackId to actual call stacks llvm::SmallVector<FrameId>. The rest of the format refers to call stacks with CallStackId. This "values + references" model effectively deduplicates call stacks. Without this patch, a large indexed memprof file of mine shrinks from 4.4GB to 1.6GB, a 64% reduction. This patch does not make Version2 generally available yet as I am planning to make a few more changes to the format.
2024-04-12[memprof] Clean up writer traits (NFC) (#88549)Kazu Hirata1-7/+5
RecordWriter does not live past the end of writeMemProfRecords, so it can be safely on stack. The constructor of FrameWriter does not take any parameter, so we can let OnDiskChainedHashTableGenerator::Emit (with a single parameter) default-construct an instance of the writer trait inside Emit.
2024-04-09[memprof] Use structured binding (NFC) (#88096)Kazu Hirata1-4/+4
2024-04-07[memprof] Fix a typo in writeMemProfV1 (#87890)Kazu Hirata1-1/+1
This patch borrows memprof-merge.test to test --memprof-version.
2024-04-04[memprof] Introduce writeMemProf (NFC) (#87698)Kazu Hirata1-76/+142
This patch refactors the serialization of MemProf data to a switch statement style: switch (Version) { case Version0: return ...; case Version1: return ...; } just like IndexedMemProfRecord::serialize. A reasonable amount of code is shared and factored out to helper functions between writeMemProfV0 and writeMemProfV1 to the extent that doens't hamper readability.
2024-04-04[memprof] Make RecordWriterTrait a non-template class (#87604)Kazu Hirata1-4/+3
commit d89914f30bc7c180fe349a5aa0f03438ae6c20a4 Author: Kazu Hirata <kazu@google.com> Date: Wed Apr 3 21:48:38 2024 -0700 changed RecordWriterTrait to a template class with IndexedVersion as a template parameter. This patch changes the class back to a non-template one while retaining the ability to serialize multiple versions. The reason I changed RecordWriterTrait to a template class was because, even if RecordWriterTrait had IndexedVersion as a member variable, RecordWriterTrait::EmitKeyDataLength, being a static function, would not have access to the variable. Since OnDiskChainedHashTableGenerator calls EmitKeyDataLength as: const std::pair<offset_type, offset_type> &Len = InfoObj.EmitKeyDataLength(Out, I->Key, I->Data); we can make EmitKeyDataLength a member function, but we have one problem. InstrProfWriter::writeImpl calls: void insert(typename Info::key_type_ref Key, typename Info::data_type_ref Data) { Info InfoObj; insert(Key, Data, InfoObj); } which default-constructs RecordWriterTrait without a specific version number. This patch fixes the problem by adjusting InstrProfWriter::writeImpl to call the other form of insert instead: void insert(typename Info::key_type_ref Key, typename Info::data_type_ref Data, Info &InfoObj) To prevent an accidental invocation of the default constructor of RecordWriterTrait, this patch deletes the default constructor.
2024-04-03[memprof] Add Version2 of IndexedMemProfRecord serialization (#87455)Kazu Hirata1-2/+4
I'm currently developing a new version of the indexed memprof format where we deduplicate call stacks in IndexedAllocationInfo::CallStack and IndexedMemProfRecord::CallSites. We refer to call stacks with integer IDs, namely CallStackId, just as we refer to Frame with FrameId. The deduplication will cut down the profile file size by 80% in a large memprof file of mine. As a step toward the goal, this patch teaches IndexedMemProfRecord::{serialize,deserialize} to speak Version2. A subsequent patch will add Version2 support to llvm-profdata. The essense of the patch is to replace the serialization of a call stack, a vector of FrameIDs, with that of a CallStackId. That is: const IndexedAllocationInfo &N = ...; ... LE.write<uint64_t>(N.CallStack.size()); for (const FrameId &Id : N.CallStack) LE.write<FrameId>(Id); becomes: LE.write<CallStackId>(N.CSId);
2024-04-01[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write ↵Mingming Liu1-10/+19
(#66825) (The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691) * For InstrFDO value profiling, implement instrumentation and lowering for virtual table address. * This is controlled by `-enable-vtable-value-profiling` and off by default. * When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads. * Implement profile reader and writer support * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols. * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't happen since IR is used to construct InstrProfSymtab. * Indexed profile writer collects the list of vtable names, and stores that to index profiles. * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type. * `llvm-profdata show -show-vtables <args> <profile>` is implemented. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
2024-03-29Revert "[ProfileData] Use size_t in PatchItem (NFC) (#87014)"Muhammad Omair Javaid1-7/+7
This reverts commit c64a328cb4a32e81f8b694162750ec1b8823994c. This broke Arm32 bit build on various LLVM buildbots. For example: https://lab.llvm.org/buildbot/#/builders/17/builds/51129
2024-03-28[ProfileData] Use size_t in PatchItem (NFC) (#87014)Kazu Hirata1-7/+7
size_t in PatchItem eliminates the need for casts.
2024-03-28[memprof] Add MemProf version (#86414)Kazu Hirata1-11/+29
This patch adds a version field to the MemProf section of the indexed profile format, calling the new version "version 1". The existing version is called "version 0". The writer supports both versions via a command-line option: llvm-profdata merge --memprof-version=1 ... The reader supports both versions by automatically detecting the version from the header.
2024-03-14[ProfileData] Use ArrayRef in ProfOStream::patch (NFC) (#85317)Kazu Hirata1-12/+12
We always apply all of the items in PatchItems. This patch simplifies the interface of ProfOStream::patch by switching to ArrayRef.
2024-03-08[PGO] Add support for writing previous indexed format (#84505)Teresa Johnson1-40/+73
Enable temporary support to ease use of new llvm-profdata with slightly older indexed profiles after 16e74fd48988ac95551d0f64e1b36f78a82a89a2, which bumped the indexed format for type profiling.
2024-02-27Reland "[TypeProf][InstrPGO] Introduce raw and instr profile format change ↵Mingming Liu1-7/+36
for type profiling." (#82711) New change on top of [reviewed patch](https://github.com/llvm/llvm-project/pull/81691) are [in commits after this one](https://github.com/llvm/llvm-project/pull/82711/commits/d0757f46b3e3865b5f7c552bc0744309a363e0ac). Previous commits are restored from the remote branch with timestamps. 1. Fix build breakage for non-ELF platforms, by defining the missing functions {`__llvm_profile_begin_vtables`, `__llvm_profile_end_vtables`, `__llvm_profile_begin_vtabnames `, `__llvm_profile_end_vtabnames`} everywhere. * Tested on mac laptop (for darwins) and Windows. Specifically, functions in `InstrProfilingPlatformWindows.c` returns `NULL` to make it more explicit that type prof isn't supported; see comments for the reason. * For the rest (AIX, other), mostly follow existing examples (like this [one](https://github.com/llvm/llvm-project/commit/f95b2f1acf1171abb0d00089fd4c9238753847e3)) 2. Rename `__llvm_prf_vtabnames` -> `__llvm_prf_vns` for shorter section name, and make returned pointers [const](https://github.com/llvm/llvm-project/pull/82711/commits/a825d2a4ec00f07772a373091a702f149c3b0c34#diff-4de780ce726d76b7abc9d3353aef95013e7b21e7bda01be8940cc6574fb0b5ffR120-R121) **Original Description** * Raw profile format - Header: records the byte size of compressed vtable names, and the number of profiled vtable entries (call it `VTableProfData`). Header also records padded bytes of each section. - Payload: adds a section for compressed vtable names, and a section to store `VTableProfData`. Both sections are padded so the size is a multiple of 8. * Indexed profile format - Header: records the byte offset of compressed vtable names. - Payload: adds a section to store compressed vtable names. This section is used by `llvm-profdata` to show the list of vtables profiled for an instrumented site. [The originally reviewed patch](https://github.com/llvm/llvm-project/pull/66825) will have profile reader/write change and llvm-profdata change. - To ensure this PR has all the necessary profile format change along with profile version bump, created a copy of the originally reviewed patch in https://github.com/llvm/llvm-project/pull/80761. The copy doesn't have profile format change, but it has the set of tests which covers type profile generation, profile read and profile merge. Tests pass there. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600 --------- Co-authored-by: modiking <modiking213@gmail.com>
2024-02-21Revert type profiling change as compiler-rt test break on Windows. (#82583)Mingming Liu1-36/+7
Examples https://lab.llvm.org/buildbot/#/builders/127/builds/62532/steps/8/logs/stdio
2024-02-21[nfc]remove unused variable after pr/81691 (#82578)Mingming Liu1-1/+0
* `N` became unused after [pull request 81691](https://github.com/llvm/llvm-project/pull/81691) * This should fix the build bot failure of `unused variable` https://lab.llvm.org/buildbot/#/builders/77/builds/34840
2024-02-21[TypeProf][InstrPGO] Introduce raw and instr profile format change for type ↵Mingming Liu1-6/+36
profiling. (#81691) * Raw profile format - Header: records the byte size of compressed vtable names, and the number of profiled vtable entries (call it `VTableProfData`). Header also records padded bytes of each section. - Payload: adds a section for compressed vtable names, and a section to store `VTableProfData`. Both sections are padded so the size is a multiple of 8. * Indexed profile format - Header: records the byte offset of compressed vtable names. - Payload: adds a section to store compressed vtable names. This section is used by `llvm-profdata` to show the list of vtables profiled for an instrumented site. [The originally reviewed patch](https://github.com/llvm/llvm-project/pull/66825) will have profile reader/write change and llvm-profdata change. - To ensure this PR has all the necessary profile format change along with profile version bump, created a copy of the originally reviewed patch in https://github.com/llvm/llvm-project/pull/80761. The copy doesn't have profile format change, but it has the set of tests which covers type profile generation, profile read and profile merge. Tests pass there. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600 --------- Co-authored-by: modiking <modiking213@gmail.com>
2023-12-15[MemProf][NFC] Clear each IndexedMemProfRecord after it is written (#75205)Teresa Johnson1-0/+3
The on-disk hash table for the memprof writer holds copies of all the memprof records to be written. These hold a lot of memory in aggregate, due to the lists of alloc sites (which each have a list of context frames) and call sites. Clear each one after emitting it. This drops the peak memory when writing a very large indexed memprof profile by about 2.5G.
2023-12-15[MemProf][NFC] Free large data structures after last use (#75120)Teresa Johnson1-0/+4
The MemProf InstrProfWriter uses a couple of MapVector for building the lists of records it needs to write. Once its entries are all added to the associated OnDiskChainedHashTableGenerator, it is no longer used. Clearing these MapVectors, which grow quite large for large profiles, saved 4G for a large memory profile.
2023-11-20[InstrProf] Add pgo use block coverage test (#72443)Ellis Hoag1-0/+2
Back in https://reviews.llvm.org/D124490 we added a block coverage mode that instruments a subset of basic blocks using single byte counters to get coverage for the whole function. This commit adds a test to make sure that we correctly assign branch weights based on the coverage profile. I noticed this test was missing after seeing that we had no coverage on `PGOUseFunc::populateCoverage()` https://lab.llvm.org/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp.html#L1383
2023-10-30Reland "[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based ↵Alan Phipps1-0/+19
Code Coverage (1/3)" Part 1 of 3. This includes the LLVM back-end processing and profile reading/writing components. compiler-rt changes are included. Differential Revision: https://reviews.llvm.org/D138846
2023-10-13Use llvm::endianness::{big,little,native} (NFC)Kazu Hirata1-3/+4
Note that llvm::support::endianness has been renamed to llvm::endianness while becoming an enum class. This patch replaces {big,little,native} with llvm::endianness::{big,little,native}. This patch completes the migration to llvm::endianness and llvm::endianness::{big,little,native}. I'll post a separate patch to remove the migration helpers in llvm/Support/Endian.h: using endianness = llvm::endianness; constexpr llvm::endianness big = llvm::endianness::big; constexpr llvm::endianness little = llvm::endianness::little; constexpr llvm::endianness native = llvm::endianness::native;
2023-10-12Use llvm::endianness::{big,little,native} (NFC)Kazu Hirata1-3/+3
Note that llvm::support::endianness has been renamed to llvm::endianness while becoming an enum class as opposed to an enum. This patch replaces support::{big,little,native} with llvm::endianness::{big,little,native}.
2023-10-10Use llvm::endianness (NFC)Kazu Hirata1-3/+2
Now that llvm::support::endianness has been renamed to llvm::endianness, we can use the shorter form. This patch replaces support::endianness with llvm::endianness.
2023-10-04[NFC]Rename InstrProf::getFuncName{,orExternalSymbol} to ↵Mingming Liu1-2/+2
getFuncOrValName{,IfDefined} (#68240) - This function looks up MD5ToNameMap to return a name for a given MD5. https://github.com/llvm/llvm-project/pull/66825 adds MD5 of global variable names into this map. So rename methods and update comments
2023-09-21Revert "[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based ↵Hans Wennborg1-19/+0
Code Coverage (1/3)" This seems to cause Clang to crash, see comments on the code review. Reverting until the problem can be investigated. > Part 1 of 3. This includes the LLVM back-end processing and profile > reading/writing components. compiler-rt changes are included. > > Differential Revision: https://reviews.llvm.org/D138846 This reverts commit a50486fd736ab2fe03fcacaf8b98876db77217a7.
2023-09-19[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code ↵Alan Phipps1-0/+19
Coverage (1/3) Part 1 of 3. This includes the LLVM back-end processing and profile reading/writing components. compiler-rt changes are included. Differential Revision: https://reviews.llvm.org/D138846
2023-07-20[llvm-profdata] Stabilize iteration order for InstrProfWriterFangrui Song1-1/+5
If two functions are inserted to the same bucket, their order in the serialized profile is dependent on StringMap iteration order, which is not guaranteed to be deterministic. (https://llvm.org/docs/ProgrammersManual.html#llvm-adt-stringmap-h). Use a sort like we do in writeText.
2023-06-28[instrprof] Add an overload to accept raw_string_ostream.Snehasish Kumar1-2/+6
Add an overload for InstrProfWriter::write so that users can emit the buffer to a string. Also use this new overload for existing unit test usecases. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D153904
2023-04-13[InstrProf][Temporal] Add weight field to tracesEllis Hoag1-8/+10
As discussed in [0], add a `weight` field to temporal profiling traces found in profiles. This allows users to use the `--weighted-input=` flag in the `llvm-profdata merge` command to weight traces from different scenarios differently. Note that this is a breaking change, but since [1] landed very recently and there is no way to "use" this trace data, there should be no users of this feature. We believe it is acceptable to land this change without bumping the profile format version. [0] https://reviews.llvm.org/D147812#4259507 [1] https://reviews.llvm.org/D147287 Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D148150
2023-04-11[InstrProf] Temporal ProfilingEllis Hoag1-6/+108
As described in [0], this extends IRPGO to support //Temporal Profiling//. When `-pgo-temporal-instrumentation` is used we add the `llvm.instrprof.timestamp()` intrinsic to the entry of functions which in turn gets lowered to a call to the compiler-rt function `INSTR_PROF_PROFILE_SET_TIMESTAMP()`. A new field in the `llvm_prf_cnts` section stores each function's timestamp. Then in `llvm-profdata merge` we convert these function timestamps into a //trace// and add it to the indexed profile. Since these traces could significantly increase the profile size, we've added `-max-temporal-profile-trace-length` and `-temporal-profile-trace-reservoir-size` to limit the length of a trace and the number of traces in a profile, respectively. In a future diff we plan to use these traces to construct an optimized function order to reduce the number of page faults during startup. Special thanks to Julian Mestre for helping with reservoir sampling. [0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068 Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D147287
2023-03-14[llvm] Use *{Set,Map}::contains (NFC)Kazu Hirata1-1/+1
2022-12-29[profile] Add binary ids into indexed profilesGulfem Savrun Yeniceri1-4/+62
This patch adds support for including binary ids in an indexed profile. It adds a new field into the header that points to the offset of the binary id section. The binary id section consists of a size of the section, and a list of binary ids (if they are present) that consist of two parts: length and data. This patch guarantees that indexed profile is backwards compatible after adding binary ids. Differential Revision: https://reviews.llvm.org/D135929
2022-12-14Revert "[profile] Add binary ids into indexed profiles"Gulfem Savrun Yeniceri1-62/+4
This reverts commit 7734053fd98e7d5ddc749808ce38134686425fb7 because it broke powerpc64 bot: https://lab.llvm.org/buildbot#builders/231/builds/6229
2022-12-14[profile] Add binary ids into indexed profilesGulfem Savrun Yeniceri1-4/+62
This patch adds support for including binary ids in an indexed profile. It adds a new field into the header that points to the offset of the binary id section. The binary id section consists of a size of the section, and a list of binary ids (if they are present) that consist of two parts: length and data. This patch guarantees that indexed profile is backwards compatible after adding binary ids. Differential Revision: https://reviews.llvm.org/D135929
2022-11-26[llvm] Use std::size (NFC)Kazu Hirata1-1/+1
std::size, introduced in C++17, allows us to directly obtain the number of elements of an array.
2022-11-04[llvm-profdata] Check for all duplicate entries in MemOpSize tableMatthew Voss1-6/+3
Previously, we only checked for duplicate zero entries when merging a MemOPSize table (see D92074), but a user recently provided a reproducer demonstrating that other entries can also be duplicated. As demonstrated by the test in this patch, PGOMemOPSizeOpt can potentially generate invalid IR for non-zero, non-consecutive duplicate entries. This seems to be a rare case, since the duplicate entry is often below the threshold, but possible. This patch extends the existing warning to check for any duplicate values in the table, both in the optimization and in llvm-profdata. Differential Revision: https://reviews.llvm.org/D136211
2022-04-08[memprof] Deduplicate and outline frame storage in the memprof profile.Snehasish Kumar1-21/+68
The current implementation of memprof information in the indexed profile format stores the representation of each calling context fram inline. This patch uses an interned representation where the frame contents are stored in a separate on-disk hash table. The table is indexed via a hash of the contents of the frame. With this patch, the compressed size of a large memprof profile reduces by ~22%. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D123094
2022-03-22Reland "[memprof] Store callsite metadata with memprof records."Snehasish Kumar1-25/+11
This reverts commit f4b794427e8037a4e952cacdfe7201e961f31a6f. Reland with underlying msan issue fixed in D122260.
2022-03-21Revert "[memprof] Store callsite metadata with memprof records."Mitch Phillips1-11/+25
This reverts commit 0d362c90d335509c57c0fbd01ae1829e2b9c3765. Reason: Causes the MSan buildbot to fail (see comments on https://reviews.llvm.org/D121179 for more information
2022-03-21[memprof] Store callsite metadata with memprof records.Snehasish Kumar1-25/+11
To ease profile annotation, each of the callsites in a function can be annotated with profile data - "IR metadata format for MemProf" [1]. This patch extends the on-disk serialized record format to store the debug information for allocation callsites incl inline frames. This change is incompatible with the existing format i.e. indexed profiles must be regenerated, raw profiles are unaffected. [1] https://groups.google.com/g/llvm-dev/c/aWHsdMxKAfE/m/WtEmRqyhAgAJ Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D121179
2022-02-24Cleanup includes: ProfileDataserge-sans-paille1-1/+0
Estimation of the impact on preprocessor output: before: 1067349756 after: 1065940348 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120434
2022-02-23[instrprof] Rename the profile kind types to be more descriptive.Snehasish Kumar1-8/+10
Based on the discussion in D115393, I've updated the names to be more descriptive. Reviewed By: ellis, MaskRay Differential Revision: https://reviews.llvm.org/D120092
2022-02-17Reland "[memprof] Extend the index prof format to include memory profiles."Snehasish Kumar1-4/+86
This patch adds support for optional memory profile information to be included with and indexed profile. The indexed profile header adds a new field which points to the offset of the memory profile section (if present) in the indexed profile. For users who do not utilize this feature the only overhead is a 64-bit offset in the header. The memory profile section contains (1) profile metadata describing the information recorded for each entry (2) an on-disk hashtable containing the profile records indexed via llvm::md5(function_name). We chose to introduce a separate hash table instead of the existing one since the indexing for the instrumented fdo hash table is based on a CFG hash which itself is perturbed by memprof instrumentation. This commit also includes the changes reviewed separately in D120093. Differential Revision: https://reviews.llvm.org/D120103
2022-02-17Revert "Reland "[memprof] Extend the index prof format to include memory ↵Snehasish Kumar1-86/+4
profiles."" This reverts commit 807ba7aace188ada83ddb4477265728e97346af1.
2022-02-17Reland "[memprof] Extend the index prof format to include memory profiles."Snehasish Kumar1-4/+86
This reverts commit 85355a560a33897453df2ef959e255ee725eebce. This patch adds support for optional memory profile information to be included with and indexed profile. The indexed profile header adds a new field which points to the offset of the memory profile section (if present) in the indexed profile. For users who do not utilize this feature the only overhead is a 64-bit offset in the header. The memory profile section contains (1) profile metadata describing the information recorded for each entry (2) an on-disk hashtable containing the profile records indexed via llvm::md5(function_name). We chose to introduce a separate hash table instead of the existing one since the indexing for the instrumented fdo hash table is based on a CFG hash which itself is perturbed by memprof instrumentation. Differential Revision: https://reviews.llvm.org/D118653
2022-02-14Revert "Reland "[memprof] Extend the index prof format to include memory ↵Snehasish Kumar1-86/+4
profiles."" This reverts commit de54e4ab78ef09b60f870e8df6f8a87e56d6bd94 [1/4]
2022-02-14Reland "[memprof] Extend the index prof format to include memory profiles."Snehasish Kumar1-4/+86
This reverts commit 0f73fb18ca333e38cdb9ffa701a8db026c56041d. Use llvm/Profile/MIBEntryDef.inc instead of relative path. Generated the raw profile data with `-mllvm -enable-name-compression=false` so that builbots where the reader is built without zlib do not fail. Also updated the test build instructions.