aboutsummaryrefslogtreecommitdiff
path: root/llvm/unittests/ProfileData/InstrProfTest.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-06-24[InstrProf] Factor out getRecord() and use NamedInstrProfRecord (#145417)Ellis Hoag1-11/+9
Factor out code in `populateCounters()` and `populateCoverage()` used to grab the record into `PGOUseFunc::getRecord()` to reduce code duplication. And return `NamedInstrProfRecord` in `getInstrProfRecord()` to avoid an unnecessary cast. No functional change is intented.
2025-05-19[NFC][MemProf] Move IndexedMemProfData to its own header. (#140503)Snehasish Kumar1-0/+1
Part of a larger refactoring with the following goals 1. Reduce the size of MemProf.h 2. Avoid including ModuleSummaryIndex just for a couple of types
2025-05-19[NFC][MemProf] Move Radix tree methods to their own header and cpp. (#140501)Snehasish Kumar1-0/+1
Part of a larger refactoring with the following goals 1. Reduce the size of MemProf.h 2. Avoid including ModuleSummaryIndex just for a couple of types
2025-05-19[NFC][MemProf] Fix typo in type name (#140500)Snehasish Kumar1-3/+3
2025-03-12[MemProf] Extend CallSite information to include potential callees. (#130441)Snehasish Kumar1-1/+1
* Added YAML traits for `CallSiteInfo` * Updated the `MemProfReader` to pass `Frames` instead of the entire `CallSiteInfo` * Updated test cases to use `testing::Field` * Add YAML sequence traits for CallSiteInfo in MemProfYAML * Also extend IndexedMemProfRecord * XFAIL the MemProfYaml round trip test until we update the profile format For now we only read and write the additional information from the YAML format. The YAML round trip test will be enabled when the serialized format is updated.
2024-12-19[memprof] Introduce IndexedCallstackIdConveter (NFC) (#120540)Kazu Hirata1-20/+11
This patch introduces IndexedCallstackIdConveter as a convenience wrapper around FrameIdConverter and CallStackIdConverter just for tests. With the new wrapper, we get to replace idioms like: FrameIdConverter<decltype(MemProfData.Frames)> FrameIdConv( MemProfData.Frames); CallStackIdConverter<decltype(MemProfData.CallStacks)> CSIdConv( MemProfData.CallStacks, FrameIdConv); with: IndexedCallstackIdConveter CSIdConv(MemProfData); Unfortunately, this exact pattern occurs in tests only; the combinations of the frame ID converter and call stack ID converter are diverse in production code.
2024-12-08[memprof] Drop memprof:: in unit tests (NFC) (#119113)Kazu Hirata1-2/+2
This patch replaces memprof::Foo with Foo if we have corresponding: using llvm::memprof::Foo;
2024-12-07[memprof] Add getMemProfDataForTest for unit tests (#119061)Kazu Hirata1-31/+19
We always call getFrameMapping and getCallStackMapping together in InstrProfTest.cpp. This patch combines the two functions into new function getMemProfDataForTest.
2024-12-07[memprof] Remove a stale comment in a unit test (#119060)Kazu Hirata1-2/+0
We've removed IndexedAllocationInfo::CallStack, so we don't need to mention it.
2024-12-06[memprof] Use gtest matchers at more places (#119050)Kazu Hirata1-7/+5
These gtest matchers reduce the number of times we mention the variables under examined.
2024-11-22[memprof] Remove MemProf format Version 1 (#117357)Kazu Hirata1-57/+0
This patch removes MemProf format Version 1 now that Version 2 and 3 are working well.
2024-11-20[memprof] Upgrade a unit test to MemProf Version 3 (#117063)Kazu Hirata1-19/+3
This patch upgrades a unit test to MemProf Version 3 while removing those bits that cannot be upgraded to Version 3. The bits being removed expect instrprof_error::hash_mismatch from a broken MemProf profile that references a frame that doesn't actually exist. Now, Version 3 no longer issues instrprof_error::hash_mismatch. Even if it still issued instrprof_error::hash_mismatch, we would have a couple of hurdles: - InstrProfWriter::addMemProfData will soon require all (or none) of the fields (frames, call stacks, and records) be populated. That is, it won't accept an instance of IndexedMemProfData with frames missing. - writeMemProfV3 asserts that every frame occurs at least once: assert(MemProfData.Frames.size() == FrameHistogram.size()); This patch gives up on instrprof_error::hash_mismatch and tries to trigger instrprof_error::unknown_function with the empty profile.
2024-11-20[memprof] Use LineLocation in a unit test (NFC) (#117031)Kazu Hirata1-4/+6
We've switched to LineLocation from FieldsAre in MemProfUseTest.cpp. This patch does the same thing in InstrProfTest.cpp. llvm/unittests/Transforms/Instrumentation/MemProfUseTest.cpp
2024-11-18[memprof] Add another constructor to IndexedAllocationInfo (NFC) (#116684)Kazu Hirata1-2/+1
This patch adds another constructor to IndexedAllocationInfo that is identical to the existing constructor except that the new one leaves the CallStack field empty. I'm planning to remove MemProf format Version 1. Then we will migrate the users of the existing constructor to the new one as nobody will be using the CallStack field anymore. Adding the new constructor now allows us to migrate a few existing users of the old constructor even before we remove the CallStack field. In turn, that simplifies the patch to actually remove the field.
2024-11-18[memprof] Add InstrProfWriter::addMemProfData (#116528)Kazu Hirata1-61/+51
This patch adds InstrProfWriter::addMemProfData, which adds the complete MemProf profile (frames, call stacks, and records) to the writer context. Without this function, functions like loadInput in llvm-profdata.cpp and InstrProfWriter::mergeRecordsFromWriter must add one item (frame, call stack, or record) at a time. The new function std::moves the entire MemProf profile to the writer context if the destination is empty, which is the common use case. Otherwise, we fall back to adding one item at a time behind the scene. Here are a couple of reasons why we should add this function: - We've had a bug where we forgot to add one of the three data structures (frames, call stacks, and records) to the writer context, resulting in a nearly empty indexed profile. We should always package the three data structures together, especially on API boundaries. - We expose a little too much of the MemProf detail to InstrProfWriter. I'd like to gradually transform InstrProfReader/Writer to entities managing buffers (sequences of bytes), with actual serialization/deserialization left to external classes. We already do some of this in InstrProfReader, where InstrProfReader "contracts out" to IndexedMemProfReader to handle MemProf details. I am not changing loadInput or InstrProfWriter::mergeRecordsFromWriter for now because MemProfReader uses DenseMap for frames and call stacks, whereas MemProfData uses MapVector. I'll resolve these mismatches in subsequent patches.
2024-11-16[memprof] Upgrade a unit test to Version 3 (#116516)Kazu Hirata1-20/+16
I'm planning to remove MemProf Version 1, which is a maintenance burden because it uses a different set of struct fields in IndexedAllocationInfo and IndexedMemProfRecord compared to Version 2 and 3. (FWIW, Version 2 and 3 are much closer to each other.) Before we remove the old version, we need to upgrade test_memprof_merge to Version 3. Here are some remarks: - Without this patch, we call Writer.addMemProfFrame, which I don't think is correct. This way, we are adding IndexedMemProfRecord to Writer2 without any frame. I'm changing that to Writer2.addMemProfFrame. - This patch adds a call to getCallStackMapping. Version 2 and 3 require a map from call stack IDs to call stacks. - I'm calling makeRecordV2 instead of makeRecord to populate the struct fields used by Version 2 and 3. - Version 1 uses MemProfRecord::MemProfRecord (that is, a constructor) to convert IndexedMemProfRecord to MemProfRecord. Version 2 and 3 use MemProfRecord::toMemProfRecord, a member function, to do the same task.
2024-11-16[memprof] Update a comment (NFC) (#116500)Kazu Hirata1-1/+1
Note that Version0 has been removed in #116442.
2024-11-13[memprof] Add IndexedMemProfReader::getMemProfCallerCalleePairs (#115807)Kazu Hirata1-0/+62
Undrifting the MemProf profile requires two sets of information: - caller-callee pairs from the profile - callee-callee pairs from the IR This patch adds a function to do the former. The latter has been addressed by extractCallsFromIR. Unfortunately, the current MemProf format does not directly give us the caller-callee pairs from the profile. "struct Frame" just tells us where the call site is -- Caller GUID and line/column numbers; it doesn't tell us what function a given Frame is calling. To extract caller-callee pairs, we need to scan each call stack, look at two adjacent Frames, and extract a caller-callee pair. Conceptually, we would extract caller-callee pairs with: for each MemProfRecord in the profile: for each call stack in AllocSites: extract caller-callee pairs from adjacent pairs of Frames However, this is highly inefficient. Obtaining MemProfRecord involves looking up the OnDiskHashTable, allocating several vectors on the heap, and populating fields that are irrelevant to us, such as MIB and CallSites. This patch adds an efficient way of doing the above. Specifically, we - go though all IndexedMemProfRecords, - look at each linear call stack ID - extract caller-callee pairs from each call stack The extraction is done by a new class CallerCalleePairExtractor, modified from LinearCallStackIdConverter, which reconstructs a call stack from the radix tree array. For our purposes, we skip the reconstruction and immediately populates the data structure for caller-callee pairs. The resulting caller-callee-pairs is of the type: DenseMap<uint64_t, SmallVector<CallEdgeTy, 0>> CallerCalleePairs; which can be passed directly to longestCommonSequence just like the result of extractCallsFromIR. Further performance optimizations are possible for the new functions in this patch. I'll address those in follow-up patches.
2024-09-21[unittests] Use {} instead of std::nullopt to initialize empty ArrayRef ↵Jay Foad1-7/+7
(#109388) Follow up to #109133.
2024-09-13[llvm][unittests] Don't call raw_string_ostream::flush() (NFC)JOE19941-1/+0
raw_string_ostream::flush() is essentially a no-op (also specified in docs). Don't call it in tests that aren't meant to test 'raw_string_ostream' itself. p.s. remove a few redundant calls to raw_string_ostream::str()
2024-07-11[ProfileData] Take ArrayRef<InstrProfValueData> in addValueData (NFC) (#97363)Kazu Hirata1-39/+39
This patch fixes another place in ProfileData where we have a pointer to an array of InstrProfValueData and its length separately. addValueData is a bit unique in that it remaps incoming values in place before adding them to ValueSites. AFAICT, no caller of addValueData uses updated incoming values. With this patch, we add value data to ValueSites first and then remaps values there. This way, we can take ArrayRef<InstrProfValueData> as a parameter.
2024-06-29[ProfileData] Migrate to a new version of getValueProfDataFromInst (#96381)Kazu Hirata1-16/+10
2024-06-20[ProfileData] Teach addValueData to honor parameter Site (#96233)Kazu Hirata1-3/+3
This patch teaches addValueData to honor Site for verification purposes. It does not affect the profile data in any manner.
2024-06-14[ProfileData] Sink the length checks (#95604)Kazu Hirata1-14/+16
The new API getValueArrayForSite returns ArrayRef<InstrProfValueData>, packaging the array length and contents together. This patch sinks the array length checks just before we check the contents. This way, we check both the array length and contents immediately after calling getValueArrayForSite.
2024-06-14[ProfileData] Migrate to a new version of getValueProfDataFromInst (#95568)Kazu Hirata1-16/+12
Note that the version of getValueProfDataFromInst that returns bool has been "deprecated" since: commit 1e15371dd8843dfc52b9435afaa133997c1773d8 Author: Mingming Liu <mingmingl@google.com> Date: Mon Apr 1 15:14:49 2024 -0700
2024-06-14[ProfileData] Sink the length checks (#95559)Kazu Hirata1-11/+14
The new API getValueArrayForSite returns ArrayRef<InstrProfValueData>, packaging the array length and contents together. This patch sinks the array length checks just before we check the contents. This way, we check both the array length and contents immediately after calling getValueArrayForSite.
2024-06-14[ProfileData] Fix the order of tests (#95549)Kazu Hirata1-2/+2
Without this patch, we call getValueForSite before veryfing that we have an expected number of value sites with getNumValueSites. This patch fixes the order by "sinking" the call to getValueForSite. While I am at it, this patch migrates the use of getValueForSite to getValueArrayForSite.
2024-06-14[ProfileData] Migrate to getValueArrayForSite (#95493)Kazu Hirata1-34/+43
This patch migrates uses of getValueForSite to getValueArrayForSite. Each hunk is self-contained, meaning that each one can be applied independently of the others. In the unit test, there are cases where the array length check is performed a lot earlier than the array content check. For now, I'm leaving the length checks where they are. I'll consider moving them when I migrate uses of getNumValueDataForSite to getValueArrayForSite in a follow-up patch.
2024-06-13[ProfileData] Migrate to getValueArrayForSite (#95457)Kazu Hirata1-5/+4
This patch is a collection of one-liner migrations to getValueArrayForSite.
2024-06-13[ProfileData] Add getValueArrayForSite (#95335)Kazu Hirata1-7/+7
Without this patch, a typical traversal over the value data looks like: uint32_t NV = Func.getNumValueDataForSite(VK, S); std::unique_ptr<InstrProfValueData[]> VD = Func.getValueForSite(VK, S); for (uint32_t V = 0; V < NV; V++) Do something with VD[V].Value and/or VD[V].Count; This patch adds getValueArrayForSite, which returns ArrayRef<InstrProfValueData>, so we can do: for (const auto &V : Func.getValueArrayForSite(VK, S)) Do something with V.Value and/or V.Count; I'm planning to migrate the existing uses of getValueForSite to getValueArrayForSite in follow-up patches and remove getValueForSite and getNumValueDataForSite.
2024-06-12[ProfileData] Compute sum in annotateValueSite (NFC) (#95199)Kazu Hirata1-9/+3
getValueForSite computes the total count -- the total number of times a given value site is visited. The problem is that, excluding tests, annotateValueSite is the only place that needs the total count. This patch moves the total count computation to annotateValueSite.
2024-05-09[ProfData] Remove unused variable in unittestMin Hsu1-1/+0
Removed unused `VTables` in unittests/ProfileData/InstrProfTest.cpp. NFC.
2024-05-09[InstrProf] Add vtables with type metadata into symtab (#81051)Mingming Liu1-0/+47
The indirect-call-promotion pass will look up the vtable to find out the virtual function [1], and add vtable-derived information in icall candidate [2] for cost-benefit analysis. [1] https://github.com/llvm/llvm-project/pull/81442/files#diff-a95d1ac8a0da69713fcb3346135d4b219f0a73920318d2549495620ea215191bR395-R416 [2] https://github.com/llvm/llvm-project/pull/81442/files#diff-a95d1ac8a0da69713fcb3346135d4b219f0a73920318d2549495620ea215191bR195-R199
2024-04-28[memprof] Add access checks to PortableMemInfoBlock::get* (#90121)Kazu Hirata1-4/+4
commit 4c8ec8f8bc3fb4dda4fd36c3b2ad745bd3451970 Author: Kazu Hirata <kazu@google.com> Date: Wed Apr 24 16:25:35 2024 -0700 introduced the idea of serializing/deserializing a subset of the fields in PortableMemInfoBlock. While it reduces the size of the indexed MemProf profile file, we now could inadvertently access unavailable fields and go without noticing. To protect ourselves from the risk, this patch adds access checks to PortableMemInfoBlock::get* methods by embedding a bit set representing available fields into PortableMemInfoBlock.
2024-04-28Repply [memprof] Introduce FrameIdConverter and CallStackIdConverter (#90307)Kazu Hirata1-48/+14
Currently, we convert FrameId to Frame and CallStackId to a call stack at several places. This patch unifies those into function objects -- FrameIdConverter and CallStackIdConverter. The existing implementation of CallStackIdConverter, being removed in this patch, handles both FrameId and CallStackId conversions. This patch splits it into two phases for flexibility (but make them composable) because some places only require the FrameId conversion. This iteration fixes a problem uncovered with ubsan, where we were dereferencing an uninitialized std::unique_ptr.
2024-04-27Revert "[memprof] Introduce FrameIdConverter and CallStackIdConverter" (#90318)Vitaly Buka1-14/+48
Reverts llvm/llvm-project#90307 Breaks bots https://lab.llvm.org/buildbot/#/builders/5/builds/42943
2024-04-26[memprof] Introduce FrameIdConverter and CallStackIdConverter (#90307)Kazu Hirata1-48/+14
Currently, we convert FrameId to Frame and CallStackId to a call stack at several places. This patch unifies those into function objects -- FrameIdConverter and CallStackIdConverter. The existing implementation of CallStackIdConverter, being removed in this patch, handles both FrameId and CallStackId conversions. This patch splits it into two phases for flexibility (but make them composable) because some places only require the FrameId conversion.
2024-04-24[memprof] Reduce schema for Version2 (#89876)Kazu Hirata1-33/+109
Curently, the compiler only uses several fields of MemoryInfoBlock. Serializing all fields into the indexed MemProf file simply wastes storage. This patch limits the schema down to four fields for Version2 by default. It retains the old behavior of serializing all fields via: llvm-profdata merge --memprof-version=2 --memprof-full-schema This patch reduces the size of the indexed MemProf profile I have by 40% (1.6GB down to 1.0GB).
2024-04-18[memprof] Use std::optional (NFC) (#89317)Kazu Hirata1-6/+4
This is partly for readability and partly for consistency with other id-to-frame callbacks.
2024-04-18[memprof] Add Version2 of the indexed MemProf format (#89100)Kazu Hirata1-1/+88
This patch adds Version2 of the indexed MemProf format. The new format comes with a hash table from CallStackId to actual call stacks llvm::SmallVector<FrameId>. The rest of the format refers to call stacks with CallStackId. This "values + references" model effectively deduplicates call stacks. Without this patch, a large indexed memprof file of mine shrinks from 4.4GB to 1.6GB, a 64% reduction. This patch does not make Version2 generally available yet as I am planning to make a few more changes to the format.
2024-04-11[memprof] Use std::optional (NFC) (#88366)Kazu Hirata1-5/+4
2024-04-01[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write ↵Mingming Liu1-141/+479
(#66825) (The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691) * For InstrFDO value profiling, implement instrumentation and lowering for virtual table address. * This is controlled by `-enable-vtable-value-profiling` and off by default. * When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads. * Implement profile reader and writer support * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols. * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't happen since IR is used to construct InstrProfSymtab. * Indexed profile writer collects the list of vtable names, and stores that to index profiles. * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type. * `llvm-profdata show -show-vtables <args> <profile>` is implemented. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
2024-03-23[memprof] Add call stack IDs to IndexedAllocationInfo (#85888)Kazu Hirata1-1/+2
The indexed MemProf file has a huge amount of redundancy. In a large internal application, 82% of call stacks, stored in IndexedAllocationInfo::CallStack, are duplicates. We should work toward deduplicating call stacks by referring to them with unique IDs with actual call stacks stored in a separate data structure, much like we refer to memprof::Frame with memprof::FrameId. At the same time, we need to facilitate a graceful transition from the current version of the MemProf format to the next. We should be able to read (but not write) the current version of the MemProf file even after we move onto the next one. With those goals in mind, I propose to have an integer ID next to CallStack in IndexedAllocationInfo to refer to a call stack in a succinct manner. We'll gradually increase the areas of the compiler where IDs and call stacks have one-to-one correspondence and eventually remove the existing CallStack field. This patch adds call stack ID, named CSId, to IndexedAllocationInfo and teaches the raw profile reader to compute unique call stack IDs and store them in the new field. It does not introduce any user of the call stack IDs yet, except in verifyFunctionProfileData.
2024-02-07[NFC][InstrProf]Generalize getParsedIRPGOFuncName to getParsedIRPGOName (#81054)Mingming Liu1-4/+2
- Function getParsedIRPGOFuncName splits name by delimiter. The `[filename;]mangled-name` format could be generalized for non-function global values (e.g., vtables for type profiling). So rename the function. - Use kGlobalIdentifierDelimiter rather than semicolon directly for defragmentation.
2024-01-04[InstrProf] No linkage prefixes in IRPGO names (#76994)Ellis Hoag1-16/+2
Change the format of IRPGO counter names to `[<filepath>;]<mangled-name>` which is computed by `GlobalValue::getGlobalIdentifier()` to fix #74565. In fe051934cbb0aaf25d960d7d45305135635d650b (https://reviews.llvm.org/D156569) the format of IRPGO counter names was changed to be `[<filepath>;]<linkage-name>` where `<linkage-name>` is basically `F.getName()` with some prefix, e.g., `_` or `l_` on Mach-O (yes, it is confusing that `<linkage-name>` is computed with `Mangler().getNameWithPrefix()` while `<mangled-name>` is just `F.getName()`). We discovered in #74565 that this causes some missed import issues on some targets and #74008 is a partial fix. Since `<mangled-name>` may not match the `<linkage-name>` on some targets like Mach-O, we will need to post-process the output of `llvm-profdata order` before passing to the linker via `-order_file`. Profiles generated after fe051934cbb0aaf25d960d7d45305135635d650b will become stale after this diff, but I think this is acceptable since that patch landed after the LLVM 18 cut which hasn't been released yet.
2023-12-19Reland the reland "[PGO][GlobalValue][LTO]In ↵Mingming Liu1-2/+2
GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75954) Simplify the compiler-rt test to make it more general for different platforms, and use `*DAG` matchers for lines that may be emitted out-of-order. - The compiler-rt test passed on a Windows machine. Previously name matchers don't work for MSVC mangling (https://lab.llvm.org/buildbot/#/builders/127/builds/59907) - `*DAG` matchers fixed the error in https://lab.llvm.org/buildbot/#/builders/94/builds/17924 This is the second reland and fixed errors caught in first reland (https://github.com/llvm/llvm-project/pull/75860) **Original commit message** Commit fe05193 (phab D156569), IRPGO names uses format `[<filepath>;]<linkage-name>` while prior format is `[<filepath>:<mangled-name>`. The format change would break the use case demonstrated in (updated) `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and `compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp` This patch changes `GlobalValues::getGlobalIdentifer` to use the semicolon. To elaborate on the scenario how things break without this PR 1. IRPGO raw profiles stores (compressed) IRPGO names of functions in one section, and per-function profile data in another section. The [NameRef](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/compiler-rt/include/profile/InstrProfData.inc#L72) field in per-function profile data is the MD5 hash of IRPGO names. 2. When raw profiles are converted to indexed format profiles, the profiled address is [mapped](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/llvm/lib/ProfileData/InstrProf.cpp#L876-L885) to the MD5 hash of the callee. 3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names will be [annotated](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp#L1707) as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported. * `compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp` is added to have an end-to-end test. * `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` is updated to have better test coverage from another aspect (as runtime tests are more sensitive to the environment and may be skipped by some contributors)
2023-12-18Revert "Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, ↵Mingming Liu1-2/+2
use semicolon as delimiter for local-linkage varibles. "" (#75888) Reverts llvm/llvm-project#75860 - Mangled name mismatch on Windows (https://lab.llvm.org/buildbot/#/builders/127/builds/59907/steps/8/logs/stdio)
2023-12-18Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use ↵Mingming Liu1-2/+2
semicolon as delimiter for local-linkage varibles. " (#75860) Fixed build-bot failures caught by post-submit tests 1) Add the list of command line tools needed by new compiler-rt test into dependency. 2) Use `starts_with` to replace deprecated `startswith`. **Original commit message** Commit fe05193 (phab D156569), IRPGO names uses format `[<filepath>;]<linkage-name>` while prior format is `[<filepath>:<mangled-name>`. The format change would break the use case demonstrated in (updated) `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and `compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp` This patch changes `GlobalValues::getGlobalIdentifer` to use the semicolon. To elaborate on the scenario how things break without this PR 1. IRPGO raw profiles stores (compressed) IRPGO names of functions in one section, and per-function profile data in another section. The [NameRef](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/compiler-rt/include/profile/InstrProfData.inc#L72) field in per-function profile data is the MD5 hash of IRPGO names. 2. When raw profiles are converted to indexed format profiles, the profiled address is [mapped](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/llvm/lib/ProfileData/InstrProf.cpp#L876-L885) to the MD5 hash of the callee. 3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names will be [annotated](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp#L1707) as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported. * `compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp` is added to have an end-to-end test. * `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` is updated to have better test coverage from another aspect (as runtime tests are more sensitive to the environment and may be skipped by some contributors)
2023-12-18Revert "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use ↵Mingming Liu1-2/+2
semicolon as delimiter for local-linkage varibles." (#75835) Reverts llvm/llvm-project#74008 The compiler-rt test failed due to `llvm-dis` not found (https://lab.llvm.org/buildbot/#/builders/127/builds/59884) Will revert and investigate how to require the proper dependency.
2023-12-18[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon ↵Mingming Liu1-2/+2
as delimiter for local-linkage varibles. (#74008) Commit fe05193 (phab D156569), IRPGO names uses format `[<filepath>;]<linkage-name>` while prior format is `[<filepath>:<mangled-name>`. The format change would break the use case demonstrated in (updated) `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and `compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp` This patch changes `GlobalValues::getGlobalIdentifer` to use the semicolon. To elaborate on the scenario how things break without this PR 1. IRPGO raw profiles stores (compressed) IRPGO names of functions in one section, and per-function profile data in another section. The [NameRef](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/compiler-rt/include/profile/InstrProfData.inc#L72) field in per-function profile data is the MD5 hash of IRPGO names. 2. When raw profiles are converted to indexed format profiles, the profiled address is [mapped](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/llvm/lib/ProfileData/InstrProf.cpp#L876-L885) to the MD5 hash of the callee. 3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names will be [annotated](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp#L1707) as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported. *`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp` is added to have an end-to-end test. * `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` is updated to have better test coverage from another aspect (as runtime tests are more sensitive to the environment and may be skipped by some contributors)