aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/ProfileData
AgeCommit message (Collapse)AuthorFilesLines
2024-04-01[ThinLTO][TypeProf] Implement vtable def import (#79381)Mingming Liu1-21/+49
Add annotated vtable GUID as referenced variables in per function summary, and update bitcode writer to create value-ids for these referenced vtables. - This is the part3 of type profiling work, and described in the "Virtual Table Definition Import" [1] section of the RFC. [1] https://github.com/llvm/llvm-project/pull/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
2024-04-01[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write ↵Mingming Liu3-17/+111
(#66825) (The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691) * For InstrFDO value profiling, implement instrumentation and lowering for virtual table address. * This is controlled by `-enable-vtable-value-profiling` and off by default. * When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads. * Implement profile reader and writer support * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols. * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't happen since IR is used to construct InstrProfSymtab. * Indexed profile writer collects the list of vtable names, and stores that to index profiles. * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type. * `llvm-profdata show -show-vtables <args> <profile>` is implemented. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
2024-03-29Revert "[ProfileData] Use size_t in PatchItem (NFC) (#87014)"Muhammad Omair Javaid1-7/+7
This reverts commit c64a328cb4a32e81f8b694162750ec1b8823994c. This broke Arm32 bit build on various LLVM buildbots. For example: https://lab.llvm.org/buildbot/#/builders/17/builds/51129
2024-03-28[ProfileData] Use size_t in PatchItem (NFC) (#87014)Kazu Hirata1-7/+7
size_t in PatchItem eliminates the need for casts.
2024-03-28[memprof] Add MemProf version (#86414)Kazu Hirata2-13/+61
This patch adds a version field to the MemProf section of the indexed profile format, calling the new version "version 1". The existing version is called "version 0". The writer supports both versions via a command-line option: llvm-profdata merge --memprof-version=1 ... The reader supports both versions by automatically detecting the version from the header.
2024-03-27[nfc]Make InstrProfSymtab non-copyable and non-movable (#86882)Mingming Liu1-16/+19
- The direct use case (in [1]) is to add `llvm::IntervalMap` [2] and the allocator required by IntervalMap ctor [3] to class `InstrProfSymtab` as owned members. The allocator class doesn't have a move-assignment operator; and it's going to take much effort to implement move-assignment operator for the allocator class such that the enclosing class is movable. - There is only one use of compiler-generated move-assignment operator in the repo, which is in CoverageMappingReader.cpp. Luckily it's possible to use std::unique_ptr<InstrProfSymtab> instead, so did the change. [1] https://github.com/llvm/llvm-project/pull/66825 [2] https://github.com/llvm/llvm-project/blob/4c2f68840e984b0f111779c46845ac00e3a7547d/llvm/include/llvm/ADT/IntervalMap.h#L936 [3] https://github.com/llvm/llvm-project/blob/4c2f68840e984b0f111779c46845ac00e3a7547d/llvm/include/llvm/ADT/IntervalMap.h#L1041
2024-03-25[memprof] Use #ifdef EXPENSIVE_CHECKS (#86585)Kazu Hirata1-1/+1
This patch replaces: #if EXPENSIVE_CHECKS with: #ifdef EXPENSIVE_CHECKS to follow the existing conventions.
2024-03-25[memprof] Compute CallStackId when deserializing IndexedAllocationInfo (#86421)Kazu Hirata2-4/+17
There are two ways to create in-memory instances of IndexedAllocationInfo -- deserialization of the raw MemProf data and that of the indexed MemProf data. With: commit 74799f424063a2d751e0f9ea698db1f4efd0d8b2 Author: Kazu Hirata <kazu@google.com> Date: Sat Mar 23 19:50:15 2024 -0700 we compute CallStackId for each call stack in IndexedAllocationInfo while deserializing the raw MemProf data. This patch does the same while deserilizing the indexed MemProf data. As with the patch above, this patch does not add any use of CallStackId yet.
2024-03-23[memprof] Add call stack IDs to IndexedAllocationInfo (#85888)Kazu Hirata2-1/+30
The indexed MemProf file has a huge amount of redundancy. In a large internal application, 82% of call stacks, stored in IndexedAllocationInfo::CallStack, are duplicates. We should work toward deduplicating call stacks by referring to them with unique IDs with actual call stacks stored in a separate data structure, much like we refer to memprof::Frame with memprof::FrameId. At the same time, we need to facilitate a graceful transition from the current version of the MemProf format to the next. We should be able to read (but not write) the current version of the MemProf file even after we move onto the next one. With those goals in mind, I propose to have an integer ID next to CallStack in IndexedAllocationInfo to refer to a call stack in a succinct manner. We'll gradually increase the areas of the compiler where IDs and call stacks have one-to-one correspondence and eventually remove the existing CallStack field. This patch adds call stack ID, named CSId, to IndexedAllocationInfo and teaches the raw profile reader to compute unique call stack IDs and store them in the new field. It does not introduce any user of the call stack IDs yet, except in verifyFunctionProfileData.
2024-03-21[memprof] Call SmallVector::reserve (#86055)Kazu Hirata1-0/+1
With one raw memprof file I have, NumPCs averages about 41. Given the default number of inline elements being 8 for SmallVector<uint64_t>, we should reserve the storage in advance.
2024-03-14[ProfileData] Use ArrayRef in ProfOStream::patch (NFC) (#85317)Kazu Hirata1-12/+12
We always apply all of the items in PatchItems. This patch simplifies the interface of ProfOStream::patch by switching to ArrayRef.
2024-03-10Add llvm::min/max_element and use it in llvm/ and mlir/ directories. (#84678)Justin Lebar1-1/+1
For some reason this was missing from STLExtras.
2024-03-08[PGO] Add support for writing previous indexed format (#84505)Teresa Johnson1-40/+73
Enable temporary support to ease use of new llvm-profdata with slightly older indexed profiles after 16e74fd48988ac95551d0f64e1b36f78a82a89a2, which bumped the indexed format for type profiling.
2024-03-06[InstrProf][NFC] Fix -Wimplicit-fallthrough warning in InstrProf.cpp after ↵wanglei1-1/+0
#82711
2024-02-27Reland "[TypeProf][InstrPGO] Introduce raw and instr profile format change ↵Mingming Liu3-11/+87
for type profiling." (#82711) New change on top of [reviewed patch](https://github.com/llvm/llvm-project/pull/81691) are [in commits after this one](https://github.com/llvm/llvm-project/pull/82711/commits/d0757f46b3e3865b5f7c552bc0744309a363e0ac). Previous commits are restored from the remote branch with timestamps. 1. Fix build breakage for non-ELF platforms, by defining the missing functions {`__llvm_profile_begin_vtables`, `__llvm_profile_end_vtables`, `__llvm_profile_begin_vtabnames `, `__llvm_profile_end_vtabnames`} everywhere. * Tested on mac laptop (for darwins) and Windows. Specifically, functions in `InstrProfilingPlatformWindows.c` returns `NULL` to make it more explicit that type prof isn't supported; see comments for the reason. * For the rest (AIX, other), mostly follow existing examples (like this [one](https://github.com/llvm/llvm-project/commit/f95b2f1acf1171abb0d00089fd4c9238753847e3)) 2. Rename `__llvm_prf_vtabnames` -> `__llvm_prf_vns` for shorter section name, and make returned pointers [const](https://github.com/llvm/llvm-project/pull/82711/commits/a825d2a4ec00f07772a373091a702f149c3b0c34#diff-4de780ce726d76b7abc9d3353aef95013e7b21e7bda01be8940cc6574fb0b5ffR120-R121) **Original Description** * Raw profile format - Header: records the byte size of compressed vtable names, and the number of profiled vtable entries (call it `VTableProfData`). Header also records padded bytes of each section. - Payload: adds a section for compressed vtable names, and a section to store `VTableProfData`. Both sections are padded so the size is a multiple of 8. * Indexed profile format - Header: records the byte offset of compressed vtable names. - Payload: adds a section to store compressed vtable names. This section is used by `llvm-profdata` to show the list of vtables profiled for an instrumented site. [The originally reviewed patch](https://github.com/llvm/llvm-project/pull/66825) will have profile reader/write change and llvm-profdata change. - To ensure this PR has all the necessary profile format change along with profile version bump, created a copy of the originally reviewed patch in https://github.com/llvm/llvm-project/pull/80761. The copy doesn't have profile format change, but it has the set of tests which covers type profile generation, profile read and profile merge. Tests pass there. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600 --------- Co-authored-by: modiking <modiking213@gmail.com>
2024-02-27[MC/DC] Introduce `class TestVector` with a pair of `BitVector` (#82174)NAKAMURA Takumi1-22/+11
This replaces `SmallVector<CondState>` and emulates it. - -------- True False DontCare - Values: True False False - Visited: True True False `findIndependencePairs()` can be optimized with logical ops. FIXME: Specialize `findIndependencePairs()` for the single word.
2024-02-27[MC/DC] Refactor: Isolate the final result out of TestVector (#82282)NAKAMURA Takumi1-12/+25
To reduce conditional judges in the loop in `findIndependencePairs()`, I have tried a couple of tweaks. * Isolate the final result in `TestVectors` `using TestVectors = llvm::SmallVector<std::pair<TestVector, CondState>>;` The final result was just piggybacked on `TestVector`, so it has been isolated. * Filter out and sort `ExecVectors` by the final result It will cost more in constructing `ExecVectors`, but it can reduce at least one conditional judgement in the loop.
2024-02-26[InstrProf] Single byte counters in coverage (#75425)gulfemsavrun1-3/+10
This patch inserts 1-byte counters instead of an 8-byte counters into llvm profiles for source-based code coverage. The origial idea was proposed as block-cov for PGO, and this patch repurposes that idea for coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4 The current 8-byte counters mechanism add counters to minimal regions, and infer the counters in the remaining regions via adding or subtracting counters. For example, it infers the counter in the if.else region by subtracting the counters between if.entry and if.then regions in an if statement. Whenever there is a control-flow merge, it adds the counters from all the incoming regions. However, we are not going to be able to infer counters by subtracting two execution counts when using single-byte counters. Therefore, this patch conservatively inserts additional counters for the cases where we need to add or subtract counters. RFC: https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
2024-02-26Introduce mcdc::TVIdxBuilder (LLVM side, NFC) (#80676)NAKAMURA Takumi1-12/+141
This is a preparation of incoming Clang changes (#82448) and just checks `TVIdx` is calculated correctly. NFC. `TVIdxBuilder` calculates deterministic Indices for each Condition Node. It is used for `clang` to emit `TestVector` indices (aka ID) and for `llvm-cov` to reconstruct `TestVectors`. This includes the unittest `CoverageMappingTest.TVIdxBuilder`. See also https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
2024-02-25Refactor: Let MCDC::State have DecisionByStmt and BranchByStmtNAKAMURA Takumi1-4/+1
- Prune `RegionMCDCBitmapMap` and `RegionCondIDMap`. They are handled by `MCDCState`. - Rename `s/BitmapMap/DecisionByStmt/`. It can handle Decision stuff. - Rename `s/CondIDMap/BranchByStmt/`. It can be handle Branch stuff. - `MCDCRecordProcessor`: Use `DecisionParams.BitmapIdx` directly.
2024-02-21Revert type profiling change as compiler-rt test break on Windows. (#82583)Mingming Liu3-87/+11
Examples https://lab.llvm.org/buildbot/#/builders/127/builds/62532/steps/8/logs/stdio
2024-02-21[nfc]remove unused variable after pr/81691 (#82578)Mingming Liu1-1/+0
* `N` became unused after [pull request 81691](https://github.com/llvm/llvm-project/pull/81691) * This should fix the build bot failure of `unused variable` https://lab.llvm.org/buildbot/#/builders/77/builds/34840
2024-02-21[TypeProf][InstrPGO] Introduce raw and instr profile format change for type ↵Mingming Liu3-10/+87
profiling. (#81691) * Raw profile format - Header: records the byte size of compressed vtable names, and the number of profiled vtable entries (call it `VTableProfData`). Header also records padded bytes of each section. - Payload: adds a section for compressed vtable names, and a section to store `VTableProfData`. Both sections are padded so the size is a multiple of 8. * Indexed profile format - Header: records the byte offset of compressed vtable names. - Payload: adds a section to store compressed vtable names. This section is used by `llvm-profdata` to show the list of vtables profiled for an instrumented site. [The originally reviewed patch](https://github.com/llvm/llvm-project/pull/66825) will have profile reader/write change and llvm-profdata change. - To ensure this PR has all the necessary profile format change along with profile version bump, created a copy of the originally reviewed patch in https://github.com/llvm/llvm-project/pull/80761. The copy doesn't have profile format change, but it has the set of tests which covers type profile generation, profile read and profile merge. Tests pass there. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600 --------- Co-authored-by: modiking <modiking213@gmail.com>
2024-02-15[MC/DC] Refactor: Let MCDCConditionID int16_t with zero-origin (#81257)NAKAMURA Takumi3-26/+33
Also, Let `NumConditions` `uint16_t`. It is smarter to handle the ID as signed. Narrowing to `int16_t` will reduce costs of handling byvalue. (See also #81221 and #81227) External behavior doesn't change. They below handle values as internal values plus 1. * `-dump-coverage-mapping` * `CoverageMappingReader.cpp` * `CoverageMappingWriter.cpp`
2024-02-14[MC/DC] Refactor: Introduce `ConditionIDs` as `std::array<2>` (#81221)NAKAMURA Takumi3-32/+28
Its 0th element corresponds to `FalseID` and 1st to `TrueID`. CoverageMappingGen.cpp: `DecisionIDPair` is replaced with `ConditionIDs`
2024-02-13[NFC][InstrProf]Factor out getCanonicalName to compute the canonical name ↵Mingming Liu1-19/+30
given a pgo name. (#81547) - Also update the `InstrProf::addFuncWithName` to call the newly added `getCanonicalName`.
2024-02-13[MC/DC] Refactor: Make `MCDCParams` as `std::variant` (#81227)NAKAMURA Takumi3-36/+57
Introduce `mcdc::DecisionParameters` and `mcdc::BranchParameters` and make sure them not initialized as zero. FIXME: Could we make `CoverageMappingRegion` as a smart tagged union?
2024-02-13CoverageMappingReader/Writer: MCDCConditionID shouldn't be zeroNAKAMURA Takumi2-0/+5
2024-02-13CoverageMapping.cpp: Apply std::move to MCDCRecord (#81220)NAKAMURA Takumi1-4/+4
2024-02-13[MC/DC] Refactor: Introduce `MCDCTypes.h` for `coverage::mcdc` (#81459)NAKAMURA Takumi2-5/+4
They can be also used in `clang`. Introduce the lightweight header instead of `CoverageMapping.h`. This includes for now: * `mcdc::ConditionID` * `mcdc::Parameters`
2024-02-09[Coverage] MCDCRecordProcessor: Find `ExecVectors` directly (#80816)NAKAMURA Takumi1-20/+11
Deprecate `TestVectors`, since no one uses it. This affects the output order of ExecVectors. The current impl emits sorted by binary value of ExecVector. This impl emits along the traversal of `buildTestVector()`.
2024-02-07[NFC][InstrProf]Generalize getParsedIRPGOFuncName to getParsedIRPGOName (#81054)Mingming Liu1-7/+6
- Function getParsedIRPGOFuncName splits name by delimiter. The `[filename;]mangled-name` format could be generalized for non-function global values (e.g., vtables for type profiling). So rename the function. - Use kGlobalIdentifierDelimiter rather than semicolon directly for defragmentation.
2024-02-06Anonymize `MCDCRecordProcessor`NAKAMURA Takumi1-0/+4
2024-02-06CoverageMapping.cpp: s/MaxBitmapID/MaxBitmapIdx/ in getMaxBitmapSize()NAKAMURA Takumi1-5/+5
2024-02-05InstrProf::getFunctionBitmap: Fix BE hosts (#80608)NAKAMURA Takumi1-4/+7
2024-02-05[Coverage] ProfileData: Handle MC/DC Bitmap as BitVector. NFC. (#80608)NAKAMURA Takumi2-52/+37
* `getFunctionBitmap()` stores not `std::vector<uint8_t>` but `BitVector`. * `CounterMappingContext` holds `Bitmap` (instead of the ref of bytes) * `Bitmap` and `BitmapIdx` are used instead of `evaluateBitmap()`. FIXME: `InstrProfRecord` itself should handle `Bitmap` as `BitVector`.
2024-02-02[Coverage] Let `Decision` take account of expansions (#78969)NAKAMURA Takumi1-43/+197
The current implementation (D138849) assumes `Branch`(es) would follow after the corresponding `Decision`. It is not true if `Branch`(es) are forwarded to expanded file ID. As a result, consecutive `Decision`(s) would be confused with insufficient number of `Branch`(es). `Expansion` will point `Branch`(es) in other file IDs if `Expansion` is included in the range of `Decision`. Fixes #77871 --------- Co-authored-by: Alan Phipps <a-phipps@ti.com>
2024-02-02CoverageMappingWriter: Emit `Decision` before `Expansion` (#78966)NAKAMURA Takumi1-1/+9
To relax scanning record, tweak order by `Decision < Expansion`, or `Expansion` could not be distinguished whether it belonged to `Decision` or not. Relevant to #77871
2024-01-29[llvm-cov] Simplify and optimize MC/DC computation (#79727)Fangrui Song1-101/+41
Update code from https://reviews.llvm.org/D138847 `buildTestVector` is a standard DFS (walking a reduced ordered binary decision diagram). Avoid shouldCopyOffTestVectorFor{True,False}Path complexity and redundant `Map[ID]` lookups. `findIndependencePairs` unnecessarily uses four nested loops (n<=6) to find independence pairs. Instead, enumerate the two execution vectors and find the number of mismatches. This algorithm can be optimized using the marking function technique described in _Efficient Test Coverage Measurement for MC/DC, 2013_, but this may be overkill.
2024-01-23[Coverage] getMaxBitmapSize: Scan `max(BitmapIdx)` instead of the last ↵NAKAMURA Takumi1-5/+6
`Decision` (#78963) In `CoverageMapping.cpp:getMaxBitmapSize()`, this assumed that the last `Decision` has the maxmum `BitmapIdx`. Let it scan `max(BitmapIdx)`. Note that `<=` is used insted of `<`, because `BitmapIdx == 0` is valid and `MaxBitmapID` is `unsigned`. `BitmapIdx` is unique in the record. Fixes #78922
2024-01-23[Coverage] Const-ize `MCDCRecordProcessor` stuff (#78918)NAKAMURA Takumi1-15/+17
The life of `MCDCRecordProcessor`'s instance is short. It may accept `const` objects to process. On the other hand, the life of `MCDCBranches` is shorter than `Record`. It may be rewritten with reference, rather than copying.
2024-01-22[coverage] skipping code coverage for 'if constexpr' and 'if consteval' (#78033)Hana Dusíková1-1/+8
`if constexpr` and `if consteval` conditional statements code coverage should behave more like a preprocesor `#if`-s than normal ConditionalStmt. This PR should fix that. --------- Co-authored-by: cor3ntin <corentinjabot@gmail.com>
2024-01-19[llvm] Use SmallString::operator std::string (NFC)Kazu Hirata1-1/+1
2024-01-19Revert "[InstrProf] Adding utility weights to BalancedPartitioning (#72717)"spupyrev1-4/+5
This reverts commit 5954b9dca21bb0c69b9e991b2ddb84c8b05ecba3 due to broken Windows build
2024-01-19[InstrProf] Adding utility weights to BalancedPartitioning (#72717)spupyrev1-5/+4
Adding weights to utility nodes in BP so that we can give more importance to certain utilities. This is useful when we optimize several objectives jointly.
2024-01-17BalancedPartitioning: minor updates (#77568)Fangrui Song1-4/+4
When LargestTraceSize is a power of two, createBPFunctionNodes does not allocate a group ID for Trace[LargestTraceSize-1] (as N is off by 1). Fix this and change floor+log2 to Log2_64. BalancedPartitioning::bisect can use unstable sort because `Nodes` contains distinct `InputOrderIndex`s. BalancedPartitioning::runIterations: use one DenseMap and simplify the node renumbering code.
2024-01-04[InstrProf] No linkage prefixes in IRPGO names (#76994)Ellis Hoag1-19/+8
Change the format of IRPGO counter names to `[<filepath>;]<mangled-name>` which is computed by `GlobalValue::getGlobalIdentifier()` to fix #74565. In fe051934cbb0aaf25d960d7d45305135635d650b (https://reviews.llvm.org/D156569) the format of IRPGO counter names was changed to be `[<filepath>;]<linkage-name>` where `<linkage-name>` is basically `F.getName()` with some prefix, e.g., `_` or `l_` on Mach-O (yes, it is confusing that `<linkage-name>` is computed with `Mangler().getNameWithPrefix()` while `<mangled-name>` is just `F.getName()`). We discovered in #74565 that this causes some missed import issues on some targets and #74008 is a partial fix. Since `<mangled-name>` may not match the `<linkage-name>` on some targets like Mach-O, we will need to post-process the output of `llvm-profdata order` before passing to the linker via `-order_file`. Profiles generated after fe051934cbb0aaf25d960d7d45305135635d650b will become stale after this diff, but I think this is acceptable since that patch landed after the LLVM 18 cut which hasn't been released yet.
2024-01-02[RawProfReader]When constructing symbol table, read the MD5 of function name ↵Mingming Liu1-1/+1
in the proper byte order (#76312) Before this patch, when the field `NameRef` is generated in little-endian systems and read back in big-endian systems, the information gets dropped. - The bug gets caught by a buildbot https://lab.llvm.org/buildbot/#/builders/94/builds/17931. In the error message (pasted below), two indirect call targets are not imported. ``` ; IMPORTS-DAG: Import _Z7callee1v ^ <stdin>:1:1: note: scanning from here main.ll: Import _Z11global_funcv from lib.cc ^ <stdin>:1:10: note: possible intended match here main.ll: Import _Z11global_funcv from lib.cc ^ Input file: <stdin> Check file: /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll -dump-input=help explains the following input dump. Input was: <<<<<< 1: main.ll: Import _Z11global_funcv from lib.cc dag:34'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found dag:34'1 ? possible intended match ``` [This commit](https://github.com/llvm/llvm-project/commit/b3999246b1a93bd8e6dc7acac36a7d57fac96542#diff-b196b796c5a396c7cdf93b347fe47e2b29b72d0b7dd0e2b88abb964d376ee50e) gates the fix by flag and provide test data by creating big-endian profiles (rather than reading the little-endian data on a big-endian system that might require a VM). - [This](https://github.com/llvm/llvm-project/commit/b3999246b1a93bd8e6dc7acac36a7d57fac96542#diff-643176077ddbe537bd0a05d2a8a53bdff6339420a30e8511710bf232afdda8b9) is a hexdump of little-endian profile data, and [this](https://github.com/llvm/llvm-project/commit/b3999246b1a93bd8e6dc7acac36a7d57fac96542#diff-1736a3ee25dde02bba55d670df78988fdb227e5a85b94b8707cf182cf70b28f0) is the big-endian version of it. - The [README.md](https://github.com/llvm/llvm-project/commit/b3999246b1a93bd8e6dc7acac36a7d57fac96542#diff-6717b6a385de3ae60ab3aec9638af2a43b55adaf6784b6f0393ebe1a6639438b) shows the result of `llvm-profdata show -ic-targets` before and after the fix when the profile is in big-endian.
2023-12-19Reland the reland "[PGO][GlobalValue][LTO]In ↵Mingming Liu2-13/+32
GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75954) Simplify the compiler-rt test to make it more general for different platforms, and use `*DAG` matchers for lines that may be emitted out-of-order. - The compiler-rt test passed on a Windows machine. Previously name matchers don't work for MSVC mangling (https://lab.llvm.org/buildbot/#/builders/127/builds/59907) - `*DAG` matchers fixed the error in https://lab.llvm.org/buildbot/#/builders/94/builds/17924 This is the second reland and fixed errors caught in first reland (https://github.com/llvm/llvm-project/pull/75860) **Original commit message** Commit fe05193 (phab D156569), IRPGO names uses format `[<filepath>;]<linkage-name>` while prior format is `[<filepath>:<mangled-name>`. The format change would break the use case demonstrated in (updated) `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and `compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp` This patch changes `GlobalValues::getGlobalIdentifer` to use the semicolon. To elaborate on the scenario how things break without this PR 1. IRPGO raw profiles stores (compressed) IRPGO names of functions in one section, and per-function profile data in another section. The [NameRef](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/compiler-rt/include/profile/InstrProfData.inc#L72) field in per-function profile data is the MD5 hash of IRPGO names. 2. When raw profiles are converted to indexed format profiles, the profiled address is [mapped](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/llvm/lib/ProfileData/InstrProf.cpp#L876-L885) to the MD5 hash of the callee. 3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names will be [annotated](https://github.com/llvm/llvm-project/blob/fc715e4cd942612a091097339841733757b53824/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp#L1707) as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported. * `compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp` is added to have an end-to-end test. * `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` is updated to have better test coverage from another aspect (as runtime tests are more sensitive to the environment and may be skipped by some contributors)
2023-12-18[memprof][NFC] Free symbolizer memory eagerly (#75849)Teresa Johnson1-10/+15
Move the ownership of the symbolizer into symbolizeAndFilterStackFrames so that it is freed on exit, when we are done with it, to reduce peak memory in the reader. This reduces about 9G from the peak for one large profile.