aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/ProfileData/SampleProfReader.cpp
AgeCommit message (Collapse)AuthorFilesLines
30 hours[Support] Deprecate one form of support::endian::read (NFC) (#160979)Kazu Hirata1-2/+2
This is a follow-up to #156140, which deprecated one form of write. We have two forms of read: template <typename value_type, std::size_t alignment> [[nodiscard]] inline value_type read(const void *memory, endianness endian) template <typename value_type, endianness endian, std::size_t alignment> [[nodiscard]] inline value_type read(const void *memory) The difference is that endian is a function parameter in the former but a template parameter in the latter. This patch streamlines the code by migrating the use of the latter to the former while deprecating the latter.
2025-09-12[SampleFDO][TypeProf]Support vtable type profiling for ext-binary and text ↵Mingming Liu1-2/+118
format (#148002) This change extends SampleFDO ext-binary and text format to record the vtable symbols and their counts for virtual calls inside a function. The vtable profiles will allow the compiler to annotate vtable types on IR instructions and perform vtable-based indirect call promotion. An RFC is in https://discourse.llvm.org/t/rfc-vtable-type-profiling-for-samplefdo/87283 Given a function below, the before vs after of a function's profile is illustrated in text format in the table: ``` __attribute__((noinline)) int loop_func(int i, int a, int b) { Base *ptr = createType(i); int sum = ptr->func(a, b); delete ptr; return sum; } ``` | before | after | | --- | --- | | Samples collected in the function's body { <br> 0: 636241 <br> 1: 681458, calls: _Z10createTypei:681458 <br> 3: 543499, calls: _ZN12_GLOBAL__N_18Derived24funcEii:410621 _ZN8Derived14funcEii:132878 <br> 5.1: 602201, calls: _ZN12_GLOBAL__N_18Derived2D0Ev:454635 _ZN8Derived1D0Ev:147566 <br> 7: 511057 <br> } | Samples collected in the function's body { <br> 0: 636241 <br> 1: 681458, calls: _Z10createTypei:681458 <br> 3: 543499, calls: _ZN12_GLOBAL__N_18Derived24funcEii:410621 _ZN8Derived14funcEii:132878 <br> 3: vtables: _ZTV8Derived1:1377 _ZTVN12_GLOBAL__N_18Derived2E:4250 <br> 5.1: 602201, calls: _ZN12_GLOBAL__N_18Derived2D0Ev:454635 _ZN8Derived1D0Ev:147566 <br> 5.1: vtables: _ZTV8Derived1:227 _ZTVN12_GLOBAL__N_18Derived2E:765 <br> 7: 511057 <br> } | Key points for this change: 1. In-memory representation of vtable profiles * A field of type `map<LineLocation, map<FunctionId, uint64_t>>` is introduced in a function's in-memory representation [FunctionSamples](https://github.com/llvm/llvm-project/blob/ccc416312ed72e92a885425d9cb9c01f9afa58eb/llvm/include/llvm/ProfileData/SampleProf.h#L749-L754). 2. The vtable counters for one LineLocation represents the relative frequency among vtables for this LineLocation. They are not required to be comparable across LineLocations. 3. For backward compatibility of ext-binary format, we take one bit from ProfSummaryFlag as illustrated in the enum class `SecProfSummaryFlags`. The ext-binary profile reader parses the integer type flag and reads this bit. If it's set, the profile reader will parse vtable profiles. 4. The vtable profiles are optional in ext-binary format, and not serialized out by default, we introduce an LLVM boolean option (named `-extbinary-write-vtable-type-prof`). The ext-binary profile writer reads the boolean option and decide whether to set the section flag bit and serialize the in-memory class members corresponding to vtables. 5. This change doesn't implement `llvm-profdata overlap --sample` for the vtable profiles. A subsequent change will do it to keep this one focused on the profile format change. We don't plan to add the vtable support to non-extensible format mainly because of the maintenance cost to keep backward compatibility for prior versions of profile data. * Currently, the [non-extensible binary format](https://github.com/llvm/llvm-project/blob/5c28af409978c19a35021855a29dcaa65e95da00/llvm/lib/ProfileData/SampleProfWriter.cpp#L899-L900) does not have feature parity with extensible binary format today, for instance, the former doesn't support [profile symbol list](https://github.com/llvm/llvm-project/blob/41e22aa31b1905aa3e9d83c0343a96ec0d5187ec/llvm/include/llvm/ProfileData/SampleProf.h#L1518-L1522) or context-sensitive PGO, both of which give measurable performance boost. Presumably the non-extensible format is not in wide use. --------- Co-authored-by: Paschalis Mpeis <paschalis.mpeis@arm.com>
2025-08-23[NFC][SampleFDO] Re-apply "In text sample prof reader, report more concrete ↵Mingming Liu1-4/+11
parsing errors for different line types" (#155124) Re-apply https://github.com/llvm/llvm-project/pull/154885 with a fix to initialize `LineTy` before calling `ParseLine`.
2025-08-23Revert "[NFC][SampleFDO] In text sample prof reader, report dreport more ↵Mingming Liu1-10/+3
concrete parsing errors for different line types" (#155121) Reverts llvm/llvm-project#154885 to fix build bot failure (https://lab.llvm.org/buildbot/#/builders/144/builds/33611)
2025-08-23[NFC][SampleFDO] In text sample prof reader, report dreport more concrete ↵Mingming Liu1-3/+10
parsing errors for different line types (#154885) The format `'NUM[.NUM]: NUM[ mangled_name:NUM]*'` applies for most line types except metadata ones.
2025-07-12[ProfileData] Remove an unnecessary cast (NFC) (#148341)Kazu Hirata1-1/+1
getBufferStart() already returns const char *.
2025-07-09[NFC]Codestyle changes for SampleFDO library (#147840)Mingming Liu1-2/+2
* Introduce an error code for illegal_line_offset in sampleprof_error namespace, and use it for line offset parsing error. * Add `const` for `LineLocation::serialize`. * Use structured binding, make_first/second_range in loops. I'm working on a [sample-profile format change](https://github.com/llvm/llvm-project/compare/users/mingmingl-llvm/samplefdo-profile-format) to extend SampleFDO profile with vtable profiles. And this change splits the non-functional changes.
2025-05-27[NFCI] Clean up idempotent stack pop for inline context (#141544)Mingming Liu1-3/+0
In the top-of-tree, the stack pops at L414-416 [1] are no-op since there are prior stack pops at L400-402. [1] https://github.com/llvm/llvm-project/blame/e015626f189dc76f8df9fdc25a47638c6a2f3feb/llvm/lib/ProfileData/SampleProfReader.cpp#L414-L416 [2] https://github.com/llvm/llvm-project/blame/e015626f189dc76f8df9fdc25a47638c6a2f3feb/llvm/lib/ProfileData/SampleProfReader.cpp#L400-L402
2025-04-28Clean up external users of GlobalValue::getGUID(StringRef) (#129644)Owen Rodley1-1/+1
See https://discourse.llvm.org/t/rfc-keep-globalvalue-guids-stable/84801 for context. This is a non-functional change which just changes the interface of GlobalValue, in preparation for future functional changes. This part touches a fair few users, so is split out for ease of review. Future changes to the GlobalValue implementation can then be focused purely on that class. This does the following: * Rename GlobalValue::getGUID(StringRef) to getGUIDAssumingExternalLinkage. This is simply making explicit at the callsite what is currently implicit. * Where possible, migrate users to directly calling getGUID on a GlobalValue instance. * Otherwise, where possible, have them call the newly renamed getGUIDAssumingExternalLinkage, to make the assumption explicit. There are a few cases where neither of the above are possible, as the caller saves and reconstructs the necessary information to compute the GUID themselves. We want to migrate these callers eventually, but for this first step we leave them be.
2025-03-04[CSSPGO] Fix redundant reading of profile metadata (#129609)Lei Wang1-6/+14
Fix a build speed regression due to repeated reading of profile metadata. Before the function `readFuncMetadata(ProfileHasAttribute, Profiles)` reads the metadata for all the functions(`Profiles`), however, it's actually used for on-demand loading, it can be called for multiple times, which leads to redundant reading that causes the build speed regression. Now fix it to read the metadata only for the new loaded functions(functions in the `FuncsToUse`).
2024-08-28[llvm-profdata] Enabled functionality to write split-layout profile (#101795)William Junda Huang1-3/+28
Using the flag `-split_layout` in llvm-profdata merge, the output profile can write profiles with and without inlined function into two different extbinary sections (and their FuncOffsetTable too). The section without inlined functions are marked with `SecFlagFlat` and is skipped by ThinLTO because it provides no useful info. The split layout feature was already implemented in SampleProfWriter but previously there is no way to use it from llvm-profdata.
2024-08-27[SampleFDO][NFC] Refactoring sample reader to support on-demand read ↵Lei Wang1-86/+140
profiles for given functions (#104654) Currently in extended binary format, sample reader only read the profiles when the function are in the current module at initialization time, this extends the support to read the arbitrary profiles for given input functions in later stage. It's used for https://github.com/llvm/llvm-project/pull/101053.
2024-07-09[NFC] Coding style fixes: SampleProf (#98208)Mircea Trofin1-12/+13
Also some control flow simplifications. Notably, this doesn't address `sampleprof_error`. I *think* the style there tries to match `std::error_category`. Also left `hash_value` as-is, because it matches what we do in Hashing.h
2024-05-20[llvm-profdata] Fix some style and clang-tidy issuesFangrui Song1-4/+4
Fix #92761 Fix #92762
2024-04-16[llvm] Drop unaligned from calls to readNext (NFC) (#88841)Kazu Hirata1-1/+1
Now readNext defaults to unaligned accesses. This patch drops unaligned to improve readability.
2023-12-11[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)Kazu Hirata1-2/+2
This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.
2023-11-10[SampleProfile] Fix bug where remapper returns empty string and crashing ↵William Junda Huang1-2/+5
Sample Profile loader (#71479) Normally SampleContext does not allow using an empty StirngRef to construct an object, this is to prevent bugs reading the profile. However empty names may be emitted by a function which its name is intentionally set to empty, or a bug in the remapper that returns an empty string. Regardless, converting it to FunctionId first will prevent the assert, and that assert check is unnecessary, which will be addressed in another patch
2023-10-17[llvm-profdata] Do not create numerical strings for MD5 function names read ↵William Junda Huang1-53/+45
from a Sample Profile. (#66164) This is phase 2 of the MD5 refactoring on Sample Profile following https://reviews.llvm.org/D147740 In previous implementation, when a MD5 Sample Profile is read, the reader first converts the MD5 values to strings, and then create a StringRef as if the numerical strings are regular function names, and later on IPO transformation passes perform string comparison over these numerical strings for profile matching. This is inefficient since it causes many small heap allocations. In this patch I created a class `ProfileFuncRef` that is similar to `StringRef` but it can represent a hash value directly without any conversion, and it will be more efficient (I will attach some benchmark results later) when being used in associative containers. ProfileFuncRef guarantees the same function name in string form or in MD5 form has the same hash value, which also fix a few issue in IPO passes where function matching/lookup only check for function name string, while returns a no-match if the profile is MD5. When testing on an internal large profile (> 1 GB, with more than 10 million functions), the full profile load time is reduced from 28 sec to 25 sec in average, and reading function offset table from 0.78s to 0.7s
2023-10-13Use llvm::endianness::{big,little,native} (NFC)Kazu Hirata1-3/+3
Note that llvm::support::endianness has been renamed to llvm::endianness while becoming an enum class. This patch replaces {big,little,native} with llvm::endianness::{big,little,native}. This patch completes the migration to llvm::endianness and llvm::endianness::{big,little,native}. I'll post a separate patch to remove the migration helpers in llvm/Support/Endian.h: using endianness = llvm::endianness; constexpr llvm::endianness big = llvm::endianness::big; constexpr llvm::endianness little = llvm::endianness::little; constexpr llvm::endianness native = llvm::endianness::native;
2023-10-10[llvm] Drop unaligned from calls to llvm::support::endian::{read,write} (NFC)Kazu Hirata1-2/+2
The last template parameter of llvm::support::endian::{read,write} defaults to unaligned, so we can drop that at call sites.
2023-09-25[llvm-profdata] Move error handling logic out of normal input's code path ↵William Junda Huang1-15/+8
(#67177) Based on disassembly and profiling results, it looks like the cost of std::error_code and llvm::ErrorOr<> are non-trivial, as it involves virtual function calls that are not optimized in -O3. This patch moves error handling logic into the conditional branch in error checking, only generates std::error_code on actual error instead of the default case to generate sampleprof_error::success. In the next patch I am looking to change the API to uses a plain old enum for error checking instead, because based on profiling results error handling related code accounts for 7-8% of the profile loading time even if there's no error at all.
2023-08-17[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build ↵William Huang1-34/+89
speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740
2023-07-28Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO ↵Aaron Ballman1-92/+34
build speed using MD5 as key to Sample Profile map" This reverts commit 66ba71d913df7f7cd75e92c0c4265932b7c93292. Addressing issues found by: https://lab.llvm.org/buildbot/#/builders/245/builds/11732 https://lab.llvm.org/buildbot/#/builders/187/builds/12251 https://lab.llvm.org/buildbot/#/builders/186/builds/11099 https://lab.llvm.org/buildbot/#/builders/182/builds/6976
2023-07-27[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build ↵William Huang1-34/+92
speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740
2023-06-27Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO ↵Haojian Wu1-88/+33
build speed using MD5 as key to Sample Profile map" This reverts commit 12e9c7aaa66b7624b5d7666ce2794d912bf9e4b7. The commit has broken the buildbot, see comment https://reviews.llvm.org/D147740#4451540
2023-06-27[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build ↵William Huang1-33/+88
speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740
2023-06-23Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO ↵Douglas Yung1-88/+33
build speed using MD5 as key to Sample Profile map" This reverts commit 31af18bccea95fe1ae8aa2c51cf7c8e92a1c208e. This change is causing build failures on many Windows build bots: https://lab.llvm.org/buildbot/#/builders/216/builds/22833 https://lab.llvm.org/buildbot/#/builders/123/builds/19602 https://lab.llvm.org/buildbot/#/builders/172/builds/28315 https://lab.llvm.org/buildbot/#/builders/119/builds/13870 https://lab.llvm.org/buildbot/#/builders/233/builds/794 https://lab.llvm.org/buildbot/#/builders/235/builds/387 https://lab.llvm.org/buildbot/#/builders/13/builds/36921 https://lab.llvm.org/buildbot/#/builders/127/builds/50510
2023-06-23[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build ↵William Huang1-33/+88
speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740
2023-05-12[llvm-profdata] ProfileReader cleanup - preparation for MD5 refactoring - 3William Huang1-47/+89
Cleanup profile reader classes to prepare for complex refactoring as propsed in D147740, continuing D148872 This is patch 3/n. This patch changes the behavior of function offset table. Previously when reading ExtBinary profile, the funcOffsetTable (map) is always populated, and in addition if the profile is CS, the orderedFuncOffsets (list) is also populated. However when reading the function samples, only one of the container is being used, never both, so it's a huge waste of time to populate both. Added logic to select which one to use, and completely skip reading function offset table if we are in tool mode (all function samples are to be read sequentially regardless) Reviewed By: davidxl, wenlei Differential Revision: https://reviews.llvm.org/D149124
2023-05-09[llvm-profdata] ProfileReader cleanup - preparation for MD5 refactoring - 2William Huang1-38/+25
Cleanup profile reader classes to prepare for complex refactoring as propsed in D147740, continuing D148868 This is patch 2/n. This patch refactors CSNameTable and related things The decision to move CSNameTable up to the base class is because a planned improvement (D147740) to use MD5 to lookup Functions/Context frames. In this case we want a unified data structure between contextless function or Context frames, so that it can be mapped by MD5 value. Since Context Frames can represent contextless functions, it is being used for MD5 lookup, therefore exposing it to the base class Reviewed By: snehasish, wenlei Differential Revision: https://reviews.llvm.org/D148872
2023-05-06[llvm-profdata] ProfileReader cleanup - preparation for MD5 refactoringWilliam Huang1-56/+69
Cleanup profile reader classes to prepare for complex refactoring as propsed in D147740 (Use MD5 as key for profile map). Change is too complicated so I am cleaning up the reader implementation first with these goals. - Reduce duplicated/similar logic - Reduce virtual functions, changing them to non-virtual - Reduce unnecessry checks, indirections, and dead writes. This is patch 1/n. This patch refactors NameTable Explaining several decisions here 1. useMD5() means whether names of the profiles (the ProfileMap) are represented as MD5. It is NOT whether the input profile format is MD5. This function is an interface for IPO passes to decide whether to match function names or function MD5. There are two motives here: (a) Eventually we want to use MD5 to represent all function contexts because it is much faster to use it as a key for lookup tables (prototype implementation D147740), so in compilation mode we call setProfileUseMD5() to force use MD5. While in tools mode (llvm-profdata) we want to keep the function name info if it's in the input profile. (b) We also propose to allow multiple name tables and profile sections in ExtBinary format, and it could consist of name tables with or without using MD5, in this case MD5 prevails and other name tables are converted to MD5. 2. MD5 handling logic is pushed up to BinaryReader base class, because this trades a non-devirtualized virtual function call with a predictable branch. ReadStringFromTable() accounts for >5% time when loading a full 1 GB profile, it should not be virtual. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D148868
2023-05-01[llvm-profdata] Deprecate Compact Binary Sample Profile FormatWilliam Huang1-122/+2
Remove support for compact binary sample profile format Reviewed By: davidxl, wenlei Differential Revision: https://reviews.llvm.org/D149400
2023-04-13[llvm-profdata] Fixed various issue with Sample Profile ReaderWilliam Huang1-8/+8
Fixed various undefind behaviors with current Sample Profile Reader when reading unusual input. Furthermore, add the following rule on allowing multiple name table sections (current Reader has conflicted code handling such case): When a new name table section is read (in the order sections are read), the names in the previous name table are cleared. Any subsequent sections referring to function names will index into the most recent read name table. Also changed name table index to uint64_t to be consistent since there's a mix of using uint32_t and uint64_t. Reviewed By: snehasish, huangjd Differential Revision: https://reviews.llvm.org/D146182
2023-02-06[llvm-profdata] Fix bug llvm-profdata crashes when reading a text sample ↵William Huang1-1/+2
profile with an empty line with spaces. Text editors can introduce spaces aligning the previous line's indentation. This crashes llvm-profdata. Added check to handle this case. Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D143369
2023-02-01[NFC][Profile] Access profile through VirtualFileSystemSteven Wu1-9/+12
Make the access to profile data going through virtual file system so the inputs can be remapped. In the context of the caching, it can make sure we capture the inputs and provided an immutable input as profile data. Reviewed By: akyrtzi, benlangmuir Differential Revision: https://reviews.llvm.org/D139052
2023-01-05Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ partserge-sans-paille1-2/+2
Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955
2023-01-03[llvm-profdata] Remove unnecessary file size checkWilliam Huang1-4/+0
Unsure why profile reader checks profile size to be less than 4 GB. This breaks builds using a very large profile. The limit is not seen anywhere else, so I am not sure why is it there in the first place. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D140741
2022-12-12[ProfileData] llvm::Optional => std::optionalFangrui Song1-1/+1
2022-12-02[llvm] Use std::nullopt instead of None (NFC)Kazu Hirata1-1/+1
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-09-17[Support] Rename llvm::compression::{zlib,zstd}::uncompress to more ↵Fangrui Song1-1/+1
appropriate decompress This improves consistency with other places (e.g. llvm::compression::decompress, llvm::object::Decompressor::decompress, llvm-objcopy). Note: when zstd::uncompress was added, we noticed that the API `ZSTD_decompress` is fine while the zlib API `uncompress` is a misnomer.
2022-08-28[llvm] Qualify auto in range-based for loops (NFC)Kazu Hirata1-1/+1
2022-08-23[Sample Profile Reader] Fix potential integer overflow/infinite loop bug in ↵William Huang1-5/+5
sample profile reader Change loop induction variable type to match the type of "SIZE" where it's compared against, to prevent infinite loop caused by overflow wraparound if there are more than 2^32 samples Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D132493
2022-08-23[llvm-profdata][NFC] fix warningliaochunyu1-2/+2
warning: unused variable ‘FC’ Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D132429
2022-08-09[llvm-profdata] Support JSON as as an output-only formatKazu Hirata1-0/+74
This patch teaches llvm-profdata to output the sample profile in the JSON format. The new option is intended to be used for research and development purposes. For example, one can write a Python script to take a JSON file and analyze how similar different inline instances of a given function are to each other. I've chosen JSON because Python can parse it reasonably fast, and it just takes a couple of lines to read the whole data: import json with open ('profile.json') as f: profile = json.load(f) Differential Revision: https://reviews.llvm.org/D130944
2022-07-13[Support] Change compression::zlib::{compress,uncompress} to use uint8_t *Fangrui Song1-5/+3
It's more natural to use uint8_t * (std::byte needs C++17 and llvm has too much uint8_t *) and most callers use uint8_t * instead of char *. The functions are recently moved into `llvm::compression::zlib::`, so downstream projects need to make adaption anyway.
2022-07-08[NFC] Refactor llvm::zlib namespaceCole Kissane1-2/+2
* Refactor compression namespaces across the project, making way for a possible introduction of alternatives to zlib compression. Changes are as follows: * Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`. Reviewed By: MaskRay, leonardchan, phosek Differential Revision: https://reviews.llvm.org/D128953
2022-04-29[CSSPGO] Rename ProfileIsCSNested and ProfileIsCSFlatHongtao Yu1-13/+13
To be more clear and definitive, I'm renaming `ProfileIsCSFlat` back to `ProfileIsCS` which stands for full context-sensitive flat profiles. `ProfileIsCSNested` is now renamed to `ProfileIsPreInlined` and is extended to be applicable for CS flat profiles too. More specifically, `ProfileIsPreInlined` is for any kind of profiles (flat or nested) that contain 'ShouldBeInlined' contexts. The flag is encoded in the profile summary section for extbinary profiles and is computed on-the-fly for text profiles. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D122602
2022-03-28Apply clang-tidy fixes for readability-redundant-smartptr-get in ↵Kazu Hirata1-1/+1
SampleProfReader.cpp (NFC)
2022-02-24Cleanup includes: ProfileDataserge-sans-paille1-1/+1
Estimation of the impact on preprocessor output: before: 1067349756 after: 1065940348 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120434
2022-02-22[ProfileData] Remove unused and racy FunctionSamples::Format after D51643Fangrui Song1-1/+0
The write may be racy if ThinLTO creates multiple `InProcessThinBackend` instances.