aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-06-04[MemProf] Optionally save context size info on largest cold allocations ↵Teresa Johnson1-3/+22
(#142837) Reapply PR142507 with fix for test: add in the same x86_64-linux requirement as other tests as the stack ids are currently computed differently on big endian systems. This will be investigated separately. In order to allow selective reporting of context hinting during the LTO link, and in the future to allow selective more aggressive cloning, add an option to specify a minimum percent of the max cold size in the profile summary. Contexts that meet that threshold will get context size info metadata (and ThinLTO summary information) on the associated allocations. Specifying -memprof-report-hinted-sizes during the pre-LTO compile step will continue to cause all contexts to receive this metadata. But specifying -memprof-report-hinted-sizes only during the LTO link will cause only those that meet the new threshold and have the metadata to get reported. To support this, because the alloc info summary and associated bitcode requires the context size information to be in the same order as the other context information, 0s are inserted for contexts without this metadata. The bitcode writer uses a more compact format for the context ids to allow better compression of the 0s. As part of this change several helper methods are added to query whether metadata contains context size info on any or all contexts.
2025-06-03Revert "[MemProf] Optionally save context size info on largest cold ↵Teresa Johnson1-22/+3
allocations" (#142688) Reverts llvm/llvm-project#142507 due to buildbot failures that I will look into tomorrow.
2025-06-03[MemProf] Optionally save context size info on largest cold allocations ↵Teresa Johnson1-3/+22
(#142507) In order to allow selective reporting of context hinting during the LTO link, and in the future to allow selective more aggressive cloning, add an option to specify a minimum percent of the max cold size in the profile summary. Contexts that meet that threshold will get context size info metadata (and ThinLTO summary information) on the associated allocations. Specifying -memprof-report-hinted-sizes during the pre-LTO compile step will continue to cause all contexts to receive this metadata. But specifying -memprof-report-hinted-sizes only during the LTO link will cause only those that meet the new threshold and have the metadata to get reported. To support this, because the alloc info summary and associated bitcode requires the context size information to be in the same order as the other context information, 0s are inserted for contexts without this metadata. The bitcode writer uses a more compact format for the context ids to allow better compression of the 0s. As part of this change several helper methods are added to query whether metadata contains context size info on any or all contexts.
2025-05-27[llvm] annotate interfaces in llvm/Analysis for DLL export (#136623)Andrew Rogers1-1/+2
## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates the `llvm/Analysis` library. These annotations currently have no meaningful impact on the LLVM build; however, they are a prerequisite to support an LLVM Windows DLL (shared library) build. ## Background This effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS on Linux: - Add `#include "llvm/Support/Compiler.h"` to files where it was not auto-added by IDS due to no pre-existing block of include statements. - Add `LLVM_TEMPLATE_ABI` and `LLVM_EXPORT_TEMPLATE` to exported instantiated templates - Add `LLVM_ABI` to a subset of private class methods and fields that require export - Add `LLVM_ABI` to a small number of symbols that require export but are not declared in headers ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang
2025-05-04[llvm] Remove unused local variables (NFC) (#138467)Kazu Hirata1-1/+0
2025-04-28Clean up external users of GlobalValue::getGUID(StringRef) (#129644)Owen Rodley1-3/+6
See https://discourse.llvm.org/t/rfc-keep-globalvalue-guids-stable/84801 for context. This is a non-functional change which just changes the interface of GlobalValue, in preparation for future functional changes. This part touches a fair few users, so is split out for ease of review. Future changes to the GlobalValue implementation can then be focused purely on that class. This does the following: * Rename GlobalValue::getGUID(StringRef) to getGUIDAssumingExternalLinkage. This is simply making explicit at the callsite what is currently implicit. * Where possible, migrate users to directly calling getGUID on a GlobalValue instance. * Otherwise, where possible, have them call the newly renamed getGUIDAssumingExternalLinkage, to make the assumption explicit. There are a few cases where neither of the above are possible, as the caller saves and reconstructs the necessary information to compute the GUID themselves. We want to migrate these callers eventually, but for this first step we leave them be.
2025-04-21[LLVM] Cleanup pass initialization for Analysis passes (#135858)Rahul Joshi1-7/+2
- Do not call pass initialization from pass constructors. - Instead, pass initialization should happen in the `initializeAnalysis` function. - https://github.com/llvm/llvm-project/issues/111767
2025-03-25[Analysis] Use *Set::insert_range (NFC) (#132878)Kazu Hirata1-4/+2
We can use *Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E); down to: Set.insert_range(Range); In some cases, we can further fold that into the set declaration.
2025-03-24[MemProf] Enable memprof ICP support by default (#132625)Teresa Johnson1-1/+1
This was disabled by default earlier while some failures were investigated and ultimately fixed. It has been tested more extensively since and can be enabled by default.
2024-11-15[MemProf] Print full context hash when reporting hinted bytes (#114465)Teresa Johnson1-8/+23
Improve the information printed when -memprof-report-hinted-sizes is enabled. Now print the full context hash computed from the original profile, similar to what we do when reporting matching statistics. This will make it easier to correlate with the profile. Note that the full context hash must be computed at profile match time and saved in the metadata and summary, because we may trim the context during matching when it isn't needed for distinguishing hotness. Similarly, due to the context trimming, we may have more than one full context id and total size pair per MIB in the metadata and summary, which now get a list of these pairs. Remove the old aggregate size from the metadata and summary support. One other change from the prior support is that we no longer write the size information into the combined index for the LTO backends, which don't use this information, which reduces unnecessary bloat in distributed index files.
2024-11-05[Analysis] Remove unused includes (NFC) (#114936)Kazu Hirata1-1/+0
Identified with misc-include-cleaner.
2024-10-18[MemProf] Disable memprof ICP support by default (#112940)Teresa Johnson1-1/+1
A failure showed up after this was committed, rather than revert simply disable this new support to simplify investigation and further testing.
2024-10-18[MemProf] Fix the option to disable memprof ICP (#112917)Teresa Johnson1-1/+9
The -enable-memprof-indirect-call-support meant to guard the recently added memprof ICP support was not used in enough places. Specifically, it was not checked in mayHaveMemprofSummary, which is called from the ThinLTO backend applyImports. This led to failures when checking the callsite records, as we incorrectly expected records for indirect calls. Fix the option to be checked in all necessary locations, and add testing.
2024-10-11[MemProf] Support cloning for indirect calls with ThinLTO (#110625)Teresa Johnson1-23/+36
This patch enables support for cloning in indirect callsites. This is done by synthesizing callsite records for each virtual call target from the profile metadata. In the thin link all the synthesized records for a particular indirect callsite initially share the same context node, but support is added to partition the callsites and outgoing edges based on the callee function, creating a separate node for each target. In the LTO backend, when cloning is needed we first perform indirect call promotion, then change the target of the new direct call to the desired clone. Note this is ThinLTO-specific, since for regular LTO indirect call promotion should have already occurred.
2024-09-07[ThinLTO] Shrink FunctionSummary by 8 bytes (#107706)Kazu Hirata1-2/+2
During the ThinLTO indexing step for one of our large applications, we create 4 million instances of FunctionSummary. Changing: std::vector<EdgeTy> CallGraphEdgeList; to: SmallVector<EdgeTy, 0> CallGraphEdgeList; in FunctionSummary reduces the size of each instance by 8 bytes. The rest of the patch makes the same change to other places so that the types stay compatible across function boundaries.
2024-09-06[NFCI]Remove EntryCount from FunctionSummary and clean up surrounding ↵Mingming Liu1-4/+4
synthetic count passes. (#107471) The primary motivation is to remove `EntryCount` from `FunctionSummary`. This frees 8 bytes out of `sizeof(FunctionSummary)` (136 bytes as of https://github.com/llvm/llvm-project/commit/64498c54831bed9cf069e0923b9b73678c6451d8). While I'm at it, this PR clean up {SummaryBasedOptimizations, SyntheticCountsPropagation} since they were not used and there are no plans to further invest on them. With this patch, bitcode writer writes a placeholder 0 at the byte offset of `EntryCount` and bitcode reader can parse the function entry count at the correct byte offset. Added a TODO to stop writing `EntryCount` and bump bitcode version
2024-09-06[ThinLTO] Shrink GlobalValueSummary by 8 bytes (#107342)Kazu Hirata1-17/+19
During the ThinLTO indexing step for one of our large applications, we create 7.5 million instances of GlobalValueSummary. Changing: std::vector<ValueInfo> RefEdgeList; to: SmallVector<ValueInfo, 0> RefEdgeList; in GlobalValueSummary reduces the size of each instance by 8 bytes. The rest of the patch makes the same change to other places so that the types stay compatible across function boundaries.
2024-07-11[MemProf] Track and report profiled sizes through cloning (#98382)Teresa Johnson1-0/+12
If requested, via the -memprof-report-hinted-sizes option, track the total profiled size of each MIB through the thin link, then report on the corresponding allocation coldness after all cloning is complete. To save size, a different bitcode record type is used for the allocation info when the option is specified, and the sizes are kept separate from the MIBs in the index.
2024-06-22[ProfileData] Add a variant of getValueProfDataFromInst (#95993)Kazu Hirata1-11/+6
This patch adds a variant of getValueProfDataFromInst that returns std::vector<InstrProfValueData> instead of std::unique<InstrProfValueData[]>. The new return type carries the length with it, so we can drop out parameter ActualNumValueData. Also, the caller can directly feed the return value into a range-based for loop as shown in the patch. I'm planning to migrate other callers of getValueProfDataFromInst to the new variant in follow-up patches.
2024-06-17[Analysis] Clean up getPromotionCandidatesForInstruction (NFC) (#95624)Kazu Hirata1-3/+3
Callers of getPromotionCandidatesForInstruction pass NumVals as an out parameter for the number of value-count pairs of the value profiling data, but nobody uses the out parameter. This patch removes the parameter and updates the callers. Note that the number of value-count pairs is still available via getPromotionCandidatesForInstruction(...).size().
2024-06-12[SystemZ][z/OS] Continue marking text files with OF_Text (#95111)Abhina Sree1-1/+1
Text files should be opened with OF_Text to have the correct encoding.
2024-05-16[ThinLTO]Mark referencers of local ifunc not eligible for import (#92431)Mingming Liu1-13/+29
If an ifunc has local linkage, do not add it into ref edges and mark its referencer (a function or global variable) not eligible for import. An ifunc doesn't have summary and ThinLTO cannot promote it. Importing the referencer may cause linkage errors. To reference a similar fix, https://reviews.llvm.org/D158961 marks callers of local ifunc not eligible for import to fix https://github.com/llvm/llvm-project/issues/58740
2024-04-15Reapply "[llvm] Teach GlobalDCE about dso_local_equivalent"Leonard Chan1-4/+33
Also reapply "[llvm] Teach whole program devirtualization about relative vtables" This reverts commit 1c604a9780fcfe92a99d539913553f0835b81de3 and 474f5efebed24547e76d022f0c5ffcc9db97ce6f.
2024-04-15Revert "Reapply "[llvm] Teach whole program devirtualization about relative ↵Leonard Chan1-33/+4
vtables"" This reverts commit 09c3bfe9b3eb47a2af0c10531b25f90cfb5fa9f4.
2024-04-15Reapply "[llvm] Teach whole program devirtualization about relative vtables"Leonard Chan1-4/+33
This reverts commit 474f5efebed24547e76d022f0c5ffcc9db97ce6f.
2024-04-10[ThinLTO]Record import type in GlobalValueSummary::GVFlags (#87597)Mingming Liu1-4/+8
The motivating use case is to support import the function declaration across modules to construct call graph edges for indirect calls [1] when importing the function definition costs too much compile time (e.g., the function is too large has no `noinline` attribute). 1. Currently, when the compiled IR module doesn't have a function definition but its postlink combined summary contains the function summary or a global alias summary with this function as aliasee, the function definition will be imported from source module by IRMover. The implementation is in FunctionImporter::importFunctions [2] 2. In order for FunctionImporter to import a declaration of a function, both function summary and alias summary need to carry the def / decl state. Specifically, all existing summary fields doesn't differ across import modules, but the def / decl state of is decided by `<ImportModule, Function>`. This change encodes the def/decl state in `GlobalValueSummary::GVFlags`. In the subsequent changes 1. The indexing step `computeImportForModule` [3] will compute the set of definitions and the set of declarations for each module, and passing on the information to bitcode writer. 2. Bitcode writer will look up the def/decl state and sets the state when it writes out the flag value. This is demonstrated in https://github.com/llvm/llvm-project/pull/87600 3. Function importer will read the def/decl state when reading the combined summary to figure out two sets of global values, and IRMover will be updated to import the declaration (aka linkGlobalValuePrototype [4]) into the destination module. - The next change is https://github.com/llvm/llvm-project/pull/87600 [1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5 [2] https://github.com/llvm/llvm-project/blob/3b337242ee165554f0017b00671381ec5b1ba855/llvm/lib/Transforms/IPO/FunctionImport.cpp#L1608-L1764 [3] https://github.com/llvm/llvm-project/blob/3b337242ee165554f0017b00671381ec5b1ba855/llvm/lib/Transforms/IPO/FunctionImport.cpp#L856 [4] https://github.com/llvm/llvm-project/blob/3b337242ee165554f0017b00671381ec5b1ba855/llvm/lib/Linker/IRMover.cpp#L605
2024-04-01[ThinLTO][TypeProf] Implement vtable def import (#79381)Mingming Liu1-0/+20
Add annotated vtable GUID as referenced variables in per function summary, and update bitcode writer to create value-ids for these referenced vtables. - This is the part3 of type profiling work, and described in the "Virtual Table Definition Import" [1] section of the RFC. [1] https://github.com/llvm/llvm-project/pull/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
2023-12-06[ThinLTO] Add tail call flag to call edges in summary (#74043)Teresa Johnson1-0/+2
This adds support for a HasTailCall flag on function call edges in the ThinLTO summary. It is intended for use in aiding discovery of missing frames from tail calls in profiled call stacks for MemProf of profiled binaries that did not disable tail call elimination. A follow on change will add the use of this new flag during MemProf context disambiguation. The new flag is encoded in the bitcode along with either the hotness flag from the profile, or the relative block frequency under the -write-relbf-to-summary flag when there is no profile data. Because we now will always have some additional call edge information, I have removed the non-profile function summary record format, and we simply encode the tail call flag along with a hotness type of none when there is no profile information or relative block frequency. The change of record format and name caused most of the test case changes. I have added explicit testing of generation of the new tail call flag into the bitcode and IR assembly format as part of the changes to llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also added round trip testing through assembly and bitcode to llvm/test/Assembler/thinlto-summary.ll.
2023-10-05Use BlockFrequency type in more places (NFC) (#68266)Matthias Braun1-1/+1
The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it more consistently in various APIs and disable implicit conversion to make usage more consistent and explicit. - Use `BlockFrequency Freq` parameter for `setBlockFreq`, `getProfileCountFromFreq` and `setBlockFreqAndScale` functions. - Return `BlockFrequency` in `getEntryFreq()` functions. - While on it change some `const BlockFrequency& Freq` parameters to plain `BlockFreqency Freq`. - Mark `BlockFrequency(uint64_t)` constructor as explicit. - Add missing `BlockFrequency::operator!=`. - Remove `uint64_t BlockFreqency::getMaxFrequency()`. - Add `BlockFrequency BlockFrequency::max()` function.
2023-08-29[ThinLTO] Mark callers of local ifunc not eligible for importFangrui Song1-1/+12
Fix https://github.com/llvm/llvm-project/issues/58740 The `target_clones` attribute results in ifunc on eligible targets (Linux glibc/Android or FreeBSD). If the function has internal linkage, we will get an internal linkage ifunc. ``` __attribute__((target_clones("popcnt", "default"))) static int foo(int n) { return __builtin_popcount(n); } int use(int n) { return foo(n); } @foo.ifunc = internal ifunc i32 (i32), ptr @foo.resolver define internal nonnull ptr @foo.resolver() comdat { ; local linkage comdat is another issue that should be fixed ... select i1 %.not, ptr @foo.default.1, ptr @foo.popcnt.0 ... } define internal i32 @foo.default.1(i32 noundef %n) ``` ifuncs are not included in module summaries, so LTO doesn't know the local linkage `foo.default.1` referenced by `foo.resolver` should be promoted. If a caller of `foo` (e.g. `use`) is imported, the local linkage `foo.resolver` will be cloned as a definition (IRLinker::shouldLink), leading to linker errors. ``` ld.lld: error: undefined hidden symbol: foo.default.1.llvm.8017227050314953235 >>> referenced by bar.c >>> lto.tmp:(foo.ifunc) ``` As a simple fix, just mark `use` as not eligible for import. Non-local linkage ifuncs do not have the problem, because they are not imported, and not cloned when a caller is imported. --- https://reviews.llvm.org/D82745 contains a more involved fix, though the original bug it intended to fix (https://github.com/llvm/llvm-project/issues/45833) now works. Note: importing ifunc is tricky. If we import an ifunc, we need to make sure the resolver and the implementation are in the translation unit, as required by https://sourceware.org/glibc/wiki/GNU_IFUNC > Requirement (a): Resolver must be defined in the same translation unit as the implementations. This is infeasible if the implementation is changed to available_externally. In addition, the imported ifunc may be referenced by two translation units. This doesn't work with PowerPC32 -msecure-plt (https://maskray.me/blog/2021-01-18-gnu-indirect-function). At the very least, every referencing translation unit needs one extra IRELATIVE dynamic relocation. At least for the local linkage ifunc case, it doesn't have much use outside of `target_clones`, as a global pointer is usually a better replacement. I think ifuncs just have too many pitfalls to design more IR features around it to optimize them. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D158961
2023-07-25[Support] Change SetVector's default template parameter to SmallVector<*, 0>Fangrui Song1-17/+29
Similar to D156016 for MapVector. This brings back commit fae7b98c221b5b28797f7b56b656b6b819d99f27 with a fix to llvm/unittests/Support/ThreadPool.cpp's `_WIN32` code path.
2023-07-25Revert rGfae7b98c221b5b28797f7b56b656b6b819d99f27 "[Support] Change ↵Simon Pilgrim1-29/+17
SetVector's default template parameter to SmallVector<*, 0>" This is failing on Windows MSVC builds: llvm\unittests\Support\ThreadPool.cpp(380): error C2440: 'return': cannot convert from 'Vector' to 'std::vector<llvm::BitVector,std::allocator<llvm::BitVector>>' with [ Vector=llvm::SmallVector<llvm::BitVector,0> ]
2023-07-25[Support] Change SetVector's default template parameter to SmallVector<*, 0>Fangrui Song1-17/+29
Similar to D156016 for MapVector.
2023-07-24[Support] Change MapVector's default template parameter to SmallVector<*, 0>Fangrui Song1-1/+3
SmallVector<*, 0> is often a better replacement for std::vector : both the object size and the code size are smaller. (SmallMapVector uses SmallVector as well, but it is not common.) clang size decreases by 0.0226%. instructions:u decreases 0.037% when compiling a sqlite3 amalgram. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D156016
2023-07-05[llvm] A Unified LTO Bitcode FrontendMatthew Voss1-1/+5
Here's a high level summary of the changes in this patch. For more information on rational, see the RFC. (https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774). - Add config parameter to LTO backend, specifying which LTO mode is desired when using unified LTO. - Add unified LTO flag to the summary index for efficiency. Unified LTO modules can be detected without parsing the module. - Make sure that the ModuleID is generated by incorporating more types of symbols. Differential Revision: https://reviews.llvm.org/D123803
2023-06-29Add a type_checked_load_relative to support relative function pointer tablesArnold Schwaighofer1-0/+1
This adds a type_checked_load_relative intrinsic whose semantics it is to load a relative function pointer. A relative function pointer is a pointer to a 32bit value that when added to its address yields the address of the function. Differential Revision: https://reviews.llvm.org/D143204
2023-05-05[MemProf] Context disambiguation cloning pass [patch 4/4]Teresa Johnson1-0/+68
Applies ThinLTO cloning decisions made during the thin link and recorded in the summary index to the IR during the ThinLTO backend. Depends on D141077. Differential Revision: https://reviews.llvm.org/D149117
2023-04-20[ThinLTO] Remove BlockCount for non partial sample profile buildsTeresa Johnson1-1/+5
As pointed out in https://discourse.llvm.org/t/undeterministic-thin-index-file/69985, the block count added to distributed ThinLTO index files breaks incremental builds on ThinLTO - if any linked file has a different number of BBs, then the accumulated sum placed in the index files will change, causing all ThinLTO backend compiles to be redone. The block count is only used for scaling of partial sample profiles, and was added in D80403 for D79831. This patch simply removes this field from the index files of non partial sample profile compiles, which is NFC on the output of the compiler. We subsequently need to see if this can be removed for partial sample profiles without signficant performance loss, or redesigned in a way that does not destroy caching. Differential Revision: https://reviews.llvm.org/D148746
2023-03-26Revert "[llvm] Teach whole program devirtualization about relative vtables"Leonard Chan1-33/+4
This reverts commit db288184765c0b4010060ebea1f6de3ac1f66445. Reverting since it broke our lto builders reported by fxbug.dev/123807.
2023-02-23[llvm] Teach whole program devirtualization about relative vtablesLeonard Chan1-4/+33
Prior to this patch, WPD was not acting on relative-vtables in C++. This involves teaching WPD about these things: - llvm.load.relative which is how relative-vtables are indexed (instead of GEP) - dso_local_equivalent which is used in the vtable itself when taking the offset between a virtual function and vtable - Update llvm/test/ThinLTO/X86/devirt.ll to use opaque pointers and add equivalent tests for RV Differential Revision: https://reviews.llvm.org/D134320
2023-02-16[ThinLTO/WPD] Handle function alias in vtable correctlyTeresa Johnson1-6/+11
We were not summarizing a function alias in the vtable, leading to incorrect WPD in some cases, and missing WPD in others. Specifically, we would end up ignoring function aliases as they aren't summarized, so we could incorrectly devirtualize if there was a single other non-alias function in a compatible vtable. And if there was only one implementation, but it was an alias, we would not be able to identify and perform the single implementation devirtualization. Handling the alias summary correctly also required fixing the handling in mustBeUnreachableFunction, so that it is not incorrectly ignored. Regular LTO is conservatively correct because it will skip devirtualizing when any pointer within a vtable is not a function. However, it needs additional work to be able to take advantage of function alias within the vtable that is in fact the only implementation. For that reason, the Regular LTO testing in the second test case is currently disabled, and will be enabled along with a follow on enhancement fix for Regular LTO WPD. Differential Revision: https://reviews.llvm.org/D144209
2022-11-23ModuleSummaryAnalysis: Internalize some cl::optFangrui Song1-5/+7
2022-11-16Restore "[MemProf] ThinLTO summary support" with more fixesTeresa Johnson1-2/+60
This restores commit 98ed423361de2f9dc0113a31be2aa04524489ca9 and follow on fix 00c22351ba697dbddb4b5bf0ad94e4bcea4b316b, which were reverted in 5d938eb6f79b16f55266dd23d5df831f552ea082 due to an MSVC bot failure. I've included a fix for that failure. Differential Revision: https://reviews.llvm.org/D135714
2022-11-16Revert "Restore "[MemProf] ThinLTO summary support" with fixes"Jeremy Morse1-60/+2
This reverts commit 00c22351ba697dbddb4b5bf0ad94e4bcea4b316b. This reverts commit 98ed423361de2f9dc0113a31be2aa04524489ca9. Seemingly MSVC has some kind of issue with this patch, in terms of linking: https://lab.llvm.org/buildbot/#/builders/123/builds/14137 I'll post more detail on D135714 momentarily.
2022-11-15Restore "[MemProf] ThinLTO summary support" with fixesTeresa Johnson1-2/+60
This restores 47459455009db4790ffc3765a2ec0f8b4934c2a4, which was reverted in commit 452a14efc84edf808d1e2953dad2c694972b312f, along with fixes for a couple of bot failures.
2022-11-15Revert "[MemProf] ThinLTO summary support"Teresa Johnson1-60/+2
This reverts commit 47459455009db4790ffc3765a2ec0f8b4934c2a4. Revert while I try to fix a couple of non-Linux build failures.
2022-11-15[MemProf] ThinLTO summary supportTeresa Johnson1-2/+60
Implements the ThinLTO summary support for memprof related metadata. This includes support for the assembly format, and for building the summary from IR during ModuleSummaryAnalysis. To reduce space in both the bitcode format and the in memory index, we do 2 things: 1. We keep a single vector of all uniq stack id hashes, and record the index into this vector in the callsite and allocation memprof summaries. 2. When building the combined index during the LTO link, the callsite and allocation memprof summaries are only kept on the FunctionSummary of the prevailing copy. Differential Revision: https://reviews.llvm.org/D135714
2022-11-14Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"Fangrui Song1-0/+2
This reverts commit bf8381a8bce28fc69857645cc7e84a72317e693e. There is a layering violation: LLVMAnalysis depends on LLVMCore, so LLVMCore should not include LLVMAnalysis header llvm/Analysis/ModuleSummaryAnalysis.h
2022-11-14[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvmAlexander Shaposhnikov1-2/+0
Enable using -module-summary with -S (similarly to what currently can be achieved with opt <input> -o - | llvm-dis). This is a recommit of ef9e62469. Test plan: ninja check-all Differential revision: https://reviews.llvm.org/D137768
2022-10-21[ModuleSummaryAnalysis] Use helper methods to check readnone/readonly (NFC)Nikita Popov1-2/+1
This makes sure that this code continue working when switching to the memory attribute. A caveat here is that onlyReadsMemory() will also true for readnone. To be conservative, I'm explicitly excluding that case here.