aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/LTO/LTO.cpp
AgeCommit message (Collapse)AuthorFilesLines
2022-11-22[LTO][COFF] Use bitcode file names in lto native object file names.Zequan Wu1-1/+1
Currently the lto native object files have names like main.exe.lto.1.obj. In PDB, those names are used as names for each compiland. Microsoft’s tool SizeBench uses those names to present to users the size of each object files. So, names like main.exe.lto.1.obj is not user friendly. This patch makes the lto native object file names more readable by using the bitcode file names as part of the file names. For example, if the input bitcode file has path like "path/to/foo.obj", its corresponding lto native object file path would be "path/to/main.exe.lto.foo.obj". Since the lto native object file name only bothers PDB, this patch only changes the lld-linker's behavior. Reviewed By: tejohnson, MaskRay, #lld-macho Differential Revision: https://reviews.llvm.org/D137217
2022-11-16[LTO] Make local linkage GlobalValue in non-prevailing COMDAT ↵Fangrui Song1-5/+5
available_externally For a local linkage GlobalObject in a non-prevailing COMDAT, it remains defined while its leader has been made available_externally. This violates the COMDAT rule that its members must be retained or discarded as a unit. To fix this, update the regular LTO change D34803 to track local linkage GlobalValues, and port the code to ThinLTO (GlobalAliases are not handled.) This fixes two problems. (a) `__cxx_global_var_init` in a non-prevailing COMDAT group used to linger around (unreferenced, hence benign), and is now correctly discarded. ``` int foo(); inline int v = foo(); ``` (b) Fix https://github.com/llvm/llvm-project/issues/58215: as a size optimization, we place private `__profd_` in a COMDAT with a `__profc_` key. When FuncImport.cpp makes `__profc_` available_externally due to a non-prevailing COMDAT, `__profd_` incorrectly remains private. This change makes the `__profd_` available_externally. ``` cat > c.h <<'eof' extern void bar(); inline __attribute__((noinline)) void foo() {} eof cat > m1.cc <<'eof' #include "c.h" int main() { bar(); foo(); } eof cat > m2.cc <<'eof' #include "c.h" __attribute__((noinline)) void bar() { foo(); } eof clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto=thin -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw ``` If a GlobalAlias references a GlobalValue which is just changed to available_externally, change the GlobalAlias as well (e.g. C5/D5 comdats due to cc1 -mconstructor-aliases). The GlobalAlias may be referenced by other available_externally functions, so it cannot easily be removed. Depends on D137441: we use available_externally to mark a GlobalAlias in a non-prevailing COMDAT, similar to how we handle GlobalVariable/Function. GlobalAlias may refer to a ConstantExpr, not changing GlobalAlias to GlobalVariable gives flexibility for future extensions (the use case is niche. For simplicity we don't handle it yet). In addition, available_externally GlobalAlias is the most straightforward implementation and retains the aliasee information to help optimizers. See windows-vftable.ll: Windows vftable uses an alias pointing to a private constant where the alias is the COMDAT leader. The COMDAT use case is skeptical and ThinLTO does not discard the alias in the non-prevailing COMDAT. This patch retains the behavior. See new tests ctor-dtor-alias2.ll: depending on whether the complete object destructor emitted, when ctor/dtor aliases are used, we may see D0/D2 COMDATs in one TU and D0/D1/D2 in a D5 COMDAT in another TU. Allow such a mix-and-match with `if (GO->getComdat()->getName() == GO->getName()) NonPrevailingComdats.insert(GO->getComdat());` GlobalAlias handling in ThinLTO is still weird, but this patch should hopefully improve the situation for at least all cases I can think of. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D135427
2022-11-16Revert D135427 "[LTO] Make local linkage GlobalValue in non-prevailing ↵Fangrui Song1-5/+5
COMDAT available_externally" This reverts commit 8901635423cbea4324059a5127657126d2e00ce1. This change broke the following example and we need to check `if (GO->getComdat()->getName() == GO->getName())` before `NonPrevailingComdats.insert(GO->getComdat());` Revert for clarify. ``` // a.cc template <typename T> struct A final { virtual ~A() {} }; extern "C" void aa() { A<int> a; } // b.cc template <typename T> struct A final { virtual ~A() {} }; template struct A<int>; extern "C" void bb(A<int> *a) { delete a; } clang -c -fpic -O0 -flto=thin a.cc && ld.lld -shared a.o b.o ```
2022-11-16Restore "[MemProf] ThinLTO summary support" with more fixesTeresa Johnson1-2/+20
This restores commit 98ed423361de2f9dc0113a31be2aa04524489ca9 and follow on fix 00c22351ba697dbddb4b5bf0ad94e4bcea4b316b, which were reverted in 5d938eb6f79b16f55266dd23d5df831f552ea082 due to an MSVC bot failure. I've included a fix for that failure. Differential Revision: https://reviews.llvm.org/D135714
2022-11-16Revert "Restore "[MemProf] ThinLTO summary support" with fixes"Jeremy Morse1-20/+2
This reverts commit 00c22351ba697dbddb4b5bf0ad94e4bcea4b316b. This reverts commit 98ed423361de2f9dc0113a31be2aa04524489ca9. Seemingly MSVC has some kind of issue with this patch, in terms of linking: https://lab.llvm.org/buildbot/#/builders/123/builds/14137 I'll post more detail on D135714 momentarily.
2022-11-15Restore "[MemProf] ThinLTO summary support" with fixesTeresa Johnson1-2/+20
This restores 47459455009db4790ffc3765a2ec0f8b4934c2a4, which was reverted in commit 452a14efc84edf808d1e2953dad2c694972b312f, along with fixes for a couple of bot failures.
2022-11-15Revert "[MemProf] ThinLTO summary support"Teresa Johnson1-20/+2
This reverts commit 47459455009db4790ffc3765a2ec0f8b4934c2a4. Revert while I try to fix a couple of non-Linux build failures.
2022-11-15[MemProf] ThinLTO summary supportTeresa Johnson1-2/+20
Implements the ThinLTO summary support for memprof related metadata. This includes support for the assembly format, and for building the summary from IR during ModuleSummaryAnalysis. To reduce space in both the bitcode format and the in memory index, we do 2 things: 1. We keep a single vector of all uniq stack id hashes, and record the index into this vector in the callsite and allocation memprof summaries. 2. When building the combined index during the LTO link, the callsite and allocation memprof summaries are only kept on the FunctionSummary of the prevailing copy. Differential Revision: https://reviews.llvm.org/D135714
2022-11-10[LTO] Make local linkage GlobalValue in non-prevailing COMDAT ↵Fangrui Song1-5/+5
available_externally For a local linkage GlobalObject in a non-prevailing COMDAT, it remains defined while its leader has been made available_externally. This violates the COMDAT rule that its members must be retained or discarded as a unit. To fix this, update the regular LTO change D34803 to track local linkage GlobalValues, and port the code to ThinLTO (GlobalAliases are not handled.) This fixes two problems. (a) `__cxx_global_var_init` in a non-prevailing COMDAT group used to linger around (unreferenced, hence benign), and is now correctly discarded. ``` int foo(); inline int v = foo(); ``` (b) Fix https://github.com/llvm/llvm-project/issues/58215: as a size optimization, we place private `__profd_` in a COMDAT with a `__profc_` key. When FuncImport.cpp makes `__profc_` available_externally due to a non-prevailing COMDAT, `__profd_` incorrectly remains private. This change makes the `__profd_` available_externally. ``` cat > c.h <<'eof' extern void bar(); inline __attribute__((noinline)) void foo() {} eof cat > m1.cc <<'eof' #include "c.h" int main() { bar(); foo(); } eof cat > m2.cc <<'eof' #include "c.h" __attribute__((noinline)) void bar() { foo(); } eof clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto=thin -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw ``` If a GlobalAlias references a GlobalValue which is just changed to available_externally, change the GlobalAlias as well (e.g. C5/D5 comdats due to cc1 -mconstructor-aliases). The GlobalAlias may be referenced by other available_externally functions, so it cannot easily be removed. Depends on D137441: we use available_externally to mark a GlobalAlias in a non-prevailing COMDAT, similar to how we handle GlobalVariable/Function. GlobalAlias may refer to a ConstantExpr, not changing GlobalAlias to GlobalVariable gives flexibility for future extensions (the use case is niche. For simplicity we don't handle it yet). In addition, available_externally GlobalAlias is the most straightforward implementation and retains the aliasee information to help optimizers. See windows-vftable.ll: Windows vftable uses an alias pointing to a private constant where the alias is the COMDAT leader. The COMDAT use case is skeptical and ThinLTO does not discard the alias in the non-prevailing COMDAT. This patch retains the behavior. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D135427
2022-11-10Revert "[LTO] Make local linkage GlobalValue in non-prevailing COMDAT ↵Alan Zhao1-5/+5
available_externally" This reverts commit 89ddcff1d2d6e9f4de78f3a563a8b1987bf7ea8f. Reason: This breaks bootstrapping builds of LLVM on Windows using ThinLTO; see https://crbug.com/1382839
2022-11-07[LTO] Make local linkage GlobalValue in non-prevailing COMDAT ↵Fangrui Song1-5/+5
available_externally For a local linkage GlobalObject in a non-prevailing COMDAT, it remains defined while its leader has been made available_externally. This violates the COMDAT rule that its members must be retained or discarded as a unit. To fix this, update the regular LTO change D34803 to track local linkage GlobalValues, and port the code to ThinLTO (GlobalAliases are not handled.) This fixes two problems. (a) `__cxx_global_var_init` in a non-prevailing COMDAT group used to linger around (unreferenced, hence benign), and is now correctly discarded. ``` int foo(); inline int v = foo(); ``` (b) Fix https://github.com/llvm/llvm-project/issues/58215: as a size optimization, we place private `__profd_` in a COMDAT with a `__profc_` key. When FuncImport.cpp makes `__profc_` available_externally due to a non-prevailing COMDAT, `__profd_` incorrectly remains private. This change makes the `__profd_` available_externally. ``` cat > c.h <<'eof' extern void bar(); inline __attribute__((noinline)) void foo() {} eof cat > m1.cc <<'eof' #include "c.h" int main() { bar(); foo(); } eof cat > m2.cc <<'eof' #include "c.h" __attribute__((noinline)) void bar() { foo(); } eof clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto=thin -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw ``` If a GlobalAlias references a GlobalValue which is just changed to available_externally, change the GlobalAlias as well (e.g. C5/D5 comdats due to cc1 -mconstructor-aliases). The GlobalAlias may be referenced by other available_externally functions, so it cannot easily be removed. Depends on D137441: we use available_externally to mark a GlobalAlias in a non-prevailing COMDAT, similar to how we handle GlobalVariable/Function. GlobalAlias may refer to a ConstantExpr, not changing GlobalAlias to GlobalVariable gives flexibility for future extensions (the use case is niche. For simplicity we don't handle it yet). In addition, available_externally GlobalAlias is the most straightforward implementation and retains the aliasee information to help optimizers. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D135427
2022-10-19Revert D135427 "[LTO] Make local linkage GlobalValue in non-prevailing ↵Fangrui Song1-5/+5
COMDAT available_externally" This reverts commit 8ef3fd8d59ba0100bc6e83350ab1e978536aa531. I mentioned that GlobalAlias was not handled. It turns out GlobalAlias has to be handled in the same patch (as opposed to in a follow-up), as otherwise clang codegen of C5/D5 constructor/destructor would regress (https://reviews.llvm.org/D135427#3869003).
2022-10-14[lto] Do not try to internalize symbols with escaped nameserge-sans-paille1-0/+16
Because of LLVM mangling escape sequence (through '\01' prefix), it is possible for a single symbols two have two different IR representations. For instance, consider @symbol and @"\01_symbol". On OSX, because of the system mangling rules, these two IR names point are converted in the same final symbol upon linkage. LTO doesn't model this behavior, which may result in symbols being incorrectly internalized (if all reference use the escaping sequence while the definition doesn't). The proper approach is probably to use the mangled name to compute GUID to avoid the dual representation, but we can also avoid discarding symbols that are bound to two different IR names. This is an approximation, but it's less intrusive on the codebase. Fix #57864 Differential Revision: https://reviews.llvm.org/D135710
2022-10-11[LTO] Make local linkage GlobalValue in non-prevailing COMDAT ↵Fangrui Song1-5/+5
available_externally For a local linkage GlobalObject in a non-prevailing COMDAT, it remains defined while its leader has been made available_externally. This violates the COMDAT rule that its members must be retained or discarded as a unit. To fix this, update the regular LTO change D34803 to track local linkage GlobalValues, and port the code to ThinLTO (GlobalAliases are not handled.) This fixes two problems. (a) `__cxx_global_var_init` in a non-prevailing COMDAT group used to linger around (unreferenced, hence benign), and is now correctly discarded. ``` int foo(); inline int v = foo(); ``` (b) Fix https://github.com/llvm/llvm-project/issues/58215: as a size optimization, we place private `__profd_` in a COMDAT with a `__profc_` key. When FuncImport.cpp makes `__profc_` available_externally due to a non-prevailing COMDAT, `__profd_` incorrectly remains private. This change makes the `__profd_` available_externally. ``` cat > c.h <<'eof' extern void bar(); inline __attribute__((noinline)) void foo() {} eof cat > m1.cc <<'eof' int main() { bar(); foo(); } eof cat > m2.cc <<'eof' __attribute__((noinline)) void bar() { foo(); } eof clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto=thin -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw ``` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D135427
2022-10-10Revert "[LTO] Make local linkage GlobalValue in non-prevailing COMDAT ↵Jordan Rupprecht1-5/+5
available_externally" This reverts commit 4fbe33593c8132fdc48647c06f4d1455bfff1c88. It causes linking errors, with details provided internally. (Hopefully the author/reviewers will be able to upstream the internal repro).
2022-10-08[LTO] Make local linkage GlobalValue in non-prevailing COMDAT ↵Fangrui Song1-5/+5
available_externally See the updated linkonce_resolution_comdat.ll. For a local linkage GV in a non-prevailing COMDAT, it remains defined while its leader has been made available_externally. This violates the COMDAT rule that its members must be retained or discarded as a unit. To fix this, update the regular LTO change D34803 to track local linkage GlobalValues, and port the code to ThinLTO (GlobalAliases are not handled.) Fix https://github.com/llvm/llvm-project/issues/58215: as a size optimization, we place private `__profd_` in a COMDAT with a `__profc_` key. When FuncImport.cpp makes `__profc_` available_externally due to a non-prevailing COMDAT, `__profd_` incorrectly remains private. This change makes the `__profd_` available_externally. ``` cat > c.h <<'eof' extern void bar(); inline __attribute__((noinline)) void foo() {} eof cat > m1.cc <<'eof' #include "c.h" int main() { bar(); foo(); } eof cat > m2.cc <<'eof' #include "c.h" __attribute__((noinline)) void bar() { foo(); } eof clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw # one _Z3foov clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto=thin -fuse-ld=lld -o t_gen rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw # one _Z3foov ``` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D135427
2022-09-19[lld][thinlto] Include -mllvm options in the thinlto cache keyMircea Trofin1-0/+2
They may modify thinlto optimization. This patch only extends support for `-mllvm`. There is another way to pass llvm flags, `-plugin-opt`, but its processing is different and will be provided in a subsequent patch. Differential Revision: https://reviews.llvm.org/D134013
2022-07-26[WPD] Use new llvm.public.type.test intrinsic for potentially publicly ↵Arthur Eubanks1-0/+4
visible classes Turning on opaque pointers has uncovered an issue with WPD where we currently pattern match away `assume(type.test)` in WPD so that a later LTT doesn't resolve the type test to undef and introduce an `assume(false)`. The pattern matching can fail in cases where we transform two `assume(type.test)`s into `assume(phi(type.test.1, type.test.2))`. Currently we create `assume(type.test)` for all virtual calls that might be devirtualized. This is to support `-Wl,--lto-whole-program-visibility`. To prevent this, all virtual calls that may not be in the same LTO module instead use a new `llvm.public.type.test` intrinsic in place of the `llvm.type.test`. Then when we know if `-Wl,--lto-whole-program-visibility` is passed or not, we can either replace all `llvm.public.type.test` with `llvm.type.test`, or replace all `llvm.public.type.test` with `true`. This prevents WPD from trying to pattern match away `assume(type.test)` for public virtual calls when failing the pattern matching will result in miscompiles. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D128955
2022-06-23[ThinLTO][ELF] Add --thinlto-emit-index-files optionJin Xin Ng1-29/+55
Allows ThinLTO indices to be written to disk on-the-fly/as-part-of “normal” linker execution. Previously ThinLTO indices could be written via --thinlto-index-only but that would cause the linker to exit early. For MLGO specifically, this enables saving the ThinLTO index files without having to restart the linker to collect data only available at later stages (i.e. output of --save-temps) of the linker's execution. Note, this option does not currently work with: --thinlto-object-suffix-replace, as this is intended to be used to consume minimized IR bitcode files while --thinlto-emit-index-files is intended to be run together with InProcessThinLTO (which cannot parse minimized IR). --thinlto-prefix-replace support is left unimplemented but can be implemented if needed Differential Revision: https://reviews.llvm.org/D127777
2022-06-20[llvm] Don't use Optional::getValue (NFC)Kazu Hirata1-1/+1
2022-06-20[llvm] Don't use Optional::hasValue (NFC)Kazu Hirata1-1/+1
2022-06-20[NFC][Alignment] Remove max functions between Align and MaybeAlignGuillaume Chatelet1-3/+4
`llvm::max(Align, MaybeAlign)` and `llvm::max(MaybeAlign, Align)` are not used often enough to be required. They also make the code more opaque. Differential Revision: https://reviews.llvm.org/D128121
2022-04-13[LTO] Remove legacy PM supportNikita Popov1-1/+0
We don't have any places setting NewPM=false anymore, so drop the support code in LTOBackend.
2022-03-14[IRLinker] make IRLinker::AddLazyFor optional (llvm::unique_function). NFCNick Desaulniers1-2/+1
2 of the 3 callsite of IRMover::move() pass empty lambda functions. Just make this parameter llvm::unique_function. Came about via discussion in D120781. Probably worth making this change regardless of the resolution of D120781. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D121630
2022-02-06[llvm] Use = default (NFC)Kazu Hirata1-1/+1
2022-02-04Tweak some uses of std::iota to skip initializing the underlying storage. NFCI.Benjamin Kramer1-3/+2
2022-02-02Cleanup header dependencies in LLVMCoreserge-sans-paille1-0/+1
Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/ I've tried to summarize the biggest change below: - llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l before: 6400831 after: 6189948 200k lines less to process is no that bad ;-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118652
2022-01-31[BitcodeWriter] Fix cases of some functionsFangrui Song1-1/+1
`WriteIndexToFile` is used by external projects so I do not touch it.
2021-12-20[LTO] Fix incomplete optimization remarks for dead functions when ↵Xu Mingjie1-2/+2
PreOptModuleHook or PostInternalizeModuleHook is defined In 20a895c4be01769a37dfffb3c6b513a7bc9b8d17, we introduce `finalizeOptimizationRemarks()` to make sure we flush the diagnostic remarks file in case the linker doesn't call the global destructors before exiting. In https://reviews.llvm.org/D73597, we add optimization remarks for removed functions for debugging or for detecting dead code. But there is a case, if PreOptModuleHook or PostInternalizeModuleHook is defined (e.g. `--plugin-opt=emit-llvm` is passed to linker), we do not call `finalizeOptimizationRemarks()`, therefore we will get an incomplete optimization remarks file. This patch make sure we flush the diagnostic remarks file when PreOptModuleHook or PostInternalizeModuleHook is defined. Reviewed By: tejohnson, MaskRay Differential Revision: https://reviews.llvm.org/D115417
2021-11-04[Support] Improve Caching conformance with Support library behaviorNoah Shutty1-9/+13
This diff makes several amendments to the local file caching mechanism which was migrated from ThinLTO to Support in rGe678c51177102845c93529d457b020f969125373 in response to follow-up discussion on that commit. Patch By: noajshu Differential Revision: https://reviews.llvm.org/D113080
2021-10-08Move TargetRegistry.(h|cpp) from Support to MCReid Kleckner1-1/+1
This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support. This allows us to ensure that Support doesn't have includes from MC/*. Differential Revision: https://reviews.llvm.org/D111454
2021-10-06[IR][NFC] Rename getBaseObject to getAliaseeObjectItay Bookstein1-1/+1
To better reflect the meaning of the now-disambiguated {GlobalValue, GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction (D109792), the function is renamed to getAliaseeObject.
2021-09-29Revert "[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip ↵Wael Yehia1-6/+0
trace." This reverts commit a60405cf035dc114e7ee090139bed2577f4ea7ef.
2021-09-29[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace.Wael Yehia1-0/+6
Reviewed by: steven_wu, fhahn, tejohnson Differential Revision: https://reviews.llvm.org/D110075
2021-09-28[LTO] Avoid repeated Triple construction. NFCFangrui Song1-1/+1
2021-09-27[ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagationmodimo1-0/+2
Thinlink provides an opportunity to propagate function attributes across modules, enabling additional propagation opportunities. This change propagates (currently default off, turn on with `disable-thinlto-funcattrs=1`) noRecurse and noUnwind based off of function summaries of the prevailing functions in bottom-up call-graph order. Testing on clang self-build: 1. There's a 35-40% increase in noUnwind functions due to the additional propagation opportunities. 2. Throughput is measured at 10-15% increase in thinlink time which itself is 1.5% of E2E link time. Implementation-wise this adds the following summary function attributes: 1. noUnwind: function is noUnwind 2. mayThrow: function contains a non-call instruction that `Instruction::mayThrow` returns true on (e.g. windows SEH instructions) 3. hasUnknownCall: function contains calls that don't make it into the summary call-graph thus should not be propagated from (e.g. indirect for now, could add no-opt functions as well) Testing: Clang self-build passes and 2nd stage build passes check-all ninja check-all with newly added tests passing Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D36850
2021-09-21[LTO] Emit DebugLoc for dead function in optimization remarksXu Mingjie1-5/+10
Currently, the dead functions information getting from optimizations remarks does not contain debug location, but knowing where these dead functions locate could be useful for debugging or for detecting dead code. Cause in `LTO::addRegularLTO()` we use `BitcodeModule::getLazyModule()` to read the bitcode module, when we pass Function F to `ore::NV()`, F is not materialized, so `F->getSubprogram()` returns nullptr, and there is no debug location information of dead functions in optimizations remarks. This patch call `F->materialize()` before we pass Function F to `ore::NV()`, then debug location information will be emitted for dead functions in optimization remarks. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D109737
2021-08-03[ThinLTO] Add TimeTrace for Thinlink stepmodimo1-0/+11
Results from Clang self-build: {F17435948} Testing: ninja check-all Reviewed By: anton-afanasyev Differential Revision: https://reviews.llvm.org/D104428
2021-06-08LTO: Export functions referenced by non-canonical CFI jump tablesSami Tolvanen1-0/+3
LowerTypeTests pass adds functions with a non-canonical jump table to cfiFunctionDecls instead of cfiFunctionDefs. As the jump table is in the regular LTO object, these functions will also need to be exported. This change fixes the non-canonical jump table case and adds a test similar to the existing one for canonical jump tables. Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D103120
2021-04-21[Support] Don't include VirtualFileSystem.h in CommandLine.hNico Weber1-0/+1
CommandLine.h is indirectly included in ~50% of TUs when building clang, and VirtualFileSystem.h is large. (Already remarked by jhenderson on D70769.) No behavior change. Differential Revision: https://reviews.llvm.org/D100957
2021-03-30[ThinLTO] During module importing, close one source module before openWei Mi1-1/+1
another one for distributed mode. Currently during module importing, ThinLTO opens all the source modules, collect functions to be imported and append them to the destination module, then leave all the modules open through out the lto backend pipeline. This patch refactors it in the way that one source module will be closed before another source module is opened. All the source modules will be closed after importing phase is done. It will save some amount of memory when there are many source modules to be imported. Note that this patch only changes the distributed thinlto mode. For in process thinlto mode, one source module is shared acorss different thinlto backend threads so it is not changed in this patch. Differential Revision: https://reviews.llvm.org/D99554
2021-03-18[LTO][MC] Discard non-prevailing defined symbols in module-level assemblyYuanfang Chen1-1/+29
This is the alternative approach to D96931. In LTO, for each module with inlineasm block, prepend directive ".lto_discard <sym>, <sym>*" to the beginning of the inline asm. ".lto_discard" is both a module inlineasm block marker and (optionally) provides a list of symbols to be discarded. In MC while emitting for inlineasm, discard symbol binding & symbol definitions according to ".lto_disard". Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D98762
2021-03-03[IRSymTab] Set FB_used on llvm.compiler.used symbolsFangrui Song1-2/+2
IR symbol table does not parse inline asm. A symbol only referenced by inline asm is not in the IR symbol table, so LTO does not know that the definition (in another translation unit) is referenced and may internalize it, even if that definition has `__attribute__((used))` (which lowers to `llvm.compiler.used` on ELF targets since D97446). ``` // cabac.c __attribute__((used)) const uint8_t ff_h264_cabac_tables[...] = {...}; // h264_cabac.c asm("lea ff_h264_cabac_tables(%rip), %0" : ...); ``` `__attribute__((used))` is the recommended way to tell the compiler there may be inline asm references, so the usage is perfectly fine. This patch conservatively sets the `FB_used` bit on `llvm.compiler.used` symbols to work around the IR symbol table limitation. Note: before D97446, Clang never emitted symbols in the `llvm.compiler.used` list, so this change does not punish any Clang emitted global object. Without the patch, `ff_h264_cabac_tables` may be assigned to a non-external partition and get internalized. Then we will get a linker error because the `cabac.c` definition is not exposed. Differential Revision: https://reviews.llvm.org/D97755
2021-02-12[LTO] Perform DSOLocal propagation in combined indexWei Wang1-2/+2
Perform DSOLocal propagation within summary list of every GV. This avoids the repeated query of this information during function importing. Differential Revision: https://reviews.llvm.org/D96398
2021-01-29[LTO] Update splitCodeGen to take a reference to the module. (NFC)Florian Hahn1-3/+3
splitCodeGen does not need to take ownership of the module, as it currently clones the original module for each split operation. There is an ~4 year old fixme to change that, but until this is addressed, the function can just take a reference to the module. This makes the transition of LTOCodeGenerator to use LTOBackend a bit easier, because under some circumstances, LTOCodeGenerator needs to write the original module back after codegen. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D95222
2021-01-27[LTO] Prevent devirtualization for symbols dynamically exportedTeresa Johnson1-2/+9
Identify dynamically exported symbols (--export-dynamic[-symbol=], --dynamic-list=, or definitions needed to preempt shared objects) and prevent their LTO visibility from being upgraded. This helps avoid use of whole program devirtualization when there may be overrides in dynamic libraries. Differential Revision: https://reviews.llvm.org/D91583
2021-01-27[ThinLTO] Add Visibility bits to GlobalValueSummary::GVFlagsFangrui Song1-7/+34
Imported functions and variable get the visibility from the module supplying the definition. However, non-imported definitions do not get the visibility from (ELF) the most constraining visibility among all modules (Mach-O) the visibility of the prevailing definition. This patch * adds visibility bits to GlobalValueSummary::GVFlags * computes the result visibility and propagates it to all definitions Protected/hidden can imply dso_local which can enable some optimizations (this is stronger than GVFlags::DSOLocal because the implied dso_local can be leveraged for ELF -shared while default visibility dso_local has to be cleared for ELF -shared). Note: we don't have summaries for declarations, so for ELF if a declaration has the most constraining visibility, the result visibility may not be that one. Differential Revision: https://reviews.llvm.org/D92900
2020-11-30[Remarks][1/2] Expand remarks hotness threshold option support in more toolsWei Wang1-3/+6
This is the #1 of 2 changes that make remarks hotness threshold option available in more tools. The changes also allow the threshold to sync with hotness threshold from profile summary with special value 'auto'. This change modifies the interface of lto::setupLLVMOptimizationRemarks() to accept remarks hotness threshold. Update all the tools that use it with remarks hotness threshold options: * lld: '--opt-remarks-hotness-threshold=' * llvm-lto2: '--pass-remarks-hotness-threshold=' * llvm-lto: '--lto-pass-remarks-hotness-threshold=' * gold plugin: '-plugin-opt=opt-remarks-hotness-threshold=' Differential Revision: https://reviews.llvm.org/D85809
2020-10-13Re-land [ThinLTO] Re-order modules for optimal multi-threaded processingAlexandre Ganea1-10/+54
This reverts 9b5b3050237db3642ed7ab1bdb3ffa2202511b99 and fixes the unwanted re-ordering when generating ThinLTO indexes. The goal of this patch is to better balance thread utilization during ThinLTO in-process linking (in llvm-lto2 or in LLD). Before this patch, large modules would often be scheduled late during execution, taking a long time to complete, thus starving the thread pool. We now sort modules in descending order, based on each module's bitcode size, so that larger modules are processed first. By doing so, smaller modules have a better chance to keep the thread pool active, and thus avoid starvation when the bitcode compilation is almost complete. In our case (on dual Intel Xeon Gold 6140, Windows 10 version 2004, two-stage build), this saves 15 sec when linking `clang.exe` with LLD & -flto=thin, /opt:lldltojobs=all, no ThinLTO cache, -DLLVM_INTEGRATED_CRT_ALLOC=d:\git\rpmalloc. Before patch: 100 sec After patch: 85 sec Inspired by the work done by David Callahan in D60495. Differential Revision: https://reviews.llvm.org/D87966
2020-10-09Temporarily revert "[ThinLTO] Re-order modules for optimal multi-threaded ↵Jordan Rupprecht1-27/+6
processing" This reverts commit 6537004913f3009d896bc30856698e7d22199ba7. This is causing test failures internally, and while a few of the cases turned out to be bad user code (relying on a specific order of static initialization across translation units), some cases are less clear. Temporarily reverting for now, and Teresa is going to follow up with more details.