aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/BasicBlockSections.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-07-10[NFC] Split UniqueBBID definition to a separate file. (#148043)Rahman Lavaee1-0/+1
2024-09-25Reapply "Deprecate the `-fbasic-block-sections=labels` option." (#110039)Rahman Lavaee1-7/+0
This reapplies commit 1911a50fae8a441b445eb835b98950710d28fc88 with a minor fix in lld/ELF/LTO.cpp which sets Options.BBAddrMap when `--lto-basic-block-sections=labels` is passed.
2024-09-25Revert "Deprecate the `-fbasic-block-sections=labels` option. (#107494)"Kazu Hirata1-0/+7
This reverts commit 1911a50fae8a441b445eb835b98950710d28fc88. Several bots are failing: https://lab.llvm.org/buildbot/#/builders/190/builds/6519 https://lab.llvm.org/buildbot/#/builders/3/builds/5248 https://lab.llvm.org/buildbot/#/builders/18/builds/4463
2024-09-25Deprecate the `-fbasic-block-sections=labels` option. (#107494)Rahman Lavaee1-7/+0
This feature is supported via the newer option `-fbasic-block-address-map`. Using the old option still works by delegating to the newer option, while a warning is printed to show deprecation.
2024-08-06[CodeGen] Use optimized domtree for MachineFunction (#102107)Alexis Engelke1-0/+11
The dominator tree gained an optimization to use block numbers instead of a DenseMap to store blocks. Given that machine basic blocks already have numbers, expose these via appropriate GraphTraits. For debugging, block number epochs are added to MachineFunction -- this greatly helps in finding uses of block numbers after RenumberBlocks(). In a few cases where dominator trees are preserved across renumberings, the dominator tree is updated to use the new numbers.
2024-02-14[CodeGen][AArch64] Only split safe blocks in BBSections (#81553)Daniel Hoekwater1-3/+8
Some types of machine function and machine basic block are unsafe to split on AArch64: basic blocks that contain jump table dispatch or targets (D157124), and blocks that contain inline ASM GOTO blocks or their targets (D158647) all cause issues and have been excluded from Machine Function Splitting on AArch64. These issues are caused by any transformation pass that places same-function basic blocks in different text sections (MachineFunctionSplitter and BasicBlockSections) and must be special-cased in both passes.
2024-02-01[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used ↵Rahman Lavaee1-6/+33
together by decoupling the handling of the two features. (#74128) Today `-split-machine-functions` and `-fbasic-block-sections={all,list}` cannot be combined with `-basic-block-sections=labels` (the labels option will be ignored). The inconsistency comes from the way basic block address map -- the underlying mechanism for basic block labels -- encodes basic block addresses (https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). Specifically, basic block offsets are computed relative to the function begin symbol. This relies on functions being contiguous which is not the case for MFS and basic block section binaries. This means Propeller cannot use binary profiles collected from these binaries, which limits the applicability of Propeller for iterative optimization. To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section binaries, we propose modifying the encoding of this section as follows. First let us review the current encoding which emits the address of each function and its number of basic blocks, followed by basic block entries for each basic block. | | | |--|--| | Address of the function | Function Address | | Number of basic blocks in this function | NumBlocks | | BB entry 1 | BB entry 2 | ... | BB entry #NumBlocks To make this work for basic block sections, we treat each basic block section similar to a function, except that basic block sections of the same function must be encapsulated in the same structure so we can map all of them to their single function. We modify the encoding to first emit the number of basic block sections (BB ranges) in the function. Then we emit the address map of each basic block section section as before: the base address of the section, its number of blocks, and BB entries for its basic block. The first section in the BB address map is always the function entry section. | | | |--|--| | Number of sections for this function | NumBBRanges | | Section 1 begin address | BaseAddress[1] | | Number of basic blocks in section 1 | NumBlocks[1] | | BB entries for Section 1 |..................| | Section #NumBBRanges begin address | BaseAddress[NumBBRanges] | | Number of basic blocks in section #NumBBRanges | NumBlocks[NumBBRanges] | | BB entries for Section #NumBBRanges The encoding of basic block entries remains as before with the minor change that each basic block offset is now computed relative to the begin symbol of its containing BB section. This patch adds a new boolean codegen option `-basic-block-address-map`. Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD flag `--lto-basic-block-address-map` are introduced. Analogously, we add a new TargetOption field `BBAddrMap`. This means BB address maps are either generated for all functions in the compiling unit, or for none (depending on `TargetOptions::BBAddrMap`). This patch keeps the functionality of the old `-fbasic-block-sections=labels` option but does not remove it. A subsequent patch will remove the obsolete option. We refactor the `BasicBlockSections` pass by separating the BB address map and BB sections handing to their own functions (named `handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers basic blocks and places them in their assigned sections. `handleBBAddrMap` is invoked after `handleBBSections` (if requested) and only renumbers the blocks. - New tests added: - Two tests basic-block-address-map-with-basic-block-sections.ll and basic-block-address-map-with-mfs.ll to exercise the combination of `-basic-block-address-map` with `-basic-block-sections=list` and '-split-machine-functions`. - A driver sanity test for the `-fbasic-block-address-map` option (basic-block-address-map.c). - An LLD test for testing the `--lto-basic-block-address-map` option. This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`. - Renamed and modified the two existing codegen tests for basic block address map (`basic-block-sections-labels-functions-sections.ll` and `basic-block-sections-labels.ll`) - Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of `SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2 will happen in a separate PR in a few months.
2024-01-16[BasicBlockSections] Always keep the entry block in the beginning of the ↵Rahman Lavaee1-3/+7
function. (#74696) BasicBlockSections must enforce placing the entry block at the beginning of the function regardless of the basic block sections profile.
2024-01-09Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… ↵Nick Anderson1-4/+4
(#77182) Port CodeGenPrepare to new pass manager and dependency BasicBlockSectionsProfileReader Fixes: #75380 Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>
2024-01-05Revert 4d7c5ad58467502fcbc433591edff40d8a4d697d "[NewPM] Update ↵Simon Pilgrim1-4/+4
CodeGenPreparePass reference in CodeGenPassBuilder (#77054)" Revert e0c554ad87d18dcbfcb9b6485d0da800ae1338d1 "Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380)" Revert #75380 and #77054 as they were breaking EXPENSIVE_CHECKS buildbots: https://lab.llvm.org/buildbot/#/builders/104
2024-01-05Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… ↵Nick Anderson1-4/+4
(#75380) Port CodeGenPrepare to new pass manager and dependency BasicBlockSectionsProfileReader Fixes: #64560 Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>
2023-10-27[BasicBlockSections] Apply path cloning with -basic-block-sections. (#68860)Rahman Lavaee1-34/+21
https://github.com/llvm/llvm-project/commit/28b912687900bc0a67cd61c374fce296b09963c4 introduced the path cloning format in the basic-block-sections profile. This PR validates and applies path clonings. A path cloning is valid if all of these conditions hold: 1. All bb ids in the path are mapped to existing blocks. 2. Each two consecutive bb ids in the path have a successor relationship in the CFG. 3. The path does not include a block with indirect branches, except possibly as the last block. Applying a path cloning involves cloning all blocks in the path (except the first one) and setting up their branches. Once all clonings are applied, the cluster information is used to guide block layout in the modified function.
2023-10-11[BasicBlockSections] Introduce the path cloning profile format to ↵Rahman Lavaee1-43/+30
BasicBlockSectionsProfileReader. (#67214) Following up on prior RFC (https://lists.llvm.org/pipermail/llvm-dev/2020-September/145357.html) we can now improve above our highly-optimized basic-block-sections binary (e.g., 2% for clang) by applying path cloning. Cloning can improve performance by reducing taken branches. This patch prepares the profile format for applying cloning actions. The basic block cloning profile format extends the basic block sections profile in two ways. 1. Specifies the cloning paths with a 'p' specifier. For example, `p 1 4 5` specifies that blocks with BB ids 4 and 5 must be cloned along the edge 1 --> 4. 2. For each cloned block, it will appear in the cluster info as `<bb_id>.<clone_id>` where `clone_id` is the id associated with this clone. For example, the following profile specifies one cloned block (2) and determines its cluster position as well. ``` f foo p 1 2 c 0 1 2.1 3 2 5 ``` This patch keeps backward-compatibility (retains the behavior for old profile formats). This feature is only introduced for profile version >= 1.
2023-08-24Reland "[CodeGen] Fix unconditional branch duplication issue in bbsections"Daniel Hoekwater1-1/+2
Reverted in 4c8d056f50342d5401f5930ed60e5e48b211c3fb because it broke buildbot `llvm-clang-x86_64-expensive-checks-debian` due to the AArch64 test generating invalid code. The issue still exists, but it's fixed in D156767, so the AArch64 test should be added there. Differential Revision: https://reviews.llvm.org/D158674
2023-08-24Revert "[CodeGen] Fix unconditional branch duplication issue in bbsections"Daniel Hoekwater1-2/+1
This reverts commit 994eb5adc40cd001d82d0f95d18d1827b57e496c. Breaks buildbot `llvm-clang-x86_64-expensive-checks-debian` https://lab.llvm.org/buildbot/#/builders/16/builds/53620
2023-08-24[CodeGen] Fix unconditional branch duplication issue in bbsectionsDaniel Hoekwater1-1/+2
If an end section basic block ends in an unconditional branch to its fallthrough, BasicBlockSections will duplicate the unconditional branch. This doesn't break x86, but it is a (slight) size optimization and more importantly prevents AArch64 builds from breaking. Ex: ``` bb1 (bbsections Hot): jmp bb2 bb2 (bbsections Cold): /* do work... */ ``` After running sortBasicBlocksAndUpdateBranches(): ``` bb1 (bbsections Hot): jmp bb2 jmp bb2 bb2 (bbsections Cold): /* do work... */ ``` Differential Revision: https://reviews.llvm.org/D158674
2023-08-23Revert "[BasicBlockSections] avoid insertting redundant branch to fall ↵Rahman Lavaee1-12/+2
through blocks" This reverts commit ab53109166c0345a79cbd6939cf7bc764a982856 which was commited by mistake.
2023-08-22[BasicBlockSections] avoid insertting redundant branch to fall through blocksRahman Lavaee1-2/+12
2023-08-20[Propeller] Deprecate Codegen paths for SHT_LLVM_BB_ADDR_MAP version 1.Rahman Lavaee1-13/+5
This patch removes the `getBBIDOrNumber` which was introduced to allow emitting version 1. Reviewed By: shenhan Differential Revision: https://reviews.llvm.org/D158299
2023-08-18[CodeGen] Use the TII hook for Noop insertion in BBSections (NFC)Daniel Hoekwater1-3/+1
Refactor BasicBlockSections to use the target-specific noop insertion hook from TargetInstrInfo instead of building it ourselves. Using the TII hook is both cleaner and makes it easier to extend BBSections to non-X86 targets. Differential Revision: https://reviews.llvm.org/D158303
2023-06-27[Propeller] Match debug info filenames from profiles to distinguish internal ↵Rahman Lavaee1-4/+10
linkage functions with the same names. Basic block sections profiles are ingested based on the function name. However, conflicts may occur when internal linkage functions with the same symbol name are linked into the binary (for instance static functions defined in different modules). Currently, these functions cannot be optimized unless we use `-funique-internal-linkage-names` (D89617) to enforce unique symbol names. However, we have found that `-funique-internal-linkage-names` does not play well with inline assembly code which refers to the symbol via its symbol name. For example, the Linux kernel does not build with this option. This patch implements a new feature which allows differentiating profiles based on the debug info filenames associated with each function. When specified, the given path is compared against the debug info filename of the matching function and profile is ingested only when the debug info filenames match. Backward-compatibility is guaranteed as omitting the specifiers from the profile would allow them to be matched by function name only. Also specifiers can be included for a subset of functions only. Reviewed By: shenhan Differential Revision: https://reviews.llvm.org/D146770
2023-05-09[IRGen] Change annotation metadata to support inserting tuple of strings ↵Zain Jaffal1-1/+1
into annotation metadata array. Annotation metadata supports adding singular annotation strings to annotation block. This patch adds the ability to insert a tuple of strings into the metadata array. The idea here is that each tuple of strings represents a piece of information that can be all related. It makes it easier to parse through related metadata information given it will be contained in one tuple. For example in remarks any pass that implements annotation remarks can have different type of remarks and pass additional information for each. The original behaviour of annotation remarks is preserved here and we can mix tuple annotations and single annotations for the same instruction. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D148328
2023-02-14Move global namespace cl::opt inside llvm::Fangrui Song1-1/+1
2023-01-17[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number.Rahman Lavaee1-36/+44
Let Propeller use specialized IDs for basic blocks, instead of MBB number. This allows optimizations not just prior to asm-printer, but throughout the entire codegen. This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version. ####Background Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows. - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly. - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to. - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point. - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks). - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped. - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else. ####Solution We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs. To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies. The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions. ####Impact on Size of the `LLVM_BB_ADDR_MAP` Section Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D100808
2022-12-20[llvm] Use std::optional instead of OptionalKazu Hirata1-5/+5
This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-13Revert "[Propeller] Use Fixed MBB ID instead of volatile ↵Rahman Lavaee1-42/+34
MachineBasicBlock::Number." This reverts commit 6015a045d768feab3bae9ad9c0c81e118df8b04a. Differential Revision: https://reviews.llvm.org/D139952
2022-12-07[llvm] Don't include Optional.h (NFC)Kazu Hirata1-1/+0
These source files no longer use Optional<T>. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-06[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number.Rahman Lavaee1-34/+42
Let Propeller use specialized IDs for basic blocks, instead of MBB number. This allows optimizations not just prior to asm-printer, but throughout the entire codegen. This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version. ####Background Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows. - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly. - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to. - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point. - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks). - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped. - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else. ####Solution We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs. To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies. The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions. ####Impact on Size of the `LLVM_BB_ADDR_MAP` Section Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D100808
2022-11-26[CodeGen] Use std::optional in BasicBlockSections.cpp (NFC)Kazu Hirata1-1/+2
This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-07-22Add a nop instruction if a section starts with landing pad for function splitterARCHIT SAXENA1-4/+2
This change adds a nop instruction if section starts with landing pad. This change is like [D73739](https://reviews.llvm.org/D73739) which avoids zero offset landing pad in basic block sections. Detailed description: The current machine functions splitter can create ˜sections which start with a landing pad themselves. This places landing pad at offset zero from LPStart. ``` .section .text.split.foo10,"ax",@progbits foo10.cold: # %lpad .cfi_startproc .cfi_personality 3, __gxx_personality_v0 .cfi_lsda 3, .Lexception5 .cfi_def_cfa %rsp, 16 .Ltmp11: <--- This is a Landing pad and also LP Start as it is start of this section movq %rax, %rdi <--- first instruction is at offest 0 from LPStart callq _Unwind_Resume@PLT ``` This will cause landing pad entries to become zero (.Ltmp11-foo10.cold) ``` .Lcst_begin4: .uleb128 .Ltmp9-.Lfunc_begin2 # >> Call Site 1 << .uleb128 .Ltmp10-.Ltmp9 # Call between .Ltmp9 and .Ltmp10 .uleb128 .Ltmp11-foo10.cold <---This is zero # jumps to .Ltmp11 .byte 3 # On action: 2 .uleb128 .Ltmp10-.Lfunc_begin2 # >> Call Site 2 << .uleb128 .Lfunc_end9-.Ltmp10 # Call between .Ltmp10 and .Lfunc_end9 .byte 0 # has no landing pad .byte 0 # On action: cleanup .p2align 2 ``` The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. This change adds a nop instruction at start of such sections so that such a case could be avoided. Output: ``` .section .text.split.foo10,"ax",@progbits foo10.cold: # %lpad .cfi_startproc .cfi_personality 3, __gxx_personality_v0 .cfi_lsda 3, .Lexception5 .cfi_def_cfa %rsp, 16 nop <--- new instruction that is added .Ltmp11: movq %rax, %rdi callq _Unwind_Resume@PLT ``` Reviewed By: modimo, snehasish, rahmanl Differential Revision: https://reviews.llvm.org/D130133
2022-07-17[CodeGen] Qualify auto variables in for loops (NFC)Kazu Hirata1-1/+1
2022-06-28[Propeller] Encode address offsets of basic blocks relative to the end of ↵Rahman Lavaee1-1/+1
the previous basic blocks. This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format. Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address. This encoding uses smaller values compared to the existing one (offsets relative to function symbol). Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary). The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D121346
2022-06-26[llvm] Don't use Optional::hasValue (NFC)Kazu Hirata1-3/+2
This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.
2022-06-25Revert "Don't use Optional::hasValue (NFC)"Kazu Hirata1-2/+3
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
2022-06-25Don't use Optional::hasValue (NFC)Kazu Hirata1-3/+2
2022-06-20[llvm] Don't use Optional::getValue (NFC)Kazu Hirata1-1/+1
2022-06-20[llvm] Don't use Optional::hasValue (NFC)Kazu Hirata1-1/+1
2022-05-26Reland "[Propeller] Promote functions with propeller profiles to .text.hot."Rahman Lavaee1-153/+16
This relands commit 4d8d2580c53e130c3c3dd3877384301e3c495554. The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests. Differential Revision: https://reviews.llvm.org/D126518
2022-05-26Revert "[Propeller] Promote functions with propeller profiles to .text.hot."Rahman Lavaee1-16/+153
This reverts commit 4d8d2580c53e130c3c3dd3877384301e3c495554.
2022-05-26[Propeller] Promote functions with propeller profiles to .text.hot.Rahman Lavaee1-153/+16
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely). This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`. The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`. Differential Revision: https://reviews.llvm.org/D122930
2022-03-16Cleanup codegen includesserge-sans-paille1-1/+0
This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
2022-03-10Revert "Cleanup codegen includes"Nico Weber1-0/+1
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10Cleanup codegen includesserge-sans-paille1-1/+0
after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
2022-02-24Revert "Encode address offsets of basic blocks relative to the end of the ↵Rahman Lavaee1-1/+1
previous basic blocks." This reverts commit 029283c1c0d8d06fbf000f5682c56b8595a1101f. The code in `ELFFile::decodeBBAddrMap` was not changed in the submitted patch. Differential Revision: https://reviews.llvm.org/D120457
2022-02-22Encode address offsets of basic blocks relative to the end of the previous ↵Rahman Lavaee1-1/+1
basic blocks. Conceptually, the new encoding emits the offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, the offsets must be aggregated along with basic block sizes to calculate the final relative-to-function offsets of basic blocks. This encoding uses smaller values compared to the existing one (offsets relative to function symbol). Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 25% reduction in the size of the bb-address-map section (reduction from about 9MB to 7MB). Reviewed By: tmsriram, jhenderson Differential Revision: https://reviews.llvm.org/D106421
2021-07-30Explain the symbols of basic block clusters with an example in the header ↵Rahman Lavaee1-3/+15
comments. This prevents from confusion with the ``labels`` option. Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D107128
2021-03-15Change void getNoop(MCInst &NopInst) to MCInst getNop()Fangrui Song1-3/+2
Prefer (self-documenting) return values to output parameters (which are liable to be used). While here, rename Noop to Nop which is more widely used and improves consistency with hasEmitNops/setEmitNops/emitNop/etc.
2021-01-29Detect Source Drift with Propeller.Sriraman Tallam1-0/+38
Source Drift happens when the sources are updated after profiling the binary but before building the final optimized binary. If the source has changed since the profiles were obtained, optimizing basic blocks might be sub-optimal. This only applies to BasicBlockSection::List as it creates clusters of basic blocks using basic block ids. Source drift can invalidate these groupings leading to sub-optimal code generation with regards to performance. PGO source drift for a particular function can be detected using function metadata added in D95495. When source drift is deected, disable basic block clusters by default which can be re-enabled with -mllvm option bbsections-detect-source-drift=false. Differential Revision: https://reviews.llvm.org/D95593
2020-10-14[llvm] Set the default for -bbsections-cold-text-prefix to .text.split.Snehasish Kumar1-3/+3
After using this for a while, we find that it is generally useful to have it set to .text.split. by default, removing the need for an additional -mllvm option. Differential Revision: https://reviews.llvm.org/D88997
2020-10-08Introduce and use a new section type for the bb_addr_map section.Rahman Lavaee1-3/+3
This patch lets the bb_addr_map (renamed to __llvm_bb_addr_map) section use a special section type (SHT_LLVM_BB_ADDR_MAP) instead of SHT_PROGBITS. This would help parsers, dumpers and other tools to use the sh_type ELF field to identify this section rather than relying on string comparison on the section name. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D88199