aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
AgeCommit message (Collapse)AuthorFilesLines
2021-01-26Add -fbinutils-version= to gate ELF features on the specified binutils versionFangrui Song1-3/+5
There are two use cases. Assembler We have accrued some code gated on MCAsmInfo::useIntegratedAssembler(). Some features are supported by latest GNU as, but we have to use MCAsmInfo::useIntegratedAs() because the newer versions have not been widely adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26). Linker We want to use features supported only by LLD or very new GNU ld, or don't want to work around older GNU ld. We currently can't represent that "we don't care about old GNU ld". You can find such workarounds in a few other places, e.g. Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276), R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727 https://sourceware.org/bugzilla/show_bug.cgi?id=22969) Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001; GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available). This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table). This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc. It changes one codegen place in SHF_MERGE to demonstrate its usage. `-fbinutils-version=2.35` means the produced object file does not care about GNU ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced assembly can be consumed by GNU as>=2.35, but older versions may not work. `-fbinutils-version=none` means that we can use all ELF features, regardless of GNU as/ld support. Both clang and llc need `parseBinutilsVersion`. Such command line parsing is usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen), however, ClangCodeGen does not depend on LLVMCodeGen. So I add `parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget). Differential Revision: https://reviews.llvm.org/D85474
2021-01-20[AArch64] Add support for the GNU ILP32 ABIAmanieu d'Antras1-5/+14
Add the aarch64[_be]-*-gnu_ilp32 targets to support the GNU ILP32 ABI for AArch64. The needed codegen changes were mostly already implemented in D61259, which added support for the watchOS ILP32 ABI. The main changes are: - Wiring up the new target to enable ILP32 codegen and MC. - ILP32 va_list support. - ILP32 TLSDESC relocation support. There was existing MC support for ELF ILP32 relocations from D25159 which could be enabled by passing "-target-abi ilp32" to llvm-mc. This was changed to check for "gnu_ilp32" in the target triple instead. This shouldn't cause any issues since the existing support was slightly broken: it was generating ELF64 objects instead of the ELF32 object files expected by the GNU ILP32 toolchain. This target has been tested by running the full rustc testsuite on a big-endian ILP32 system based on the GCC ILP32 toolchain. Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D94143
2021-01-12[llvm] Remove redundant string initialization (NFC)Kazu Hirata1-1/+1
Identified with readability-redundant-string-init.
2021-01-02[PowerPC] Add the LLVM triple for powerpcle [1/5]Brandon Bergren1-0/+1
Add a triple for powerpcle-*-*. This is a little-endian encoding of the 32-bit PowerPC ABI, useful in certain niche situations: 1) A loader such as the FreeBSD loader which will be loading a little endian kernel. This is required for PowerPC64LE to load properly in pseries VMs. Such a loader is implemented as a freestanding ELF32 LSB binary. 2) Userspace emulation of a 32-bit LE architecture such as x86 on 64-bit hosts such as PowerPC64LE with tools like box86 requires having a 32-bit LE toolchain and library set, as they operate by translating only the main binary and switching to native code when making library calls. 3) The Void Linux for PowerPC project is experimenting with running an entire powerpcle userland. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D93918
2020-12-16[XCOFF][AIX] Emit EH information in traceback tablediggerlin1-0/+23
SUMMARY: In order for the runtime on AIX to find the compact unwind section(EHInfo table), we would need to set the following on the traceback table: The 6th byte's longtbtable field to true to signal there is an Extended TB Table Flag. The Extended TB Table Flag to be 0x08 to signal there is an exception handling info presents. Emit the offset between ehinfo TC entry and TOC base after all other optional portions of traceback table. The patch is authored by Jason Liu. Reviewers: David Tenty, Digger Lin Differential Revision: https://reviews.llvm.org/D92766
2020-12-14[NFC] cleanup cg-profile emission on TargetLowerinngZequan Wu1-84/+2
Differential Revision: https://reviews.llvm.org/D93150
2020-12-10[CSSPGO] Pseudo probe encoding and emission.Hongtao Yu1-0/+24
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead.  **ELF object emission** The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission. Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool. The format of `.pseudo_probe_desc` section looks like: ``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ``` For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format : ``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ``` To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index. **Assembling** Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis. A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file. A example assembly looks like: ``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq *%rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ``` With inlining turned on, the assembly may look different around %bb2 with an inlined probe: ``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ``` **Disassembling** We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file. An example disassembly looks like: ``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91878
2020-12-10Revert "[CSSPGO] Pseudo probe encoding and emission."Mitch Phillips1-24/+0
This reverts commit b035513c06d1cba2bae8f3e88798334e877523e1. Reason: Broke the ASan buildbots: http://lab.llvm.org:8011/#/builders/5/builds/2269
2020-12-10[CSSPGO] Pseudo probe encoding and emission.Hongtao Yu1-0/+24
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead.  **ELF object emission** The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission. Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool. The format of `.pseudo_probe_desc` section looks like: ``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ``` For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format : ``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ``` To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index. **Assembling** Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis. A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file. A example assembly looks like: ``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq *%rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ``` With inlining turned on, the assembly may look different around %bb2 with an inlined probe: ``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ``` **Disassembling** We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file. An example disassembly looks like: ``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91878
2020-12-08[CodeGen] Add text section prefix for COFF object filePan, Tao1-2/+6
Text section prefix is created in CodeGenPrepare, it's file format independent implementation, text section name is written into object file in TargetLoweringObjectFile, it's file format dependent implementation, port code of adding text section prefix to text section name from ELF to COFF. Different with ELF that use '.' as concatenation character, COFF use '$' as concatenation character. That is, concatenation character is variable, so split concatenation character from text section prefix. Text section prefix is existing feature of ELF, it can help to reduce icache and itlb misses, it's also make possible aggregate other compilers e.g. v8 created same prefix sections. Furthermore, the recent feature Machine Function Splitter (basic block level text prefix section) is based on text section prefix. Reviewed By: pengfei, rnk Differential Revision: https://reviews.llvm.org/D92073
2020-12-02[XCOFF][AIX] Generate LSDA data and compact unwind section on AIXjasonliu1-1/+5
Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructions. 2. AIX uses a new personality routine, named __xlcxx_personality_v1. It doesn't use the GCC personality rountine, because the interoperability is not there yet on AIX. 3. AIX do not use eh_frame sections. Instead, it would use a eh_info section (compat unwind section) to store the information about personality routine and LSDA data address. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D91455
2020-11-19[llvm][IR] Add dso_local_equivalent ConstantLeonard Chan1-0/+19
The `dso_local_equivalent` constant is a wrapper for functions that represents a value which is functionally equivalent to the global passed to this. That is, if this accepts a function, calling this constant should have the same effects as calling the function directly. This could be a direct reference to the function, the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the resolved function as a call target. When lowered, the returned address must have a constant offset at link time from some other symbol defined within the same binary. The address of this value is also insignificant. The name is leveraged from `dso_local` where use of a function or variable is resolved to a symbol in the same linkage unit. In this patch: - Addition of `dso_local_equivalent` and handling it - Update Constant::needsRelocation() to strip constant inbound GEPs and take advantage of `dso_local_equivalent` for relative references This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959) which makes vtables readonly. This works by replacing the dynamic relocations for function pointers in them with static relocations that represent the offset between the vtable and virtual functions. If a function is externally defined, `dso_local_equivalent` can be used as a generic wrapper for the function to still allow for this static offset calculation to be done. See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details. Differential Revision: https://reviews.llvm.org/D77248
2020-11-13[CGProfile] allows bitcast in metadata node storing function pointersYuanfang Chen1-1/+1
For example, during RAUW in IRMover, the `Function` ValueAsMetadata in "CG Profile" could become bitcast. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D88433
2020-11-09[XCOFF] Enable explicit sections on AIXjasonliu1-15/+33
Implement mechanism to allow explicit sections to be generated on AIX. Reviewed By: DiggerLin Differential Revision: https://reviews.llvm.org/D88615
2020-11-03[CodeGen] Fix regression from D83655Jessica Clarke1-1/+2
Arm EHABI has a null LSDASection as it does its own thing, so we should continue to return null in that case rather than try and cast it.
2020-11-02[AsmPrinter] Split up .gcc_except_tableFangrui Song1-0/+36
MC currently produces monolithic .gcc_except_table section. GCC can split up .gcc_except_table: * if comdat: `.section .gcc_except_table._Z6comdatv,"aG",@progbits,_Z6comdatv,comdat` * otherwise, if -ffunction-sections: `.section .gcc_except_table._Z3fooi,"a",@progbits` This ensures that (a) non-prevailing copies are discarded and (b) .gcc_except_table associated to discarded text sections can be discarded by a .gcc_except_table-aware linker (GNU ld, but not gold or LLD) This patches matches the GCC behavior. If -fno-unique-section-names is specified, we don't append the suffix. If -ffunction-sections is additionally specified, use `.section ...,unique`. Note, if clang driver communicates that the linker is LLD and we know it is new (11.0.0 or later) we can use SHF_LINK_ORDER to avoid string table costs, at least in the -fno-unique-section-names case. We cannot use it on GNU ld because as of binutils 2.35 it does not support mixed SHF_LINK_ORDER & non-SHF_LINK_ORDER components in an output section https://sourceware.org/bugzilla/show_bug.cgi?id=26256 For RISC-V -mrelax, this patch additionally fixes an assembler-linker interaction problem: because a section is shrinkable, the length of a call-site code range is not a constant. Relocations referencing the associated text section (STT_SECTION) are needed. However, a STB_LOCAL relocation referencing a discarded section group member from outside the group is disallowed by the ELF specification (PR46675): ``` // a.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int main() { return comdat(); } // b.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int foo() { return comdat(); } clang++ -target riscv64-linux -c a.cc b.cc -fPIC -mno-relax ld.lld -shared a.o b.o => ld.lld: error: relocation refers to a symbol in a discarded section: ``` -fbasic-block-sections= is similar to RISC-V -mrelax: there are outstanding relocations. Reviewed By: jrtc27, rahmanl Differential Revision: https://reviews.llvm.org/D83655
2020-10-14[llvm] Set the default for -bbsections-cold-text-prefix to .text.split.Snehasish Kumar1-1/+1
After using this for a while, we find that it is generally useful to have it set to .text.split. by default, removing the need for an additional -mllvm option. Differential Revision: https://reviews.llvm.org/D88997
2020-10-02[XCOFF] Enable -fdata-sections on AIXjasonliu1-21/+29
Summary: Some design decision worth noting about: I've noticed a recent mailing discussing about why string literal is not affected by -fdata-sections for ELF target: http://lists.llvm.org/pipermail/llvm-dev/2020-September/145121.html But on AIX, our linker could not split the mergeable string like other target. So I think it would make more sense for us to emit separate csect for every mergeable string in -fdata-sections mode, as there might not be other ways for linker to do garbage collection on unused mergeable string. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D88339
2020-09-29[CodeGen] emit CG profile for COFF object fileZequan Wu1-10/+55
Differential Revision: https://reviews.llvm.org/D87811
2020-09-27Revert "Reland [CodeGen] emit CG profile for COFF object file"Arthur Eubanks1-14/+52
This reverts commit 506b6170cb513f1cb6e93a3b690c758f9ded18ac. This still causes link errors, see https://crbug.com/1130780.
2020-09-24[llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different ↵Snehasish Kumar1-1/+2
section. This change adds an option to basic block sections to allow cold clusters to be assigned a custom text prefix. With a custom prefix such as ".text.split." (D87840), lld can place them in a separate output section. The benefits are - * Empirically shown to improve icache and itlb metrics by 3-5% (absolute) compared to placing split parts in .text.unlikely. * Mitigates against poor profiles, eg samplePGO profiles used with the machine function splitter. Optimizations such as hugepage remapping can make different decisions at the section granularity. * Enables section granularity hotness monitoring (checking on the decisions made during compilation vs sample data from production). Differential Revision: https://reviews.llvm.org/D87813
2020-09-24Reland [CodeGen] emit CG profile for COFF object fileZequan Wu1-52/+14
This reverts commit 90242caca2074dab5a9b76e5bc36d9fafd2179a7. Error fixed at f5435399e823746bbe1737b95c853d77a42e1ac3 Differential Revision: https://reviews.llvm.org/D87811
2020-09-22Revert "[CodeGen] emit CG profile for COFF object file"Reid Kleckner1-14/+52
This reverts commit 91aed9bf975f1e4346cc8f4bdefc98436386ced2, it is causing link errors.
2020-09-18[COFF] Move per-global .drective emission from AsmPrinter to TLOFCOFFReid Kleckner1-25/+60
This changes the order of output sections and the output assembly, but is otherwise NFC. It simplifies the TLOF interface by removing two COFF-only methods.
2020-09-18[CodeGen] emit CG profile for COFF object fileZequan Wu1-52/+14
I forgot to add emission of CG profile for COFF object file, when adding the support (https://reviews.llvm.org/D81775) Differential Revision: https://reviews.llvm.org/D87811
2020-08-25[TargetLoweringObjectFileImpl] Make .llvmbc and .llvmcmd non-SHF_ALLOCFangrui Song1-1/+2
There are two ways .llvmbc can be produced: * clang -c -fembed-bitcode=all (which also produces .llvmcmd) * LTO backend: ld.lld -mllvm -lto-embed-bitcode or -plugin-opt=-lto-embed-bitcode .llvmbc and .llvmcmd have the SHF_ALLOC flag, so they can be dropped by --gc-sections. This patch sets SectionKind::Metadata to drop the SHF_ALLOC flag. This is conceptually correct: the two sections are not part of the process image, so SHF_ALLOC is not appropriate. `test/LTO/X86/embed-bitcode.ll`: changed `llvm-objcopy -O binary --only-section` to `llvm-objcopy --dump-section`. `-O binary` does not dump non-SHF_ALLOC sections. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D86374
2020-08-11[AIX][XCOFF] change the operand of branch instruction from symbol name to ↵diggerlin1-22/+13
qualified symbol name for function declarations SUMMARY: 1. in the patch , remove setting storageclass in function .getXCOFFSection and construct function of class MCSectionXCOFF there are XCOFF::StorageMappingClass MappingClass; XCOFF::SymbolType Type; XCOFF::StorageClass StorageClass; in the MCSectionXCOFF class, these attribute only used in the XCOFFObjectWriter, (asm path do not need the StorageClass) we need get the value of StorageClass, Type,MappingClass before we invoke the getXCOFFSection every time. actually , we can get the StorageClass of the MCSectionXCOFF from it's delegated symbol. 2. we also change the oprand of branch instruction from symbol name to qualify symbol name. for example change bl .foo extern .foo to bl .foo[PR] extern .foo[PR] 3. and if there is reference indirect call a function bar. we also add extern .bar[PR] Reviewers: Jason liu, Xiangling Liao Differential Revision: https://reviews.llvm.org/D84765
2020-08-10[XCOFF][AIX] Use TE storage mapping class when large code model is enabledjasonliu1-2/+5
Summary: Use TE SMC instead of TC SMC in large code model mode, so that large code model TOC entries could get placed after all the small code model TOC entries, which reduces the chance of TOC overflow. Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D85455
2020-08-10[AIX] Static init frontend recovery and backend supportXiangling Liao1-4/+4
On the frontend side, this patch recovers AIX static init implementation to use the linkage type and function names Clang chooses for sinit related function. On the backend side, this patch sets correct linkage and function names on aliases created for sinit/sterm functions. Differential Revision: https://reviews.llvm.org/D84534
2020-08-06[XCOFF][AIX] Put each jump table in an independent section if ↵jasonliu1-4/+11
-ffunction-sections is specified If a function is in a unique section, putting all jump tables in .rodata will prevent functions that have a jump table to get garbage collect by the linker. Therefore, we need to put jump table into a unique section as well. Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D84761
2020-08-03[MC] Set sh_link to 0 if the associated symbol is undefinedFangrui Song1-1/+1
Part of https://bugs.llvm.org/show_bug.cgi?id=41734 LTO can drop externally available definitions. Such AssociatedSymbol is not associated with a symbol. ELFWriter::writeSection() will assert. Allow a SHF_LINK_ORDER section to have sh_link=0. We need to give sh_link a syntax, a literal zero in the linked-to symbol position, e.g. `.section name,"ao",@progbits,0` Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D72899
2020-07-30[XCOFF][AIX] Enable -ffunction-sectionsjasonliu1-3/+24
Summary: This patch implements -ffunction-sections on AIX. This patch focuses on assembly generation. Follow-on patch needs to handle: 1. -ffunction-sections implication for jump table. 2. Object file generation path and associated testing. Differential Revision: https://reviews.llvm.org/D83875
2020-07-23Fix -Wparentheses warning - add missing brackets around the entire assertion ↵Simon Pilgrim1-5/+5
condition
2020-07-22[XCOFF] Enable symbol alias for AIXjasonliu1-5/+13
Summary: AIX assembly's .set directive is not usable for aliasing purpose. We need to use extra-label-at-defintion strategy to generate symbol aliasing on AIX. Reviewed By: DiggerLin, Xiangling_L Differential Revision: https://reviews.llvm.org/D83252
2020-07-09[SVE] Remove calls to VectorType::getNumElements from CodeGenChristopher Tetreault1-1/+1
Reviewers: efriedma, fpetrogalli, sdesmalen, RKSimon, arsenm Reviewed By: RKSimon Subscribers: wdng, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82210
2020-07-06[XCOFF][AIX] Give symbol an internal name when desired symbol name contains ↵jasonliu1-1/+1
invalid character(s) Summary: When a desired symbol name contains invalid character that the system assembler could not process, we need to emit .rename directive in assembly path in order for that desired symbol name to appear in the symbol table. Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L Differential Revision: https://reviews.llvm.org/D82481
2020-06-29[Alignment][NFC] migrate DataLayout::getPreferredAlignmentGuillaume Chatelet1-8/+8
This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82752
2020-06-02Options for Basic Block Sections, enabled in D68063 and D73674.Sriraman Tallam1-1/+1
This patch adds clang options: -fbasic-block-sections={all,<filename>,labels,none} and -funique-basic-block-section-names. LLVM Support for basic block sections is already enabled. + -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic block sections for all or a subset of basic blocks. "labels" only enables basic block symbols. + -funique-basic-block-section-names: Enables unique section names for basic block sections, disabled by default. Differential Revision: https://reviews.llvm.org/D68049
2020-05-29[AIX] Emit AvailableExternally Linkage on AIXXiangling Liao1-5/+3
Since on AIX, our strategy is to not use -u to suppress any undefined symbols, we need to emit .extern for the symbols with AvailableExternally linkage. Differential Revision: https://reviews.llvm.org/D80642
2020-05-27[NFC][XCOFF][AIX] Return function entry point symbol with dedicate functionjasonliu1-0/+8
Use getFunctionEntryPointSymbol whenever possible to enclose the implementation detail and reduce duplicate logic. Differential Revision: https://reviews.llvm.org/D80402
2020-05-24[TargetLoweringObjectFileImpl] Use llvm::transformOrivej Desh1-1/+1
Fixes a build issue with libc++ configured with _LIBCPP_RAW_ITERATORS (ADL not effective) ``` llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp:1602:3: error: no matching function for call to 'transform' transform(HexString.begin(), HexString.end(), HexString.begin(), tolower); ^~~~~~~~~ ``` Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D80475
2020-05-21[Target] Use Align in TargetLoweringObjectFile::getSectionForConstant.Craig Topper1-14/+14
Differential Revision: https://reviews.llvm.org/D80363
2020-05-12[TargetLoweringObjectFileImpl] Produce .text.hot. instead of .text.hot for ↵Fangrui Song1-2/+6
-fno-unique-section-names GNU ld's internal linker script uses (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=add44f8d5c5c05e08b11e033127a744d61c26aee) .text : { *(.text.unlikely .text.*_unlikely .text.unlikely.*) *(.text.exit .text.exit.*) *(.text.startup .text.startup.*) *(.text.hot .text.hot.*) *(SORT(.text.sorted.*)) *(.text .stub .text.* .gnu.linkonce.t.*) /* .gnu.warning sections are handled specially by elf.em. */ *(.gnu.warning) } Because `*(.text.exit .text.exit.*)` is ordered before `*(.text .text.*)`, in a -ffunction-sections build, the C library function `exit` will be placed before other functions. gold's `-z keep-text-section-prefix` has the same problem. In lld, `-z keep-text-section-prefix` recognizes `.text.{exit,hot,startup,unlikely,unknown}.*`, but not `.text.{exit,hot,startup,unlikely,unknown}`, to avoid the strange placement problem. In -fno-function-sections or -fno-unique-section-names mode, a function whose `function_section_prefix` is set to `.exit"` will go to the output section `.text` instead of `.text.exit` when linked by lld. To address the problem, append a dot to become `.text.exit.` Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D79600
2020-04-30[AIX] emit .extern and .weak directive linkagediggerlin1-3/+7
SUMMARY: emit .extern and .weak directive linkage Reviewers: hubert.reinterpretcast, Jason Liu Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D76932
2020-04-24Use .text.unlikely and .text.eh prefixes for MachineBasicBlock sections.Snehasish Kumar1-16/+16
Summary: Instead of adding a ".unlikely" or ".eh" suffix for machine basic blocks, this change updates the behaviour to use an appropriate prefix instead. This allows lld to group basic block sections together when -z,keep-text-section-prefix is specified and matches the behaviour observed in gcc. Reviewers: tmsriram, mtrofin, efriedma Reviewed By: tmsriram, efriedma Subscribers: eli.friedman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78742
2020-04-17[XCOFF][AIX] Fix getSymbol to return the correct qualname when necessaryjasonliu1-0/+34
Summary: AIX symbol have qualname and unqualified name. The stock getSymbol could only return unqualified name, which leads us to patch many caller side(lowerConstant, getMCSymbolForTOCPseudoMO). So we should try to address this problem in the callee side(getSymbol) and clean up the caller side instead. Note: this is a "mostly" NFC patch, with a fix for the original lowerConstant behavior. Differential Revision: https://reviews.llvm.org/D78045
2020-04-16[MC][ELF] Put explicit section name symbols into entry size compatible sectionsbd1976llvm1-45/+134
Ensure that symbols explicitly* assigned a section name are placed into a section with a compatible entry size. This is done by creating multiple sections with the same name** if incompatible symbols are explicitly given the name of an incompatible section, whilst: - Avoiding using uniqued sections where possible (for readability and to maximize compatibly with assemblers). - Creating as few SHF_MERGE sections as possible (for efficiency). Given that each symbol is assigned to a section in a single pass, we must decide which section each symbol is assigned to without seeing the properties of all symbols. A stable and easy to understand assignment is desirable. The following rules facilitate this: The "generic" section for a given section name will be mergeable if the name is a mergeable "default" section name (such as .debug_str), a mergeable "implicit" section name (such as .rodata.str2.2), or MC has already created a mergeable "generic" section for the given section name (e.g. in response to a section directive in inline assembly). Otherwise, the "generic" section for a given name is non-mergeable; and, non-mergeable symbols are assigned to the "generic" section, while mergeable symbols are assigned to uniqued sections. Terminology: "default" sections are those always created by MC initially, e.g. .text or .debug_str. "implicit" sections are those created normally by MC in response to the symbols that it encounters, i.e. in the absence of an explicit section name assignment on the symbol, e.g. a function foo might be placed into a .text.foo section. "generic" sections are those that are referred to when a unique section ID is not supplied, e.g. if there are multiple unique .bob sections then ".quad .bob" will reference the generic .bob section. Typically, the generic section is just the first section of a given name to be created. Default sections are always generic. * Typically, section names might be explicitly assigned in source code using a language extension e.g. a section attribute: _attribute_ ((section ("section-name"))) - https://clang.llvm.org/docs/AttributeReference.html ** I refer to such sections as unique/uniqued sections. In assembly the ", unique," assembly syntax is used to express such sections. Fixes https://bugs.llvm.org/show_bug.cgi?id=43457. See https://reviews.llvm.org/D68101 for previous discussions leading to this patch. Some minor fixes were required to LLVM's tests, for tests had been using the old behavior - which allowed for explicitly assigning globals with incompatible entry sizes to a section. This fix relies on the ",unique ," assembly feature. This feature is not available until bintuils version 2.35 (https://sourceware.org/bugzilla/show_bug.cgi?id=25380). If the integrated assembler is not being used then we avoid using this feature for compatibility and instead try to place mergeable symbols into non-mergeable sections or issue an error otherwise. Differential Revision: https://reviews.llvm.org/D72194
2020-04-15[MC] Rename MCSection*::getSectionName() to getName(). NFCFangrui Song1-2/+2
A pending change will merge MCSection*::getName() to MCSection::getName().
2020-04-15[NFC] correct "thier" to "their"Josh Stone1-1/+1
2020-04-14[WebAssembly] Emit .llvmcmd and .llvmbc as custom sectionsSam Clegg1-0/+8
Fixes: https://bugs.llvm.org/show_bug.cgi?id=45362 Differential Revision: https://reviews.llvm.org/D77115