diff options
author | H.J. Lu <hjl.tools@gmail.com> | 2022-08-19 11:50:41 -0700 |
---|---|---|
committer | H.J. Lu <hjl.tools@gmail.com> | 2025-08-13 12:34:13 -0700 |
commit | 5cf1b9a03ec5b617af8c50c1e9c0d223083fd7f2 (patch) | |
tree | 88698f601ab9c4b01e87089f12695ae2f7c1c5ae /libgfortran/generated/shape_i4.c | |
parent | 90238c0f172b9ea7189cf295584632217792b55a (diff) | |
download | gcc-5cf1b9a03ec5b617af8c50c1e9c0d223083fd7f2.zip gcc-5cf1b9a03ec5b617af8c50c1e9c0d223083fd7f2.tar.gz gcc-5cf1b9a03ec5b617af8c50c1e9c0d223083fd7f2.tar.bz2 |
x86-64: Remove redundant TLS calls
For TLS calls:
1. UNSPEC_TLS_GD:
(parallel [
(set (reg:DI 0 ax)
(call:DI (mem:QI (symbol_ref:DI ("__tls_get_addr")))
(const_int 0 [0])))
(unspec:DI [(symbol_ref:DI ("e") [flags 0x50])
(reg/f:DI 7 sp)] UNSPEC_TLS_GD)
(clobber (reg:DI 5 di))])
2. UNSPEC_TLS_LD_BASE:
(parallel [
(set (reg:DI 0 ax)
(call:DI (mem:QI (symbol_ref:DI ("__tls_get_addr")))
(const_int 0 [0])))
(unspec:DI [(reg/f:DI 7 sp)] UNSPEC_TLS_LD_BASE)])
3. UNSPEC_TLSDESC:
(parallel [
(set (reg/f:DI 104)
(plus:DI (unspec:DI [
(symbol_ref:DI ("_TLS_MODULE_BASE_") [flags 0x10])
(reg:DI 114)
(reg/f:DI 7 sp)] UNSPEC_TLSDESC)
(const:DI (unspec:DI [
(symbol_ref:DI ("e") [flags 0x1a])
] UNSPEC_DTPOFF))))
(clobber (reg:CC 17 flags))])
(parallel [
(set (reg:DI 101)
(unspec:DI [(symbol_ref:DI ("e") [flags 0x50])
(reg:DI 112)
(reg/f:DI 7 sp)] UNSPEC_TLSDESC))
(clobber (reg:CC 17 flags))])
they return the same value for the same input value. But multiple calls
with the same input value may be generated for simple programs like:
void a(long *);
int b(void);
void c(void);
static __thread long e;
long
d(void)
{
a(&e);
if (b())
c();
return e;
}
When compiled with -O2 -fPIC -mtls-dialect=gnu2, the following codes are
generated:
.type d, @function
d:
.LFB0:
.cfi_startproc
pushq %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
leaq e@TLSDESC(%rip), %rbx
movq %rbx, %rax
call *e@TLSCALL(%rax)
addq %fs:0, %rax
movq %rax, %rdi
call a@PLT
call b@PLT
testl %eax, %eax
jne .L8
movq %rbx, %rax
call *e@TLSCALL(%rax)
popq %rbx
.cfi_remember_state
.cfi_def_cfa_offset 8
movq %fs:(%rax), %rax
ret
.p2align 4,,10
.p2align 3
.L8:
.cfi_restore_state
call c@PLT
movq %rbx, %rax
call *e@TLSCALL(%rax)
popq %rbx
.cfi_def_cfa_offset 8
movq %fs:(%rax), %rax
ret
.cfi_endproc
There are 3 "call *e@TLSCALL(%rax)". They all return the same value.
Rename the remove_redundant_vector pass to the x86_cse pass, for 64bit,
extend it to also remove redundant TLS calls to generate:
d:
.LFB0:
.cfi_startproc
pushq %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
leaq e@TLSDESC(%rip), %rax
movq %fs:0, %rdi
call *e@TLSCALL(%rax)
addq %rax, %rdi
movq %rax, %rbx
call a@PLT
call b@PLT
testl %eax, %eax
jne .L8
movq %fs:(%rbx), %rax
popq %rbx
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
.p2align 4,,10
.p2align 3
.L8:
.cfi_restore_state
call c@PLT
movq %fs:(%rbx), %rax
popq %rbx
.cfi_def_cfa_offset 8
ret
.cfi_endproc
with only one "call *e@TLSCALL(%rax)". This reduces the number of
__tls_get_addr calls in libgcc.a by 72%:
__tls_get_addr calls before after
libgcc.a 868 243
gcc/
PR target/81501
* config/i386/i386-features.cc (x86_cse_kind): Add X86_CSE_TLS_GD,
X86_CSE_TLS_LD_BASE and X86_CSE_TLSDESC.
(redundant_load): Renamed to ...
(redundant_pattern): This.
(ix86_place_single_vector_set): Replace redundant_load with
redundant_pattern.
(replace_tls_call): New.
(ix86_place_single_tls_call): Likewise.
(pass_remove_redundant_vector_load): Renamed to ...
(pass_x86_cse): This. Add val, def_insn, mode, scalar_mode, kind,
x86_cse, candidate_gnu_tls_p, candidate_gnu2_tls_p and
candidate_vector_p.
(pass_x86_cse::candidate_gnu_tls_p): New.
(pass_x86_cse::candidate_gnu2_tls_p): Likewise.
(pass_x86_cse::candidate_vector_p): Likewise.
(remove_redundant_vector_load): Renamed to ...
(pass_x86_cse::x86_cse): This. Extend to remove redundant TLS
calls.
(make_pass_remove_redundant_vector_load): Renamed to ...
(make_pass_x86_cse): This.
* config/i386/i386-passes.def: Replace
pass_remove_redundant_vector_load with pass_x86_cse.
* config/i386/i386-protos.h (ix86_tls_get_addr): New.
(make_pass_remove_redundant_vector_load): Renamed to ...
(make_pass_x86_cse): This.
* config/i386/i386.cc (ix86_tls_get_addr): Remove static.
* config/i386/i386.h (machine_function): Add
tls_descriptor_call_multiple_p.
* config/i386/i386.md (tls64): New attribute.
(@tls_global_dynamic_64_<mode>): Set tls_descriptor_call_multiple_p.
(@tls_local_dynamic_base_64_<mode>): Likewise.
(@tls_dynamic_gnu2_64_<mode>): Likewise.
(*tls_global_dynamic_64_<mode>): Set tls64 attribute to gd.
(*tls_local_dynamic_base_64_<mode>): Set tls64 attribute to ld_base.
(*tls_dynamic_gnu2_lea_64_<mode>): Set tls64 attribute to lea.
(*tls_dynamic_gnu2_call_64_<mode>): Set tls64 attribute to call.
(*tls_dynamic_gnu2_combine_64_<mode>): Set tls64 attribute to
combine.
gcc/testsuite/
PR target/81501
* g++.target/i386/pr81501-1.C: New test.
* gcc.target/i386/pr81501-1a.c: Likewise.
* gcc.target/i386/pr81501-1b.c: Likewise.
* gcc.target/i386/pr81501-2a.c: Likewise.
* gcc.target/i386/pr81501-2b.c: Likewise.
* gcc.target/i386/pr81501-3.c: Likewise.
* gcc.target/i386/pr81501-4a.c: Likewise.
* gcc.target/i386/pr81501-4b.c: Likewise.
* gcc.target/i386/pr81501-5.c: Likewise.
* gcc.target/i386/pr81501-6a.c: Likewise.
* gcc.target/i386/pr81501-6b.c: Likewise.
* gcc.target/i386/pr81501-7.c: Likewise.
* gcc.target/i386/pr81501-8a.c: Likewise.
* gcc.target/i386/pr81501-8b.c: Likewise.
* gcc.target/i386/pr81501-9a.c: Likewise.
* gcc.target/i386/pr81501-9b.c: Likewise.
* gcc.target/i386/pr81501-10a.c: Likewise.
* gcc.target/i386/pr81501-10b.c: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Diffstat (limited to 'libgfortran/generated/shape_i4.c')
0 files changed, 0 insertions, 0 deletions