diff options
author | Alan Modra <amodra@gmail.com> | 2018-01-13 18:53:41 +1030 |
---|---|---|
committer | Alan Modra <amodra@gmail.com> | 2018-01-17 18:51:04 +1030 |
commit | 9e390558cef76767a98123994c422d0642d86cf8 (patch) | |
tree | fa46b2f2c764e30be6e6e512fe29a01de0b4ad04 /gold | |
parent | 78742b93a5fef4b88b1391817e7bcc6738000044 (diff) | |
download | binutils-9e390558cef76767a98123994c422d0642d86cf8.zip binutils-9e390558cef76767a98123994c422d0642d86cf8.tar.gz binutils-9e390558cef76767a98123994c422d0642d86cf8.tar.bz2 |
PowerPC PLT stub tidy
This is in preparation for the next patch adding Spectre variant 2
mitigation for PowerPC and PowerPC64. Besides tidying code involved
in stub output (to reduce the number of places where bctr is output),
the patch adds some user visible features:
1) PowerPC64 ELFv2 global entry stubs now are aligned under the
control of --plt-align, with a default alignment of 32 bytes.
2) PowerPC64 __glink_PLTresolve is no longer padded out with nops.
3) PowerPC32 PLT stubs are aligned under the control of --plt-align,
with the default alignment being 16 bytes as before.
4) The PowerPC32 branch/nop table emitted before __glink_PLTresolve
is now smaller in many cases. It was sized incorrectly when the
__tls_get_addr_opt stub was used, and unnecessarily included space
for local ifuncs.
bfd/
* elf32-ppc.c (GLINK_ENTRY_SIZE): Add parameters, handle
__tls_get_addr_opt, and alignment sizing.
(TLS_GET_ADDR_GLINK_SIZE): Delete.
(is_nonpic_glink_stub): Don't use GLINK_ENTRY_SIZE.
(ppc_elf_get_synthetic_symtab): Recognize stubs spaced at 4, 6,
or 8 insns.
(ppc_elf_link_hash_table_create): Init new ppc_elf_params field.
(allocate_dynrelocs): Use new GLINK_ENTRY_SIZE.
(ppc_elf_size_dynamic_sections): Likewise. Size branch table
by PLT reloc count.
(write_glink_stub): Handle __tls_get_addr_opt stub.
Pad out to size given by GLINK_ENTRY_SIZE.
(ppc_elf_relocate_section): Adjust write_glink_stub call.
(ppc_elf_finish_dynamic_symbol): Likewise.
(ppc_elf_finish_dynamic_sections): Write PLTresolve without using
insn array since so many need rewriting.
* elf32-ppc.h (struct ppc_elf_params): Add plt_stub_align.
* elf64-ppc.c (GLINK_PLTRESOLVE_SIZE): Rename from
GLINK_CALL_STUB_SIZE. Add htab param and evaluate to size without
nops. Adjust all uses.
(ppc64_elf_get_synthetic_symtab): Don't use GLINK_CALL_STUB_SIZE
in glink_vma calculation.
(struct ppc_link_hash_table): Add global_entry section pointer.
(create_linkage_sections): Create separate section for global
entry stubs.
(PPC_LO, PPC_HI, PPC_HA): Move earlier.
(size_global_entry_stubs): Handle sizing for aligned stubs.
(ppc64_elf_size_dynamic_sections): Handle global_entry alloc,
and don't stash end of glink branch table in rawsize.
(ppc_build_one_stub): Rewrite stub size calculations.
(build_global_entry_stubs): Use new section.
(ppc64_elf_build_stubs): Don't pad __glink_PLTresolve with nops.
Build lazy link stubs out to end of section. Build global entry
stubs in new section.
gold/
* options.h (plt_align): Support for PowerPC32 too.
* powerpc.cc (Stub_table::stub_align): Heed --plt-align for 32-bit.
(Stub_table::plt_call_size, branch_stub_size): Tidy.
(Stub_table::plt_call_align): Implement using stub_align.
(Output_data_glink::global_entry_align): New function.
(Output_data_glink::global_entry_off): New function.
(Output_data_glink::global_entry_address): Use global_entry_off.
(Output_data_glink::pltresolve_size): New function, replacing
pltresolve_size_ constant. Update all uses.
(Output_data_glink::add_global_entry): Align offset.
(Output_data_glink::set_final_data_size): Use global_entry_align.
(Stub_table::do_write): Don't pad __glink_PLTrelsolve with nops.
Tidy stub output. Use global_entry_off.
ld/
* emultempl/ppc32elf.em (params): Init new field.
(enum ppc32_opt): New enum to define OPTION_* values. Add
OPTION_PLT_ALIGN and OPTION_NO_PLT_ALIGN.
(PARSE_AND_LIST_LONGOPTS): Handle new options.
(PARSE_AND_LIST_ARGS_CASES): Likewise.
(PARSE_AND_LIST_OPTIONS): Likewise. Break up help output.
* emultempl/ppc64elf.em (ppc_add_stub_section): Init alignment
correctly for negative --plt-stub-align.
* testsuite/ld-powerpc/elfv2exe.d,
* testsuite/ld-powerpc/elfv2so.d,
* testsuite/ld-powerpc/relbrlt.d,
* testsuite/ld-powerpc/relbrlt.s,
* testsuite/ld-powerpc/tlsexe.d,
* testsuite/ld-powerpc/tlsexe.r,
* testsuite/ld-powerpc/tlsexe32.d,
* testsuite/ld-powerpc/tlsexe32.g,
* testsuite/ld-powerpc/tlsexe32.r,
* testsuite/ld-powerpc/tlsexetoc.d,
* testsuite/ld-powerpc/tlsexetoc.r,
* testsuite/ld-powerpc/tlsopt5_32.d,
* testsuite/ld-powerpc/tlsso.d,
* testsuite/ld-powerpc/tlstocso.d: Update for changed stub order.
Diffstat (limited to 'gold')
-rw-r--r-- | gold/ChangeLog | 16 | ||||
-rw-r--r-- | gold/options.h | 2 | ||||
-rw-r--r-- | gold/powerpc.cc | 225 |
3 files changed, 149 insertions, 94 deletions
diff --git a/gold/ChangeLog b/gold/ChangeLog index 6494b20..fff66e1 100644 --- a/gold/ChangeLog +++ b/gold/ChangeLog @@ -1,3 +1,19 @@ +2018-01-17 Alan Modra <amodra@gmail.com> + + * options.h (plt_align): Support for PowerPC32 too. + * powerpc.cc (Stub_table::stub_align): Heed --plt-align for 32-bit. + (Stub_table::plt_call_size, branch_stub_size): Tidy. + (Stub_table::plt_call_align): Implement using stub_align. + (Output_data_glink::global_entry_align): New function. + (Output_data_glink::global_entry_off): New function. + (Output_data_glink::global_entry_address): Use global_entry_off. + (Output_data_glink::pltresolve_size): New function, replacing + pltresolve_size_ constant. Update all uses. + (Output_data_glink::add_global_entry): Align offset. + (Output_data_glink::set_final_data_size): Use global_entry_align. + (Stub_table::do_write): Don't pad __glink_PLTrelsolve with nops. + Tidy stub output. Use global_entry_off. + 2018-01-15 Cary Coutant <ccoutant@gmail.com> PR gold/22694 diff --git a/gold/options.h b/gold/options.h index feb60cc..b39d5ff 100644 --- a/gold/options.h +++ b/gold/options.h @@ -1101,7 +1101,7 @@ class General_options NULL, N_("(ARM only) Ignore for backward compatibility")); DEFINE_var(plt_align, options::TWO_DASHES, '\0', 0, "5", - N_("(PowerPC64 only) Align PLT call stubs to fit cache lines"), + N_("(PowerPC only) Align PLT call stubs to fit cache lines"), N_("[=P2ALIGN]"), true, int, int, options::parse_uint, false); DEFINE_bool(plt_localentry, options::TWO_DASHES, '\0', false, diff --git a/gold/powerpc.cc b/gold/powerpc.cc index 94efcdf..9ed5b21 100644 --- a/gold/powerpc.cc +++ b/gold/powerpc.cc @@ -3524,7 +3524,7 @@ Target_powerpc<size, big_endian>::do_relax(int pass, if (this->glink_ != NULL) { - int stub_size = this->glink_->pltresolve_size; + int stub_size = this->glink_->pltresolve_size(); Address value = -stub_size; if (size == 64) { @@ -3580,7 +3580,7 @@ Target_powerpc<size, big_endian>::do_plt_fde_location(const Output_data* plt, // There are two FDEs for a position independent glink. // The first covers the branch table, the second // __glink_PLTresolve at the end of glink. - off_t resolve_size = this->glink_->pltresolve_size; + off_t resolve_size = this->glink_->pltresolve_size(); if (oview[9] == elfcpp::DW_CFA_nop) len -= resolve_size; else @@ -4391,9 +4391,9 @@ class Stub_table : public Output_relaxed_input_section unsigned int stub_align() const { - if (size == 32) - return 16; - unsigned int min_align = 32; + unsigned int min_align = 4; + if (!parameters->options().user_set_plt_align()) + return size == 64 ? 32 : min_align; unsigned int user_align = 1 << parameters->options().plt_align(); return std::max(user_align, min_align); } @@ -4425,9 +4425,8 @@ class Stub_table : public Output_relaxed_input_section if (size == 32) { const Symbol* gsym = p->first.sym_; - if (this->targ_->is_tls_get_addr_opt(gsym)) - return 12 * 4; - return 4 * 4; + return (4 * 4 + + (this->targ_->is_tls_get_addr_opt(gsym) ? 8 * 4 : 0)); } bool is_iplt; @@ -4460,10 +4459,8 @@ class Stub_table : public Output_relaxed_input_section unsigned int plt_call_align(unsigned int bytes) const { - unsigned int align = 1 << parameters->options().plt_align(); - if (align > 1) - bytes = (bytes + align - 1) & -align; - return bytes; + unsigned int align = this->stub_align(); + return (bytes + align - 1) & -align; } // Return long branch stub size. @@ -4473,9 +4470,10 @@ class Stub_table : public Output_relaxed_input_section Address loc = this->stub_address() + this->last_plt_size_ + p->second; if (p->first.dest_ - loc + (1 << 25) < 2 << 25) return 4; - if (size == 64 || !parameters->options().output_is_position_independent()) - return 16; - return 32; + unsigned int bytes = 16; + if (size == 32 && parameters->options().output_is_position_independent()) + bytes += 16; + return bytes; } // Write out stubs. @@ -4884,7 +4882,6 @@ class Output_data_glink : public Output_section_data public: typedef typename elfcpp::Elf_types<size>::Elf_Addr Address; static const Address invalid_address = static_cast<Address>(0) - 1; - static const int pltresolve_size = 16*4; Output_data_glink(Target_powerpc<size, big_endian>* targ) : Output_section_data(16), targ_(targ), global_entry_stubs_(), @@ -4900,12 +4897,35 @@ class Output_data_glink : public Output_section_data Address find_global_entry(const Symbol*) const; + unsigned int + global_entry_align(unsigned int off) const + { + unsigned int align = 1 << parameters->options().plt_align(); + if (!parameters->options().user_set_plt_align()) + align = size == 64 ? 32 : 4; + return (off + align - 1) & -align; + } + + unsigned int + global_entry_off() const + { + return this->global_entry_align(this->end_branch_table_); + } + Address global_entry_address() const { gold_assert(this->is_data_size_valid()); - unsigned int global_entry_off = (this->end_branch_table_ + 15) & -16; - return this->address() + global_entry_off; + return this->address() + this->global_entry_off(); + } + + int + pltresolve_size() const + { + if (size == 64) + return (8 + + (this->targ_->abiversion() < 2 ? 11 * 4 : 14 * 4)); + return 16 * 4; } protected: @@ -4977,10 +4997,11 @@ template<int size, bool big_endian> void Output_data_glink<size, big_endian>::add_global_entry(const Symbol* gsym) { + unsigned int off = this->global_entry_align(this->ge_size_); std::pair<typename Global_entry_stub_entries::iterator, bool> p - = this->global_entry_stubs_.insert(std::make_pair(gsym, this->ge_size_)); + = this->global_entry_stubs_.insert(std::make_pair(gsym, off)); if (p.second) - this->ge_size_ += 16; + this->ge_size_ = off + 16; } template<int size, bool big_endian> @@ -5007,11 +5028,11 @@ Output_data_glink<size, big_endian>::set_final_data_size() total += 4 * (count - 1); total += -total & 15; - total += this->pltresolve_size; + total += this->pltresolve_size(); } else { - total += this->pltresolve_size; + total += this->pltresolve_size(); // space for branch table total += 4 * count; @@ -5024,7 +5045,7 @@ Output_data_glink<size, big_endian>::set_final_data_size() } } this->end_branch_table_ = total; - total = (total + 15) & -16; + total = this->global_entry_align(total); total += this->ge_size_; this->set_data_size(total); @@ -5175,7 +5196,7 @@ Stub_table<size, big_endian>::do_write(Output_file* of) = ((pltoff - this->targ_->first_plt_entry_offset()) / this->targ_->plt_entry_size()); Address glinkoff - = (this->targ_->glink_section()->pltresolve_size + = (this->targ_->glink_section()->pltresolve_size() + pltindex * 8); if (pltindex > 32768) glinkoff += (pltindex - 32768) * 4; @@ -5441,26 +5462,24 @@ Stub_table<size, big_endian>::do_write(Output_file* of) Address off = plt_addr - got_addr; if (ha(off) == 0) - { - write_insn<big_endian>(p + 0, lwz_11_30 + l(off)); - write_insn<big_endian>(p + 4, mtctr_11); - write_insn<big_endian>(p + 8, bctr); - } + write_insn<big_endian>(p, lwz_11_30 + l(off)); else { - write_insn<big_endian>(p + 0, addis_11_30 + ha(off)); - write_insn<big_endian>(p + 4, lwz_11_11 + l(off)); - write_insn<big_endian>(p + 8, mtctr_11); - write_insn<big_endian>(p + 12, bctr); + write_insn<big_endian>(p, addis_11_30 + ha(off)); + p += 4; + write_insn<big_endian>(p, lwz_11_11 + l(off)); } } else { - write_insn<big_endian>(p + 0, lis_11 + ha(plt_addr)); - write_insn<big_endian>(p + 4, lwz_11_11 + l(plt_addr)); - write_insn<big_endian>(p + 8, mtctr_11); - write_insn<big_endian>(p + 12, bctr); + write_insn<big_endian>(p, lis_11 + ha(plt_addr)); + p += 4; + write_insn<big_endian>(p, lwz_11_11 + l(plt_addr)); } + p += 4; + write_insn<big_endian>(p, mtctr_11); + p += 4; + write_insn<big_endian>(p, bctr); } } @@ -5479,23 +5498,29 @@ Stub_table<size, big_endian>::do_write(Output_file* of) write_insn<big_endian>(p, b | (delta & 0x3fffffc)); else if (!parameters->options().output_is_position_independent()) { - write_insn<big_endian>(p + 0, lis_12 + ha(bs->first.dest_)); - write_insn<big_endian>(p + 4, addi_12_12 + l(bs->first.dest_)); - write_insn<big_endian>(p + 8, mtctr_12); - write_insn<big_endian>(p + 12, bctr); + write_insn<big_endian>(p, lis_12 + ha(bs->first.dest_)); + p += 4; + write_insn<big_endian>(p, addi_12_12 + l(bs->first.dest_)); } else { delta -= 8; - write_insn<big_endian>(p + 0, mflr_0); - write_insn<big_endian>(p + 4, bcl_20_31); - write_insn<big_endian>(p + 8, mflr_12); - write_insn<big_endian>(p + 12, addis_12_12 + ha(delta)); - write_insn<big_endian>(p + 16, addi_12_12 + l(delta)); - write_insn<big_endian>(p + 20, mtlr_0); - write_insn<big_endian>(p + 24, mtctr_12); - write_insn<big_endian>(p + 28, bctr); + write_insn<big_endian>(p, mflr_0); + p += 4; + write_insn<big_endian>(p, bcl_20_31); + p += 4; + write_insn<big_endian>(p, mflr_12); + p += 4; + write_insn<big_endian>(p, addis_12_12 + ha(delta)); + p += 4; + write_insn<big_endian>(p, addi_12_12 + l(delta)); + p += 4; + write_insn<big_endian>(p, mtlr_0); } + p += 4; + write_insn<big_endian>(p, mtctr_12); + p += 4; + write_insn<big_endian>(p, bctr); } } if (this->need_save_res_) @@ -5563,8 +5588,7 @@ Output_data_glink<size, big_endian>::do_write(Output_file* of) write_insn<big_endian>(p, ld_11_11 + 8), p += 4; } write_insn<big_endian>(p, bctr), p += 4; - while (p < oview + this->pltresolve_size) - write_insn<big_endian>(p, nop), p += 4; + gold_assert(p == oview + this->pltresolve_size()); // Write lazy link call stubs. uint32_t indx = 0; @@ -5590,7 +5614,7 @@ Output_data_glink<size, big_endian>::do_write(Output_file* of) Address plt_base = this->targ_->plt_section()->address(); Address iplt_base = invalid_address; - unsigned int global_entry_off = (this->end_branch_table_ + 15) & -16; + unsigned int global_entry_off = this->global_entry_off(); Address global_entry_base = this->address() + global_entry_off; typename Global_entry_stub_entries::const_iterator ge; for (ge = this->global_entry_stubs_.begin(); @@ -5631,7 +5655,7 @@ Output_data_glink<size, big_endian>::do_write(Output_file* of) // Write out pltresolve branch table. p = oview; - unsigned int the_end = oview_size - this->pltresolve_size; + unsigned int the_end = oview_size - this->pltresolve_size(); unsigned char* end_p = oview + the_end; while (p < end_p - 8 * 4) write_insn<big_endian>(p, b + end_p - p), p += 4; @@ -5639,68 +5663,85 @@ Output_data_glink<size, big_endian>::do_write(Output_file* of) write_insn<big_endian>(p, nop), p += 4; // Write out pltresolve call stub. + end_p = oview + oview_size; if (parameters->options().output_is_position_independent()) { Address res0_off = 0; Address after_bcl_off = the_end + 12; Address bcl_res0 = after_bcl_off - res0_off; - write_insn<big_endian>(p + 0, addis_11_11 + ha(bcl_res0)); - write_insn<big_endian>(p + 4, mflr_0); - write_insn<big_endian>(p + 8, bcl_20_31); - write_insn<big_endian>(p + 12, addi_11_11 + l(bcl_res0)); - write_insn<big_endian>(p + 16, mflr_12); - write_insn<big_endian>(p + 20, mtlr_0); - write_insn<big_endian>(p + 24, sub_11_11_12); + write_insn<big_endian>(p, addis_11_11 + ha(bcl_res0)); + p += 4; + write_insn<big_endian>(p, mflr_0); + p += 4; + write_insn<big_endian>(p, bcl_20_31); + p += 4; + write_insn<big_endian>(p, addi_11_11 + l(bcl_res0)); + p += 4; + write_insn<big_endian>(p, mflr_12); + p += 4; + write_insn<big_endian>(p, mtlr_0); + p += 4; + write_insn<big_endian>(p, sub_11_11_12); + p += 4; Address got_bcl = g_o_t + 4 - (after_bcl_off + this->address()); - write_insn<big_endian>(p + 28, addis_12_12 + ha(got_bcl)); + write_insn<big_endian>(p, addis_12_12 + ha(got_bcl)); + p += 4; if (ha(got_bcl) == ha(got_bcl + 4)) { - write_insn<big_endian>(p + 32, lwz_0_12 + l(got_bcl)); - write_insn<big_endian>(p + 36, lwz_12_12 + l(got_bcl + 4)); + write_insn<big_endian>(p, lwz_0_12 + l(got_bcl)); + p += 4; + write_insn<big_endian>(p, lwz_12_12 + l(got_bcl + 4)); } else { - write_insn<big_endian>(p + 32, lwzu_0_12 + l(got_bcl)); - write_insn<big_endian>(p + 36, lwz_12_12 + 4); + write_insn<big_endian>(p, lwzu_0_12 + l(got_bcl)); + p += 4; + write_insn<big_endian>(p, lwz_12_12 + 4); } - write_insn<big_endian>(p + 40, mtctr_0); - write_insn<big_endian>(p + 44, add_0_11_11); - write_insn<big_endian>(p + 48, add_11_0_11); - write_insn<big_endian>(p + 52, bctr); - write_insn<big_endian>(p + 56, nop); - write_insn<big_endian>(p + 60, nop); + p += 4; + write_insn<big_endian>(p, mtctr_0); + p += 4; + write_insn<big_endian>(p, add_0_11_11); + p += 4; + write_insn<big_endian>(p, add_11_0_11); } else { Address res0 = this->address(); - write_insn<big_endian>(p + 0, lis_12 + ha(g_o_t + 4)); - write_insn<big_endian>(p + 4, addis_11_11 + ha(-res0)); + write_insn<big_endian>(p, lis_12 + ha(g_o_t + 4)); + p += 4; + write_insn<big_endian>(p, addis_11_11 + ha(-res0)); + p += 4; if (ha(g_o_t + 4) == ha(g_o_t + 8)) - write_insn<big_endian>(p + 8, lwz_0_12 + l(g_o_t + 4)); + write_insn<big_endian>(p, lwz_0_12 + l(g_o_t + 4)); else - write_insn<big_endian>(p + 8, lwzu_0_12 + l(g_o_t + 4)); - write_insn<big_endian>(p + 12, addi_11_11 + l(-res0)); - write_insn<big_endian>(p + 16, mtctr_0); - write_insn<big_endian>(p + 20, add_0_11_11); + write_insn<big_endian>(p, lwzu_0_12 + l(g_o_t + 4)); + p += 4; + write_insn<big_endian>(p, addi_11_11 + l(-res0)); + p += 4; + write_insn<big_endian>(p, mtctr_0); + p += 4; + write_insn<big_endian>(p, add_0_11_11); + p += 4; if (ha(g_o_t + 4) == ha(g_o_t + 8)) - write_insn<big_endian>(p + 24, lwz_12_12 + l(g_o_t + 8)); + write_insn<big_endian>(p, lwz_12_12 + l(g_o_t + 8)); else - write_insn<big_endian>(p + 24, lwz_12_12 + 4); - write_insn<big_endian>(p + 28, add_11_0_11); - write_insn<big_endian>(p + 32, bctr); - write_insn<big_endian>(p + 36, nop); - write_insn<big_endian>(p + 40, nop); - write_insn<big_endian>(p + 44, nop); - write_insn<big_endian>(p + 48, nop); - write_insn<big_endian>(p + 52, nop); - write_insn<big_endian>(p + 56, nop); - write_insn<big_endian>(p + 60, nop); + write_insn<big_endian>(p, lwz_12_12 + 4); + p += 4; + write_insn<big_endian>(p, add_11_0_11); + } + p += 4; + write_insn<big_endian>(p, bctr); + p += 4; + while (p < end_p) + { + write_insn<big_endian>(p, nop); + p += 4; } - p += 64; } of->write_output_view(off, oview_size, oview); @@ -8161,7 +8202,7 @@ Target_powerpc<size, big_endian>::do_finalize_sections( this->glink_->finalize_data_size(); odyn->add_section_plus_offset(elfcpp::DT_PPC64_GLINK, this->glink_, - (this->glink_->pltresolve_size + (this->glink_->pltresolve_size() - 32)); } if (this->has_localentry0_ || this->has_tls_get_addr_opt_) @@ -10187,8 +10228,6 @@ Target_selector_powerpc<64, false> target_selector_ppc64le; // Instantiate these constants for -O0 template<int size, bool big_endian> -const int Output_data_glink<size, big_endian>::pltresolve_size; -template<int size, bool big_endian> const typename Output_data_glink<size, big_endian>::Address Output_data_glink<size, big_endian>::invalid_address; template<int size, bool big_endian> |