Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch supports using pcrel instructions in TLS code sequences. A
number of new relocations are needed, gas operand modifiers to
generate those relocations, and new TLS optimisation. For
optimisation it turns out that the new pcrel GD and LD sequences can
be distinguished from the non-pcrel GD and LD sequences by there being
different relocations on the new sequence. The final "add ra,rb,13"
on IE sequences similarly needs a new relocation, or as I chose, a
modification of R_PPC64_TLS. On pcrel IE code, the R_PPC64_TLS points
one byte into the "add" instruction rather than being on the
instruction boundary.
GD:
pla 3,z@got@tlsgd@pcrel # R_PPC64_GOT_TLSGD34
bl __tls_get_addr@notoc(z@tlsgd) # R_PPC64_TLSGD and R_PPC64_REL24_NOTOC
edited to IE
pld 3,z@got@tprel@pcrel
add 3,3,13
edited to LE
paddi 3,13,z@tprel
nop
LD:
pla 3,z@got@tlsld@pcrel # R_PPC64_GOT_TLSLD34
bl __tls_get_addr@notoc(z@tlsld) # R_PPC64_TLSLD and R_PPC64_REL24_NOTOC
..
paddi 9,3,z2@dtprel
pld 10,z3@got@dtprel@pcrel
add 10,10,3
edited to LE
paddi 3,13,0x1000
nop
IE:
pld 9,z@got@tprel@pcrel # R_PPC64_GOT_TPREL34
add 3,9,z@tls@pcrel # R_PPC64_TLS at insn+1
ldx 4,9,z@tls@pcrel
lwax 5,9,z@tls@pcrel
stdx 5,9,z@tls@pcrel
edited to LE
paddi 9,13,z@tprel
nop
ld 4,0(9)
lwa 5,0(9)
std 5,0(9)
LE:
paddi 10,13,z@tprel
include/
* elf/ppc64.h (R_PPC64_TPREL34, R_PPC64_DTPREL34),
(R_PPC64_GOT_TLSGD34, R_PPC64_GOT_TLSLD34),
(R_PPC64_GOT_TPREL34, R_PPC64_GOT_DTPREL34): Define.
(IS_PPC64_TLS_RELOC): Include new tls relocs.
bfd/
* reloc.c (BFD_RELOC_PPC64_TPREL34, BFD_RELOC_PPC64_DTPREL34),
(BFD_RELOC_PPC64_GOT_TLSGD34, BFD_RELOC_PPC64_GOT_TLSLD34),
(BFD_RELOC_PPC64_GOT_TPREL34, BFD_RELOC_PPC64_GOT_DTPREL34),
(BFD_RELOC_PPC64_TLS_PCREL): New pcrel tls relocs.
* elf64-ppc.c (ppc64_elf_howto_raw): Add howtos for pcrel tls relocs.
(ppc64_elf_reloc_type_lookup): Translate pcrel tls relocs.
(must_be_dyn_reloc, dec_dynrel_count): Add R_PPC64_TPREL64.
(ppc64_elf_check_relocs): Support pcrel tls relocs.
(ppc64_elf_tls_optimize, ppc64_elf_relocate_section): Likewise.
* bfd-in2.h: Regenerate.
* libbfd.h: Regenerate.
gas/
* config/tc-ppc.c (ppc_elf_suffix): Map "tls@pcrel", "got@tlsgd@pcrel",
"got@tlsld@pcrel", "got@tprel@pcrel", and "got@dtprel@pcrel".
(fixup_size, md_assemble): Handle pcrel tls relocs.
(ppc_force_relocation, ppc_fix_adjustable): Likewise.
(md_apply_fix, tc_gen_reloc): Likewise.
ld/
* testsuite/ld-powerpc/tlsgd.d,
* testsuite/ld-powerpc/tlsgd.s,
* testsuite/ld-powerpc/tlsie.d,
* testsuite/ld-powerpc/tlsie.s,
* testsuite/ld-powerpc/tlsld.d,
* testsuite/ld-powerpc/tlsld.s: New tests.
* testsuite/ld-powerpc/powerpc.exp: Run them.
|
|
include/
* elf/ppc64.h (R_PPC64_PLTSEQ_NOTOC, R_PPC64_PLTCALL_NOTOC),
(R_PPC64_PCREL_OPT, R_PPC64_D34, R_PPC64_D34_LO, R_PPC64_D34_HI30),
(R_PPC64_D34_HA30, R_PPC64_PCREL34, R_PPC64_GOT_PCREL34),
(R_PPC64_PLT_PCREL34, R_PPC64_PLT_PCREL34_NOTOC),
(R_PPC64_ADDR16_HIGHER34, R_PPC64_ADDR16_HIGHERA34),
(R_PPC64_ADDR16_HIGHEST34, R_PPC64_ADDR16_HIGHESTA34),
(R_PPC64_REL16_HIGHER34, R_PPC64_REL16_HIGHERA34),
(R_PPC64_REL16_HIGHEST34, R_PPC64_REL16_HIGHESTA34),
(R_PPC64_D28, R_PPC64_PCREL28): Define.
bfd/
* reloc.c (BFD_RELOC_PPC64_D34, BFD_RELOC_PPC64_D34_LO),
(BFD_RELOC_PPC64_D34_HI30, BFD_RELOC_PPC64_D34_HA30),
(BFD_RELOC_PPC64_PCREL34, BFD_RELOC_PPC64_GOT_PCREL34),
(BFD_RELOC_PPC64_PLT_PCREL34),
(BFD_RELOC_PPC64_ADDR16_HIGHER34, BFD_RELOC_PPC64_ADDR16_HIGHERA34),
(BFD_RELOC_PPC64_ADDR16_HIGHEST34, BFD_RELOC_PPC64_ADDR16_HIGHESTA34),
(BFD_RELOC_PPC64_REL16_HIGHER34, BFD_RELOC_PPC64_REL16_HIGHERA34),
(BFD_RELOC_PPC64_REL16_HIGHEST34, BFD_RELOC_PPC64_REL16_HIGHESTA34),
(BFD_RELOC_PPC64_D28, BFD_RELOC_PPC64_PCREL28): New reloc enums.
* elf64-ppc.c (PNOP): Define.
(ppc64_elf_howto_raw): Add reloc howtos for new relocations.
(ppc64_elf_reloc_type_lookup): Translate new bfd reloc numbers.
(ppc64_elf_ha_reloc): Adjust addend for highera34 and highesta34
relocs.
(ppc64_elf_prefix_reloc): New function.
(struct ppc_link_hash_table): Add notoc_plt.
(is_branch_reloc): Add R_PPC64_PLTCALL_NOTOC.
(is_plt_seq_reloc): Add R_PPC64_PLT_PCREL34,
R_PPC64_PLT_PCREL34_NOTOC, and R_PPC64_PLTSEQ_NOTOC.
(ppc64_elf_check_relocs): Handle pcrel got and plt relocs. Set
has_pltcall for section on seeing R_PPC64_PLTCALL_NOTOC. Handle
possible need for dynamic relocs on non-pcrel powerxx relocs.
(dec_dynrel_count): Handle non-pcrel powerxx relocs.
(ppc64_elf_inline_plt): Handle R_PPC64_PLTCALL_NOTOC.
(toc_adjusting_stub_needed): Likewise.
(ppc64_elf_tls_optimize): Handle R_PPC64_PLTSEQ_NOTOC.
(ppc64_elf_relocate_section): Handle new powerxx relocs.
* bfd-in2.h: Regenerate.
* libbfd.h: Regenerate.
gas/
* config/tc-ppc.c (ppc_elf_suffix): Support @pcrel, @got@pcrel,
@plt@pcrel, @higher34, @highera34, @highest34, and @highesta34.
(fixup_size): Handle new powerxx relocs.
(md_assemble): Warn for @pcrel on non-prefix insns.
Accept @l, @h and @ha on prefix insns, and infer reloc without
any @ suffix. Translate powerxx relocs to suit DQ and DS field
instructions. Include operand tests as well as opcode test to
translate BFD_RELOC_HI16_S to BFD_RELOC_PPC_16DX_HA.
(ppc_fix_adjustable): Return false for pcrel GOT and PLT relocs.
(md_apply_fix): Handle new powerxx relocs.
* config/tc-ppc.h (TC_FORCE_RELOCATION_SUB_LOCAL): Accept
BFD_RELOC_PPC64_ADDR16_HIGHER34, BFD_RELOC_PPC64_ADDR16_HIGHERA34,
BFD_RELOC_PPC64_ADDR16_HIGHEST34, BFD_RELOC_PPC64_ADDR16_HIGHESTA34,
BFD_RELOC_PPC64_D34, and BFD_RELOC_PPC64_D28.
* testsuite/gas/ppc/prefix-reloc.d,
* testsuite/gas/ppc/prefix-reloc.s: New test.
* testsuite/gas/ppc/ppc.exp: Run it.
|
|
|
|
There are occasions where someone might want to build a 64-bit
pc-relative offset from 16-bit pieces. This adds the necessary REL16
relocs corresponding to existing ADDR16 relocs that can be used to
build 64-bit absolute values.
include/
* elf/ppc64.h (R_PPC64_REL16_HIGH, R_PPC64_REL16_HIGHA),
(R_PPC64_REL16_HIGHER, R_PPC64_REL16_HIGHERA),
(R_PPC64_REL16_HIGHEST, R_PPC64_REL16_HIGHESTA): Define.
(R_PPC64_LO_DS_OPT, R_PPC64_16DX_HA): Bump value.
bfd/
* reloc.c (BFD_RELOC_PPC64_REL16_HIGH, BFD_RELOC_PPC64_REL16_HIGHA),
(BFD_RELOC_PPC64_REL16_HIGHER, BFD_RELOC_PPC64_REL16_HIGHERA),
(BFD_RELOC_PPC64_REL16_HIGHEST, BFD_RELOC_PPC64_REL16_HIGHESTA):
Define.
* elf64-ppc.c (ppc64_elf_howto_raw): Add new REL16 howtos.
(ppc64_elf_reloc_type_lookup): Translate new REL16 relocs.
(ppc64_elf_check_relocs, ppc64_elf_relocate_section): Handle them.
* bfd-in2.h: Regenerate.
* libbfd.h: Regenerate.
gas/
* config/tc-ppc.h (TC_FORCE_RELOCATION_SUB_LOCAL): Allow ADDR16
HIGH, HIGHA, HIGHER, HIGHERA, HIGHEST, and HIGHESTA relocs.
Group 16-bit relocs.
* config/tc-ppc.c (md_apply_fix): Translate those ADDR16 relocs
to REL16 when pcrel. Sort relocs.
|
|
This adds support for ".localentry 1", a new st_other
STO_PPC64_LOCAL_MASK encoding that signifies a function with a single
entry point like ".localentry 0", but unlike a ".localentry 0"
function does not preserve r2.
include/
* elf/ppc64.h: Specify byte offset to local entry for values
of two to six in STO_PPC64_LOCAL_MASK. Clarify r2 return
value for such functions when entering via global entry point.
Specify meaning of a value of one in STO_PPC64_LOCAL_MASK.
bfd/
* elf64-ppc.c (ppc64_elf_size_stubs): Use a ppc_stub_long_branch_r2off
for calls to symbols with STO_PPC64_LOCAL_MASK bits set to 1.
gas/
* config/tc-ppc.c (ppc_elf_localentry): Allow .localentry values
of 1 and 7 to directly set value into STO_PPC64_LOCAL_MASK bits.
ld/testsuite/
* ld-powerpc/elfv2.s: Add .localentry f5,1 testcase.
* ld-powerpc/elfv2exe.d: Update.
* ld-powerpc/elfv2so.d: Update.
|
|
In addition to the existing relocs we need two more to mark all
instructions in the call sequence, PLTCALL on the call itself (plus
the toc restore insn for ppc64), and PLTSEQ on others. All
relocations in a particular sequence have the same symbol.
Example ppc64 ELFv2 assembly:
.reloc .,R_PPC64_PLTSEQ,puts
std 2,24(1)
addis 12,2,puts@plt@ha # .reloc .,R_PPC64_PLT16_HA,puts
ld 12,puts@plt@l(12) # .reloc .,R_PPC64_PLT16_LO_DS,puts
.reloc .,R_PPC64_PLTSEQ,puts
mtctr 12
.reloc .,R_PPC64_PLTCALL,puts
bctrl
ld 2,24(1)
Example ppc32 -fPIC assembly:
addis 12,30,puts+32768@plt@ha # .reloc .,R_PPC_PLT16_HA,puts+0x8000
lwz 12,12,puts+32768@plt@l # .reloc .,R_PPC_PLT16_LO,puts+0x8000
.reloc .,R_PPC_PLTSEQ,puts+32768
mtctr 12
.reloc .,R_PPC_PLTCALL,puts+32768
bctrl
Marking sequences like this allows the linker to convert them to nops
and a direct call if the target symbol turns out to be local.
When the call is __tls_get_addr, each relocation shown above is paired
with an R_PPC*_TLSLD or R_PPC*_TLSGD reloc to additionally mark the
sequence for possible TLS optimization. The TLSLD or TLSGD relocs are
emitted first.
include/
* elf/ppc.h (R_PPC_PLTSEQ, R_PPC_PLTCALL): Define.
* elf/ppc64.h (R_PPC64_PLTSEQ, R_PPC64_PLTCALL): Define.
bfd/
* elf32-ppc.c (ppc_elf_howto_raw): Add PLTSEQ and PLTCALL howtos.
(is_plt_seq_reloc): New function.
(ppc_elf_check_relocs): Handle PLTSEQ and PLTCALL relocs.
(ppc_elf_tls_optimize): Handle inline plt call sequence.
(ppc_elf_relax_section): Handle PLTCALL reloc.
(ppc_elf_relocate_section): Nop out inline plt call sequence when
resolving locally.
* elf64-ppc.c (ppc64_elf_howto_raw): Add R_PPC64_PLTSEQ and
R_PPC64_PLTCALL entries. Comment R_PPC64_TOCSAVE.
(has_tls_get_addr_call): Correct comment.
(is_branch_reloc): Add PLTCALL.
(is_plt_seq_reloc): New function.
(ppc64_elf_check_relocs): Handle PLT16_LO_DS reloc. Set
has_tls_reloc for R_PPC64_TLSGD and R_PPC64_TLSLD. Create plt
entry for R_PPC64_PLTCALL.
(ppc64_elf_tls_optimize): Handle inline plt call sequence.
(ppc_type_of_stub): Handle PLTCALL reloc.
(toc_adjusting_stub_needed): Likewise.
(ppc64_elf_relocate_section): Set "can_plt_call" for PLTCALL
reloc insn. Nop out inline plt call sequence when resolving
locally. Handle __tls_get_addr inline plt call optimization.
elfcpp/
* powerpc.h (R_POWERPC_PLTSEQ, R_POWERPC_PLTCALL): Define.
gold/
* powerpc.cc (Target_powerpc::Track_tls::maybe_skip_tls_get_addr_call):
Handle inline plt sequence relocs.
(Stub_table::Plt_stub_key::Plt_stub_key): Likewise.
(Target_powerpc::Scan::reloc_needs_plt_for_ifunc): Likewise.
(Target_powerpc::Relocate::relocate): Likewise.
|
|
|
|
ELFv2 functions with localentry:0 are those with a single entry point,
ie. global entry == local entry, and that have no requirement on r2 or
r12, and guarantee r2 is unchanged on return. Such an external
function can be called via the PLT without saving r2 or restoring it
on return, avoiding a common load-hit-store for small functions. The
optimization is attractive. The TOC pointer load-hit-store is a major
reason why calls to small functions that need no register saves, or
with shrink-wrap, no register saves on a fast path, are slow on
powerpc64le.
To be safe, this optimization needs ld.so support to check that the
run-time matches link-time function implementation. If a function
in a shared library with st_other localentry non-zero is called
without saving and restoring r2, r2 will be trashed on return, leading
to segfaults. For that reason the optimization does not happen for
weak functions since a weak definition is a fairly solid hint that the
function will likely be overridden. I'm also not enabling the
optimization by default unless glibc-2.26 is detected, which should
have the ld.so checks implemented.
bfd/
* elf64-ppc.c (struct ppc_link_hash_table): Add has_plt_localentry0.
(ppc64_elf_merge_symbol_attribute): Merge localentry bits from
dynamic objects.
(is_elfv2_localentry0): New function.
(ppc64_elf_tls_setup): Default params->plt_localentry0.
(plt_stub_size): Adjust size for tls_get_addr_opt stub.
(build_tls_get_addr_stub): Use a simpler stub when r2 is not saved.
(ppc64_elf_size_stubs): Leave stub_type as ppc_stub_plt_call for
optimized localentry:0 stubs.
(ppc64_elf_build_stubs): Save r2 in ELFv2 __glink_PLTresolve.
(ppc64_elf_relocate_section): Leave nop unchanged for optimized
localentry:0 stubs.
(ppc64_elf_finish_dynamic_sections): Set PPC64_OPT_LOCALENTRY in
DT_PPC64_OPT.
* elf64-ppc.h (struct ppc64_elf_params): Add plt_localentry0.
include/
* elf/ppc64.h (PPC64_OPT_LOCALENTRY): Define.
ld/
* emultempl/ppc64elf.em (params): Init plt_localentry0 field.
(enum ppc64_opt): New, replacing OPTION_* defines. Add
OPTION_PLT_LOCALENTRY, and OPTION_NO_PLT_LOCALENTRY.
(PARSE_AND_LIST_*): Support --plt-localentry and --no-plt-localentry.
* testsuite/ld-powerpc/elfv2so.d: Update.
* testsuite/ld-powerpc/powerpc.exp (TLS opt 5): Use --no-plt-localentry.
* testsuite/ld-powerpc/tlsopt5.d: Update.
|
|
This came up because I was looking at ld/tmpdir/addpcis.o and noticed
the odd addends on REL16DX_HA. They ought to both be -4. The error
crept in due REL16DX_HA howto being pc-relative (as indeed it should
be), and code at gas/write.c:1001 after this comment
/* Make it pc-relative. If the back-end code has not
selected a pc-relative reloc, cancel the adjustment
we do later on all pc-relative relocs. */
*not* cancelling the pc-relative adjustment. So I've made a dummy
non-relative split reloc so that the generic code handles this, rather
than attempting to add hacks later in md_apply_fix which would not be
very robust. Having the new internal reloc also makes it easy to
support
addpcis rx,sym@ha
as an equivalent to
addpcis rx,(sym-0f)@ha
0:
The patch also fixes overflow checking, which must test whether the
addi will overflow too since @l relocs don't have any overflow check.
Lastly, since I was poking at md_apply_fix, I arranged to have the
generic gas/write.c code emit errors for subtraction expressions where
we lack reloc support.
include/
* elf/ppc64.h (R_PPC64_16DX_HA): New. Expand fake reloc comment.
* elf/ppc.h (R_PPC_16DX_HA): Likewise.
bfd/
* reloc.c (BFD_RELOC_PPC_16DX_HA): New.
* elf64-ppc.c (ppc64_elf_howto_raw <R_PPC64_16DX_HA>): New howto.
(ppc64_elf_reloc_type_lookup): Translate new bfd reloc.
(ppc64_elf_ha_reloc): Correct overflow test on REL16DX_HA.
(ppc64_elf_relocate_section): Likewise.
* elf32-ppc.c (ppc_elf_howto_raw <R_PPC_16DX_HA>): New howto.
(ppc_elf_reloc_type_lookup): Translate new bfd reloc.
(ppc_elf_check_relocs): Handle R_PPC_16DX_HA to pacify gcc.
* libbfd.h: Regenerate.
* bfd-in2.h: Regenerate.
gas/
* config/tc-ppc.c (md_assemble): Use BFD_RELOC_PPC_16DX_HA for addpcis.
(md_apply_fix): Remove fx_subsy check. Move code converting to
pcrel reloc earlier and handle BFD_RELOC_PPC_16DX_HA. Remove code
emiiting errors on seeing fx_pcrel set on unexpected relocs, as
that is done now by the generic code via..
* config/tc-ppc.h (TC_FORCE_RELOCATION_SUB_LOCAL): ..this. Define.
(TC_VALIDATE_FIX_SUB): Define.
ld/
* testsuite/ld-powerpc/addpcis.d: Define ext1 and ext2 at
limits of addpcis range.
|
|
|
|
|
|
Add a new relocation that marks large-model entry code, for edit back
to medium-model.
include/elf/
* ppc64.h (R_PPC64_ENTRY): Define.
bfd/
* reloc.c (BFD_RELOC_PPC64_ENTRY): New.
* elf64-ppc.c (reloc_howto_type ppc64_elf_howto_raw): Add
entry for R_PPC64_ENTRY.
(LD_R2_0R12, ADD_R2_R2_R12, LIS_R2, ADDIS_R2_R12): Define.
(ppc64_elf_reloc_type_lookup): Handle R_PPC64_ENTRY.
(ppc64_elf_relocate_section): Edit code at R_PPC64_ENTTY. Use
new insn defines.
* libbfd.h: Regenerate.
* bfd-in2.h: Regenerate.
|
|
include/opcode/
* ppc.h (PPC_OPCODE_POWER9): New define.
(PPC_OPCODE_VSX3): Likewise.
opcodes/
* ppc-dis.c (ppc_opts): Add "power9" and "pwr9" entries.
Add PPC_OPCODE_VSX3 to the vsx entry.
(powerpc_init_dialect): Set default dialect to power9.
* ppc-opc.c (insert_dcmxs, extract_dcmxs, insert_dxd, extract_dxd,
insert_dxdn, extract_dxdn, insert_l0, extract_l0, insert_l1,
extract_l1 insert_xtq6, extract_xtq6): New static functions.
(insert_esync): Test for illegal L operand value.
(DCMX, DCMXS, DXD, NDXD, L0, L1, RC, FC, UIM6, X_R, RIC, PRS, XSQ6,
XTQ6, LRAND, IMM8, DQX, DQX_MASK, DX, DX_MASK, VXVAPS_MASK, VXVA,XVA,
XX2VA, XVARC, XBF_MASK, XX2UIM4_MASK, XX2BFD_MASK, XX2DCMXS_MASK,
XVA_MASK, XRLA_MASK, XBFRARB_MASK, XLRAND_MASK, POWER9, PPCVEC3,
PPCVSX3): New defines.
(powerpc_opcodes) <ps_cmpu0, ps_cmpo0, ps_cmpu1, ps_cmpo1, fcmpu,
fcmpo, ftdiv, ftsqrt>: Use XBF_MASK.
<mcrxr>: Use XBFRARB_MASK.
<addpcis, bcdcfn., bcdcfsq., bcdcfz., bcdcpsgn., bcdctn., bcdctsq.,
bcdctz., bcds., bcdsetsgn., bcdsr., bcdtrunc., bcdus., bcdutrunc.,
cmpeqb, cmprb, cnttzd, cnttzd., cnttzw, cnttzw., copy, copy_first,
cp_abort, darn, dtstsfi, dtstsfiq, extswsli, extswsli., ldat, ldmx,
lwat, lxsd, lxsibzx, lxsihzx, lxssp, lxv, lxvb16x, lxvh8x, lxvl, lxvll,
lxvwsx, lxvx, maddhd, maddhdu, maddld, mcrxrx, mfvsrld, modsd, modsw,
modud, moduw, msgsync, mtvsrdd, mtvsrws, paste, paste., paste_last,
rmieg, setb, slbieg, slbsync, stdat, stop, stwat, stxsd, stxsibx,
stxsihx, stxssp, stxv, stxvb16x, stxvh8x, stxvl, stxvll, stxvx,
subpcis, urfid, vbpermd, vclzlsbb, vcmpneb, vcmpneb., vcmpneh,
vcmpneh., vcmpnew, vcmpnew., vcmpnezb, vcmpnezb., vcmpnezh, vcmpnezh.,
vcmpnezw, vcmpnezw., vctzb, vctzd, vctzh, vctzlsbb, vctzw, vextractd,
vextractub, vextractuh, vextractuw, vextsb2d, vextsb2w, vextsh2d,
vextsh2w, vextsw2d, vextublx, vextubrx, vextuhlx, vextuhrx, vextuwlx,
vextuwrx, vinsertb, vinsertd, vinserth, vinsertw, vmul10cuq,
vmul10ecuq, vmul10euq, vmul10uq, vnegd, vnegw, vpermr, vprtybd,
vprtybq, vprtybw, vrldmi, vrldnm, vrlwmi, vrlwnm, vslv, vsrv, wait,
xsabsqp, xsaddqp, xsaddqpo, xscmpeqdp, xscmpexpdp, xscmpexpqp,
xscmpgedp, xscmpgtdp, xscmpnedp, xscmpoqp, xscmpuqp, xscpsgnqp,
xscvdphp, xscvdpqp, xscvhpdp, xscvqpdp, xscvqpdpo, xscvqpsdz,
xscvqpswz, xscvqpudz, xscvqpuwz, xscvsdqp, xscvudqp, xsdivqp,
xsdivqpo, xsiexpdp, xsiexpqp, xsmaddqp, xsmaddqpo, xsmaxcdp,
xsmaxjdp, xsmincdp, xsminjdp, xsmsubqp, xsmsubqpo, xsmulqp, xsmulqpo,
xsnabsqp, xsnegqp, xsnmaddqp, xsnmaddqpo, xsnmsubqp, xsnmsubqpo,
xsrqpi, xsrqpix, xsrqpxp, xssqrtqp, xssqrtqpo, xssubqp, xssubqpo,
xststdcdp, xststdcqp, xststdcsp, xsxexpdp, xsxexpqp, xsxsigdp,
xsxsigqp, xvcmpnedp, xvcmpnedp., xvcmpnesp, xvcmpnesp., xvcvhpsp,
xvcvsphp, xviexpdp, xviexpsp, xvtstdcdp, xvtstdcsp, xvxexpdp,
xvxexpsp, xvxsigdp, xvxsigsp, xxbrd, xxbrh, xxbrq, xxbrw, xxextractuw,
xxinsertw, xxperm, xxpermr, xxspltib>: New instructions.
<doze, nap, sleep, rvwinkle, waitasec, lxvx, stxvx>: Disable on POWER9.
<tlbiel, tlbie, sync, slbmfev, slbmfee>: Add additional operands.
include/elf/
* ppc.h (R_PPC_REL16DX_HA): New reloction.
* ppc64.h (R_PPC64_REL16DX_HA): Likewise.
bfd/
* elf32-ppc.c (ppc_elf_howto_raw): Add R_PPC_REL16DX_HA.
(ppc_elf_reloc_type_lookup): Handle R_PPC_REL16DX_HA.
(ppc_elf_addr16_ha_reloc): Likewise.
(ppc_elf_check_relocs): Likewise.
(ppc_elf_relocate_section): Likewise.
(is_insn_dq_form): Handle lxv and stxv instructions.
* elf64-ppc.c (ppc64_elf_howto_raw): Add R_PPC64_REL16DX_HA.
(ppc64_elf_reloc_type_lookup): Handle R_PPC64_REL16DX_HA.
(ppc64_elf_ha_reloc): Likewise.
(ppc64_elf_check_relocs): Likewise.
(ppc64_elf_relocate_section): Likewise.
* bfd-in2.h: Regenerate.
* libbfd.h: Likewise.
* reloc.c (BFD_RELOC_PPC_REL16DX_HA): New.
elfcpp/
* powerpc.h (R_POWERPC_REL16DX_HA): Define.
gas/
* doc/as.texinfo (Target PowerPC): Document -mpower9 and -mpwr9.
* doc/c-ppc.texi (PowerPC-Opts): Likewise.
* config/tc-ppc.c (md_show_usage): Likewise.
(md_assemble): Handle BFD_RELOC_PPC_REL16DX_HA.
(md_apply_fix): Likewise.
(ppc_handle_align): Handle power9's group ending nop.
gas/testsuite/
* gas/ppc/altivec3.s: New test.
* gas/ppc/altivec3.d: Likewise.
* gas/ppc/vsx3.s: Likewise.
* gas/ppc/vsx3.d: Likewise.
* gas/ppc/power9.s: Likewise.
* gas/ppc/power9.d: Likewise.
* gas/ppc/ppc.exp: Run them.
* gas/ppc/power8.s <lxvx, lxvd2x, stxvx, stxvd2x>: Add new tests.
* gas/ppc/power8.d: Likewise.
* gas/ppc/vsx.s: <lxvx, stxvx>: Rename invalid mnemonics ...
<lxvd2x, stxvd2x>: ...to this.
* gas/ppc/vsx.d: Likewise.
gold/
* gold/powerpc.cc (Powerpc_relocate_functions::addr16_dq): New function.
(Powerpc_relocate_functions::addr16dx_ha): Likewise.
(Target_powerpc::Scan::local): Handle R_POWERPC_REL16DX_HA.
(Target_powerpc::Scan::global): Likewise.
(Target_powerpc::Relocate::relocate): Likewise.
ld/testsuite/
* ld-powerpc/addpcis.d: New test.
* ld-powerpc/addpcis.s: New test.
* ld-powerpc/powerpc.exp: Run it.
|
|
|
|
|
|
This adds support for "func@localentry", an expression that returns the
ELFv2 local entry point address of function "func". I've excluded
dynamic relocation support because that obviously would require glibc
changes.
include/elf/
* ppc64.h (R_PPC64_REL24_NOTOC, R_PPC64_ADDR64_LOCAL): Define.
bfd/
* elf64-ppc.c (ppc64_elf_howto_raw): Add R_PPC64_ADDR64_LOCAL entry.
(ppc64_elf_reloc_type_lookup): Support R_PPC64_ADDR64_LOCAL.
(ppc64_elf_check_relocs): Likewise.
(ppc64_elf_relocate_section): Likewise.
* Add BFD_RELOC_PPC64_ADDR64_LOCAL.
* bfd-in2.h: Regenerate.
* libbfd.h: Regenerate.
gas/
* config/tc-ppc.c (ppc_elf_suffix): Support @localentry.
(md_apply_fix): Support R_PPC64_ADDR64_LOCAL.
ld/testsuite/
* ld-powerpc/elfv2-2a.s, ld-powerpc/elfv2-2b.s: New files.
* ld-powerpc/elfv2-2exe.d, ld-powerpc/elfv2-2so.d: New files.
* ld-powerpc/powerpc.exp: Run new test.
elfcpp/
* powerpc.h (R_PPC64_REL24_NOTOC, R_PPC64_ADDR64_LOCAL): Define.
gold/
* powerpc.cc (Target_powerpc::Scan::local, global): Support
R_PPC64_ADDR64_LOCAL.
(Target_powerpc::Relocate::relocate): Likewise.
|
|
This removes the DT_PPC_TLSOPT/DT_PPC64_TLSOPT dynamic tag and replaces
it with DT_PPC_OPT/DT_PPC64_OPT tag to provide the same functionality
and more. This isn't backwards compatible, but the TLSOPT tag hasn't
been used since the tls optimisation support was never submitted to
glibc.
/include/elf/
* ppc.h (DT_PPC_TLSOPT): Delete.
(DT_PPC_OPT, PPC_OPT_TLS): Define.
* ppc64.h (DT_PPC64_TLSOPT): Delete.
(DT_PPC64_OPT, PPC64_OPT_TLS, PPC64_OPT_MULTI_TOC): Define.
bfd/
* elf32-ppc.c (ppc_elf_size_dynamic_sections): Use new DT_PPC_OPT
tag to specify tls optimisation.
* elf64-ppc.c (ppc64_elf_size_dynamic_sections): Likewise.
(ppc64_elf_finish_dynamic_sections): Specify whether multiple
toc pointers are used via DT_PPC64_OPT.
binutils/
* readelf.c (get_ppc_dynamic_type): Replace PPC_TLSOPT with PPC_OPT.
(get_ppc64_dynamic_type): Replace PPC64_TLSOPT with PPC64_OPT.
|
|
This defines the ELF symbol st_other field used to encode the number
of instructions between a function "global entry" and its "local entry",
and adds support related to the local entry offset.
include/elf/
* ppc64.h (STO_PPC64_LOCAL_BIT, STO_PPC64_LOCAL_MASK): Define.
(ppc64_decode_local_entry, ppc64_encode_local_entry): New functions.
(PPC64_LOCAL_ENTRY_OFFSET, PPC64_SET_LOCAL_ENTRY_OFFSET): Define.
bfd/
* elf64-ppc.c (struct ppc_stub_hash_entry): Add "other".
(stub_hash_newfunc): Init new ppc_stub_hash_entry field, and one
we forgot, "plt_ent".
(ppc64_elf_add_symbol_hook): Check ELFv1 objects don't have
st_other bits only valid in ELFv2.
(ppc64_elf_merge_symbol_attribute): New function.
(ppc_type_of_stub): Add local_off param to test branch range.
(ppc_build_one_stub): Adjust destinations for ELFv2 locals.
(ppc_size_one_stub, toc_adjusting_stub_needed): Similarly.
(ppc64_elf_size_stubs): Pass local_off to ppc_type_of_stub.
Set "other" field.
(ppc64_elf_relocate_section): Adjust destination for ELFv2 local
calls.
gas/
* config/tc-ppc.c (md_pseudo_table): Add .localentry.
(ppc_elf_localentry): New function.
(ppc_force_relocation): Force relocs on all branches to localenty
symbols.
(ppc_fix_adjustable): Don't reduce such symbols to section+offset.
binutils/
* readelf.c (get_ppc64_symbol_other): New function.
(get_symbol_other): Use it for EM_PPC64.
|
|
Defines bits in ELF e_flags to differentiate ELFv2 objects from ELFv2,
adds .abiversion directive to explicitly choose the ABI, and code to
check and automatically select ABI.
include/elf/
* ppc64.h (EF_PPC64_ABI): Define.
bfd/
* elf64-ppc.c (abiversion, set_abiversion): New functions.
(ppc64_elf_get_synthetic_symtab): Handle ELFv2 objects without .opd.
(struct ppc_link_hash_table): Add opd_abi.
(ppc64_elf_check_relocs): Check no .opd with ELFv2.
(ppc64_elf_merge_private_bfd_data): New function.
(ppc64_elf_print_private_bfd_data): New function.
(ppc64_elf_tls_setup): Set htab->opd_abi.
(ppc64_elf_size_dynamic_sections): Don't emit OPD related dynamic
tags for ELFv2.
(ppc_build_one_stub): Use R_PPC64_IRELATIVE for ELFv2 ifunc.
(ppc64_elf_finish_dynamic_symbol): Likewise
binutils/
* readelf.c (get_machine_flags): Display ABI version for EM_PPC64.
gas/
* config/tc-ppc.c: Include elf/ppc64.h.
(ppc_abiversion): New variable.
(md_pseudo_table): Add .abiversion.
(ppc_elf_abiversion, ppc_elf_end): New functions.
* config/tc-ppc.h (md_end): Define.
|
|
This changes the behaviour of @h and @ha on PowerPC64 to report errors
on 32-bit overflow. The motivation for this change is that on
PowerPC64, most uses of @h and @ha modifiers and their corresponding
relocations are to build up 32-bit offsets. We'd like to know when
such offsets overflow. Only rarely do people use @h or @ha with the
high 32-bit modifiers to build a 64-bit constant. Those uses will now
need to use two new modifiers, @high and @higha, if the constant isn't
known at assembly time. For now, we won't report overflow at assembly
time..
This also fixes an error when applying some of the HIGHER and HIGHEST
relocations.
include/elf/
* ppc64.h (R_PPC64_ADDR16_HIGH, R_PPC64_ADDR16_HIGHA,
R_PPC64_TPREL16_HIGH, R_PPC64_TPREL16_HIGHA,
R_PPC64_DTPREL16_HIGH, R_PPC64_DTPREL16_HIGHA): New.
(IS_PPC64_TLS_RELOC): Match new tls relocs.
bfd/
* reloc.c (BFD_RELOC_PPC64_ADDR16_HIGH, BFD_RELOC_PPC64_ADDR16_HIGHA,
BFD_RELOC_PPC64_TPREL16_HIGH, BFD_RELOC_PPC64_TPREL16_HIGHA,
BFD_RELOC_PPC64_DTPREL16_HIGH, BFD_RELOC_PPC64_DTPREL16_HIGHA): New.
* elf64-ppc.c (ppc64_elf_howto_raw): Add entries for new relocs.
Make all _HA and _HI relocs report signed overflow.
(ppc64_elf_reloc_type_lookup): Handle new relocs.
(must_be_dyn_reloc, ppc64_elf_check_relocs): Likewise.
(dec_dynrel_count, ppc64_elf_relocate_section): Likewise.
(ppc64_elf_relocate_section): Don't apply 0x8000 adjust to
R_PPC64_TPREL16_HIGHER, R_PPC64_TPREL16_HIGHEST,
R_PPC64_DTPREL16_HIGHER, and R_PPC64_DTPREL16_HIGHEST.
* libbfd.h: Regenerate.
* bfd-in2.h: Regenerate.
gas/
* config/tc-ppc.c (SEX16): Don't mask.
(REPORT_OVERFLOW_HI): Define as zero.
(ppc_elf_suffix): Support @high, @higha, @dtprel@high, @dtprel@higha,
@tprel@high, and @tprel@higha modifiers.
(md_assemble): Ignore X_unsigned when applying 16-bit insn fields.
Add (disabled) code to check @h and @ha reloc overflow for powerpc64.
Handle new relocs.
(md_apply_fix): Similarly.
elfcpp/
* powerpc.h (R_PPC64_ADDR16_HIGH, R_PPC64_ADDR16_HIGHA,
R_PPC64_TPREL16_HIGH, R_PPC64_TPREL16_HIGHA,
R_PPC64_DTPREL16_HIGH, R_PPC64_DTPREL16_HIGHA): Define.
gold/
* powerpc.cc (Target_powerpc::Scan::check_non_pic): Handle new relocs.
(Target_powerpc::Scan::global, local): Likewise.
(Target_powerpc::Relocate::relocate): Likewise. Check for overflow
on all ppc64 @h and @ha relocs.
|
|
* ppc64.h (R_PPC64_TOCSAVE): Add.
bfd/
* elf64-ppc.c (ppc64_elf_howto_table): Add R_PPC64_TOCSAVE entry.
(struct ppc_link_hash_table): Add tocsave_htab.
(struct tocsave_entry): New.
(tocsave_htab_hash, tocsave_htab_eq, tocsave_find): New functions.
(ppc64_elf_link_hash_table_create): Create tocsave_htab..
(ppc64_elf_link_hash_table_free): ..and delete it.
(build_plt_stub): Always put STD_R2_40R1 first.
(ppc64_elf_size_stubs): Check for R_PPC64_TOCSAVE following reloc
on plt call. If present add prologue nop location to tocsave_htab.
(ppc64_elf_relocate_section): Convert prologue nop to std. Skip
first insn of plt call stub when R_PPC64_TOCSAVE present.
|
|
* ppc64.h (R_PPC64_LO_DS_OPT): Define.
bfd/
* elf64-ppc.c (toc_skip_enum): Define.
(ppc64_elf_edit_toc): Use two low bits of skip array as markers.
Optimize largetoc sequences.
(adjust_toc_syms): Update for skip array change.
(ppc64_elf_relocate_section): Handle R_PPC64_LO_DS_OPT.
ld/
* emultempl/ppc64elf.em (prelim_size_sections): New function.
(ppc_before_allocation): Use it. Size sections before toc edit too.
|
|
|
|
* ppc.h (DT_PPC_TLSOPT): Define.
* ppc64.h (DT_PPC64_TLSOPT): Define.
bfd/
* elf32-ppc.c (TLS_GET_ADDR_GLINK_SIZE): Define.
(ADD_3_12_2, BEQLR, CMPWI_11_0, LWZ_11_3, LWZ_12_3): Define.
(MR_0_3, MR_3_0): Define.
(struct ppc_elf_link_hash_table): Add no_tls_get_addr_opt.
(ppc_elf_select_plt_layout): Save emit_stub_syms param earlier.
(ppc_elf_tls_setup): Add no_tls_get_addr_opt param and save to hash
table. Check for presense of __tls_get_addr_opt
(allocate_dynrelocs): Increase glink entry size for __tls_get_addr.
(ppc_elf_size_dynamic_sections): Add DT_PPC_TLS_OPT tag.
(write_glink_stub): Add param p.
(ppc_elf_relocate_section): Adjust write_glink_stub call.
(ppc_elf_finish_dynamic_symbol): Emit special glink call stub for
__tls_get_addr.
* elf32-ppc.h (ppc_elf_tls_setup): Update prototype.
* elf64-ppc.c (struct ppc_link_hash_table): Add no_tls_get_addr_opt.
(ppc64_elf_tls_setup): Add no_tls_get_addr_opt param and save to hash
table. Check for presense of __tls_get_addr_opt.
(ppc64_elf_size_dynamic_sections): Add DT_PPC64_TLS_OPT tag.
(LD_R11_0R3, LD_R12_0R3, MR_R0_R3, CMPDI_R11_0, ADD_R3_R12_R13,
BEQLR, MR_R3_R0, MFLR_R11, STD_R11_0R1, BCTRL, LD_R11_0R1,
LD_R2_0R1, MTLR_R11): Define.
(build_tls_get_addr_stub): New function.
(ppc_build_one_stub): Call it.
(ppc_size_one_stub): Add extra size for __tls_get_addr stub.
(ppc64_elf_relocate_section): Don't change nop to ld 2,40(1) for
__tls_get_addr plt call.
* elf64-ppc.h (ppc64_elf_tls_setup): Update prototype.
binutils/
* readelf.c (get_ppc_dynamic_type): Add TLSOPT.
(get_ppc64_dynamic_type): Likewise.
ld/
* emultempl/ppc32elf.em (no_tls_get_addr_opt): New var.
(ppc_before_allocation): Pass to ppc_elf_tls_setup.
(OPTION_NO_TLS_GET_ADDR_OPT): Define. Redefine other options in
terms of previous option.
(PARSE_AND_LIST_LONGOPTS, PARSE_AND_LIST_OPTIONS): Add
--no-tls-get-addr-optimize.
(PARSE_AND_LIST_ARGS_CASES): Handle it.
* emultempl/ppc64elf.em (no_tls_get_addr_opt): New var.
(ppc_before_allocation): Pass to ppc64_elf_tls_setup.
(OPTION_NO_TLS_GET_ADDR_OPT): Define.
(PARSE_AND_LIST_LONGOPTS, PARSE_AND_LIST_OPTIONS): Add
--no-tls-get-addr-optimize.
(PARSE_AND_LIST_ARGS_CASES): Handle it.
ld/testsuite/
* ld-powerpc/tlslib.s: Delete dot-symbol entry syms. Add
__tls_get_addr_opt.
* ld-powerpc/tlslib32.s: Add __tls_get_addr_opt.
* ld-powerpc/oldtlslib.s: New file, old-abi version of tlslib.s.
* ld-powerpc/powerpc.exp: Build old-abi library and use it in
two new link tests.
* ld-powerpc/tlsexe.d: Update for new __tls_get_addr stub.
* ld-powerpc/tlsexe.g, * ld-powerpc/tlsexe.r, *ld-powerpc/tlsexe32.d,
* ld-powerpc/tlsexe32.g, * ld-powerpc/tlsexe32.r,
* ld-powerpc/tlsexetoc.d, * ld-powerpc/tlsexetoc.g,
* ld-powerpc/tlsexetoc.r: Likewise.
|
|
R_PPC64_REL16_HI, R_PPC64_REL16_HA.
|
|
|
|
* ppc.h (R_PPC_TLSGD, R_PPC_TLSLD): Add new relocs.
* ppc64.h (R_PPC64_TLSGD, R_PPC64_TLSLD): Add new relocs.
bfd/
* reloc.c (BFD_RELOC_PPC_TLSGD, BFD_RELOC_PPC_TLSLD): New.
* section.c (struct bfd_section): Add has_tls_get_addr_call.
(BFD_FAKE_SECTION): Init new flag.
* ecoff.c (bfd_debug_section): Likewise.
* bfd-in2.h: Regenerate.
* libbfd.h: Regenerate.
* elf32-ppc.c (ppc_elf_howto_raw): Add R_PPC_TLSGD and R_PPC_TLSLD.
(ppc_elf_reloc_type_lookup): Handle new relocs.
(ppc_elf_check_relocs): Set has_tls_get_addr_call on finding such
without marker relocs.
(ppc_elf_tls_optimize): Allow out-of-order __tls_get_addr relocs
if section has no old-style calls.
(ppc_elf_relocate_section): Set tls_mask for non-tls relocs too.
Don't try to optimize new-style __tls_get_addr call when handling
arg setup relocs. Instead do so for R_PPC_TLSGD and R_PPC_TLSLD
relocs.
* elf64-ppc.c (ppc64_elf_howto_raw): Add R_PPC64_TLSGD, R_PPC64_TLSLD.
(ppc64_elf_reloc_type_lookup): Handle new relocs.
(ppc64_elf_check_relocs): Set has_tls_get_addr_call on finding such
without marker relocs.
(ppc64_elf_tls_optimize): Allow out-of-order __tls_get_addr relocs
if section has no old-style calls. Set toc_ref for new relocs as
appropriate.
(ppc64_elf_relocate_section): Set tls_mask for non-tls relocs too.
Don't try to optimize new-style __tls_get_addr call when handling
arg setup relocs. Instead do so for R_PPC_TLSGD and R_PPC_TLSLD
relocs.
gas/
* config/tc-ppc.c (ppc_elf_suffix): Error if ppc32 tls got relocs
have non-zero addend.
(md_assemble): Parse args of __tls_get_addr calls.
(md_apply_fix): Handle BFD_RELOC_PPC_TLSGD and BFD_RELOC_PPC_TLSLD.
ld/testsuite/
* ld-powerpc/tlsmark.s, * ld-powerpc/tlsmark.d: New test.
* ld-powerpc/tlsmark32.s, * ld-powerpc/tlsmark32.d: New test.
* ld-powerpc/powerpc.exp: Run them.
|
|
|
|
|
|
* ppc64.h: Likewise.
|
|
* pcc64.h: ..here. New file.
(R_PPC64_REL30): Rename from R_PPC64_ADDR30.
|