From 7bd374a44d1db21b54a9a52ecde1d064cdaa8cd1 Mon Sep 17 00:00:00 2001 From: "Maciej W. Rozycki" Date: Wed, 27 Jul 2016 17:27:55 +0100 Subject: MIPS/GAS: Implement microMIPS branch/jump compaction Convert microMIPS branches and jumps whose delay slot would be filled by a generated NOP instruction to the corresponding compact form where one exists, in a manner similar to MIPS16 JR->JRC and JALR->JALRC swap. Do so even where the transformation switches from a 16-bit to a 32-bit branch encoding for no benefit in code size reduction, as this is still advantageous. This is because a branch/NOP pair takes 2 pipeline slots or a 2-cycle completion latency except in superscalar implementations. Whereas a compact branch may or may not stall on its target fetch, so it will at most have a 2-cycle completion latency and may have only 1 even in scalar implementations, and in superscalar implementations it is expected to have no worse latency as a branch/NOP pair has. Also it won't stall and therefore take the extra latency cycle in the not-taken case. Technically this is the same as MIPS16 compaction: for the qualifying instruction encodings the APPEND_ADD_COMPACT machine code generation method is selected where APPEND_ADD_WITH_NOP otherwise would and tells the code generator in `append_insn' to convert the regular form of an instruction to its corresponding compact form. For this the opcode is tweaked as necessary and the microMIPS opcode table is scanned for the matching updated instruction. A non-$0 `rt' operand to BEQ and BNE instructions is moved to the `rs' operand field of BEQZC and BNEZC encodings as required. Unlike with MIPS16 compaction however we need to handle out-of-distance branch relaxation as well. We do this by deferring the generation of any delay-slot NOP required to relaxation made in `md_convert_frag', by converting the APPEND_ADD_WITH_NOP machine code generation to APPEND_ADD where a relaxed instruction is recorded. Relaxation then, depending on actual code produced, chooses between either using a compact branch or jump encoding and emitting the NOP outstanding if no compact encoding is possible. For code simplicity's sake the relaxation pass is retained even if the principle of preferring a compact encoding to a 16-bit branch/NOP pair means, in the absence of out-of-range branch relaxation, that a single compact branch machine code instruction will eventually be produced from a given assembly source instruction. gas/ * config/tc-mips.c (RELAX_MICROMIPS_ENCODE): Add `nods' flag. (RELAX_MICROMIPS_RELAX32, RELAX_MICROMIPS_TOOFAR16) (RELAX_MICROMIPS_MARK_TOOFAR16, RELAX_MICROMIPS_CLEAR_TOOFAR16) (RELAX_MICROMIPS_TOOFAR32, RELAX_MICROMIPS_MARK_TOOFAR32) (RELAX_MICROMIPS_CLEAR_TOOFAR32): Shift bits. (get_append_method): Also return APPEND_ADD_COMPACT for microMIPS instructions. (find_altered_mips16_opcode): Exclude macros from matching. Factor code out... (find_altered_opcode): ... to this new function. (find_altered_micromips_opcode): New function. (frag_branch_delay_slot_size): Likewise. (append_insn): Handle microMIPS branch/jump compaction. (macro_start): Likewise. (relaxed_micromips_32bit_branch_length): Likewise. (md_convert_frag): Likewise. * testsuite/gas/mips/micromips.s: Add conditional explicit NOPs for delay slot filling. * testsuite/gas/mips/micromips-b16.s: Add explicit NOPs for delay slot filling. * testsuite/gas/mips/micromips-size-1.s: Likewise. * testsuite/gas/mips/micromips.l: Adjust line numbers. * testsuite/gas/mips/micromips-warn.l: Likewise. * testsuite/gas/mips/micromips-size-1.l: Likewise. * testsuite/gas/mips/micromips.d: Adjust padding. * testsuite/gas/mips/micromips-trap.d: Likewise. * testsuite/gas/mips/micromips-insn32.d: Likewise. * testsuite/gas/mips/micromips-noinsn32.d: Likewise. * testsuite/gas/mips/micromips@beq.d: Update patterns for branch/jump compaction. * testsuite/gas/mips/micromips@bge.d: Likewise. * testsuite/gas/mips/micromips@bgeu.d: Likewise. * testsuite/gas/mips/micromips@blt.d: Likewise. * testsuite/gas/mips/micromips@bltu.d: Likewise. * testsuite/gas/mips/micromips@branch-misc-4.d: Likewise. * testsuite/gas/mips/micromips@branch-misc-4-64.d: Likewise. * testsuite/gas/mips/micromips@branch-misc-5.d: Likewise. * testsuite/gas/mips/micromips@branch-misc-5pic.d: Likewise. * testsuite/gas/mips/micromips@branch-misc-5-64.d: Likewise. * testsuite/gas/mips/micromips@branch-misc-5pic-64.d: Likewise. * testsuite/gas/mips/micromips@jal-svr4pic-local.d: Likewise. * testsuite/gas/mips/micromips@jal-svr4pic-local-n32.d: Likewise. * testsuite/gas/mips/micromips@jal-svr4pic-local-n64.d: Likewise. * testsuite/gas/mips/micromips@loc-swap.d: Likewise. * testsuite/gas/mips/micromips@loc-swap-dis.d: Likewise. * testsuite/gas/mips/micromips@relax.d: Likewise. * testsuite/gas/mips/micromips@relax-at.d: Likewise. * testsuite/gas/mips/micromips@relax-swap3.d: Likewise. * testsuite/gas/mips/branch-extern-2.d: Likewise. * testsuite/gas/mips/branch-extern-4.d: Likewise. * testsuite/gas/mips/branch-section-2.d: Likewise. * testsuite/gas/mips/branch-section-4.d: Likewise. * testsuite/gas/mips/branch-weak-2.d: Likewise. * testsuite/gas/mips/branch-weak-5.d: Likewise. * testsuite/gas/mips/micromips-branch-absolute.d: Likewise. * testsuite/gas/mips/micromips-branch-absolute-n32.d: Likewise. * testsuite/gas/mips/micromips-branch-absolute-n64.d: Likewise. * testsuite/gas/mips/micromips-branch-absolute-addend.d: Likewise. * testsuite/gas/mips/micromips-branch-absolute-addend-n32.d: Likewise. * testsuite/gas/mips/micromips-branch-absolute-addend-n64.d: Likewise. * testsuite/gas/mips/micromips-compact.d: New test. * testsuite/gas/mips/mips.exp: Run the new test. ld/ * testsuite/ld-mips-elf/micromips-branch-absolute.d: Update patterns for branch compaction. * testsuite/ld-mips-elf/micromips-branch-absolute-addend.d: Likewise. opcodes/ * micromips-opc.c (micromips_opcodes): Reorder "bc" next to "b", "beqzc" next to "beq", "bnezc" next to "bne" and "jrc" next to "j". --- ld/ChangeLog | 7 +++++++ ld/testsuite/ld-mips-elf/micromips-branch-absolute-addend.d | 13 +++++-------- ld/testsuite/ld-mips-elf/micromips-branch-absolute.d | 13 +++++-------- 3 files changed, 17 insertions(+), 16 deletions(-) (limited to 'ld') diff --git a/ld/ChangeLog b/ld/ChangeLog index 8c4e5d4..99bb6df 100644 --- a/ld/ChangeLog +++ b/ld/ChangeLog @@ -1,3 +1,10 @@ +2016-07-27 Maciej W. Rozycki + + * testsuite/ld-mips-elf/micromips-branch-absolute.d: Update + patterns for branch compaction. + * testsuite/ld-mips-elf/micromips-branch-absolute-addend.d: + Likewise. + 2016-07-27 Nick Clifton * testsuite/ld-gc/personality.d: Use "target cfi" to restrict the diff --git a/ld/testsuite/ld-mips-elf/micromips-branch-absolute-addend.d b/ld/testsuite/ld-mips-elf/micromips-branch-absolute-addend.d index ec78ff9..fc3bd03 100644 --- a/ld/testsuite/ld-mips-elf/micromips-branch-absolute-addend.d +++ b/ld/testsuite/ld-mips-elf/micromips-branch-absolute-addend.d @@ -8,15 +8,12 @@ Disassembly of section \.text: \.\.\. -[0-9a-f]+ <[^>]*> 9400 2c54 b 0*123468ac -[0-9a-f]+ <[^>]*> 0c00 nop -[0-9a-f]+ <[^>]*> 4060 2c51 bal 0*123468ac +[0-9a-f]+ <[^>]*> 40e0 2c54 bc 0*123468ac +[0-9a-f]+ <[^>]*> 4060 2c52 bal 0*123468ac [0-9a-f]+ <[^>]*> 0000 0000 nop -[0-9a-f]+ <[^>]*> 4020 2c4d bltzal zero,0*123468ac +[0-9a-f]+ <[^>]*> 4020 2c4e bltzal zero,0*123468ac [0-9a-f]+ <[^>]*> 0000 0000 nop -[0-9a-f]+ <[^>]*> 9402 2c49 beqz v0,0*123468ac -[0-9a-f]+ <[^>]*> 0c00 nop -[0-9a-f]+ <[^>]*> b402 2c46 bnez v0,0*123468ac -[0-9a-f]+ <[^>]*> 0c00 nop +[0-9a-f]+ <[^>]*> 40e2 2c4a beqzc v0,0*123468ac +[0-9a-f]+ <[^>]*> 40a2 2c48 bnezc v0,0*123468ac [0-9a-f]+ <[^>]*> 0c00 nop \.\.\. diff --git a/ld/testsuite/ld-mips-elf/micromips-branch-absolute.d b/ld/testsuite/ld-mips-elf/micromips-branch-absolute.d index f07ad1b..ad44f5a 100644 --- a/ld/testsuite/ld-mips-elf/micromips-branch-absolute.d +++ b/ld/testsuite/ld-mips-elf/micromips-branch-absolute.d @@ -8,15 +8,12 @@ Disassembly of section \.text: \.\.\. -[0-9a-f]+ <[^>]*> 9400 0118 b 0+001234 -[0-9a-f]+ <[^>]*> 0c00 nop -[0-9a-f]+ <[^>]*> 4060 0115 bal 0+001234 +[0-9a-f]+ <[^>]*> 40e0 0118 bc 0+001234 +[0-9a-f]+ <[^>]*> 4060 0116 bal 0+001234 [0-9a-f]+ <[^>]*> 0000 0000 nop -[0-9a-f]+ <[^>]*> 4020 0111 bltzal zero,0+001234 +[0-9a-f]+ <[^>]*> 4020 0112 bltzal zero,0+001234 [0-9a-f]+ <[^>]*> 0000 0000 nop -[0-9a-f]+ <[^>]*> 9402 010d beqz v0,0+001234 -[0-9a-f]+ <[^>]*> 0c00 nop -[0-9a-f]+ <[^>]*> b402 010a bnez v0,0+001234 -[0-9a-f]+ <[^>]*> 0c00 nop +[0-9a-f]+ <[^>]*> 40e2 010e beqzc v0,0+001234 +[0-9a-f]+ <[^>]*> 40a2 010c bnezc v0,0+001234 [0-9a-f]+ <[^>]*> 0c00 nop \.\.\. -- cgit v1.1