diff options
author | Andrew Burgess <aburgess@redhat.com> | 2022-06-21 20:23:35 +0100 |
---|---|---|
committer | Andrew Burgess <aburgess@redhat.com> | 2022-10-02 11:58:27 +0100 |
commit | d4ce49b7ac077a9882d6a5e689e260300045ca88 (patch) | |
tree | eee06ae927cf9296680c33ce93873f9108a050e8 /gdb/doc | |
parent | d309a8f9b34d8fd570dc8c7189eb6790b9afd4e3 (diff) | |
download | gdb-d4ce49b7ac077a9882d6a5e689e260300045ca88.zip gdb-d4ce49b7ac077a9882d6a5e689e260300045ca88.tar.gz gdb-d4ce49b7ac077a9882d6a5e689e260300045ca88.tar.bz2 |
gdb: disassembler opcode display formatting
This commit changes the format of 'disassemble /r' to match GNU
objdump. Specifically, GDB will now display the instruction bytes in
as 'objdump --wide --disassemble' does.
Here is an example for RISC-V before this patch:
(gdb) disassemble /r 0x0001018e,0x0001019e
Dump of assembler code from 0x1018e to 0x1019e:
0x0001018e <call_me+66>: 03 26 84 fe lw a2,-24(s0)
0x00010192 <call_me+70>: 83 25 c4 fe lw a1,-20(s0)
0x00010196 <call_me+74>: 61 65 lui a0,0x18
0x00010198 <call_me+76>: 13 05 85 6a addi a0,a0,1704
0x0001019c <call_me+80>: f1 22 jal 0x10368 <printf>
End of assembler dump.
And here's an example after this patch:
(gdb) disassemble /r 0x0001018e,0x0001019e
Dump of assembler code from 0x1018e to 0x1019e:
0x0001018e <call_me+66>: fe842603 lw a2,-24(s0)
0x00010192 <call_me+70>: fec42583 lw a1,-20(s0)
0x00010196 <call_me+74>: 6561 lui a0,0x18
0x00010198 <call_me+76>: 6a850513 addi a0,a0,1704
0x0001019c <call_me+80>: 22f1 jal 0x10368 <printf>
End of assembler dump.
There are two differences here. First, the instruction bytes after
the patch are grouped based on the size of the instruction, and are
byte-swapped to little-endian order.
Second, after the patch, GDB now uses the bytes-per-line hint from
libopcodes to add whitespace padding after the opcode bytes, this
means that in most cases the instructions are nicely aligned.
It is still possible for a very long instruction to intrude into the
disassembled text space. The next example is x86-64, before the
patch:
(gdb) disassemble /r main
Dump of assembler code for function main:
0x0000000000401106 <+0>: 55 push %rbp
0x0000000000401107 <+1>: 48 89 e5 mov %rsp,%rbp
0x000000000040110a <+4>: c7 87 d8 00 00 00 01 00 00 00 movl $0x1,0xd8(%rdi)
0x0000000000401114 <+14>: b8 00 00 00 00 mov $0x0,%eax
0x0000000000401119 <+19>: 5d pop %rbp
0x000000000040111a <+20>: c3 ret
End of assembler dump.
And after the patch:
(gdb) disassemble /r main
Dump of assembler code for function main:
0x0000000000401106 <+0>: 55 push %rbp
0x0000000000401107 <+1>: 48 89 e5 mov %rsp,%rbp
0x000000000040110a <+4>: c7 87 d8 00 00 00 01 00 00 00 movl $0x1,0xd8(%rdi)
0x0000000000401114 <+14>: b8 00 00 00 00 mov $0x0,%eax
0x0000000000401119 <+19>: 5d pop %rbp
0x000000000040111a <+20>: c3 ret
End of assembler dump.
Most instructions are aligned, except for the very long instruction.
Notice too that for x86-64 libopcodes doesn't request that GDB group
the instruction bytes. This matches the behaviour of objdump.
In case the user really wants the old behaviour, I have added a new
modifier 'disassemble /b', this displays the instruction byte at a
time. For x86-64, which never groups instruction bytes, /b and /r are
equivalent, but for RISC-V, using /b gets the old layout back (except
that the whitespace for alignment is still present). Consider our
original RISC-V example, this time using /b:
(gdb) disassemble /b 0x0001018e,0x0001019e
Dump of assembler code from 0x1018e to 0x1019e:
0x0001018e <call_me+66>: 03 26 84 fe lw a2,-24(s0)
0x00010192 <call_me+70>: 83 25 c4 fe lw a1,-20(s0)
0x00010196 <call_me+74>: 61 65 lui a0,0x18
0x00010198 <call_me+76>: 13 05 85 6a addi a0,a0,1704
0x0001019c <call_me+80>: f1 22 jal 0x10368 <printf>
End of assembler dump.
Obviously, this patch is a potentially significant change to the
behaviour or /r. I could have added /b with the new behaviour and
left /r alone. However, personally, I feel the new behaviour is
significantly better than the old, hence, I made /r be what I consider
the "better" behaviour.
The reason I prefer the new behaviour is that, when I use /r, I almost
always want to manually decode the instruction for some reason, and
having the bytes displayed in "instruction order" rather than memory
order, just makes this easier.
The 'record instruction-history' command also takes a /r modifier, and
has been modified in the same way as disassemble; /r gets the new
behaviour, and /b has been added to retain the old behaviour.
Finally, the MI command -data-disassemble, is unchanged in behaviour,
this command now requests the raw bytes of the instruction, which is
equivalent to the /b modifier. This means that the MI output will
remain backward compatible.
Diffstat (limited to 'gdb/doc')
-rw-r--r-- | gdb/doc/gdb.texinfo | 48 |
1 files changed, 41 insertions, 7 deletions
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo index 238a49b..596e587 100644 --- a/gdb/doc/gdb.texinfo +++ b/gdb/doc/gdb.texinfo @@ -7945,7 +7945,10 @@ are printed in execution order. It can also print mixed source+disassembly if you specify the the @code{/m} or @code{/s} modifier, and print the raw instructions in hex -as well as in symbolic form by specifying the @code{/r} modifier. +as well as in symbolic form by specifying the @code{/r} or @code{/b} +modifier. The behaviour of the @code{/m}, @code{/s}, @code{/r}, and +@code{/b} modifiers are the same as for the @kbd{disassemble} command +(@pxref{disassemble,,@kbd{disassemble}}). The current position marker is printed for the instruction at the current program counter value. This instruction can appear multiple @@ -9859,6 +9862,7 @@ After @code{info line}, using @code{info line} again without specifying a location will display information about the next source line. +@anchor{disassemble} @table @code @kindex disassemble @cindex assembly instructions @@ -9869,16 +9873,17 @@ line. @itemx disassemble /m @itemx disassemble /s @itemx disassemble /r +@itemx disassemble /b This specialized command dumps a range of memory as machine instructions. It can also print mixed source+disassembly by specifying -the @code{/m} or @code{/s} modifier and print the raw instructions in hex -as well as in symbolic form by specifying the @code{/r} modifier. -The default memory range is the function surrounding the +the @code{/m} or @code{/s} modifier and print the raw instructions in +hex as well as in symbolic form by specifying the @code{/r} or @code{/b} +modifier. The default memory range is the function surrounding the program counter of the selected frame. A single argument to this command is a program counter value; @value{GDBN} dumps the function -surrounding this value. When two arguments are given, they should -be separated by a comma, possibly surrounded by whitespace. The -arguments specify a range of addresses to dump, in one of two forms: +surrounding this value. When two arguments are given, they should be +separated by a comma, possibly surrounded by whitespace. The arguments +specify a range of addresses to dump, in one of two forms: @table @code @item @var{start},@var{end} @@ -9916,6 +9921,35 @@ Dump of assembler code from 0x32c4 to 0x32e4: End of assembler dump. @end smallexample +The following two examples are for RISC-V, and demonstrates the +difference between the @code{/r} and @code{/b} modifiers. First with +@code{/b}, the bytes of the instruction are printed, in hex, in memory +order: + +@smallexample +(@value{GDBP}) disassemble /b 0x00010150,0x0001015c +Dump of assembler code from 0x10150 to 0x1015c: + 0x00010150 <call_me+4>: 22 dc sw s0,56(sp) + 0x00010152 <call_me+6>: 80 00 addi s0,sp,64 + 0x00010154 <call_me+8>: 23 26 a4 fe sw a0,-20(s0) + 0x00010158 <call_me+12>: 23 24 b4 fe sw a1,-24(s0) +End of assembler dump. +@end smallexample + +In contrast, with @code{/r} the bytes of the instruction are displayed +in the instruction order, for RISC-V this means that the bytes have been +swapped to little-endian order: + +@smallexample +(@value{GDBP}) disassemble /r 0x00010150,0x0001015c +Dump of assembler code from 0x10150 to 0x1015c: + 0x00010150 <call_me+4>: dc22 sw s0,56(sp) + 0x00010152 <call_me+6>: 0080 addi s0,sp,64 + 0x00010154 <call_me+8>: fea42623 sw a0,-20(s0) + 0x00010158 <call_me+12>: feb42423 sw a1,-24(s0) +End of assembler dump. +@end smallexample + Here is an example showing mixed source+assembly for Intel x86 with @code{/m} or @code{/s}, when the program is stopped just after function prologue in a non-optimized function with no inline code. |