aboutsummaryrefslogtreecommitdiff
path: root/gdb/doc
diff options
context:
space:
mode:
authorAndrew Burgess <andrew.burgess@embecosm.com>2021-09-17 18:12:34 +0100
committerAndrew Burgess <aburgess@redhat.com>2022-06-15 09:44:54 +0100
commit15e15b2d9cd3b1db68f99cd3b047352142ddfd1c (patch)
tree6bc04a49dbf8d60839ec0d73638eee4803acd559 /gdb/doc
parente4ae302562aba1bd166919d76341fb631e2d470a (diff)
downloadgdb-15e15b2d9cd3b1db68f99cd3b047352142ddfd1c.zip
gdb-15e15b2d9cd3b1db68f99cd3b047352142ddfd1c.tar.gz
gdb-15e15b2d9cd3b1db68f99cd3b047352142ddfd1c.tar.bz2
gdb/python: implement the print_insn extension language hook
This commit extends the Python API to include disassembler support. The motivation for this commit was to provide an API by which the user could write Python scripts that would augment the output of the disassembler. To achieve this I have followed the model of the existing libopcodes disassembler, that is, instructions are disassembled one by one. This does restrict the type of things that it is possible to do from a Python script, i.e. all additional output has to fit on a single line, but this was all I needed, and creating something more complex would, I think, require greater changes to how GDB's internal disassembler operates. The disassembler API is contained in the new gdb.disassembler module, which defines the following classes: DisassembleInfo Similar to libopcodes disassemble_info structure, has read-only properties: address, architecture, and progspace. And has methods: __init__, read_memory, and is_valid. Each time GDB wants an instruction disassembled, an instance of this class is passed to a user written disassembler function, by reading the properties, and calling the methods (and other support methods in the gdb.disassembler module) the user can perform and return the disassembly. Disassembler This is a base-class which user written disassemblers should inherit from. This base class provides base implementations of __init__ and __call__ which the user written disassembler should override. DisassemblerResult This class can be used to hold the result of a call to the disassembler, it's really just a wrapper around a string (the text of the disassembled instruction) and a length (in bytes). The user can return an instance of this class from Disassembler.__call__ to represent the newly disassembled instruction. The gdb.disassembler module also provides the following functions: register_disassembler This function registers an instance of a Disassembler sub-class as a disassembler, either for one specific architecture, or, as a global disassembler for all architectures. builtin_disassemble This provides access to GDB's builtin disassembler. A common use case that I see is augmenting the existing disassembler output. The user code can call this function to have GDB disassemble the instruction in the normal way. The user gets back a DisassemblerResult object, which they can then read in order to augment the disassembler output in any way they wish. This function also provides a mechanism to intercept the disassemblers reads of memory, thus the user can adjust what GDB sees when it is disassembling. The included documentation provides a more detailed description of the API. There is also a new CLI command added: maint info python-disassemblers This command is defined in the Python gdb.disassemblers module, and can be used to list the currently registered Python disassemblers.
Diffstat (limited to 'gdb/doc')
-rw-r--r--gdb/doc/gdb.texinfo45
-rw-r--r--gdb/doc/python.texi328
2 files changed, 373 insertions, 0 deletions
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 3a8cf3f..2178b47 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -39680,6 +39680,51 @@ packet history.
@item maint info jit
Print information about JIT code objects loaded in the current inferior.
+@anchor{maint info python-disassemblers}
+@kindex maint info python-disassemblers
+@item maint info python-disassemblers
+This command is defined within the @code{gdb.disassembler} Python
+module (@pxref{Disassembly In Python}), and will only be present after
+that module has been imported. To force the module to be imported do
+the following:
+
+@smallexample
+(@value{GDBP}) python import gdb.disassembler
+@end smallexample
+
+This command lists all the architectures for which a disassembler is
+currently registered, and the name of the disassembler. If a
+disassembler is registered for all architectures, then this is listed
+last against the @samp{GLOBAL} architecture.
+
+If one of the disassemblers would be selected for the architecture of
+the current inferior, then this disassembler will be marked.
+
+The following example shows a situation in which two disassemblers are
+registered, initially the @samp{i386} disassembler matches the current
+architecture, then the architecture is changed, now the @samp{GLOBAL}
+disassembler matches.
+
+@smallexample
+@group
+(@value{GDBP}) show architecture
+The target architecture is set to "auto" (currently "i386").
+(@value{GDBP}) maint info python-disassemblers
+Architecture Disassember Name
+i386 Disassembler_1 (Matches current architecture)
+GLOBAL Disassembler_2
+@end group
+@group
+(@value{GDBP}) set architecture arm
+The target architecture is set to "arm".
+(@value{GDBP}) maint info python-disassemblers
+quit
+Architecture Disassember Name
+i386 Disassembler_1
+GLOBAL Disassembler_2 (Matches current architecture)
+@end group
+@end smallexample
+
@kindex set displaced-stepping
@kindex show displaced-stepping
@cindex displaced stepping support
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index aaf7666..75804ef 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -222,6 +222,7 @@ optional arguments while skipping others. Example:
* Registers In Python:: Python representation of registers.
* Connections In Python:: Python representation of connections.
* TUI Windows In Python:: Implementing new TUI windows.
+* Disassembly In Python:: Instruction Disassembly In Python
@end menu
@node Basic Python
@@ -599,6 +600,7 @@ such as those used by readline for command input, and annotation
related prompts are prohibited from being changed.
@end defun
+@anchor{gdb_architecture_names}
@defun gdb.architecture_names ()
Return a list containing all of the architecture names that the
current build of @value{GDBN} supports. Each architecture name is a
@@ -3287,6 +3289,7 @@ single address space, so this may not match the architecture of a
particular frame (@pxref{Frames In Python}).
@end defun
+@anchor{gdbpy_inferior_read_memory}
@findex Inferior.read_memory
@defun Inferior.read_memory (address, length)
Read @var{length} addressable memory units from the inferior, starting at
@@ -6575,6 +6578,331 @@ corner), and @var{button} specifies which mouse button was used, whose
values can be 1 (left), 2 (middle), or 3 (right).
@end defun
+@node Disassembly In Python
+@subsubsection Instruction Disassembly In Python
+@cindex python instruction disassembly
+
+@value{GDBN}'s builtin disassembler can be extended, or even replaced,
+using the Python API. The disassembler related features are contained
+within the @code{gdb.disassembler} module:
+
+@deftp {class} gdb.disassembler.DisassembleInfo
+Disassembly is driven by instances of this class. Each time
+@value{GDBN} needs to disassemble an instruction, an instance of this
+class is created and passed to a registered disassembler. The
+disassembler is then responsible for disassembling an instruction and
+returning a result.
+
+Instances of this type are usually created within @value{GDBN},
+however, it is possible to create a copy of an instance of this type,
+see the description of @code{__init__} for more details.
+
+This class has the following properties and methods:
+
+@defvar DisassembleInfo.address
+A read-only integer containing the address at which @value{GDBN}
+wishes to disassemble a single instruction.
+@end defvar
+
+@defvar DisassembleInfo.architecture
+The @code{gdb.Architecture} (@pxref{Architectures In Python}) for
+which @value{GDBN} is currently disassembling, this property is
+read-only.
+@end defvar
+
+@defvar DisassembleInfo.progspace
+The @code{gdb.Progspace} (@pxref{Progspaces In Python,,Program Spaces
+In Python}) for which @value{GDBN} is currently disassembling, this
+property is read-only.
+@end defvar
+
+@defun DisassembleInfo.is_valid ()
+Returns @code{True} if the @code{DisassembleInfo} object is valid,
+@code{False} if not. A @code{DisassembleInfo} object will become
+invalid once the disassembly call for which the @code{DisassembleInfo}
+was created, has returned. Calling other @code{DisassembleInfo}
+methods, or accessing @code{DisassembleInfo} properties, will raise a
+@code{RuntimeError} exception if it is invalid.
+@end defun
+
+@defun DisassembleInfo.__init__ (info)
+This can be used to create a new @code{DisassembleInfo} object that is
+a copy of @var{info}. The copy will have the same @code{address},
+@code{architecture}, and @code{progspace} values as @var{info}, and
+will become invalid at the same time as @var{info}.
+
+This method exists so that sub-classes of @code{DisassembleInfo} can
+be created, these sub-classes must be initialized as copies of an
+existing @code{DisassembleInfo} object, but sub-classes might choose
+to override the @code{read_memory} method, and so control what
+@value{GDBN} sees when reading from memory
+(@pxref{builtin_disassemble}).
+@end defun
+
+@defun DisassembleInfo.read_memory (length, offset)
+This method allows the disassembler to read the bytes of the
+instruction to be disassembled. The method reads @var{length} bytes,
+starting at @var{offset} from
+@code{DisassembleInfo.address}.
+
+It is important that the disassembler read the instruction bytes using
+this method, rather than reading inferior memory directly, as in some
+cases @value{GDBN} disassembles from an internal buffer rather than
+directly from inferior memory, calling this method handles this
+detail.
+
+Returns a buffer object, which behaves much like an array or a string,
+just as @code{Inferior.read_memory} does
+(@pxref{gdbpy_inferior_read_memory,,Inferior.read_memory}). The
+length of the returned buffer will always be exactly @var{length}.
+
+If @value{GDBN} is unable to read the required memory then a
+@code{gdb.MemoryError} exception is raised (@pxref{Exception
+Handling}).
+
+This method can be overridden by a sub-class in order to control what
+@value{GDBN} sees when reading from memory
+(@pxref{builtin_disassemble}). When overriding this method it is
+important to understand how @code{builtin_disassemble} makes use of
+this method.
+
+While disassembling a single instruction there could be multiple calls
+to this method, and the same bytes might be read multiple times. Any
+single call might only read a subset of the total instruction bytes.
+
+If an implementation of @code{read_memory} is unable to read the
+requested memory contents, for example, if there's a request to read
+from an invalid memory address, then a @code{gdb.MemoryError} should
+be raised.
+
+Raising a @code{MemoryError} inside @code{read_memory} does not
+automatically mean a @code{MemoryError} will be raised by
+@code{builtin_disassemble}. It is possible the @value{GDBN}'s builtin
+disassembler is probing to see how many bytes are available. When
+@code{read_memory} raises the @code{MemoryError} the builtin
+disassembler might be able to perform a complete disassembly with the
+bytes it has available, in this case @code{builtin_disassemble} will
+not itself raise a @code{MemoryError}.
+
+Any other exception type raised in @code{read_memory} will propagate
+back and be available re-raised by @code{builtin_disassemble}.
+@end defun
+@end deftp
+
+@deftp {class} Disassembler
+This is a base class from which all user implemented disassemblers
+must inherit.
+
+@defun Disassembler.__init__ (name)
+The constructor takes @var{name}, a string, which should be a short
+name for this disassembler.
+@end defun
+
+@defun Disassembler.__call__ (info)
+The @code{__call__} method must be overridden by sub-classes to
+perform disassembly. Calling @code{__call__} on this base class will
+raise a @code{NotImplementedError} exception.
+
+The @var{info} argument is an instance of @code{DisassembleInfo}, and
+describes the instruction that @value{GDBN} wants disassembling.
+
+If this function returns @code{None}, this indicates to @value{GDBN}
+that this sub-class doesn't wish to disassemble the requested
+instruction. @value{GDBN} will then use its builtin disassembler to
+perform the disassembly.
+
+Alternatively, this function can return a @code{DisassemblerResult}
+that represents the disassembled instruction, this type is described
+in more detail below.
+
+The @code{__call__} method can raise a @code{gdb.MemoryError}
+exception (@pxref{Exception Handling}) to indicate to @value{GDBN}
+that there was a problem accessing the required memory, this will then
+be displayed by @value{GDBN} within the disassembler output.
+
+Ideally, the only three outcomes from invoking @code{__call__} would
+be a return of @code{None}, a successful disassembly returned in a
+@code{DisassemblerResult}, or a @code{MemoryError} indicating that
+there was a problem reading memory.
+
+However, as an implementation of @code{__call__} could fail due to
+other reasons, e.g.@: some external resource required to perform
+disassembly is temporarily unavailable, then, if @code{__call__}
+raises a @code{GdbError}, the exception will be converted to a string
+and printed at the end of the disassembly output, the disassembly
+request will then stop.
+
+Any other exception type raised by the @code{__call__} method is
+considered an error in the user code, the exception will be printed to
+the error stream according to the @kbd{set python print-stack} setting
+(@pxref{set_python_print_stack,,@kbd{set python print-stack}}).
+@end defun
+@end deftp
+
+@deftp {class} DisassemblerResult
+This class is used to hold the result of calling
+@w{@code{Disassembler.__call__}}, and represents a single disassembled
+instruction. This class has the following properties and methods:
+
+@defun DisassemblerResult.__init__ (@var{length}, @var{string})
+Initialize an instance of this class, @var{length} is the length of
+the disassembled instruction in bytes, which must be greater than
+zero, and @var{string} is a non-empty string that represents the
+disassembled instruction.
+@end defun
+
+@defvar DisassemblerResult.length
+A read-only property containing the length of the disassembled
+instruction in bytes, this will always be greater than zero.
+@end defvar
+
+@defvar DisassemblerResult.string
+A read-only property containing a non-empty string representing the
+disassembled instruction.
+@end defvar
+@end deftp
+
+The following functions are also contained in the
+@code{gdb.disassembler} module:
+
+@defun register_disassembler (disassembler, architecture)
+The @var{disassembler} must be a sub-class of
+@code{gdb.disassembler.Disassembler} or @code{None}.
+
+The optional @var{architecture} is either a string, or the value
+@code{None}. If it is a string, then it should be the name of an
+architecture known to @value{GDBN}, as returned either from
+@code{gdb.Architecture.name}
+(@pxref{gdbpy_architecture_name,,gdb.Architecture.name}), or from
+@code{gdb.architecture_names}
+(@pxref{gdb_architecture_names,,gdb.architecture_names}).
+
+The @var{disassembler} will be installed for the architecture named by
+@var{architecture}, or if @var{architecture} is @code{None}, then
+@var{disassembler} will be installed as a global disassembler for use
+by all architectures.
+
+@cindex disassembler in Python, global vs.@: specific
+@cindex search order for disassembler in Python
+@cindex look up of disassembler in Python
+@value{GDBN} only records a single disassembler for each architecture,
+and a single global disassembler. Calling
+@code{register_disassembler} for an architecture, or for the global
+disassembler, will replace any existing disassembler registered for
+that @var{architecture} value. The previous disassembler is returned.
+
+If @var{disassembler} is @code{None} then any disassembler currently
+registered for @var{architecture} is deregistered and returned.
+
+When @value{GDBN} is looking for a disassembler to use, @value{GDBN}
+first looks for an architecture specific disassembler. If none has
+been registered then @value{GDBN} looks for a global disassembler (one
+registered with @var{architecture} set to @code{None}). Only one
+disassembler is called to perform disassembly, so, if there is both an
+architecture specific disassembler, and a global disassembler
+registered, it is the architecture specific disassembler that will be
+used.
+
+@value{GDBN} tracks the architecture specific, and global
+disassemblers separately, so it doesn't matter in which order
+disassemblers are created or registered; an architecture specific
+disassembler, if present, will always be used in preference to a
+global disassembler.
+
+You can use the @kbd{maint info python-disassemblers} command
+(@pxref{maint info python-disassemblers}) to see which disassemblers
+have been registered.
+@end defun
+
+@anchor{builtin_disassemble}
+@defun builtin_disassemble (info)
+This function calls back into @value{GDBN}'s builtin disassembler to
+disassemble the instruction identified by @var{info}, an instance, or
+sub-class, of @code{DisassembleInfo}.
+
+When the builtin disassembler needs to read memory the
+@code{read_memory} method on @var{info} will be called. By
+sub-classing @code{DisassembleInfo} and overriding the
+@code{read_memory} method, it is possible to intercept calls to
+@code{read_memory} from the builtin disassembler, and to modify the
+values returned.
+
+It is important to understand that, even when
+@code{DisassembleInfo.read_memory} raises a @code{gdb.MemoryError}, it
+is the internal disassembler itself that reports the memory error to
+@value{GDBN}. The reason for this is that the disassembler might
+probe memory to see if a byte is readable or not; if the byte can't be
+read then the disassembler may choose not to report an error, but
+instead to disassemble the bytes that it does have available.
+
+If the builtin disassembler is successful then an instance of
+@code{DisassemblerResult} is returned from @code{builtin_disassemble},
+alternatively, if something goes wrong, an exception will be raised.
+
+A @code{MemoryError} will be raised if @code{builtin_disassemble} is
+unable to read some memory that is required in order to perform
+disassembly correctly.
+
+Any exception that is not a @code{MemoryError}, that is raised in a
+call to @code{read_memory}, will pass through
+@code{builtin_disassemble}, and be visible to the caller.
+
+Finally, there are a few cases where @value{GDBN}'s builtin
+disassembler can fail for reasons that are not covered by
+@code{MemoryError}. In these cases, a @code{GdbError} will be raised.
+The contents of the exception will be a string describing the problem
+the disassembler encountered.
+@end defun
+
+Here is an example that registers a global disassembler. The new
+disassembler invokes the builtin disassembler, and then adds a
+comment, @code{## Comment}, to each line of disassembly output:
+
+@smallexample
+class ExampleDisassembler(gdb.disassembler.Disassembler):
+ def __init__(self):
+ super().__init__("ExampleDisassembler")
+
+ def __call__(self, info):
+ result = gdb.disassembler.builtin_disassemble(info)
+ length = result.length
+ text = result.string + "\t## Comment"
+ return gdb.disassembler.DisassemblerResult(length, text)
+
+gdb.disassembler.register_disassembler(ExampleDisassembler())
+@end smallexample
+
+The following example creates a sub-class of @code{DisassembleInfo} in
+order to intercept the @code{read_memory} calls, within
+@code{read_memory} any bytes read from memory have the two 4-bit
+nibbles swapped around. This isn't a very useful adjustment, but
+serves as an example.
+
+@smallexample
+class MyInfo(gdb.disassembler.DisassembleInfo):
+ def __init__(self, info):
+ super().__init__(info)
+
+ def read_memory(self, length, offset):
+ buffer = super().read_memory(length, offset)
+ result = bytearray()
+ for b in buffer:
+ v = int.from_bytes(b, 'little')
+ v = (v << 4) & 0xf0 | (v >> 4)
+ result.append(v)
+ return memoryview(result)
+
+class NibbleSwapDisassembler(gdb.disassembler.Disassembler):
+ def __init__(self):
+ super().__init__("NibbleSwapDisassembler")
+
+ def __call__(self, info):
+ info = MyInfo(info)
+ return gdb.disassembler.builtin_disassemble(info)
+
+gdb.disassembler.register_disassembler(NibbleSwapDisassembler())
+@end smallexample
+
@node Python Auto-loading
@subsection Python Auto-loading
@cindex Python auto-loading