aboutsummaryrefslogtreecommitdiff
path: root/gdb/python
diff options
context:
space:
mode:
authorAndrew Burgess <aburgess@redhat.com>2023-01-24 15:35:45 +0000
committerAndrew Burgess <aburgess@redhat.com>2023-05-16 10:30:47 +0100
commit4de4e48514fc47aeb4ca95cd4091e2a333fbe9e1 (patch)
treee6af2471378a753f74f665b601f21c15958c5f77 /gdb/python
parent0af2f233330024e0e9b4697d510c7030e518e64c (diff)
downloadgdb-4de4e48514fc47aeb4ca95cd4091e2a333fbe9e1.zip
gdb-4de4e48514fc47aeb4ca95cd4091e2a333fbe9e1.tar.gz
gdb-4de4e48514fc47aeb4ca95cd4091e2a333fbe9e1.tar.bz2
gdb/python: extend the Python Disassembler API to allow for styling
This commit extends the Python Disassembler API to allow for styling of the instructions. Before this commit the Python Disassembler API allowed the user to do two things: - They could intercept instruction disassembly requests and return a string of their choosing, this string then became the disassembled instruction, or - They could call builtin_disassemble, which would call back into libopcode to perform the disassembly. As libopcode printed the instruction GDB would collect these print requests and build a string. This string was then returned from the builtin_disassemble call, and the user could modify or extend this string as needed. Neither of these approaches allowed for, or preserved, disassembler styling, which is now available within libopcodes for many of the more popular architectures GDB supports. This commit aims to fill this gap. After this commit a user will be able to do the following things: - Implement a custom instruction disassembler entirely in Python without calling back into libopcodes, the custom disassembler will be able to return styling information such that GDB will display the instruction fully styled. All of GDB's existing style settings will affect how instructions coming from the Python disassembler are displayed in the expected manner. - Call builtin_disassemble and receive a result that represents how libopcode would like the instruction styled. The user can then adjust or extend the disassembled instruction before returning the result to GDB. Again, the instruction will be styled as expected. To achieve this I will add two new classes to GDB, DisassemblerTextPart and DisassemblerAddressPart. Within builtin_disassemble, instead of capturing the print calls from libopcodes and building a single string, we will now create either a text part or address part and store these parts in a vector. The DisassemblerTextPart will capture a small piece of text along with the associated style that should be used to display the text. This corresponds to the disassembler calling disassemble_info::fprintf_styled_func, or for disassemblers that don't support styling disassemble_info::fprintf_func. The DisassemblerAddressPart is used when libopcodes requests that an address be printed, and takes care of printing the address and associated symbol, this corresponds to the disassembler calling disassemble_info::print_address_func. These parts are then placed within the DisassemblerResult when builtin_disassemble returns. Alternatively, the user can directly create parts by calling two new methods on the DisassembleInfo class: DisassembleInfo.text_part and DisassembleInfo.address_part. Having created these parts the user can then pass these parts when initializing a new DisassemblerResult object. Finally, when we return from Python to gdbpy_print_insn, one way or another, the result being returned will have a list of parts. Back in GDB's C++ code we walk the list of parts and call back into GDB's core to display the disassembled instruction with the correct styling. The new API lives in parallel with the old API. Any existing code that creates a DisassemblerResult using a single string immediately creates a single DisassemblerTextPart containing the entire instruction and gives this part the default text style. This is also what happens if the user calls builtin_disassemble for an architecture that doesn't (yet) support libopcode styling. This matches up with what happens when the Python API is not involved, an architecture without disassembler styling support uses the old libopcodes printing API (the API that doesn't pass style info), and GDB just prints everything using the default text style. The reason that parts are created by calling methods on DisassembleInfo, rather than calling the class constructor directly, is DisassemblerAddressPart. Ideally this part would only hold the address which the part represents, but in order to support backwards compatibility we need to be able to convert the DisassemblerAddressPart into a string. To do that we need to call GDB's internal print_address function, and to do that we need an gdbarch. What this means is that the DisassemblerAddressPart needs to take a gdb.Architecture object at creation time. The only valid place a user can pull this from is from the DisassembleInfo object, so having the DisassembleInfo act as a factory ensures that the correct gdbarch is passed over each time. I implemented both solutions (the one presented here, and an alternative where parts could be constructed directly), and this felt like the cleanest solution. Reviewed-By: Eli Zaretskii <eliz@gnu.org> Reviewed-By: Tom Tromey <tom@tromey.com>
Diffstat (limited to 'gdb/python')
-rw-r--r--gdb/python/py-disasm.c871
1 files changed, 820 insertions, 51 deletions
diff --git a/gdb/python/py-disasm.c b/gdb/python/py-disasm.c
index f246a09..85d936e 100644
--- a/gdb/python/py-disasm.c
+++ b/gdb/python/py-disasm.c
@@ -56,6 +56,49 @@ struct disasm_info_object
extern PyTypeObject disasm_info_object_type
CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("disasm_info_object");
+/* Implement gdb.disassembler.DisassembleAddressPart type. An object of
+ this type represents a small part of a disassembled instruction; a part
+ that is an address that should be printed using a call to GDB's
+ internal print_address function. */
+
+struct disasm_addr_part_object
+{
+ PyObject_HEAD
+
+ /* The address to be formatted. */
+ bfd_vma address;
+
+ /* A gdbarch. This is only needed in the case where the user asks for
+ the DisassemblerAddressPart to be converted to a string. When we
+ return this part to GDB within a DisassemblerResult then GDB will use
+ the gdbarch from the initial disassembly request. */
+ struct gdbarch *gdbarch;
+};
+
+extern PyTypeObject disasm_addr_part_object_type
+ CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("disasm_addr_part_object");
+
+/* Implement gdb.disassembler.DisassembleTextPart type. An object of
+ this type represents a small part of a disassembled instruction; a part
+ that is a piece of test along with an associated style. */
+
+struct disasm_text_part_object
+{
+ PyObject_HEAD
+
+ /* The string that is this part. */
+ std::string *string;
+
+ /* The style to use when displaying this part. */
+ enum disassembler_style style;
+};
+
+extern PyTypeObject disasm_text_part_object_type
+ CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("disasm_text_part_object");
+
+extern PyTypeObject disasm_part_object_type
+ CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("PyObject");
+
/* Implement gdb.disassembler.DisassemblerResult type, an object that holds
the result of calling the disassembler. This is mostly the length of
the disassembled instruction (in bytes), and the string representing the
@@ -68,9 +111,9 @@ struct disasm_result_object
/* The length of the disassembled instruction in bytes. */
int length;
- /* A buffer which, when allocated, holds the disassembled content of an
- instruction. */
- string_file *content;
+ /* A vector containing all the parts of the disassembled instruction.
+ Each part will be a DisassemblerPart sub-class. */
+ std::vector<gdbpy_ref<>> *parts;
};
extern PyTypeObject disasm_result_object_type
@@ -88,7 +131,7 @@ static bool python_print_insn_enabled = false;
placed in the application_data field of the disassemble_info that is
used when we call gdbarch_print_insn. */
-struct gdbpy_disassembler : public gdb_printing_disassembler
+struct gdbpy_disassembler : public gdb_disassemble_info
{
/* Constructor. */
gdbpy_disassembler (disasm_info_object *obj, PyObject *memory_source);
@@ -109,6 +152,27 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
unsigned int len,
struct disassemble_info *info) noexcept;
+ /* Callback used as the disassemble_info's fprintf_func callback. The
+ DIS_INFO pointer is a pointer to a gdbpy_disassembler object. */
+ static int fprintf_func (void *dis_info, const char *format, ...) noexcept
+ ATTRIBUTE_PRINTF(2,3);
+
+ /* Callback used as the disassemble_info's fprintf_styled_func callback.
+ The DIS_INFO pointer is a pointer to a gdbpy_disassembler. */
+ static int fprintf_styled_func (void *dis_info,
+ enum disassembler_style style,
+ const char *format, ...) noexcept
+ ATTRIBUTE_PRINTF(3,4);
+
+ /* Helper used by fprintf_func and fprintf_styled_func. This function
+ creates a new DisassemblerTextPart and adds it to the disassembler's
+ parts list. The actual disassembler is accessed through DIS_INFO,
+ which is a pointer to the gdbpy_disassembler object. */
+ static int vfprintf_styled_func (void *dis_info,
+ enum disassembler_style style,
+ const char *format, va_list args) noexcept
+ ATTRIBUTE_PRINTF(3,0);
+
/* Return a reference to an optional that contains the address at which a
memory error occurred. The optional will only have a value if a
memory error actually occurred. */
@@ -118,9 +182,9 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
/* Return the content of the disassembler as a string. The contents are
moved out of the disassembler, so after this call the disassembler
contents have been reset back to empty. */
- std::string release ()
+ std::vector<gdbpy_ref<>> release ()
{
- return m_string_file.release ();
+ return std::move (m_parts);
}
/* If there is a Python exception stored in this disassembler then
@@ -147,8 +211,10 @@ struct gdbpy_disassembler : public gdb_printing_disassembler
private:
- /* Where the disassembler result is written. */
- string_file m_string_file;
+ /* The list of all the parts that make up this disassembled instruction.
+ This is populated as a result of the callbacks from libopcodes as the
+ instruction is disassembled. */
+ std::vector<gdbpy_ref<>> m_parts;
/* The DisassembleInfo object we are disassembling for. */
disasm_info_object *m_disasm_info_object;
@@ -286,6 +352,38 @@ disasmpy_set_memory_error_for_address (CORE_ADDR address)
PyErr_SetObject (gdbpy_gdb_memory_error, address_obj);
}
+/* Create a new DisassemblerTextPart and return a gdbpy_ref wrapper for
+ the new object. STR is the string content of the part and STYLE is the
+ style to be used when GDB displays this part. */
+
+static gdbpy_ref<>
+make_disasm_text_part (std::string &&str, enum disassembler_style style)
+{
+ PyTypeObject *type = &disasm_text_part_object_type;
+ disasm_text_part_object *text_part
+ = (disasm_text_part_object *) type->tp_alloc (type, 0);
+ text_part->string = new std::string (str);
+ text_part->style = style;
+
+ return gdbpy_ref<> ((PyObject *) text_part);
+}
+
+/* Create a new DisassemblerAddressPart and return a gdbpy_ref wrapper for
+ the new object. GDBARCH is the architecture used when formatting the
+ address, and ADDRESS is the numerical address to be displayed. */
+
+static gdbpy_ref<>
+make_disasm_addr_part (struct gdbarch *gdbarch, CORE_ADDR address)
+{
+ PyTypeObject *type = &disasm_addr_part_object_type;
+ disasm_addr_part_object *addr_part
+ = (disasm_addr_part_object *) type->tp_alloc (type, 0);
+ addr_part->address = address;
+ addr_part->gdbarch = gdbarch;
+
+ return gdbpy_ref<> ((PyObject *) addr_part);
+}
+
/* Ensure that a gdb.disassembler.DisassembleInfo is valid. */
#define DISASMPY_DISASM_INFO_REQUIRE_VALID(Info) \
@@ -298,21 +396,135 @@ disasmpy_set_memory_error_for_address (CORE_ADDR address)
} \
} while (0)
-/* Initialise OBJ, a DisassemblerResult object with LENGTH and CONTENT.
+/* Implement DisassembleInfo.text_part method. Creates and returns a new
+ DisassemblerTextPart object. */
+
+static PyObject *
+disasmpy_info_make_text_part (PyObject *self, PyObject *args,
+ PyObject *kwargs)
+{
+ disasm_info_object *obj = (disasm_info_object *) self;
+ DISASMPY_DISASM_INFO_REQUIRE_VALID (obj);
+
+ static const char *keywords[] = { "style", "string", NULL };
+ int style_num;
+ const char *string;
+ if (!gdb_PyArg_ParseTupleAndKeywords (args, kwargs, "is", keywords,
+ &style_num, &string))
+ return nullptr;
+
+ if (style_num < 0 || style_num > ((int) dis_style_comment_start))
+ {
+ PyErr_SetString (PyExc_ValueError,
+ _("Invalid disassembler style."));
+ return nullptr;
+ }
+
+ if (strlen (string) == 0)
+ {
+ PyErr_SetString (PyExc_ValueError,
+ _("String must not be empty."));
+ return nullptr;
+ }
+
+ gdbpy_ref<> text_part
+ = make_disasm_text_part (std::string (string),
+ (enum disassembler_style) style_num);
+ return text_part.release ();
+}
+
+/* Implement DisassembleInfo.address_part method. Creates and returns a
+ new DisassemblerAddressPart object. */
+
+static PyObject *
+disasmpy_info_make_address_part (PyObject *self, PyObject *args,
+ PyObject *kwargs)
+{
+ disasm_info_object *obj = (disasm_info_object *) self;
+ DISASMPY_DISASM_INFO_REQUIRE_VALID (obj);
+
+ static const char *keywords[] = { "address", NULL };
+ CORE_ADDR address;
+ PyObject *address_object;
+ if (!gdb_PyArg_ParseTupleAndKeywords (args, kwargs, "O", keywords,
+ &address_object))
+ return nullptr;
+
+ if (get_addr_from_python (address_object, &address) < 0)
+ return nullptr;
+
+ return make_disasm_addr_part (obj->gdbarch, address).release ();
+}
+
+/* Return a string representation of TEXT_PART. The returned string does
+ not include any styling. */
+
+static std::string
+disasmpy_part_to_string (const disasm_text_part_object *text_part)
+{
+ gdb_assert (text_part->string != nullptr);
+ return *(text_part->string);
+}
+
+/* Return a string representation of ADDR_PART. The returned string does
+ not include any styling. */
+
+static std::string
+disasmpy_part_to_string (const disasm_addr_part_object *addr_part)
+{
+ string_file buf;
+ print_address (addr_part->gdbarch, addr_part->address, &buf);
+ return buf.release ();
+}
+
+/* PARTS is a vector of Python objects, each is a sub-class of
+ DisassemblerPart. Create a string by concatenating the string
+ representation of each part, and return this new string.
+
+ Converting an address part requires that we call back into GDB core,
+ which could throw an exception. As such, calls to this function should
+ be wrapped with a try/catch. */
+
+static std::string
+disasmpy_parts_list_to_string (const std::vector<gdbpy_ref<>> &parts)
+{
+ std::string str;
+ for (auto p : parts)
+ {
+ if (Py_TYPE (p.get ()) == &disasm_text_part_object_type)
+ {
+ disasm_text_part_object *text_part
+ = (disasm_text_part_object *) p.get ();
+ str += disasmpy_part_to_string (text_part);
+ }
+ else
+ {
+ gdb_assert (Py_TYPE (p.get ()) == &disasm_addr_part_object_type);
+
+ disasm_addr_part_object *addr_part
+ = (disasm_addr_part_object *) p.get ();
+ str += disasmpy_part_to_string (addr_part);
+ }
+ }
+
+ return str;
+}
+
+/* Initialise OBJ, a DisassemblerResult object with LENGTH and PARTS.
OBJ might already have been initialised, in which case any existing
- content should be discarded before the new CONTENT is moved in. */
+ content should be discarded before the new PARTS are moved in. */
static void
disasmpy_init_disassembler_result (disasm_result_object *obj, int length,
- std::string content)
+ std::vector<gdbpy_ref<>> &&parts)
{
- if (obj->content == nullptr)
- obj->content = new string_file;
+ if (obj->parts == nullptr)
+ obj->parts = new std::vector<gdbpy_ref<>>;
else
- obj->content->clear ();
+ obj->parts->clear ();
obj->length = length;
- *(obj->content) = std::move (content);
+ *(obj->parts) = std::move (parts);
}
/* Implement gdb.disassembler.builtin_disassemble(). Calls back into GDB's
@@ -375,9 +587,19 @@ disasmpy_builtin_disassemble (PyObject *self, PyObject *args, PyObject *kw)
}
else
{
- std::string content = disassembler.release ();
- if (!content.empty ())
- PyErr_SetString (gdbpy_gdberror_exc, content.c_str ());
+ auto content = disassembler.release ();
+ std::string str;
+
+ try
+ {
+ str = disasmpy_parts_list_to_string (content);
+ }
+ catch (const gdb_exception &except)
+ {
+ GDB_PY_HANDLE_EXCEPTION (except);
+ }
+ if (!str.empty ())
+ PyErr_SetString (gdbpy_gdberror_exc, str.c_str ());
else
PyErr_SetString (gdbpy_gdberror_exc,
_("Unknown disassembly error."));
@@ -393,10 +615,10 @@ disasmpy_builtin_disassemble (PyObject *self, PyObject *args, PyObject *kw)
gdb_assert (!disassembler.memory_error_address ().has_value ());
/* Create a DisassemblerResult containing the results. */
- std::string content = disassembler.release ();
PyTypeObject *type = &disasm_result_object_type;
gdbpy_ref<disasm_result_object> res
((disasm_result_object *) type->tp_alloc (type, 0));
+ auto content = disassembler.release ();
disasmpy_init_disassembler_result (res.get (), length, std::move (content));
return reinterpret_cast<PyObject *> (res.release ());
}
@@ -510,6 +732,88 @@ disasmpy_info_progspace (PyObject *self, void *closure)
return pspace_to_pspace_object (obj->program_space).release ();
}
+/* Helper function called when the libopcodes disassembler produces some
+ output. FORMAT and ARGS are used to create a string which GDB will
+ display using STYLE. The string is either added as a new
+ DisassemblerTextPart to the list of parts being built in the current
+ gdbpy_disassembler object (accessed through DIS_INFO). Or, if the last
+ part in the gdbpy_disassembler is a text part in the same STYLE, then
+ the new string is appended to the previous part.
+
+ The merging behaviour make the Python API a little more user friendly,
+ some disassemblers produce their output character at a time, there's no
+ particular reason for this, it's just how they are implemented. By
+ merging parts with the same style we make it easier for the user to
+ analyse the disassembler output. */
+
+int
+gdbpy_disassembler::vfprintf_styled_func (void *dis_info,
+ enum disassembler_style style,
+ const char *format,
+ va_list args) noexcept
+{
+ gdb_disassemble_info *di = (gdb_disassemble_info *) dis_info;
+ gdbpy_disassembler *dis
+ = gdb::checked_static_cast<gdbpy_disassembler *> (di);
+
+ if (!dis->m_parts.empty ()
+ && Py_TYPE (dis->m_parts.back ().get ()) == &disasm_text_part_object_type
+ && (((disasm_text_part_object *) dis->m_parts.back ().get ())->style
+ == style))
+ {
+ std::string *string
+ = ((disasm_text_part_object *) dis->m_parts.back ().get ())->string;
+ string_vappendf (*string, format, args);
+ }
+ else
+ {
+ std::string str = string_vprintf (format, args);
+ if (str.size () > 0)
+ {
+ gdbpy_ref<> text_part
+ = make_disasm_text_part (std::move (str), style);
+ dis->m_parts.emplace_back (std::move (text_part));
+ }
+ }
+
+ /* Something non -ve. */
+ return 0;
+}
+
+/* Disassembler callback for architectures where libopcodes doesn't
+ created styled output. In these cases we format all the output using
+ the (default) text style. */
+
+int
+gdbpy_disassembler::fprintf_func (void *dis_info,
+ const char *format, ...) noexcept
+{
+ va_list args;
+ va_start (args, format);
+ vfprintf_styled_func (dis_info, dis_style_text, format, args);
+ va_end (args);
+
+ /* Something non -ve. */
+ return 0;
+}
+
+/* Disassembler callback for architectures where libopcodes does create
+ styled output. Just creates a new text part with the given STYLE. */
+
+int
+gdbpy_disassembler::fprintf_styled_func (void *dis_info,
+ enum disassembler_style style,
+ const char *format, ...) noexcept
+{
+ va_list args;
+ va_start (args, format);
+ vfprintf_styled_func (dis_info, style, format, args);
+ va_end (args);
+
+ /* Something non -ve. */
+ return 0;
+}
+
/* This implements the disassemble_info read_memory_func callback and is
called from the libopcodes disassembler when the disassembler wants to
read memory.
@@ -615,11 +919,24 @@ disasmpy_result_str (PyObject *self)
{
disasm_result_object *obj = (disasm_result_object *) self;
- gdb_assert (obj->content != nullptr);
- gdb_assert (obj->content->size () > 0);
+ /* These conditions are all enforced when the DisassemblerResult object
+ is created. */
+ gdb_assert (obj->parts != nullptr);
+ gdb_assert (obj->parts->size () > 0);
gdb_assert (obj->length > 0);
- return PyUnicode_Decode (obj->content->c_str (),
- obj->content->size (),
+
+ std::string str;
+
+ try
+ {
+ str = disasmpy_parts_list_to_string (*obj->parts);
+ }
+ catch (const gdb_exception &except)
+ {
+ GDB_PY_HANDLE_EXCEPTION (except);
+ }
+
+ return PyUnicode_Decode (str.c_str (), str.size (),
host_charset (), nullptr);
}
@@ -642,6 +959,39 @@ disasmpy_result_string (PyObject *self, void *closure)
return disasmpy_result_str (self);
}
+/* Implement DisassemblerResult.parts method. Returns a list of all the
+ parts that make up this result. There should always be at least one
+ part, so the returned list should never be empty. */
+
+static PyObject *
+disasmpy_result_parts (PyObject *self, void *closure)
+{
+ disasm_result_object *obj = (disasm_result_object *) self;
+
+ /* These conditions are all enforced when the DisassemblerResult object
+ is created. */
+ gdb_assert (obj->parts != nullptr);
+ gdb_assert (obj->parts->size () > 0);
+ gdb_assert (obj->length > 0);
+
+ gdbpy_ref<> result_list (PyList_New (obj->parts->size ()));
+ if (result_list == nullptr)
+ return nullptr;
+ Py_ssize_t idx = 0;
+ for (auto p : *obj->parts)
+ {
+ gdbpy_ref<> item = gdbpy_ref<>::new_reference (p.get ());
+ PyList_SET_ITEM (result_list.get (), idx, item.release ());
+ ++idx;
+ }
+
+ /* This should follow naturally from the obj->parts list being
+ non-empty. */
+ gdb_assert (PyList_Size (result_list.get()) > 0);
+
+ return result_list.release ();
+}
+
/* Implement DisassemblerResult.__init__. Takes two arguments, an
integer, the length in bytes of the disassembled instruction, and a
string, the disassembled content of the instruction. */
@@ -649,11 +999,12 @@ disasmpy_result_string (PyObject *self, void *closure)
static int
disasmpy_result_init (PyObject *self, PyObject *args, PyObject *kwargs)
{
- static const char *keywords[] = { "length", "string", NULL };
+ static const char *keywords[] = { "length", "string", "parts", NULL };
int length;
- const char *string;
- if (!gdb_PyArg_ParseTupleAndKeywords (args, kwargs, "is", keywords,
- &length, &string))
+ const char *string = nullptr;
+ PyObject *parts_list = nullptr;
+ if (!gdb_PyArg_ParseTupleAndKeywords (args, kwargs, "i|zO", keywords,
+ &length, &string, &parts_list))
return -1;
if (length <= 0)
@@ -663,17 +1014,85 @@ disasmpy_result_init (PyObject *self, PyObject *args, PyObject *kwargs)
return -1;
}
- if (strlen (string) == 0)
+ if (parts_list == Py_None)
+ parts_list = nullptr;
+
+ if (string != nullptr && parts_list != nullptr)
{
- PyErr_SetString (PyExc_ValueError,
- _("String must not be empty."));
+ PyErr_Format (PyExc_ValueError,
+ _("Cannot use 'string' and 'parts' when creating %s."),
+ Py_TYPE (self)->tp_name);
return -1;
}
- disasm_result_object *obj = (disasm_result_object *) self;
- disasmpy_init_disassembler_result (obj, length, std::string (string));
+ if (string != nullptr)
+ {
+ if (strlen (string) == 0)
+ {
+ PyErr_SetString (PyExc_ValueError,
+ _("String must not be empty."));
+ return -1;
+ }
+
+ disasm_result_object *obj = (disasm_result_object *) self;
+ std::vector<gdbpy_ref<>> content;
+ gdbpy_ref<> text_part
+ = make_disasm_text_part (std::string (string), dis_style_text);
+ content.emplace_back (text_part.release ());
+ disasmpy_init_disassembler_result (obj, length, std::move (content));
+ }
+ else
+ {
+ if (!PySequence_Check (parts_list))
+ {
+ PyErr_SetString (PyExc_TypeError,
+ _("'parts' argument is not a sequence"));
+ return -1;
+ }
+
+ Py_ssize_t parts_count = PySequence_Size (parts_list);
+ if (parts_count <= 0)
+ {
+ PyErr_SetString (PyExc_ValueError,
+ _("'parts' list must not be empty."));
+ return -1;
+ }
+
+ disasm_result_object *obj = (disasm_result_object *) self;
+ std::vector<gdbpy_ref<>> content (parts_count);
+
+ struct gdbarch *gdbarch = nullptr;
+ for (Py_ssize_t i = 0; i < parts_count; ++i)
+ {
+ gdbpy_ref<> part (PySequence_GetItem (parts_list, i));
+
+ if (part == nullptr)
+ return -1;
+
+ if (Py_TYPE (part.get ()) == &disasm_addr_part_object_type)
+ {
+ disasm_addr_part_object *addr_part
+ = (disasm_addr_part_object *) part.get ();
+ gdb_assert (addr_part->gdbarch != nullptr);
+ if (gdbarch == nullptr)
+ gdbarch = addr_part->gdbarch;
+ else if (addr_part->gdbarch != gdbarch)
+ {
+ PyErr_SetString (PyExc_ValueError,
+ _("Inconsistent gdb.Architectures used "
+ "in 'parts' sequence."));
+ return -1;
+ }
+ }
+
+ content[i] = std::move (part);
+ }
+
+ disasmpy_init_disassembler_result (obj, length, std::move (content));
+ }
return 0;
+
}
/* Implement __repr__ for the DisassemblerResult type. */
@@ -683,12 +1102,12 @@ disasmpy_result_repr (PyObject *self)
{
disasm_result_object *obj = (disasm_result_object *) self;
- gdb_assert (obj->content != nullptr);
+ gdb_assert (obj->parts != nullptr);
- return PyUnicode_FromFormat ("<%s length=%d string=\"%s\">",
+ return PyUnicode_FromFormat ("<%s length=%d string=\"%U\">",
Py_TYPE (obj)->tp_name,
obj->length,
- obj->content->string ().c_str ());
+ disasmpy_result_str (self));
}
/* Implement memory_error_func callback for disassemble_info. Extract the
@@ -712,16 +1131,22 @@ gdbpy_disassembler::print_address_func (bfd_vma addr,
{
gdbpy_disassembler *dis
= static_cast<gdbpy_disassembler *> (info->application_data);
- print_address (dis->arch (), addr, dis->stream ());
+
+ gdbpy_ref<> addr_part
+ = make_disasm_addr_part (dis->arch (), addr);
+ dis->m_parts.emplace_back (std::move (addr_part));
}
/* constructor. */
gdbpy_disassembler::gdbpy_disassembler (disasm_info_object *obj,
PyObject *memory_source)
- : gdb_printing_disassembler (obj->gdbarch, &m_string_file,
- read_memory_func, memory_error_func,
- print_address_func),
+ : gdb_disassemble_info (obj->gdbarch,
+ read_memory_func,
+ memory_error_func,
+ print_address_func,
+ fprintf_func,
+ fprintf_styled_func),
m_disasm_info_object (obj),
m_memory_source (memory_source)
{ /* Nothing. */ }
@@ -932,20 +1357,39 @@ gdbpy_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr,
return gdb::optional<int> (-1);
}
- /* Validate the text of the disassembled instruction. */
- gdb_assert (result_obj->content != nullptr);
- std::string string (std::move (result_obj->content->release ()));
- if (strlen (string.c_str ()) == 0)
+ /* It is impossible to create a DisassemblerResult object with an empty
+ parts list. We know that each part results in a non-empty string, so
+ we know that the instruction disassembly will not be the empty
+ string. */
+ gdb_assert (result_obj->parts->size () > 0);
+
+ /* Now print out the parts that make up this instruction. */
+ for (auto &p : *result_obj->parts)
{
- PyErr_SetString (PyExc_ValueError,
- _("String attribute must not be empty."));
- gdbpy_print_stack ();
- return gdb::optional<int> (-1);
+ if (Py_TYPE (p.get ()) == &disasm_text_part_object_type)
+ {
+ disasm_text_part_object *text_part
+ = (disasm_text_part_object *) p.get ();
+ gdb_assert (text_part->string != nullptr);
+ info->fprintf_styled_func (info->stream, text_part->style,
+ "%s", text_part->string->c_str ());
+ }
+ else
+ {
+ gdb_assert (Py_TYPE (p.get ()) == &disasm_addr_part_object_type);
+ disasm_addr_part_object *addr_part
+ = (disasm_addr_part_object *) p.get ();
+ /* A DisassemblerAddressPart can only be created by calling a
+ method on DisassembleInfo, and the gdbarch is copied from the
+ DisassembleInfo into the DisassemblerAddressPart. As the
+ DisassembleInfo has its gdbarch initialised from GDBARCH in
+ this scope, and this architecture can't be changed, then the
+ following assert should hold. */
+ gdb_assert (addr_part->gdbarch == gdbarch);
+ info->print_address_func (addr_part->address, info);
+ }
}
- /* Print the disassembled instruction back to core GDB, and return the
- length of the disassembled instruction. */
- info->fprintf_func (info->stream, "%s", string.c_str ());
return gdb::optional<int> (length);
}
@@ -956,10 +1400,143 @@ static void
disasmpy_dealloc_result (PyObject *self)
{
disasm_result_object *obj = (disasm_result_object *) self;
- delete obj->content;
+ delete obj->parts;
Py_TYPE (self)->tp_free (self);
}
+/* The tp_init callback for the DisassemblerPart type. This just raises an
+ exception, which prevents the user from creating objects of this type.
+ Instead the user should create instances of a sub-class. */
+
+static int
+disasmpy_part_init (PyObject *self, PyObject *args, PyObject *kwargs)
+{
+ PyErr_SetString (PyExc_RuntimeError,
+ _("Cannot create instances of DisassemblerPart."));
+ return -1;
+}
+
+/* Return a string representing STYLE. The returned string is used as a
+ constant defined in the gdb.disassembler module. */
+
+static const char *
+get_style_name (enum disassembler_style style)
+{
+ switch (style)
+ {
+ case dis_style_text: return "STYLE_TEXT";
+ case dis_style_mnemonic: return "STYLE_MNEMONIC";
+ case dis_style_sub_mnemonic: return "STYLE_SUB_MNEMONIC";
+ case dis_style_assembler_directive: return "STYLE_ASSEMBLER_DIRECTIVE";
+ case dis_style_register: return "STYLE_REGISTER";
+ case dis_style_immediate: return "STYLE_IMMEDIATE";
+ case dis_style_address: return "STYLE_ADDRESS";
+ case dis_style_address_offset: return "STYLE_ADDRESS_OFFSET";
+ case dis_style_symbol: return "STYLE_SYMBOL";
+ case dis_style_comment_start: return "STYLE_COMMENT_START";
+ }
+
+ gdb_assert_not_reached ("unknown disassembler style");
+}
+
+/* Implement DisassemblerTextPart.__repr__ method. */
+
+static PyObject *
+disasmpy_text_part_repr (PyObject *self)
+{
+ disasm_text_part_object *obj = (disasm_text_part_object *) self;
+
+ gdb_assert (obj->string != nullptr);
+
+ return PyUnicode_FromFormat ("<%s string='%s', style='%s'>",
+ Py_TYPE (obj)->tp_name,
+ obj->string->c_str (),
+ get_style_name (obj->style));
+}
+
+/* Implement DisassemblerTextPart.__str__ attribute. */
+
+static PyObject *
+disasmpy_text_part_str (PyObject *self)
+{
+ disasm_text_part_object *obj = (disasm_text_part_object *) self;
+
+ return PyUnicode_Decode (obj->string->c_str (), obj->string->size (),
+ host_charset (), nullptr);
+}
+
+/* Implement DisassemblerTextPart.string attribute. */
+
+static PyObject *
+disasmpy_text_part_string (PyObject *self, void *closure)
+{
+ return disasmpy_text_part_str (self);
+}
+
+/* Implement DisassemblerTextPart.style attribute. */
+
+static PyObject *
+disasmpy_text_part_style (PyObject *self, void *closure)
+{
+ disasm_text_part_object *obj = (disasm_text_part_object *) self;
+
+ LONGEST style_val = (LONGEST) obj->style;
+ return gdb_py_object_from_longest (style_val).release ();
+}
+
+/* Implement DisassemblerAddressPart.__repr__ method. */
+
+static PyObject *
+disasmpy_addr_part_repr (PyObject *self)
+{
+ disasm_addr_part_object *obj = (disasm_addr_part_object *) self;
+
+ return PyUnicode_FromFormat ("<%s address='%s'>",
+ Py_TYPE (obj)->tp_name,
+ core_addr_to_string_nz (obj->address));
+}
+
+/* Implement DisassemblerAddressPart.__str__ attribute. */
+
+static PyObject *
+disasmpy_addr_part_str (PyObject *self)
+{
+ disasm_addr_part_object *obj = (disasm_addr_part_object *) self;
+
+ std::string str;
+ try
+ {
+ string_file buf;
+ print_address (obj->gdbarch, obj->address, &buf);
+ str = buf.release ();
+ }
+ catch (const gdb_exception &except)
+ {
+ GDB_PY_HANDLE_EXCEPTION (except);
+ }
+
+ return PyUnicode_Decode (str.c_str (), str.size (),
+ host_charset (), nullptr);
+}
+
+/* Implement DisassemblerAddressPart.string attribute. */
+
+static PyObject *
+disasmpy_addr_part_string (PyObject *self, void *closure)
+{
+ return disasmpy_addr_part_str (self);
+}
+
+/* Implement DisassemblerAddressPart.address attribute. */
+
+static PyObject *
+disasmpy_addr_part_address (PyObject *self, void *closure)
+{
+ disasm_addr_part_object *obj = (disasm_addr_part_object *) self;
+
+ return gdb_py_object_from_longest (obj->address).release ();
+}
+
/* The get/set attributes of the gdb.disassembler.DisassembleInfo type. */
static gdb_PyGetSetDef disasm_info_object_getset[] = {
@@ -982,6 +1559,14 @@ Read LEN octets for the instruction to disassemble." },
{ "is_valid", disasmpy_info_is_valid, METH_NOARGS,
"is_valid () -> Boolean.\n\
Return true if this DisassembleInfo is valid, false if not." },
+ { "text_part", (PyCFunction) disasmpy_info_make_text_part,
+ METH_VARARGS | METH_KEYWORDS,
+ "text_part (STRING, STYLE) -> DisassemblerTextPart\n\
+Create a new text part, with contents STRING styled with STYLE." },
+ { "address_part", (PyCFunction) disasmpy_info_make_address_part,
+ METH_VARARGS | METH_KEYWORDS,
+ "address_part (ADDRESS) -> DisassemblerAddressPart\n\
+Create a new address part representing ADDRESS." },
{nullptr} /* Sentinel */
};
@@ -992,6 +1577,28 @@ static gdb_PyGetSetDef disasm_result_object_getset[] = {
"Length of the disassembled instruction.", nullptr },
{ "string", disasmpy_result_string, nullptr,
"String representing the disassembled instruction.", nullptr },
+ { "parts", disasmpy_result_parts, nullptr,
+ "List of all the separate disassembly parts", nullptr },
+ { nullptr } /* Sentinel */
+};
+
+/* The get/set attributes of the gdb.disassembler.DisassemblerTextPart type. */
+
+static gdb_PyGetSetDef disasmpy_text_part_getset[] = {
+ { "string", disasmpy_text_part_string, nullptr,
+ "String representing a text part.", nullptr },
+ { "style", disasmpy_text_part_style, nullptr,
+ "The style of this text part.", nullptr },
+ { nullptr } /* Sentinel */
+};
+
+/* The get/set attributes of the gdb.disassembler.DisassemblerAddressPart type. */
+
+static gdb_PyGetSetDef disasmpy_addr_part_getset[] = {
+ { "string", disasmpy_addr_part_string, nullptr,
+ "String representing an address part.", nullptr },
+ { "address", disasmpy_addr_part_address, nullptr,
+ "The address of this address part.", nullptr },
{ nullptr } /* Sentinel */
};
@@ -1046,6 +1653,13 @@ gdbpy_initialize_disasm ()
PyObject *dict = PyImport_GetModuleDict ();
PyDict_SetItemString (dict, "_gdb.disassembler", gdb_disassembler_module);
+ for (int i = 0; i <= (int) dis_style_comment_start; ++i)
+ {
+ const char *style_name = get_style_name ((enum disassembler_style) i);
+ if (PyModule_AddIntConstant (gdb_disassembler_module, style_name, i) < 0)
+ return -1;
+ }
+
disasm_info_object_type.tp_new = PyType_GenericNew;
if (PyType_Ready (&disasm_info_object_type) < 0)
return -1;
@@ -1062,6 +1676,32 @@ gdbpy_initialize_disasm ()
(PyObject *) &disasm_result_object_type) < 0)
return -1;
+ disasm_part_object_type.tp_new = PyType_GenericNew;
+ if (PyType_Ready (&disasm_part_object_type) < 0)
+ return -1;
+
+ if (gdb_pymodule_addobject (gdb_disassembler_module, "DisassemblerPart",
+ (PyObject *) &disasm_part_object_type) < 0)
+ return -1;
+
+ disasm_addr_part_object_type.tp_new = PyType_GenericNew;
+ if (PyType_Ready (&disasm_addr_part_object_type) < 0)
+ return -1;
+
+ if (gdb_pymodule_addobject (gdb_disassembler_module,
+ "DisassemblerAddressPart",
+ (PyObject *) &disasm_addr_part_object_type) < 0)
+ return -1;
+
+ disasm_text_part_object_type.tp_new = PyType_GenericNew;
+ if (PyType_Ready (&disasm_text_part_object_type) < 0)
+ return -1;
+
+ if (gdb_pymodule_addobject (gdb_disassembler_module,
+ "DisassemblerTextPart",
+ (PyObject *) &disasm_text_part_object_type) < 0)
+ return -1;
+
return 0;
}
@@ -1152,3 +1792,132 @@ PyTypeObject disasm_result_object_type = {
disasmpy_result_init, /* tp_init */
0, /* tp_alloc */
};
+
+/* Describe the gdb.disassembler.DisassemblerPart type. This type exists
+ only as an abstract base-class for the various part sub-types. The
+ init method for this type throws an error. As such we don't both to
+ provide a tp_repr method for this parent class. */
+
+PyTypeObject disasm_part_object_type = {
+ PyVarObject_HEAD_INIT (nullptr, 0)
+ "gdb.disassembler.DisassemblerPart", /*tp_name*/
+ sizeof (PyObject), /*tp_basicsize*/
+ 0, /*tp_itemsize*/
+ 0, /*tp_dealloc*/
+ 0, /*tp_print*/
+ 0, /*tp_getattr*/
+ 0, /*tp_setattr*/
+ 0, /*tp_compare*/
+ 0, /*tp_repr*/
+ 0, /*tp_as_number*/
+ 0, /*tp_as_sequence*/
+ 0, /*tp_as_mapping*/
+ 0, /*tp_hash */
+ 0, /*tp_call*/
+ 0, /*tp_str*/
+ 0, /*tp_getattro*/
+ 0, /*tp_setattro*/
+ 0, /*tp_as_buffer*/
+ Py_TPFLAGS_DEFAULT, /*tp_flags*/
+ "GDB object, representing part of a disassembled instruction", /* tp_doc */
+ 0, /* tp_traverse */
+ 0, /* tp_clear */
+ 0, /* tp_richcompare */
+ 0, /* tp_weaklistoffset */
+ 0, /* tp_iter */
+ 0, /* tp_iternext */
+ 0, /* tp_methods */
+ 0, /* tp_members */
+ 0, /* tp_getset */
+ 0, /* tp_base */
+ 0, /* tp_dict */
+ 0, /* tp_descr_get */
+ 0, /* tp_descr_set */
+ 0, /* tp_dictoffset */
+ disasmpy_part_init, /* tp_init */
+ 0, /* tp_alloc */
+};
+
+/* Describe the gdb.disassembler.DisassemblerTextPart type. */
+
+PyTypeObject disasm_text_part_object_type = {
+ PyVarObject_HEAD_INIT (nullptr, 0)
+ "gdb.disassembler.DisassemblerTextPart", /*tp_name*/
+ sizeof (disasm_text_part_object_type), /*tp_basicsize*/
+ 0, /*tp_itemsize*/
+ 0, /*tp_dealloc*/
+ 0, /*tp_print*/
+ 0, /*tp_getattr*/
+ 0, /*tp_setattr*/
+ 0, /*tp_compare*/
+ disasmpy_text_part_repr, /*tp_repr*/
+ 0, /*tp_as_number*/
+ 0, /*tp_as_sequence*/
+ 0, /*tp_as_mapping*/
+ 0, /*tp_hash */
+ 0, /*tp_call*/
+ disasmpy_text_part_str, /*tp_str*/
+ 0, /*tp_getattro*/
+ 0, /*tp_setattro*/
+ 0, /*tp_as_buffer*/
+ Py_TPFLAGS_DEFAULT, /*tp_flags*/
+ "GDB object, representing a text part of an instruction", /* tp_doc */
+ 0, /* tp_traverse */
+ 0, /* tp_clear */
+ 0, /* tp_richcompare */
+ 0, /* tp_weaklistoffset */
+ 0, /* tp_iter */
+ 0, /* tp_iternext */
+ 0, /* tp_methods */
+ 0, /* tp_members */
+ disasmpy_text_part_getset, /* tp_getset */
+ &disasm_part_object_type, /* tp_base */
+ 0, /* tp_dict */
+ 0, /* tp_descr_get */
+ 0, /* tp_descr_set */
+ 0, /* tp_dictoffset */
+ 0, /* tp_init */
+ 0, /* tp_alloc */
+};
+
+/* Describe the gdb.disassembler.DisassemblerAddressPart type. */
+
+PyTypeObject disasm_addr_part_object_type = {
+ PyVarObject_HEAD_INIT (nullptr, 0)
+ "gdb.disassembler.DisassemblerAddressPart", /*tp_name*/
+ sizeof (disasm_addr_part_object), /*tp_basicsize*/
+ 0, /*tp_itemsize*/
+ 0, /*tp_dealloc*/
+ 0, /*tp_print*/
+ 0, /*tp_getattr*/
+ 0, /*tp_setattr*/
+ 0, /*tp_compare*/
+ disasmpy_addr_part_repr, /*tp_repr*/
+ 0, /*tp_as_number*/
+ 0, /*tp_as_sequence*/
+ 0, /*tp_as_mapping*/
+ 0, /*tp_hash */
+ 0, /*tp_call*/
+ disasmpy_addr_part_str, /*tp_str*/
+ 0, /*tp_getattro*/
+ 0, /*tp_setattro*/
+ 0, /*tp_as_buffer*/
+ Py_TPFLAGS_DEFAULT, /*tp_flags*/
+ "GDB object, representing an address part of an instruction", /* tp_doc */
+ 0, /* tp_traverse */
+ 0, /* tp_clear */
+ 0, /* tp_richcompare */
+ 0, /* tp_weaklistoffset */
+ 0, /* tp_iter */
+ 0, /* tp_iternext */
+ 0, /* tp_methods */
+ 0, /* tp_members */
+ disasmpy_addr_part_getset, /* tp_getset */
+ &disasm_part_object_type, /* tp_base */
+ 0, /* tp_dict */
+ 0, /* tp_descr_get */
+ 0, /* tp_descr_set */
+ 0, /* tp_dictoffset */
+ 0, /* tp_init */
+ 0, /* tp_alloc */
+};