aboutsummaryrefslogtreecommitdiff
path: root/gdb/value.h
diff options
context:
space:
mode:
authorAndrew Burgess <aburgess@redhat.com>2021-07-13 14:44:27 -0400
committerAndrew Burgess <aburgess@redhat.com>2023-06-05 13:25:08 +0100
commitbaab375361c365afee2577c94cbbd3fdd443d6da (patch)
tree971366ab82a21e507d43b242a78bfa9db7526e1f /gdb/value.h
parentf4afd6cb1b72760d1c8af4bc82c74c289aa1ecf7 (diff)
downloadgdb-baab375361c365afee2577c94cbbd3fdd443d6da.zip
gdb-baab375361c365afee2577c94cbbd3fdd443d6da.tar.gz
gdb-baab375361c365afee2577c94cbbd3fdd443d6da.tar.bz2
gdb: building inferior strings from within GDB
History Of This Patch ===================== This commit aims to address PR gdb/21699. There have now been a couple of attempts to fix this issue. Simon originally posted two patches back in 2021: https://sourceware.org/pipermail/gdb-patches/2021-July/180894.html https://sourceware.org/pipermail/gdb-patches/2021-July/180896.html Before Pedro then posted a version of his own: https://sourceware.org/pipermail/gdb-patches/2021-July/180970.html After this the conversation halted. Then in 2023 I (Andrew) also took a look at this bug and posted two versions: https://sourceware.org/pipermail/gdb-patches/2023-April/198570.html https://sourceware.org/pipermail/gdb-patches/2023-April/198680.html The approach taken in my first patch was pretty similar to what Simon originally posted back in 2021. My second attempt was only a slight variation on the first. Pedro then pointed out his older patch, and so we arrive at this patch. The GDB changes here are mostly Pedro's work, but updated by me (Andrew), any mistakes are mine. The tests here are a combinations of everyone's work, and the commit message is new, but copies bits from everyone's earlier work. Problem Description =================== Bug PR gdb/21699 makes the observation that using $_as_string with GDB's printf can cause GDB to print unexpected data from the inferior. The reproducer is pretty simple: #include <stddef.h> static char arena[100]; /* Override malloc() so value_coerce_to_target() gets a known pointer, and we know we"ll see an error if $_as_string() gives a string that isn't null terminated. */ void *malloc (size_t size) { memset (arena, 'x', sizeof (arena)); if (size > sizeof (arena)) return NULL; return arena; } int main () { return 0; } And then in a GDB session: $ gdb -q test Reading symbols from /tmp/test... (gdb) start Temporary breakpoint 1 at 0x4004c8: file test.c, line 17. Starting program: /tmp/test Temporary breakpoint 1, main () at test.c:17 17 return 0; (gdb) printf "%s\n", $_as_string("hello") "hello"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (gdb) quit The problem above is caused by how value_cstring is used within py-value.c, but once we understand the issue then it turns out that value_cstring is used in an unexpected way in many places within GDB. Within py-value.c we have a null-terminated C-style string. We then pass a pointer to this string, along with the length of this string (so not including the null-character) to value_cstring. In value_cstring GDB allocates an array value of the given character type, and copies in requested number of characters. However value_cstring does not add a null-character of its own. This means that the value created by calling value_cstring is only null-terminated if the null-character is included in the passed in length. In py-value.c this is not the case, and indeed, in most uses of value_cstring, this is not the case. When GDB tries to print one of these strings the value contents are pushed to the inferior, and then read back as a C-style string, that is, GDB reads inferior memory until it finds a null-terminator. For the py-value.c case, no null-terminator is pushed into the inferior, so GDB will continue reading inferior memory until a null-terminator is found, with unpredictable results. Patch Description ================= The first thing this patch does is better define what the arguments for the two function value_cstring and value_string should represent. The comments in the header file are updated to describe whether the length argument should, or should not, include a null-character. Also, the data argument is changed to type gdb_byte. The functions as they currently exist will handle wide-characters, in which case more than one 'char' would be needed for each character. As such using gdb_byte seems to make more sense. To avoid adding casts throughout GDB, I've also added an overload that still takes a 'char *', but asserts that the character type being used is of size '1'. The value_cstring function is now responsible for adding a null character at the end of the string value it creates. However, once we start looking at how value_cstring is used, we realise there's another, related, problem. Not every language's strings are null terminated. Fortran and Ada strings, for example, are just an array of characters, GDB already has the function value_string which can be used to create such values. Consider this example using current GDB: (gdb) set language ada (gdb) p $_gdb_setting("arch") $1 = (97, 117, 116, 111) (gdb) ptype $ type = array (1 .. 4) of char (gdb) p $_gdb_maint_setting("test-settings string") $2 = (0) (gdb) ptype $ type = array (1 .. 1) of char This shows two problems, first, the $_gdb_setting and $_gdb_maint_setting functions are calling value_cstring using the builtin_char character, rather than a language appropriate type. In the first call, the 'arch' case, the value_cstring call doesn't include the null character, so the returned array only contains the expected characters. But, in the $_gdb_maint_setting example we do end up including the null-character, even though this is not expected for Ada strings. This commit adds a new language method language_defn::value_string, this function takes a pointer and length and creates a language appropriate value that represents the string. For C, C++, etc this will be a null-terminated string (by calling value_cstring), and for Fortran and Ada this can be a bounded array of characters with no null terminator. Additionally, this new language_defn::value_string function is responsible for selecting a language appropriate character type. After this commit the only calls to value_cstring are from the C expression evaluator and from the default language_defn::value_string. And the only calls to value_string are from Fortan, Ada, and ObjectC related code. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=21699 Co-Authored-By: Simon Marchi <simon.marchi@efficios.com> Co-Authored-By: Andrew Burgess <aburgess@redhat.com> Co-Authored-By: Pedro Alves <pedro@palves.net> Approved-By: Simon Marchi <simon.marchi@efficios.com>
Diffstat (limited to 'gdb/value.h')
-rw-r--r--gdb/value.h41
1 files changed, 39 insertions, 2 deletions
diff --git a/gdb/value.h b/gdb/value.h
index 508367a..512d05e 100644
--- a/gdb/value.h
+++ b/gdb/value.h
@@ -1183,11 +1183,48 @@ class scoped_value_mark
bool m_freed = false;
};
-extern struct value *value_cstring (const char *ptr, ssize_t len,
+/* Create not_lval value representing a NULL-terminated C string. The
+ resulting value has type TYPE_CODE_ARRAY. The string passed in should
+ not include embedded null characters.
+
+ PTR points to the string data; COUNT is number of characters (does
+ not include the NULL terminator) pointed to by PTR, each character is of
+ type (and size of) CHAR_TYPE. */
+
+extern struct value *value_cstring (const gdb_byte *ptr, ssize_t count,
struct type *char_type);
-extern struct value *value_string (const char *ptr, ssize_t len,
+
+/* Specialisation of value_cstring above. In this case PTR points to
+ single byte characters. CHAR_TYPE must have a length of 1. */
+inline struct value *value_cstring (const char *ptr, ssize_t count,
+ struct type *char_type)
+{
+ gdb_assert (char_type->length () == 1);
+ return value_cstring ((const gdb_byte *) ptr, count, char_type);
+}
+
+/* Create a not_lval value with type TYPE_CODE_STRING, the resulting value
+ has type TYPE_CODE_STRING.
+
+ PTR points to the string data; COUNT is number of characters pointed to
+ by PTR, each character has the type (and size of) CHAR_TYPE.
+
+ Note that string types are like array of char types with a lower bound
+ defined by the language (usually zero or one). Also the string may
+ contain embedded null characters. */
+
+extern struct value *value_string (const gdb_byte *ptr, ssize_t count,
struct type *char_type);
+/* Specialisation of value_string above. In this case PTR points to
+ single byte characters. CHAR_TYPE must have a length of 1. */
+inline struct value *value_string (const char *ptr, ssize_t count,
+ struct type *char_type)
+{
+ gdb_assert (char_type->length () == 1);
+ return value_string ((const gdb_byte *) ptr, count, char_type);
+}
+
extern struct value *value_array (int lowbound, int highbound,
struct value **elemvec);