Age | Commit message (Collapse) | Author | Files | Lines |
|
Fix test-case gdb.server/file-transfer.exp for remote host using
gdb_remote_download and host_standard_output_file.
Tested on x86_64-linux.
|
|
When running test-case gdb.server/stop-reply-no-thread-multi.exp with
host+target board local-remote-host-native, I run into a time-out:
...
(gdb) PASS: gdb.server/stop-reply-no-thread-multi.exp: target-non-stop=off: \
to_disable=: disconnect
builtin_spawn /usr/bin/ssh -t -l vries 127.0.0.1 gdbserver --once \
localhost:2346 stop-reply-no-thread-multi^M
Process stop-reply-no-thread-multi created; pid = 32600^M
Listening on port 2346^M
set remote threads-packet off^M
FAIL: gdb.server/stop-reply-no-thread-multi.exp: target-non-stop=off: \
to_disable=: set remote threads-packet off (timeout)
...
This is due to this line in ${board}_spawn:
...
set board_info($board,fileid) $spawn_id
...
We have the following series of events:
- gdb is spawned, setting fileid
- a few gdb commands (set height etc) are send using fileid, arrive at gdb and
are successful
- gdbserver is spawned, overwriting fileid
- the next gdb command is sent using fileid, so it's send
to gdbserver instead of gdb, and we run into the timeout.
There is some notion of current gdb, tracked in both gdb_spawn_id and fileid
of the host board (see switch_gdb_spawn_id). And because the host and target
board are the same, spawning something on the target overwrites the fileid on
host, and consequently the current gdb.
Fix this by only setting fileid when spawning gdb.
Tested on x86_64-linux.
Now gdb.server/*.exp passes for host+target board local-remote-host-native,
except for file-transfer.exp.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29734
|
|
remote-gdbserver-on-localhost
With test-case gdb.server/non-existing-program.exp and native, I have reliably:
...
(gdb) builtin_spawn gdbserver stdio non-existing-program^M
stdin/stdout redirected^M
/bin/bash: line 0: exec: non-existing-program: not found^M
During startup program exited with code 127.^M
Exiting^M
PASS: gdb.server/non-existing-program.exp: gdbserver exits cleanly
...
But with target board remote-gdbserver-on-localhost I sometimes have:
...
(gdb) builtin_spawn /usr/bin/ssh -t -l remote-target localhost gdbserver \
stdio non-existing-program^M
stdin/stdout redirected^M
/bin/bash: line 0: exec: non-existing-program: not found^M
During startup program exited with code 127.^M
Exiting^M
Connection to localhost closed.^M^M
PASS: gdb.server/non-existing-program.exp: gdbserver exits cleanly
...
and sometimes the exact same output, but a FAIL instead.
Fix this by replacing "Exiting\r\n$" with "Exiting\r\n" in the regexps.
Tested on x86_64-linux.
|
|
Proc allow_rust_tests returns 0 when there's no rust compiler, but that gives
the wrong answer for gdb.rust/expr.exp, which doesn't require it.
Fix this by using can_compile rust in the test-cases that need it, and just
returning 1 in allow_rust_tests.
Tested on x86_64-linux.
|
|
If I deinstall the rust compiler, I get:
...
gdb compile failed, default_target_compile: Can't find rustc --color never.
UNTESTED: gdb.rust/watch.exp: failed to prepare
...
Fix this by adding can_compile rust, and using it in allow_rust_tests, such
that we have instead:
...
UNSUPPORTED: gdb.rust/watch.exp: require failed: allow_rust_tests
...
Since the rest of the code in allow_rust_tests is also about availability of
the rust compiler, move it to can_compile.
Tested on x86_64-linux.
|
|
With test-case gdb.rust/watch.exp and remote host I run into:
...
Executing on host: gcc watch.rs -g -lm -o watch (timeout = 300)
...
ld:watch.rs: file format not recognized; treating as linker script
ld:watch.rs:1: syntax error
...
UNTESTED: gdb.rust/watch.exp: failed to prepare
...
The problem is that find_rustc returns "" for remote host, so we fall back to gcc, which fails.
Fix this by returning 0 in allow_rust_tests for remote host.
Tested on x86_64-linux.
|
|
Fix gnat_runtime_has_debug_info for remote host by checking for
allow_ada_tests.
This fixes an error for test-case gdb.testsuite/gdb-caching-proc.exp and
remote host.
Tested on x86_64-linux.
|
|
Fix test-case gdb.stabs/exclfwd.exp for remote host using include_file.
Tested on x86_64-linux.
|
|
Fix test-case gdb.stabs/weird.exp for remote host by not using an absolute
destfile argument to gdb_remote_download, which doesn't work well with remotedir.
|
|
Fix test-case gdb.gdb/unittest.exp for remote host, by:
- disabling the completion tests if readline is not used, and
- not using with_gdb_cwd $dir for remote host (because it does
not support changing to ".").
Tested on x86_64-linux.
|
|
In do_self_tests we try to find out the location of the gdb to debug, which
will then be copied and renamed to xgdb.
In principle, the host board specifies the location of GDB, on host.
With remote host, we could upload that gdb from host to build/target, but we
would miss the data directory (which is listed as the reason to skip
do_self_tests for remote target).
We could fix that by instead taking the gdb from build instead, but that
wouldn't work with installed testing.
It seems easier to just skip this on remote host.
It could be made to work for the "[is_remote host] && [is_remote target]
&& host == target" scenario (see board local-remote-host-native.exp), but
that doesn't seem worth the effort.
Tested on x86_64-linux.
|
|
A user here at AdaCore noticed that, when debugging a certain program,
a stack frame reported line 34358, where it should have been line
99894.
After debugging a bit, I discovered:
(top) p (99894 & ~65536)
$60 = 34358
That line, symbol::line is too narrow.
This patch widens the member and changes all the uses that currently
use the narrower type.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
While working on 128-bit integer support, I found one spot in Ada that
needed a fix as well.
|
|
This changes gdb to use scalar arithmetic for expression evaluation.
I suspect this patch is not truly complete, as there may be code paths
that still don't correctly handle 128-bit integers. However, many
things do work now.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30190
|
|
Fix test-case gdb.asm/asm-source.exp for remote host using
host_standard_output_file and gdb_remote_download.
Tested on x86_64-linux.
|
|
Fix test-case gdb.dwarf2/imported-unit-bp-c.exp on remote by removing a
downloaded source file.
Tested on x86_64-linux.
|
|
Declare test-case gdb.dwarf2/gdb-add-index-symlink.exp unsupported for remote
host, because the current implementation of gdb_ensure_index doesn't support it.
Tested on x86_64-linux.
|
|
Fix test-case gdb.dwarf2/gdb-index-cxx.exp for remote host using
host_standard_output_file.
Tested on x86_64-linux.
|
|
Fix test-case gdb.dwarf2/enqueued-cu-base-addr.exp for remote host by using
$testfile instead $binfile.
Tested on x86_64-linux.
|
|
Fix test-case gdb.dwarf2/gdb-index.exp on remote host using
gdb_remote_download and host_standard_output_file.
Also declare the test-case unsupported with readnow.
Tested on x86_64-linux.
|
|
Fix test-case gdb.dwarf2/per-bfd-sharing.exp for remote host using
gdb_remote_download.
Likewise in a few other test-cases.
Tested on x86_64-linux.
|
|
For test-case gdb.base/index-cache.exp and remote host, this:
...
lassign [remote_exec host sh "-c \"rm $cache_dir/*.gdb-index\""] ret
...
gives us:
...
Executing on host: sh -c rm /tmp/tmp.m3L7m2AVkL/*.gdb-index (timeout = 300)
builtin_spawn -ignore SIGHUP sh -c rm /tmp/tmp.m3L7m2AVkL/*.gdb-index^M
rm: missing operand^M
Try 'rm --help' for more information.^M
FAIL: gdb.dwarf2/per-bfd-sharing.exp: couldn't remove files in temporary cache dir
...
Fix this using quote_for_host. Likewise in gdb.dwarf2/per-bfd-sharing.exp.
Tested on x86_64-linux.
|
|
A few test-cases in gdb.dwarf2 use something like:
...
additional_flags=\"-DFOO=BAR + 10\"
...
which doesn't work on remote host.
Fix this by introducing a new proc quote_for_host that also works for remote
host, such that we have:
...
additional_flags=[quote_for_host -DFOO=BAR + 10]
...
Tested on x86_64-linux.
|
|
Proc have_index is mostly used with $binfile, which gives problems
for remote host.
Fix this by using "file tail" on the proc argument.
Tested on x86_64-linux.
|
|
|
|
For test-case gdb.dlang/dlang-start.exp, I run into:
...
UNSUPPORTED: gdb.dlang/dlang-start.exp: require failed: can_compile d
...
My distro has no support for gdc, but I'd like to have the test-case
running and passing, so let's rewrite the test-case using dwarf assembly
and add it alongside (rather than replacing it, because it's good to use
actual compiler output if we have it available).
My distro does have a package providing dmd, so let's mimic that debug info in
the dwarf assembly. This gives us:
...
(gdb) start ^M
Temporary breakpoint 1 at 0x4004ab^M
Starting program: dlang-start-2 ^M
^M
Temporary breakpoint 1, 0x00000000004004ab in _Dmain ()^M
...
The "_Dmain ()" should probably be "D main", I've filed PR30276 about that.
Also add a "show language" to check that we automatically set the language
correctly to D.
Note that the dwarf assembly also describes main, otherwise the test-case
doesn't function as regression test for commit 47fe57c9281 ("Fix "start" for
D, Rust, etc").
Tested on x86_64-linux.
|
|
On openSUSE Leap 15.4, I get:
...
Running gdb.dlang/dlang-start.exp ...
gdb compile failed, default_target_compile: Can't find gdc.
UNTESTED: gdb.dlang/dlang-start.exp: failed to prepare
...
Fix this by:
- introducing a new proc can_compile, and
- requiring "can_compile d" in the test-case,
such that I have instead:
...
Running gdb.dlang/dlang-start.exp ...
UNSUPPORTED: gdb.dlang/dlang-start.exp: require failed: can_compile d
...
Tested on x86_64-linux, on openSUSE Leap 15.4 and Fedora 37.
|
|
While trying to use gdb_can_simple_compile with a d program, I ran into:
...
/data/vries/gdb/f37/build/gdb/testsuite/temp/105856/can_compile_d-105856.d: \
error: module 'can_compile_d-105856' has non-identifier characters in \
filename, use module declaration instead
...
The d compiler has a problem with the filename can_compile_d-105856.d, which
contains the pid. The pid is added by gdb_simple_compile:
...
set obj [standard_temp_file $name-[pid].$postfix]
...
but it's unnecessary because standard_temp_file already uses the pid.
Fix this by removing "[pid]" in all calls to standard_temp_file.
Tested on x86_64-linux.
|
|
Simon pointed out that with gdb.dap/*.exp and target board
native-gdbserver, we run into problems.
I see for each test-case:
...
+++ run
Traceback (most recent call last):
File "startup.py", line 146, in exec_and_log
output = gdb.execute(cmd, from_tty=True, to_string=True)
gdb.error: Don't know how to run. Try "help target".
...
Likewise with target board native-extended-gdbserver.
Fix this by:
- adding a new proc allow_dap_tests,
- using it in all the gdb.dap tests, and
- bailing out if GDBFLAGS/INTERNAL_GDBFLAGS contains
"set auto-connect-native-target off".
Tested on x86_64-linux.
Reported-By: Simon Marchi <simon.marchi@efficios.com>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The type-allocation patches introduced a small regression that was
picked up by the AdaCore internal test suite. Previously, the name of
a range type was preserved by resolve_dynamic_range, but now it is
not. This patch changes this code to preserve the name.
Reviewed-By: Bruno Larsen <blarsen@redhat.com>
|
|
The evaluate command supports a "context" parameter which tells the
adapter the context in which an evaluation occurs. One of the
supported values is "repl", which we took to mean evaluation of a gdb
command. That is what this patch implements.
Note that some gdb commands probably will not work correctly with the
rest of the protocol. For example if the user types "continue",
confusion may result.
This patch requires the earlier patch to fix up scopes in DAP.
|
|
Since commit 6d263fe46e0 ("Avoid bad breakpoints with --gc-sections"), there
was a silent regression on openSUSE Leap 15.4 for test-case
gdb.cp/m-static.exp, from:
...
(gdb) info variable everywhere^M
All variables matching regular expression "everywhere":^M
^M
File /home/vries/tmp.local-remote-host-native/m-static.h:^M
8: const int gnu_obj_4::everywhere;^M
(gdb)
...
to:
...
(gdb) info variable everywhere^M
All variables matching regular expression "everywhere":^M
^M
File /data/vries/gdb/src/gdb/testsuite/gdb.cp/m-static.h:^M
8: const int gnu_obj_4::everywhere;^M
^M
File /data/vries/gdb/src/gdb/testsuite/gdb.cp/m-static1.cc:^M
8: const int gnu_obj_4::everywhere;^M
(gdb)
...
Another regression was found due to that commit, and it was fixed in commit
99d679e7b30 ("[gdb/symtab] Fix "file index out of range" complaint") by
limiting the scope of the fix in the original commit.
Fix this regression by yet further limiting the scope of that fix, making sure
that this bit in dwarf_decode_lines is executed again for m-static1.cc:
...
/* Make sure a symtab is created for every file, even files
which contain only variables (i.e. no code with associated
line numbers). */
...
Tested on x86_64-linux.
PR symtab/30265
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30265
|
|
In proc mi_expect_stop there's a proc argument reason that's handled like so:
...
set r "reason=\"$reason\","
...
That's fine for say:
...
set reason "foo"
...
for which this evaluates to:
...
set r "reason=\"foo\","
...
But there are more complex uses, for instance:
...
set reason "breakpoint-hit\",disp=\"keep\",bkptno=\"$decimal"
...
which evaluates to:
...
set r "\"breakpoint-hit\",disp=\"keep\",bkptno=\"$decimal\""
...
Note how in this reason argument, the first two '\"' seems to form a pair
surrounding ',disp=', which is not the case, which is confusing.
Fix this by only adding the quotes in mi_expect_stop if the string doesn't
already contain quotes, such that we have the more readable:
...
set reason "\"breakpoint-hit\",disp=\"keep\",bkptno=\"$decimal\""
...
Tested on x86_64-linux.
|
|
In commit 722c4596034 ("[gdb/testsuite] Fix gdb.cp/*.exp for remote host"), I
needed to change ".*/" into "(.*/)?" in:
...
gdb_test "info variable everywhere" \
"File .*/m-static\[.\]h.*const int gnu_obj_4::everywhere;"
...
However, due to the fact that I got this output:
...
(gdb) info variable everywhere^M
All variables matching regular expression "everywhere":^M
^M
File /data/vries/gdb/src/gdb/testsuite/gdb.cp/m-static.h:^M
8: const int gnu_obj_4::everywhere;^M
^M
File /data/vries/gdb/src/gdb/testsuite/gdb.cp/m-static1.cc:^M
8: const int gnu_obj_4::everywhere;^M
...
I decided to make the matching somewhat stricter, to make sure that the two
matched lines were subsequent.
The commit turned out to be more strict than intended, and caused a regression
on Ubuntu 20.04, where the output was instead:
...
(gdb) info variable everywhere^M
All variables matching regular expression "everywhere":^M
^M
File /data/vries/gdb/src/gdb/testsuite/gdb.cp/m-static.h:^M
8: const int gnu_obj_4::everywhere;^M
...
At that point I realized I'm looking at a bug (filed as PR symtab/30265),
which manifests on openSUSE Leap 15.4 for native and readnow, and on Ubuntu
20.04 for readnow, but not for native.
Before my commit, the test-case passed whether the bug manifested or not.
After my commit, the test-case only passed when the bug manifested.
Fix the test-case regression by reverting to the situation before the commit:
pass whether the bug manifests or not. We could add an xfail for the PR, but
I'm expecting a fix soon, so that doesn't look worth the effort.
Tested on x86_64-linux, both on openSUSE Leap 15.4 and Ubuntu 20.04, both with
native and readnow.
Reported-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Simon reported that doing:
...
$ while make check-parallel TESTS='gdb.opencl/*.exp' -j 100; do true; done
...
could run into:
...
ERROR: remote_download to target of \
/data/vries/gdb/src/gdb/testsuite/lib/opencl_kernel.cl to opencl_kernel.cl: \
cp: cannot create regular file 'opencl_kernel.cl': File exists
...
Fix this by using gdb_remote_download (instead of plain remote_download) in
allow_opencl_test, which takes care of:
- downloading to a location which is safe for parallel testing, by
using standard_output_file, and
- cleaning up the downloaded file, meaning we can remove the corresponding
"remote_file target delete ${clprogram}" lines in allow_opencl_test.
Tested on x86_64-linux.
Reported-by: Simon Marchi <simon.marchi@efficios.com>
|
|
change
Commit 904d9b02a185 ("gdb: make "maintenance info line-table" show
relocated addresses again") changed the format of that command, but
failed to adjust some test cases that relied on it. This patch fixes
it.
The failures fixed are:
FAIL: gdb.base/maint.exp: maint info line-table w/o a file name
FAIL: gdb.dwarf2/dw2-out-of-range-end-of-seq.exp: END with address 1 eliminated
FAIL: gdb.dwarf2/dw2-ranges-base.exp: count END markers in line table
Change-Id: I946580d5e100f1beeac99a9e90d7819c6bb4ac6c
|
|
Fix test-case gdb.cp/cp-relocate.exp for remote host using
gdb_remote_download.
Tested on x86_64-linux.
|
|
When running test-cases gdb.cp/annota{2,3}.exp with target board
native-extended-gdbserver, we run into a few FAILs, due to the test-cases
trying to match inferior output together with gdb output.
Fix this by ignoring the inferior output in this case.
Tested on x86_64-linux.
|
|
Fix a few test-cases in gdb.cp/*.exp for remote host using new proc
include_file.
Tested on x86_64-linux.
|
|
Commit 1acc9dca423f ("Change linetables to be objfile-independent")
changed "maintenance info line-table" to print unrelocated addresses
instead of relocated. This breaks a few tests on systems where that
matters. The ones I see are:
Running /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.base/consecutive.exp ...
FAIL: gdb.base/consecutive.exp: stopped at bp, 2nd instr (missing hex prefix)
Running /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.base/async.exp ...
FAIL: gdb.base/async.exp: stepi&
FAIL: gdb.base/async.exp: nexti&
FAIL: gdb.base/async.exp: finish&
These tests run "maintenance info line-table" to record the address of
some lines, and then use these addresses in expected patterns. It
therefore expects these addresses to match the runtime addresses,
therefore the relocated addresses.
Add back the relocated addresses, next to the unrelocated addresses,
like so:
INDEX LINE REL-ADDRESS UNREL-ADDRESS IS-STMT PROLOGUE-END
0 6 0x0000555555555119 0x0000000000001119 Y
1 7 0x000055555555511d 0x000000000000111d Y
2 8 0x0000555555555123 0x0000000000001123 Y
3 END 0x0000555555555125 0x0000000000001125 Y
The unrelocated addresses can always be useful trying to map this
information with a DWARF info dump.
Adjust the is_stmt_addresses proc in the testsuite to match the new
output.
Change-Id: I59558f167e13e63421c9e0f2cad192e7c95c10cf
|
|
Make sure the result of each remote_exec in gdb/testsuite/boards/*.exp is
checked.
Tested on x86_64-linux.
|
|
In a recent commit I forgot to add a double quote before chmod here:
...
remote_exec build $rsh_cmd chmod go-rx ."
...
Fix it by adding the missing double quote.
|
|
Looking at the implementation of ${board}_file in remote-stdio-gdbserver.exp,
I don't see a relevant difference with the implementation of standard_file
in dejagnu.
Simplify the board by removing ${board}_file.
Tested on x86_64-linux, by running gdb.testsuite/board-sanity.exp.
|
|
Some boards in gdb/testsuite/boards use the hardcoded ipv4 address "127.0.0.1".
Use instead "localhost".
Tested on x86_64-linux.
|
|
Fix test-case gdb.xml/tdesc-regs.exp for remote host by using appropriate
filenames.
Tested on x86_64-linux.
|
|
Fix test-case gdb.xml/tdesc-reload.exp for remote host by using appropriate
filenames.
Tested on x86_64-linux.
|
|
In commit ff581559f9d ("[gdb/testsuite] Add gdb.testsuite/board-sanity.exp") I
removed handling of HOST_DIR in local-remote-host-native.exp to fix FAILs
in test-case gdb.testsuite/board-sanity.exp.
Reintroduce handling of HOST_DIR using remotedir, now that using remotedir for
a host board no longer make compilation fail due to commit 80d6c79866f
("[gdb/testsuite] Handle remotedir in remote_upload").
This fixes an XFAIL in gdb.testsuite/board-sanity.exp, introduced in commit
3741934fdb0 ("[gdb/testsuite] Set remotedir by default in some boards").
Tested on x86_64-linux.
|
|
The type allocation changes introduced a failure in python-helper.exp
that I did not notice. The bug is that, with these patches,
arch-allocated integer types have a TYPE_SPECIFIC_INT object attached.
This patch updates the test to allow this.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30253
|
|
Dejagnu's remotedir implementation has support in remote_exec and
remote_download, but not remote_upload.
Consider the following scenario:
- downloading an executable to target,
- running it,
- uploading a file produced by the executable
while assuming remote target user remote-target with homedir
/home/remote-target and remotedir set to /home/remote-target/tmp.
Concretely, it looks like this:
...
# binfile == "$outputs/gdb.abc/a.out"
set target_binfile [remote_download target $binfile]
# target_binfile == "/home/remote-target/tmp/a.out"
remote_exec target $target_binfile
# Running $target_binfile produced /home/remote-target/tmp/result.txt.
set result [remote_upload target /home/remote-target/tmp/result.txt \
$outputs/gdb.abc/result.txt]
# result == $outputs/gdb.abc/result.txt.
...
Add a remote_upload implementation that also handles remotedir in lib/gdb.exp,
overriding dejagnu's remote_upload, such that we can simplify the
remote_upload call to:
...
set result [remote_upload target result.txt $outputs/gdb.abc/result.txt]
...
Tested on x86_64-linux.
PR testsuite/30250
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30250
|
|
In some cases GDB will fail when attempting to complete a command that
involves a rust symbol, the failure can manifest as a crash.
The problem is caused by the completion_match_for_lcd object being
left containing invalid data during calls to cp_symbol_name_matches_1.
The first question to address is why we are calling a C++ support
function when handling a rust symbol. That's due to GDB's auto
language detection for msymbols, in some cases GDB can't tell if a
symbol is a rust symbol, or a C++ symbol.
The test application contains symbols for functions which are
statically linked in from various rust support libraries. There's no
DWARF for these symbols, so all GDB has is the msymbols built from the
ELF symbol table.
Here's the problematic symbol that leads to our crash:
mangled: _ZN4core3str21_$LT$impl$u20$str$GT$5parse17h5111d2d6a50d22bdE
demangled: core::str::<impl str>::parse
As an msymbol this is initially created with language auto, then GDB
eventually calls symbol_find_demangled_name, which loops over all
languages calling language_defn::sniff_from_mangled_name, the first
language that can demangle the symbol gets assigned as the language
for that symbol.
Unfortunately, there's overlap in the mangled symbol names,
some (legacy) rust symbols can be demangled as both rust and C++, see
cplus_demangle in libiberty/cplus-dem.c where this is mentioned.
And so, because we check the C++ language before we check for rust,
then the msymbol is (incorrectly) given the C++ language.
Now it's true that is some cases we might be able to figure out that a
demangled symbol is not actually a valid C++ symbol, for example, in
our case, the construct '::<impl str>::' is not, I believe, valid in a
C++ symbol, we could look for ':<' and '>:' and refuse to accept this
as a C++ symbol.
However, I'm not sure it is always possible to tell that a demangled
symbol is rust or C++, so, I think, we have to accept that some times
we will get this language detection wrong.
If we accept that we can't fix the symbol language detection 100% of
the time, then we should make sure that GDB doesn't crash when it gets
the language wrong, that is what this commit addresses.
In our test case the user tries to complete a symbol name like this:
(gdb) complete break pars
This results in GDB trying to find all symbols that match 'pars',
eventually we consider our problematic symbol, and we end up with a
call stack that looks like this:
#0 0x0000000000f3c6bd in strncmp_iw_with_mode
#1 0x0000000000706d8d in cp_symbol_name_matches_1
#2 0x0000000000706fa4 in cp_symbol_name_matches
#3 0x0000000000df3c45 in compare_symbol_name
#4 0x0000000000df3c91 in completion_list_add_name
#5 0x0000000000df3f1d in completion_list_add_msymbol
#6 0x0000000000df4c94 in default_collect_symbol_completion_matches_break_on
#7 0x0000000000658c08 in language_defn::collect_symbol_completion_matches
#8 0x0000000000df54c9 in collect_symbol_completion_matches
#9 0x00000000009d98fb in linespec_complete_function
#10 0x00000000009d99f0 in complete_linespec_component
#11 0x00000000009da200 in linespec_complete
#12 0x00000000006e4132 in complete_address_and_linespec_locations
#13 0x00000000006e4ac3 in location_completer
In cp_symbol_name_matches_1 we enter a loop, this loop repeatedly
tries to match the demangled problematic symbol name against the user
supplied text ('pars'). Each time around the loop another component
of the symbol name is stripped off, thus, we check 'pars' against
these options:
core::str::<impl str>::parse
str::<impl str>::parse
<impl str>::parse
parse
As soon as we get a match the cp_symbol_name_matches_1 exits its loop
and returns. In our case, when we're looking for 'pars', the match
occurs on the last iteration of the loop, when we are comparing to
'parse'.
Now the problem here is that cp_symbol_name_matches_1 uses the
strncmp_iw_with_mode, and inside strncmp_iw_with_mode we allow for
skipping over template parameters. This allows GDB to match the
symbol name 'foo<int>(int,int)' if the user supplies 'foo(int,'.
Inside strncmp_iw_with_mode GDB will record any template arguments
that it has skipped over inside the completion_match_for_lcd object
that is passed in as an argument.
And so, when GDB tries to match against '<impl str>::parse', the first
thing it sees is '<impl str>', GDB assumes this is a template argument
and records this as a skipped region within the
completion_match_for_lcd object. After '<impl str>' GDB sees a ':'
character, which doesn't match with the 'pars' the user supplied, so
strncmp_iw_with_mode returns a value indicating a non-match. GDB then
removes the '<impl str>' component from the symbol name and tries
again, this time comparing to 'parse', which does match.
Having found a match, then in cp_symbol_name_matches_1 we record the
match string, and the full symbol name within the
completion_match_result object, and return.
The problem here is that the skipped region, the '<impl str>' that we
recorded in the penultimate loop iteration was never discarded, its
still there in our returned result.
If we look at what the pointers held in the completion_match_result
that cp_symbol_name_matches_1 returns, this is what we see:
core::str::<impl str>::parse
| \________/ |
| | '--- completion match string
| '---skip range
'--- full symbol name
When GDB calls completion_match_for_lcd::finish, GDB tries to create a
string using the completion match string (parse), but excluding the
skip range, as the stored skip range is before the start of the
completion match string, then GDB tries to do some weird string
creation, which will cause GDB to crash.
The reason we don't often see this problem in C++ is that for C++
symbols there is always some non-template text before the template
argument. This non-template text means GDB is likely to either match
the symbol, or reject the symbol without storing a skip range.
However, notice, I did say, we don't often see this problem. Once I
understood the issue, I was able to reproduce the crash using a pure
C++ example:
template<typename S>
struct foo
{
template<typename T>
foo (int p1, T a)
{
s = 0;
}
S s;
};
int
main ()
{
foo<int> obj (2.3, 0);
return 0;
}
Then in GDB:
(gdb) complete break foo(int
The problem here is that the C++ symbol for the constructor looks like
this:
foo<int>::foo<double>(int, double)
When GDB enters cp_symbol_name_matches_1 the symbols it examines are:
foo<int>::foo<double>(int, double)
foo<double>(int, double)
The first iteration of the loop will match the 'foo', then add the
'<int>' template argument will be added as a skip range. When GDB
find the ':' after the '<int>' the first iteration of the loop fails
to match, GDB removes the 'foo<int>::' component, and starts the
second iteration of the loop.
Again, GDB matches the 'foo', and now adds '<double>' as a skip
region. After that the '(int' successfully matches, and so the second
iteration of the loop succeeds, but, once again we left the '<int>' in
place as a skip region, even though this occurs before the start of
our match string, and this will cause GDB to crash.
This problem was reported to the mailing list, and a solution
discussed in this thread:
https://sourceware.org/pipermail/gdb-patches/2023-January/195166.html
The solution proposed here is similar to one proposed by the original
bug reported, but implemented in a different location within GDB.
Instead of placing the fix in strncmp_iw_with_mode, I place the fix in
cp_symbol_name_matches_1. I believe this is a better location as it
is this function that implements the loop, and it is this loop, which
repeatedly calls strncmp_iw_with_mode, that should be resetting the
result object state (I believe).
What I have done is add an assert to strncmp_iw_with_mode that the
incoming result object is empty.
I've also added some other asserts in related code, in
completion_match_for_lcd::mark_ignored_range, I make some basic
assertions about the incoming range pointers, and in
completion_match_for_lcd::finish I also make some assertions about how
the skip ranges relate to the match pointer.
There's two new tests. The original rust example that was used in the
initial bug report, and a C++ test. The rust example depends on which
symbols are pulled in from the rust libraries, so it is possible that,
at some future date, the problematic symbol will disappear from this
test program. The C++ test should be more reliable, as this only
depends on symbols from within the C++ source code.
Since I originally posted this patch to the mailing list, the
following patch has been merged:
commit 6e7eef72164c00d6a5a7b0bce9fa01f5481f33cb
Date: Sun Mar 19 09:13:10 2023 -0600
Use rust_demangle to fix a crash
This solves the problem of a rust symbol ending up in the C++ specific
code by changing the order languages are sorted. However, this new
commit doesn't address the issue in the C++ code which was fixed with
this commit.
Given that the C++ issue is real, and has a reproducer, I'm still
going to merge this fix. I've left the discussion of rust in this
commit message as I originally wrote it, but it should be read within
the context of GDB prior to commit 6e7eef72164c00d6a5a7.
Co-Authored-By: Zheng Zhan <zzlossdev@163.com>
|