Age | Commit message (Collapse) | Author | Files | Lines |
|
The use of user namespaces is required for normal users to use mount
namespaces. Consider trying this as an unprivileged user:
$ unshare --mount /bin/true
unshare: unshare failed: Operation not permitted
The problem here is that an unprivileged user doesn't have the
required permissions to create a new mount namespace. If, instead, we
do this:
$ unshare --mount --map-root-user /bin/true
then this will succeed. The new option causes unshare to create a
user namespace in which the unprivileged user is mapped to UID/GID 0,
and so gains all privileges (inside the namespace), the user is then
able to create the mount namespace as required.
So, how does this relate to GDB?
When a user attaches to a process running in a separate mount
namespace, GDB makes use of a separate helper process (see
linux_mntns_get_helper in nat/linux-namespaces.c), which will then use
the `setns` function to enter (or try to enter) the mount namespace of
the process GDB is attaching too. The helper process will then handle
file I/O requests received from GDB, and return the results back to
GDB, this allows GDB to access files within the mount namespace.
The problem here is that, switching to a mount namespace requires that
a process hold CAP_SYS_CHROOT and CAP_SYS_ADMIN capabilities within
its user namespace (actually it's a little more complex, see 'man 2
setns'). Assuming GDB is running as an unprivileged user, then GDB
will not have the required permissions.
However, if GDB enters the user namespace that the `unshare` process
created, then the current user will be mapped to UID/GID 0, and will
have the required permissions.
And so, this patch extends linux_mntns_access_fs (in
nat/linux-namespace.c) to first try and switch to the user namespace
of the inferior before trying to switch to the mount namespace. If
the inferior does have a user namespace, and does have elevated
privileges within that namespace, then this first switch by GDB will
mean that the second step, into the mount namespace, will succeed.
If there is no user namespace, or the inferior doesn't have elevated
privileges within the user namespace, then the switch into the mount
namespace will fail, just as it currently does, and the user will need
to give elevated privileges to GDB via some other mechanism (e.g. run
as root).
This work was originally posted here:
https://inbox.sourceware.org/gdb-patches/20230321120126.1418012-1-benjamin@sipsolutions.net
I (Andrew Burgess) have made some cleanups to the code to comply with
GDB's coding standard, and the test is entirely mine. This commit
message is also entirely mine -- the original message was very terse
and required the reader to understand how the various namespaces
work and interact. The above is my attempt to document what I now
understand about the problem being fixed.
I've left the original author in place as the core of the GDB change
itself is largely as originally presented, but any inaccuracies in the
commit message, or problems with the test, are all mine.
Co-Authored-by: Andrew Burgess <aburgess@redhat.com>
|
|
This commit works around a problem introduced by commit:
commit e58beedf2c8a1e0c78e0f57aeab3934de9416bfc
Date: Tue Jan 23 16:00:59 2024 +0000
gdb: attach to a process when the executable has been deleted
The above commit extended GDB for Linux, so that, of the executable
for a process had been deleted, GDB would instead try to use
/proc/PID/exe as the executable.
This worked by updating linux_proc_pid_to_exec_file to introduce the
/proc/PID/exe fallback. However, the result of
linux_proc_pid_to_exec_file is then passed to exec_file_find to
actually find the executable, and exec_file_find, will take into
account the sysroot. In addition, if GDB is attaching to a process in
a different MNT and/or PID namespace then the executable lookup is
done within that namespace.
This all means two things:
1. Just because linux_proc_pid_to_exec_file cannot see the
executable doesn't mean that GDB is actually going to fail to
find the executable, and
2. returning /proc/PID/exe isn't useful if we know GDB is then going
to look for this within a sysroot, or within some other
namespace (where PIDs might be different).
There was an initial attempt to fix this issue here:
https://inbox.sourceware.org/gdb-patches/20250511141517.2455092-4-kilger@sec.in.tum.de/
This proposal addresses the issue in PR gdb/32955, which is all about
the namespace side of the problem. The fix in this original proposal
is to check the MNT namespace inside linux_proc_pid_to_exec_file, and
for the namespace problem this is fine. But we should also consider
the sysroot problem.
And for the sysroot problem, the fix cannot fully live inside
linux_proc_pid_to_exec_file, as linux_proc_pid_to_exec_file is shared
between GDB and gdbserver, and gdbserver has no sysroot.
And so, I propose a slightly bigger change.
Now, linux_proc_pid_to_exec_file takes a flag which indicates if
GDB (or gdbserver) will look for the inferior executable in the
local file system, where local means the same file system as GDB (or
gdbserver) is running in.
This local file system check is true if:
1. The MNT namespace of the inferior is the same as for GDB, and
2. for GDB only, the sysroot must either be empty, or 'target:'.
If the local file system check is false then GDB (or gdbserver) is
going to look elsewhere for the inferior executable, and so, falling
back to /proc/PID/exe should not be done, as GDB will end up looking
for this file in the sysroot, or within the alternative MNT
namespace (which in also likely to be a different PID namespace).
Now this is all a bit of a shame really. It would be nice if
linux_proc_pid_to_exec_file could return /proc/PID/exe in such a way
that exec_file_find would know that the file should NOT be looked for
in the sysroot, or in the alternative namespace. But fixing that
problem would be a much bigger change, so for now lets just disable
the /proc/PID/exe fallback for cases where it might not work.
For testing, the sysroot case is now tested.
I don't believe we have any alternative namespace testing. It would
certainly be interesting to add some, but I'm not proposing any with
this patch, so the code for checking the MNT namespace has been tested
manually by me, but isn't covered by a new test I'm adding here.
Author of the original fix is listed as co-author here. Credit for
identifying the original problem, and proposing a solution belongs to
them.
Co-Authored-By: Fabian Kilger <kilger@sec.in.tum.de>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32955
|
|
Currently gdbserver uses the require_int() function to parse the
requested offset (in vFile::pread packet and the like). This function
allows integers up to 0x7fffffff (to fit in 32-bit int), however the
offset (for the pread system call) has an off_t type which can be
larger than 32-bit.
This patch allows require_int() function to parse offset up to the
maximum value implied by the off_t type.
Approved-By: Pedro Alves <pedro@palves.net>
Change-Id: I3691bcc1ab1838c0db7f8b82d297d276a5419c8c
|
|
`pre-commit run --all-files` found this.
Change-Id: I8db09b12cf184d32351ff2c579bdaa8cf6f80ac3
|
|
Change the messages to reflect that these numbers includes type units,
not only compile units.
Change-Id: Id2f511d4666e5cf92112be917d72ff76791b7e1d
Approved-by: Kevin Buettner <kevinb@redhat.com>
|
|
This commit adds a new gdb.warning() function. This function takes a
string and then calls GDB's internal warning() function. This will
display the string as a warning.
Using gdb.warning() means that the message will get the new emoji
prefix if the user has that feature turned on. Also, the message will
be sent to gdb.STDERR without the user having to remember to print to
the correct stream.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This commit continues the work of the previous two commits.
In the following commits I added the target_fileio_stat function, and
the target_ops::fileio_stat member function:
* 08a115cc1c4 gdb: add target_fileio_stat, but no implementations yet
* 3055e3d2f13 gdb: add GDB side target_ops::fileio_stat implementation
* 6d45af96ea5 gdbserver: add gdbserver support for vFile::stat packet
* 22836ca8859 gdb: check for multiple matching build-id files
Unfortunately I messed up, despite being called 'stat' these function
actually performed an 'lstat'. The 'lstat' is the correct (required)
implementation, it's the naming that is wrong.
Additionally, to support remote targets, these commit added the
vFile::stat packet, which again, performed an 'lstat'.
In the previous two commits I changed the GDB code to replace 'stat'
with 'lstat' in the fileio function names. I then added a new
vFile:lstat packet which GDB now uses instead of vFile:stat.
And that just leaves the vFile:stat packet which is, right now,
performing an 'lstat'.
Now, clearly when I wrote this code I fully intended for this packet
to perform an lstat, it's the lstat that I needed. But now, I think,
we should "fix" vFile:stat to actually perform a 'stat'.
This is risky. This is a change in remote protocol behaviour.
Reasons why this might be OK:
- vFile:stat was only added in GDB 16, so it's not been "in the
wild" for too long yet. If we're quick, we might be able to "fix"
this before anyone realises I messed up.
- The documentation for vFile:stat is pretty vague. It certainly
doesn't explicitly say "this does an lstat". Most implementers
would (I think), given the name, start by assuming this should be
a 'stat' (given the name). Only if they ran the full GDB
testsuite, or examined GDB's implementation, would they know to
use lstat.
Reasons why this might not be OK:
- Some other debug client could be connecting to gdbserver, sending
vFile:stat and expecting to get lstat behaviour. This would break
after this patch.
- Some other remote server might have implemented vFile:stat
support, and either figured out, or copied, the lstat behaviour
from gdbserver. This remote server would technically be wrong
after this commit, but as GDB no longer uses vFile:stat, then this
will only become a problem if/when GDB or some other client starts
to use vFile:stat in the future.
Given the vague documentation for vFile:stat, and that it was only
added in GDB 16, I think we should fix it now to perform a 'stat', and
that is what this commit does.
The change in behaviour is documented in the NEWS file. I've improved
the vFile:stat documentation in the manual to better explain what is
expected from this packet, and I've extended the existing test to
cover vFile:stat.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
In the following commits I added the target_fileio_stat function, and
the target_ops::fileio_stat member function:
* 08a115cc1c4 gdb: add target_fileio_stat, but no implementations yet
* 3055e3d2f13 gdb: add GDB side target_ops::fileio_stat implementation
* 6d45af96ea5 gdbserver: add gdbserver support for vFile::stat packet
* 22836ca8859 gdb: check for multiple matching build-id files
Unfortunately I messed up, despite being called 'stat' these function
actually performed an 'lstat'. The 'lstat' is the correct (required)
implementation, it's the naming that is wrong.
In the previous commit I fixed the naming within GDB, renaming 'stat'
to 'lstat' throughout.
However, in order to support target_fileio_stat (as was) on remote
targets, the above patches added the vFile:stat packet, which actually
performed an 'lstat' call. This is really quite unfortunate, and I'd
like to do as much as I can to try and clean up this mess. But I'm
mindful that changing packets is not really the done thing.
So, this commit doesn't change anything.
Instead, this commit adds vFile:lstat as a new packet.
Currently, this packet is handled identically as vFile:stat, the
packet performs an 'lstat' call.
I then update GDB to send the new vFile:lstat instead of vFile:stat
for the remote_target::fileio_lstat implementation.
After this commit GDB will never send the vFile:stat packet.
However, I have retained the 'set remote hostio-stat-packet' control
flag, just in case someone was trying to set this somewhere.
Then there's one test in the testsuite which used to disable the
vFile:stat packet, that test is updated to now disable vFile:lstat.
There's a new test that does a more direct test of vFile:lstat. This
new test can be extended to also test vFile:stat, but that is left for
the next commit.
And so, after this commit, GDB sends the new vFile:lstat packet in
order to implement target_ops::fileio_lstat. The new packet is more
clearly documented than vFile:stat is. But critically, this change
doesn't risk breaking any other clients or servers that implement
GDB's remote protocol.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
With MSYS2 and test-case gdb.ada/assign_1.exp, we get:
...
(gdb) dir^M
Reinitialize source path to empty? (y or n) \
[answered Y; input not from terminal]^M^M
Source directories searched: $cdir;$cwd^M^M
(gdb)
...
GDB automatically answers the query, because interactive-mode is off:
...
(gdb) show interactive-mode^M
Debugger's interactive mode is auto (currently off).^M^M
...
The correct value is on, because GDB was started in a terminal.
For some reason, the auto value of interactive-mode is off instead. According
to this patch [1], gdb doesn't recognize the pipes used by DejaGnu testsuite
as an interactive setup.
Fix this by adding "set interactive-mode on" to INTERNAL_GDBFLAGS, such that
we get:
...
(gdb) dir^M
Reinitialize source path to empty? (y or n) y^M
Source directories searched: $cdir;$cwd^M^M
(gdb)
...
and no longer need fixes like commit be740e7cc62 ("testsuite: skip
confirmation in 'gdb_reinitialize_dir'")
The fix is essentially the same as in aforementioned patch.
For consistency, we apply the fix for all platforms.
Co-Authored-By: Pierre Muller <muller@sourceware.org>
Approved-By: Tom Tromey <tom@tromey.com>
[1] https://sourceware.org/legacy-ml/gdb-patches/2013-09/msg00940.html
|
|
With MSYS2 and default TERM=xterm-256color (as well as with xterm and ansi), I
get:
...
builtin_spawn gdb -q ...
^[[6n(gdb) ERROR: GDB never initialized.
...
This is not specific to gdb, other tools produce the same CSI sequence, and
consequently we run into trouble in other places (like get_compiler_info).
Fix this by default-setting TERM to dumb.
We do this for all platforms, to avoid test-cases passing on one platform but
failing on another.
For test-cases that set TERM to something other than dumb, handle the CSI
sequence in default_gdb_start.
Approved-By: Tom Tromey <tom@tromey.com>
PR testsuite/33072
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33072
|
|
amd_dbgapi_target_breakpoint::check_status
ROCgdb handles target events very slowly when running a test case like
this, where a breakpoint is preset on HipTest::vectorADD:
for (int i=0; i < numDevices; ++i) {
HIPCHECK(hipSetDevice(i));
hipLaunchKernelGGL(HipTest::vectorADD, dim3(blocks), dim3(threadsPerBlock), 0, stream[i],
static_cast<const int*>(A_d[i]), static_cast<const int*>(B_d[i]), C_d[i], N);
}
What happens is:
- A kernel is launched
- The internal runtime breakpoint is hit during the second
hipLaunchKernelGGL call, which causes
amd_dbgapi_target_breakpoint::check_status to be called
- Meanwhile, all waves of the kernel hit the breakpoint on vectorADD
- amd_dbgapi_target_breakpoint::check_status calls process_event_queue,
which pulls the thousand of breakpoint hit events from the kernel
- As part of handling the breakpoint hit events, we write the PC of the
waves that stopped to decrement it. Because the forward progress
requirement is not disabled, this causes a suspend/resume of the
queue each time, which is time-consuming.
The stack trace where this all happens is:
#32 0x00007ffff6b9abda in amd_dbgapi_write_register (wave_id=..., register_id=..., offset=0, value_size=8, value=0x7fffea9fdcc0) at /home/smarchi/src/amd-dbgapi/src/register.cpp:587
#33 0x00005555588c0bed in amd_dbgapi_target::store_registers (this=0x55555c7b1d20 <the_amd_dbgapi_target>, regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2504
#34 0x000055555a5186a1 in target_store_registers (regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/target.c:3973
#35 0x0000555559fab831 in regcache::raw_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:890
#36 0x0000555559fabd2b in regcache::cooked_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:915
#37 0x0000555559fc3ca5 in regcache::cooked_write<unsigned long, void> (this=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:850
#38 0x0000555559fab09a in regcache_cooked_write_unsigned (regcache=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:858
#39 0x0000555559fb0678 in regcache_write_pc (regcache=0x507000002240, pc=0x7ffff62bd900) at /home/smarchi/src/wt/amd/gdb/regcache.c:1460
#40 0x00005555588bb37d in process_one_event (event_id=..., event_kind=AMD_DBGAPI_EVENT_KIND_WAVE_STOP) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1873
#41 0x00005555588bbf7b in process_event_queue (process_id=..., until_event_kind=AMD_DBGAPI_EVENT_KIND_BREAKPOINT_RESUME) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2006
#42 0x00005555588b1aca in amd_dbgapi_target_breakpoint::check_status (this=0x511000140900, bs=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:890
#43 0x0000555558c50080 in bpstat_stop_status (aspace=0x5070000061b0, bp_addr=0x7fffed0b9ab0, thread=0x518000026c80, ws=..., stop_chain=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/breakpoint.c:6126
#44 0x000055555984f4ff in handle_signal_stop (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:7169
#45 0x000055555984b889 in handle_inferior_event (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:6621
#46 0x000055555983eab6 in fetch_inferior_event () at /home/smarchi/src/wt/amd/gdb/infrun.c:4750
#47 0x00005555597caa5f in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/wt/amd/gdb/inf-loop.c:42
#48 0x00005555588b838e in handle_target_event (client_data=0x0) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1513
Fix that performance problem by disabling the forward progress
requirement in amd_dbgapi_target_breakpoint::check_status, before
calling process_event_queue, so that we can process all events
efficiently.
Since the same performance problem could theoritically happen any time
process_event_queue is called with forward progress requirement enabled,
add an assert to ensure that forward progress requirement is disabled
when process_event_queue is invoked. This makes it necessary to add a
require_forward_progress call to amd_dbgapi_finalize_core_attach. It
looks a bit strange, since core files don't have execution, but it
doesn't hurt.
Add a test that replicates this scenario. The test launches a kernel
that hits a breakpoint (with an always false condition) repeatedly.
Meanwhile, the host process loads an unloads a code object, causing
check_status to be called.
Bug: SWDEV-482511
Change-Id: Ida86340d679e6bd8462712953458c07ba3fd49ec
Approved-by: Lancelot Six <lancelot.six@amd.com>
|
|
When running test-case gdb.python/py-source-styling-2.exp with TERM=dumb, I
get:
...
(gdb) set style enabled on^M
warning: The current terminal doesn't support styling. \
Styled output might not appear as expected.^M
(gdb) FAIL: $exp: set style enabled on
...
Fix this by using with_ansi_styling_terminal on clean_restart.
Tested on x86_64-linux.
|
|
Setting a BP on a line like this would incorrectly yield two BP locations:
01 void two () { {int var = 0;} }
(gdb) break 1
Breakpoint 1 at 0x1164: main.cpp:1. (2 locations)
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y <MULTIPLE>
1.1 y 0x0000000000001164 in two() at main.cpp:1
1.2 y 0x0000000000001164 in two() at main.cpp:1
In this case decode_digits_ordinary () returns two SALs, exactly matching the
requested line. One for the entry PC and one for the prologue end PC. This
was
tested with GCC, CLANG and ICPX. Subsequent code tries to skip the prologue
on these PCs, which in turn makes them the same.
To fix this, ignore SALs with the same PC and program space when adding to the
list of SALs.
This will then properly set only one location:
(gdb) break 1
Breakpoint 1 at 0x1164: file main.cpp, line 1
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000000000001164 in two() at main.cpp:1
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
The Windows port does not support multi-process debugging. Testcases
that want to exercise multi-process currently FAIL and some hit
cascading timeouts. Add a new allow_multi_inferior_tests procedure,
meant to be used with require, and sprinkle it throughout testcases as
needed.
Approved-by: Kevin Buettner <kevinb@redhat.com>
Change-Id: I4a10d8f04f9fa10f4b751f140ad0a6d31fbd9dfb
|
|
Cygwin debugging does not support follow fork. There is currently no
interface between the debugger and the Cygwin runtime to be able to
intercept forks and execs. Consequently, testcases that try to
exercise fork/exec all FAIL, and several hit long cascading timeouts.
Add a new allow_fork_tests procedure, meant to be used with require,
and sprinkle it throughout testcases that exercise fork.
Note that some tests currently are skipped on targets other than
Linux, with something like:
# Until "set follow-fork-mode" and "catch vfork" are implemented on
# other targets...
#
if {![istarget "*-linux*"]} {
continue
}
However, some BSD ports also support fork debugging nowadays, and the
testcases were never adjusted... That is why the new allow_fork_tests
procedure doesn't look for linux.
With this patch, on Cygwin, I get this:
$ make check TESTS="*/*fork*.exp"
...
=== gdb Summary ===
# of expected passes 6
# of untested testcases 1
# of unsupported tests 31
Reviewed-By: Keith Seitz <keiths@redhat.com>
Change-Id: I0c5e8c574d1f61b28d370c22a0b0b6bc3efaf978
|
|
Running gdb.multi/attach-no-multi-process.exp on Cygwin, where
GDBserver does not support non-stop mode, I see:
FAIL: gdb.multi/attach-no-multi-process.exp: target_non_stop=off: info threads
FAIL: gdb.multi/attach-no-multi-process.exp: target_non_stop=on: attach to the program via remote (timeout)
FAIL: gdb.multi/attach-no-multi-process.exp: target_non_stop=on: info threads (timeout)
Let's ignore the first "info threads" fail. The timeouts look like
this:
builtin_spawn /home/alves/gdb-cache-cygwin/gdb/../gdbserver/gdbserver --once --multi localhost:2346
Listening on port 2346
target extended-remote localhost:2346
Remote debugging using localhost:2346
Non-stop mode requested, but remote does not support non-stop
(gdb) gdb_do_cache: can_spawn_for_attach ( )
builtin_spawn /home/alves/gdb/build-cygwin-testsuite/outputs/gdb.multi/attach-no-multi-process/attach-no-multi-process
attach 14540
FAIL: gdb.multi/attach-no-multi-process.exp: target_non_stop=on: attach to the program via remote (timeout)
info threads
FAIL: gdb.multi/attach-no-multi-process.exp: target_non_stop=on: info threads (timeout)
Note the "Non-stop mode requested, but remote does not support
non-stop" line.
The intro to gdb_target_cmd_ext says:
# gdb_target_cmd_ext
# Send gdb the "target" command. Returns 0 on success, 1 on failure, 2 on
# unsupported.
That's perfect here, we can just use gdb_target_cmd_ext instead of
gdb_target_cmd, and check for 2 (unsupported). That's what this patch
does.
However gdb_target_cmd_ext incorrectly returns 1 instead of 2 for the
case where the remote target says it does not support non-stop. That
is also fixed by this patch.
With this, we no longer get those timeout fails. We get instead:
target extended-remote localhost:2346
Remote debugging using localhost:2346
Non-stop mode requested, but remote does not support non-stop
(gdb) UNSUPPORTED: gdb.multi/attach-no-multi-process.exp: target_non_stop=on: non-stop RSP
Approved-by: Kevin Buettner <kevinb@redhat.com>
Change-Id: I1ab3162f74200c6c02a17a0600b102d2d12db236
|
|
On Cygwin, starting an inferior under GDB, and detaching it, quitting
GDB, and then closing the shell, like so:
(gdb) start
(gdb) detach
(gdb) quit
# close shell
... hangs the parent shell of GDB (not GDB!) until the inferior
process that was detached (as it is still using the same terminal GDB
was using) exits too.
This leads to odd failures in gdb.base/watchpoint-hw-attach.exp like
so:
detach
Detaching from program: .../outputs/gdb.base/watchpoint-hw-attach/watchpoint-hw-attach, process 16580
[Inferior 1 (process 16580) detached]
(gdb) FAIL: gdb.base/watchpoint-hw-attach.exp: detach
Fix this by converting the testcase to spawn the inferior outside GDB,
with spawn_wait_for_attach.
With this patch, the testcase passes cleanly on Cygwin, for me.
Approved-By: Tom Tromey <tom@tromey.com>
Change-Id: I8e3884073a510d6fd2fff611e1d26fc808adc4fa
|
|
Running gdb.cp/cpexprs.exp on x86-64 GNU/Linux, I see:
break base::~base
Breakpoint 117 at 0x555555555d90: file .../src/gdb/testsuite/gdb.cp/cpexprs.cc, line 135.
(gdb) continue
Continuing.
Breakpoint 117, base::~base (this=0x7fffffffd0f8, __in_chrg=<optimized out>) at .../src/gdb/testsuite/gdb.cp/cpexprs.cc:135
135 ~base (void) { } // base::~base
(gdb) PASS: gdb.cp/cpexprs.exp: continue to base::~base
Here, the breakpoint only got one location because both the in-charge
and the not-in-charge dtors are identical and got the same address:
$ nm -A ./testsuite/outputs/gdb.cp/cpexprs/cpexprs| c++filt |grep "~base"
./testsuite/outputs/gdb.cp/cpexprs/cpexprs:0000000000001d84 W base::~base()
./testsuite/outputs/gdb.cp/cpexprs/cpexprs:0000000000001d84 W base::~base()
While on Cygwin, we get two locations for the same breakpoint, which
the testcase isn't expecting:
break base::~base
Breakpoint 117 at 0x100402678: base::~base. (2 locations)
(gdb) continue
Continuing.
Thread 1 "cpexprs" hit Breakpoint 117.1, base::~base (this=0x7ffffcaf8, __in_chrg=<optimized out>) at .../src/gdb/testsuite/gdb.cp/cpexprs.cc:135
135 ~base (void) { } // base::~base
(gdb) FAIL: gdb.cp/cpexprs.exp: continue to base::~base
We got two locations because the in-charge and the not-in-charge dtors
have different addresses:
$ nm -A outputs/gdb.cp/cpexprs/cpexprs.exe | c++filt | grep "~base"
outputs/gdb.cp/cpexprs/cpexprs.exe:0000000100402680 T base::~base()
outputs/gdb.cp/cpexprs/cpexprs.exe:0000000100402690 T base::~base()
On Cygwin, we also see the typical failure due to not expecting the
inferior to be multi-threaded:
(gdb) continue
Continuing.
[New Thread 628.0xe08]
Thread 1 "cpexprs" hit Breakpoint 200, test_function (argc=1, argv=0x7ffffcc20) at .../src/gdb/testsuite/gdb.cp/cpexprs.cc:336
336 derived d;
(gdb) FAIL: gdb.cp/cpexprs.exp: continue to test_function for policyd3::~policyd
Both issues are fixed by this patch, and now the testcase passes
cleanly on Cygwin, for me.
Reviewed-By: Keith Seitz <keiths@redhat.com>
Change-Id: If7eb95d595f083f36dfebf9045c0fc40ef5c5df1
|
|
I noticed on Cygwin, gdb.thread/thread-execl.exp would hang, (not that
surprising since we can't follow-exec on Cygwin). Looking at the
process list running on the machine, we end up with a thread-execl.exe
process constantly respawning another process [1].
We see the same constant-reexec if we launch gdb.thread/thread-execl
manually on the shell:
$ ./testsuite/outputs/gdb.threads/thread-execl/thread-execl
# * doesn't exit, constantly re-execing *
^C
Prevent this leftover constantly-re-execing scenario by making the
testcase program only exec once. We now get:
$ ./testsuite/outputs/gdb.threads/thread-execl/thread-execl
$ # exits immediately after one exec.
On Cygwin, the testcase now fails reasonably quickly, and doesn't
leave stale processes behind.
Still passes cleanly on x86-64 GNU/Linux.
[1] Cygwin's exec emulation spawns a new Windows process for the new
image.
Approved-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I0de1136cf2ef7e89465189bc43489a2139a80efb
|
|
Cygwin supports dumping ELF cores via a dumper.exe utility, see
https://www.cygwin.com/cygwin-ug-net/dumper.html.
When I run a testcase that has the "kernel" generate a corefile, like
gdb.base/corefile.exp, Cygwin invokes dumper.exe correctly and
generates an ELF core file, however, the testsuite doesn't find the
generated core:
Running /home/alves/gdb/src/gdb/testsuite/gdb.base/corefile.exp ...
WARNING: can't generate a core file - core tests suppressed - check ulimit -c
The file is correctly put under $coredir, e.g., like so:
outputs/gdb.base/corefile/coredir.8926/corefile.exe.core
The problem is in this line in core_find:
foreach i "${coredir}/core ${coredir}/core.coremaker.c ${binfile}.core" {
Note that that isn't looking for "${binfile}.core" inside
${coredir}... That is fixed in this patch.
However, that still isn't sufficient for Cygwin + dumper, as in that
case the core is going to be called foo.exe.core, not foo.core. Fix
that by looking for foo.exe.core in the core dir as well.
With this, gdb.base/corefile.exp and other tests that use core_find
now run. They don't pass cleanly, but at least now they're exercised.
Approved-By: Tom Tromey <tom@tromey.com>
Change-Id: Ic807dd2d7f22c5df291360a18c1d4fbbbb9b993e
|
|
The gdb.base/sigall.exp testcase has many FAILs on Cygwin currently.
From:
Thread 1 "sigall" received signal SIGPWR, Power fail/restart.
0x00007ffeac9ed134 in ntdll!ZwWaitForSingleObject () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) FAIL: gdb.base/sigall.exp: get signal LOST
we see two issues. The test is expecting "Program received ..." which
only appears if the inferior is single-threaded. All Cygwin inferiors
are multi-threaded, because both Windows and the Cygwin runtime spawn
a few helper threads.
And then, SIGLOST is the same as SIGPWR on Cygwin. The testcase
already knows to treat them the same on SPARC64 GNU/Linux. We just
need to extend the relevant code to treat Cygwin the same.
With this, the test passes cleanly on Cygwin.
Approved-By: Tom Tromey <tom@tromey.com>
Change-Id: Ie3553d043f4aeafafc011347b6cb61ed58501667
|
|
The gdb.arch/amd64-watchpoint-downgrade.exp testcase is assuming the
output of debugging a single-thread program, like so, on e.g.,
GNU/Linux:
starti
Starting program: .../gdb.arch/amd64-watchpoint-downgrade/amd64-watchpoint-downgrade
warning: watchpoint 1 downgraded to software watchpoint
Program stopped.
0x00007ffff7fe32b0 in _start () from /lib64/ld-linux-x86-64.so.2
However, on Cygwin, where all inferiors are multi-threaded (because
both Windows and the Cygwin runtime spawn a few helper threads), we
get:
starti
Starting program: .../gdb.arch/amd64-watchpoint-downgrade/amd64-watchpoint-downgrade
[New Thread 4652.0x17e4]
warning: watchpoint 1 downgraded to software watchpoint
Thread 1 stopped.
0x00007ffbfc1c0911 in ntdll!LdrInitShimEngineDynamic () from C:/Windows/SYSTEM32/ntdll.dll
This commit adjusts the testcase to work with either output.
(Note GDB may print a thread name after the thread number.)
Approved-by: Kevin Buettner <kevinb@redhat.com>
Change-Id: I3aedfec04924ea3fb3bb87ba3251e2b720f8d59c
|
|
On Cygwin, all inferiors are multi-threaded, because both Windows and
the Cygwin runtime spawn a few helper threads. Adjust the
gdb.base/bp-permanent.exp testcase to work with either single- or
multi-threaded inferiors.
Approved-by: Kevin Buettner <kevinb@redhat.com>
Change-Id: I28935b34fc9f739c2a5490e83aa4995d29927be2
|
|
Currently on Cygwin, I get:
Running /home/alves/gdb/src/gdb/testsuite/gdb.base/bp-cond-failure.exp ...
FAIL: gdb.base/bp-cond-failure.exp: access_type=char: cond_eval=auto: multi-loc: continue
FAIL: gdb.base/bp-cond-failure.exp: access_type=char: cond_eval=auto: single-loc: continue
FAIL: gdb.base/bp-cond-failure.exp: access_type=short: cond_eval=auto: multi-loc: continue
FAIL: gdb.base/bp-cond-failure.exp: access_type=short: cond_eval=auto: single-loc: continue
FAIL: gdb.base/bp-cond-failure.exp: access_type=int: cond_eval=auto: multi-loc: continue
FAIL: gdb.base/bp-cond-failure.exp: access_type=int: cond_eval=auto: single-loc: continue
FAIL: gdb.base/bp-cond-failure.exp: access_type=long long: cond_eval=auto: multi-loc: continue
FAIL: gdb.base/bp-cond-failure.exp: access_type=long long: cond_eval=auto: single-loc: continue
On GNU/Linux, we see:
Breakpoint 2.1, foo () at .../src/gdb/testsuite/gdb.base/bp-cond-failure.c:21
21 return 0; /* Multi-location breakpoint here. */
(gdb) PASS: gdb.base/bp-cond-failure.exp: access_type=char: cond_eval=auto: multi-loc: continue
While on Cygwin, we see:
Thread 1 "bp-cond-failure" hit Breakpoint 2.1, foo () at .../src/gdb/testsuite/gdb.base/bp-cond-failure.c:21
21 return 0; /* Multi-location breakpoint here. */
(gdb) FAIL: gdb.base/bp-cond-failure.exp: access_type=char: cond_eval=auto: multi-loc: continue
The difference is the "Thread 1" part in the beginning of the quoted
output. It appears on Cygwin, but not on Linux. That's because on
Cygwin, all inferiors are multi-threaded, because both Windows and the
Cygwin runtime spawn a few helper threads.
Fix this by adjusting the gdb.base/bp-cond-failure.exp testcase to
work with either single- or multi-threaded inferiors.
The testcase passes cleanly for me after this.
Approved-by: Kevin Buettner <kevinb@redhat.com>
Change-Id: I5ff11d06ac1748d044cef025f1e78b8f84ad3349
|
|
On s390x-linux, with test-case gdb.ada/dyn-bit-offset.exp and gcc 7.5.0 I get:
...
(gdb) print spr^M
$1 = (discr => 3, array_field => (-5, -6, -7), field => -6, another_field => -6)^M
(gdb) FAIL: $exp: print spr
print spr.field^M
$2 = -6^M
(gdb) FAIL: $exp: print spr.field
...
On x86_64-linux, with the same compiler version I get:
...
(gdb) print spr^M
$1 = (discr => 3, array_field => (-5, -6, -7), field => -4, another_field => -4)^M
(gdb) XFAIL: $exp: print spr
print spr.field^M
$2 = -4^M
(gdb) PASS: $exp: print spr.field
...
In both cases, we're hitting the same compiler problem, but it manifests
differently on little and big endian.
Make sure the values seen for both little and big endian trigger xfails
for both tests.
Printing spr.field gives the expected value -4 for x86_64, but that's an
accident. Change the actual spr.field value to -5, to make sure
that we get the same number of xfails on x86_64 and s390x.
Finally, make the xfails conditional on the compiler version.
Tested using gcc 7.5.0 on both x86_64-linux and s390x-linux.
Approved-By: Andrew Burgess <aburgess@redhat.com>
PR testsuite/33042
https://sourceware.org/bugzilla/show_bug.cgi?id=33042
|
|
When the convenience variable $_linker_namespace was introduced, I meant
for it to print the namespace of the frame that where the user was
stopped. However, due to confusing what "current_frame" and
"selected_frame" meant, it instead printed the namespace of the
lowermost frame.
This commit updates the code to follow my original intent. Since the
variable was never in a GDB release, updating the behavior should not
cause any disruption. It also adds a test to verify the functionality.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Internal AdaCore testing using -gdwarf-4 found a spot where GCC will
emit a negative DW_AT_bit_offset. However, my recent signed/unsigned
changes assumed that this value had to be positive.
I feel this bug somewhat invalidates my previous thinking about how
DWARF attributes should be handled.
In particular, both GCC and LLVM at understand that a negative bit
offset can be generated -- but for positive offsets they might use a
smaller "data" form, which is expected not to be sign-extended. LLVM
has similar code but GCC does:
if (bit_offset < 0)
add_AT_int (die, DW_AT_bit_offset, bit_offset);
else
add_AT_unsigned (die, DW_AT_bit_offset, (unsigned HOST_WIDE_INT) bit_offset);
What this means is that this attribute is "signed but default
unsigned".
To fix this, I've added a new attribute::confused_constant method.
This should be used when a constant value might be signed, but where
narrow forms (e.g., DW_FORM_data1) should *not* cause sign extension.
I examined the GCC and LLVM DWARF writers to come up with the list of
attributes where this applies, namely DW_AT_bit_offset,
DW_AT_const_value and DW_AT_data_member_location (GCC only, but LLVM
always emits it as unsigned, so we're safe here).
This patch corrects the bug and imports the relevant test case.
Regression tested on x86-64 Fedora 41.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32680
Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118837
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Turns out that using 'set debug breakpoint on' will trigger an
assertion for 'catch' style breakpoints, e.g.:
(gdb) file /tmp/hello.x
Reading symbols from /tmp/hello.x...
(gdb) catch exec
Catchpoint 1 (exec)
(gdb) set debug breakpoint on
(gdb) start
[breakpoint] dump_condition_tokens: Tokens: { INFERIOR: "1" }
Temporary breakpoint 2 at 0x401198: file /tmp/hello.c, line 18.
[breakpoint] update_global_location_list: insert_mode = UGLL_MAY_INSERT
Starting program: /tmp/hello.x
[breakpoint] update_global_location_list: insert_mode = UGLL_MAY_INSERT
../../gdb-16.1/gdb/gdbarch-gen.c:1764: internal-error: gdbarch_addr_bit: Assertion `gdbarch != NULL' failed.
.... etc ...
The problem is that catch breakpoints don't set the
bp_location::gdbarch member variable, they a "dummy" location added
with a call to add_dummy_location (breakpoint.c).
The breakpoint_location_address_str function (which is only used for
breakpoint debug output) relies on bp_location::gdbarch being set in
order to call the paddress function.
I considered trying to ensure that the bp_location::gdbarch variable
is always set to sane value. For example, in add_dummy_location I
tried copying the gdbarch from the breakpoint object, and this does
work for the catchpoint case, but for some of the watchpoint cases,
even the breakpoint object has no gdbarch value set.
Now this seemed a little suspect, but, the more I thought about it, I
wondered if "fixing" the gdbarch was allowing me to solve the wrong
problem.
If the gdbarch was set, then this would allow us to print the address
field of the bp_location, which is going to be 0, after all, as this
is a dummy location, which has no address.
But does it really make sense to print the address 0? For some
targets, 0 is a valid address. But that wasn't an address we actually
selected, it's just the default value for dummy locations.
And we already have a helper function bl_address_is_meaningful, which
returns false for dummy locations.
So, I propose that in breakpoint_location_address_str, we use
bl_address_is_meaningful to detect dummy locations, and skip the
address printing code in that case.
For testing, I temporarily changed insert_bp_location so that
breakpoint_location_address_str was always called, even when
breakpoint debugging was off. I then ran the whole testsuite.
Without the fixes included in this commit I saw lots of assertion
failures, but with the fixes from this commit in place, I now see no
assertion failures.
I've added a new test which reveals the original assertion failure.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
For some reason, when testing GDB on Cygwin, I get:
child process exited abnormally
while executing
"exec sh -c "exec > /dev/null 2>&1 && (kill -2 -$spid || kill -2 $spid)""
(procedure "close_wait_program" line 20)
invoked from within
"close_wait_program $shell_id $pid"
(procedure "standard_close" line 23)
invoked from within
"standard_close "Windows-ROCm""
("eval" body line 1)
invoked from within
"eval ${try}_${proc} \"$dest\" $args"
(procedure "call_remote" line 42)
invoked from within
"call_remote "" close $host"
(procedure "remote_close" line 3)
invoked from within
"remote_close host"
(procedure "log_and_exit" line 30)
invoked from within
"log_and_exit"
When that happens from within clean_restart, clean_restart doesn't
clear the gdb_spawn_id variable, and then when clean_restart starts up
a new GDB, that sees that gdb_spawn_id is already set, so it doesn't
actually spawn a new GDB, and so clean_restart happens to reuse the
same GDB (!). Many tests happen to actually work OK with this, but
some don't, and the failure modes can be head-scratching.
Of course, the failure to close GDB should be fixed, but when it
happens, I think it's good to not end up with the current weird state.
Connecting the "child process exit abnormally" errors at the end of a
testcase run with weird FAILs in other testcases took me a while (as
in, weeks!), it wasn't obvious to me immediately.
Thus, this patch makes default_gdb_exit more resilient to failed
closes, so that gdb_spawn_id is unset even is closing GDB fails, and
we move on to start a new GDB.
Approved-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I9ec95aa61872a40095775534743525e0ad2097d2
|
|
The testcase added by this patch has a gdb_test_multiple call that
wants to match different lines of output that all have a common
prefix, and do different actions on each. Instead of a single regular
expression with alternatives, it's clearer code if the different
expressions are handled with different "-re", like so:
gdb_test_multiple "command" "" -lbl {
-re "^command(?=\r\n)" {
exp_continue
}
-re "^\r\nprefix foo(?=\r\n)" {
# Some action for "foo".
exp_continue
}
-re "^\r\nprefix bar(?=\r\n)" {
# Some action for "bar".
exp_continue
}
-re "^\r\nprefix \[^\r\n\]*(?=\r\n)" {
# Some action for all others.
exp_continue
}
-re "^\r\n$::gdb_prompt $" {
gdb_assert {$all_prefixes_were_seen} $gdb_test_name
}
}
Above, the leading anchors in the "^\r\nprefix..." matches are needed
to avoid too-eager matching due to the common prefix. Without the
anchors, if the expect output buffer happens to contain at least:
"\r\nprefix xxx\r\nprefix foo\r\n"
... then the "prefix foo" pattern match inadvertently consumes the
first "prefix xxx" line.
Without the anchor in the prompt match, like:
-re "\r\n$::gdb_prompt $" {
gdb_assert {$all_prefixes_were_seen} $gdb_test_name
}
Or the equivalent:
-re -wrap "" {
gdb_assert {$all_prefixes_were_seen} $gdb_test_name
}
... then if the expect buffer contains:
"\r\nmeant-to-be-matched-by-lbl\r\nprefix foo\r\n$gdb_prompt "
... then the prompt regexp matches this, consuming the "prefix" line
inadvertently, and we get a FAIL. The built-in regexp matcher for
-lbl doesn't get a chance to match the
"\r\nmeant-to-be-matched-by-lbl\r\n" part, because the built-in prompt
match appears first within gdb_test_multiple.
By adding the anchor to the prompt regexp, we avoid that problem.
However, the same expect output buffer contents will still match the
built-in prompt regexp. That is what is fixed by this patch. It
makes it so that if -lbl is specified, the built-in prompt regexp has
a leading anchor.
Original idea for turning this into a gdb.testsuite/ testcase by Tom
de Vries <tdevries@suse.de>.
Approved-By: Tom de Vries <tdevries@suse.de>
Change-Id: Ic2571ec793d856a89ee0d533ec363e2ac6036ea2
|
|
With test-case gdb.multi/attach-while-running.exp usually I get:
...
(gdb) run &^M
Starting program: attach-while-running ^M
(gdb) PASS: $exp: run &
[Thread debugging using libthread_db enabled]^M
Using host libthread_db library "/lib64/libthread_db.so.1".^M
add-inferior^M
[New inferior 2]^M
Added inferior 2 on connection 1 (native)^M
(gdb) PASS: $exp: add-inferior
...
or:
...
(gdb) run &
Starting program: attach-while-running
(gdb) PASS: $exp: run &
add-inferior
[New inferior 2]
Added inferior 2 on connection 1 (native)
(gdb) PASS: $exp: add-inferior
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
...
but sometimes I run into:
...
(gdb) run &
Starting program: attach-while-running
(gdb) PASS: $exp: run &
add-inferior
[New inferior 2]
Added inferior 2 on connection 1 (native)
(gdb) [Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
FAIL: $exp: add-inferior (timeout)
...
Fix this by using -no-prompt-anchor.
Tested on x86_64-linux.
|
|
Based on feedback from IRC and PR solib/32959, this commit renames the
recently introduced convenience variable $_current_linker_namespace to
the shorter name $_linker_namespace. This makes it more in line with
existing convenience variables such as $_thread and $_inferior, and
faster to type.
It should be ok to simply change the name because the variable was never
released to the public, so there should be no existing scripts depending
on the name.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32959
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Andrew Burgess <aburgess@redhat.com>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Based on IRC feedback since commit 6a0da68c036a85a46415aa0dada2421eee7c2269
gdb: add convenience variables around linker namespace debugging
This commit changes the type of the _current_linker_namespace variable
to be a simple integer. This makes it easier to use for expressions,
like breakpoint conditions or printing from a specific namespace once
that is supported, at the cost of making namespace IDs slightly less
consistent.
This is based on PR solib/32960, where no negative feedback was given
for the suggestion.
The commit also changes the usage of "linkage namespaces" to "linker
namespaces" in the NEWS file, to reduce chance of confusion from an end
user.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32960
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
With test-case gdb.base/bp-permanent.exp and gcc 15 I run into:
...
gdb compile failed, bp-permanent.c: In function 'test_signal_nested':
bp-permanent.c:118:20: error: passing argument 2 of 'signal' from \
incompatible pointer type [-Wincompatible-pointer-types]
118 | signal (SIGALRM, test_signal_nested_handler);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| void (*)(void)
In file included from bp-permanent.c:20:
/usr/include/signal.h:88:57: note: expected '__sighandler_t' \
{aka 'void (*)(int)'} but argument is of type 'void (*)(void)'
...
Fix this by adding an int parameter to test_signal_nested_handler.
Tested on x86_64-linux.
PR testsuite/32756
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32756
|
|
A commit I recently pushed:
commit 0b5023cc71d3af8b18e10e6599a3f9381bc15265
Date: Sat Apr 12 09:15:53 2025 +0100
gdb/python/guile: user created prefix commands get help list
can trigger a segfault if a user tries to create nested prefix
commands. For example, this will trigger a crash:
(gdb) python gdb.ParameterPrefix("prefix-1", gdb.COMMAND_NONE)
(gdb) python gdb.ParameterPrefix("prefix-1 prefix-2", gdb.COMMAND_NONE)
Fatal signal: Segmentation fault
... etc ...
If the user adds an actual parameter under 'prefix-1' before creating
'prefix-2', then everything is fine:
(gdb) python gdb.ParameterPrefix("prefix-1", gdb.COMMAND_NONE)
(gdb) python gdb.Parameter('prefix-1 param-1', gdb.COMMAND_NONE, gdb.PARAM_BOOLEAN)
(gdb) python gdb.ParameterPrefix("prefix-1 prefix-2", gdb.COMMAND_NONE)
The mistake in the above patch is in how gdbpy_parse_command_name is
used. The BASE_LIST output argument from this function points to the
list of commands for the prefix, not to the prefix command itself.
So when gdbpy_parse_command_name is called for 'prefix-1 prefix-2',
BASE_LIST points to the list of commands associated with 'prefix-1',
not to the actual 'prefix-1' cmd_list_element.
Back in cmdpy_init, from where gdbpy_parse_command_name was called, I
was walking back from the first entry in BASE_LIST to figure out if
this was a "show" prefix command or not. However, if BASE_LIST is
empty then there is no first item, and this would trigger the
segfault.
The solution it to extend gdbpy_parse_command_name to also return the
prefix cmd_list_element in addition to the existing values. With this
done, and cmdpy_init updated, the segfault is now avoided.
There's a new test that would trigger the crash without the patch.
And, of course, the above commit also broke guile in the exact same
way. And the fix is exactly the same. And there's a guile test too.
NOTE: We should investigate possibly sharing some of this boiler plate
helper code between Python and Guile. But not in this commit.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Since commit d462550c91c ("gdb/testsuite: also compile foll-exec.exp as C++"),
we run into:
...
Running gdb.base/exec-invalid-sysroot.exp ...
gdb compile failed, foll-exec.c: In function 'main':
foll-exec.c:35:52: error: 'EXECD_PROG' undeclared (first use in this function)
printf ("foll-exec is about to execlp(%s)...\n", EXECD_PROG);
^~~~~~~~~~
foll-exec.c:35:52: note: each undeclared identifier is reported only once \
for each function it appears in
...
Fix this by default-defining EXECD_PROG to "execd-prog".
Tested on x86_64-linux.
|
|
For a long time, Fedora GDB has carried a test that performs some
basic testing that GDB can handle 'catch exec' related commands for a
C++ executable.
The exact motivation for this test has been lost in the mists of time,
but looking at the test script, the concern seems to be that GDB would
have problems inserting C++ related internal breakpoints if a non C++
process is execd from a C++ one.
There's no actual GDB fix associated with the Fedora test. This
usually means that the issue was fixed upstream long ago. This patch
does seem to date from around 2010ish (or maybe earlier).
Having a look through the upstream tests, I cannot see anything that
covers this sort of thing (C++ to C exec calls), and I figure it
cannot hurt to have some additional testing in this area, and so I
wrote this patch.
I've taken the existing foll-exec.exp test, which compiles a C
executable and then execs a different C executable, and split it into
two copies.
We now have foll-exec-c.exp and foll-exec-c++.exp. These tests
compile a C and C++ executable respectively. Then within each of
these scripts both a C and C++ helper application is built, which can
then be execd from the main test executable.
And so, we now cover 4 cases, the initial executable can be C or C++,
and the execd process can be C or C++.
As expected, everything passes. This is just increasing test
coverage.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
Consider GDB's builtin prefix set/show prefix sub-commands, if they
are invoked with no sub-command name then they work like this:
(gdb) show print
print address: Printing of addresses is on.
print array: Pretty formatting of arrays is off.
print array-indexes: Printing of array indexes is off.
print asm-demangle: Demangling of C++/ObjC names in disassembly listings is off.
... cut lots of lines ...
(gdb) set print
List of set print subcommands:
set print address -- Set printing of addresses.
set print array -- Set pretty formatting of arrays.
set print array-indexes -- Set printing of array indexes.
set print asm-demangle -- Set demangling of C++/ObjC names in disassembly listings.
... cut lots of lines ...
Type "help set print" followed by set print subcommand name for full documentation.
Type "apropos word" to search for commands related to "word".
Type "apropos -v word" for full documentation of commands related to "word".
Command name abbreviations are allowed if unambiguous.
(gdb)
That is 'show print' lists the values of all settings under the
'print' prefix, and 'set print' lists the help text for all settings
under the 'set print' prefix.
Now, if we try to create something similar using the Python API:
(gdb) python gdb.ParameterPrefix("my-prefix", gdb.COMMAND_NONE)
(gdb) python gdb.Parameter("my-prefix foo", gdb.COMMAND_OBSCURE, gdb.PARAM_BOOLEAN)
(gdb) show my-prefix
(gdb) set my-prefix
Neither 'show my-prefix' or 'set my-prefix' gives us the same details
relating to the sub-commands that we get with the builtin prefix
commands.
This commit aims to address this.
Currently, in cmdpy_init, when a new command is created, we always set
the commands callback function to cmdpy_function. It is within
cmdpy_function that we spot that the command is a prefix command, and
that there is no gdb.Command.invoke method, and so return early.
This commit changes things so that the rules are now:
1. For NON prefix commands, we continue to use cmdpy_function.
2. For prefix commands that do have a gdb.Command.invoke
method (i.e. can handle unknown sub-commands), continue to use
cmdpy_function.
3. For all other prefix commands, don't use cmdpy_function, instead
use GDB's normal callback function for set/show prefixes.
This requires splitting the current call to add_prefix_cmd into either
a call to add_prefix_cmd, add_show_prefix_cmd, or
add_basic_prefix_cmd, as appropriate.
After these changes, we now see this:
(gdb) python gdb.ParameterPrefix("my-prefix", gdb.COMMAND_NONE) │
(gdb) python gdb.Parameter("my-prefix foo", gdb.COMMAND_OBSCURE, gdb.PARAM_BOOLEAN)
(gdb) show my-prefix │
my-prefix foo: The current value of 'my-prefix foo' is "off".
(gdb) set my-prefix
List of "set my-prefix" subcommands:
set my-prefix foo -- Set the current value of 'my-prefix foo'.
Type "help set my-prefix" followed by subcommand name for full documentation.
Type "apropos word" to search for commands related to "word".
Type "apropos -v word" for full documentation of commands related to "word".
Command name abbreviations are allowed if unambiguous.
(gdb)
Which matches how a prefix defined within GDB would act.
I have made the same changes to the Guile API.
|
|
This removes a trailing backslash from a comment in
dw2-ranges-psym-warning.exp. This backslash causes Emacs to try to
reindent the next line. This happens because comments are weird in
Tcl -- they are not exactly syntactic and the backslash still acts as
a line-continuation marker here.
|
|
In Ada, a field can have a dynamic bit offset in its enclosing record.
In DWARF 3, this was handled using a dynamic
DW_AT_data_member_location, combined with a DW_AT_bit_offset -- this
combination worked out ok because in practice GNAT only needs a
dynamic byte offset with a fixed offset within the byte.
However, this approach was deprecated in DWARF 4 and then removed in
DWARF 5. No replacement approach was given, meaning that in strict
mode there is no way to express this.
This is a DWARF bug, see
https://dwarfstd.org/issues/250501.1.html
In a discussion on the DWARF mailing list, a couple people mentioned
that compilers could use the obvious extension of a dynamic
DW_AT_data_bit_offset. I've implemented this for LLVM:
https://github.com/llvm/llvm-project/pull/141106
In preparation for that landing, this patch implements support for
this construct in gdb.
New in v2: renamed some constants and added a helper method, per
Simon's review.
New in v3: more renamings.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
It is possible, when creating a shared memory segment (i.e. with
shmget), that the id of the segment will be zero.
When looking at the segment in /proc/PID/smaps, the inode field of the
entry holds the shared memory segment id.
And so, it can be the case that an entry (in the smaps file) will have
an inode of zero.
When GDB generates a core file, with the generate-core-file (or its
gcore alias) command, the shared memory segment should be written into
the core file.
Fedora GDB has, since 2008, carried a patch that tests this case.
There is no fix for GDB associated with the test, and unfortunately,
the motivation for the test has been lost to the mists of time. This
likely means that a fix was merged upstream without a suitable test,
but I've not been able to find and relevant commit. The test seems to
be checking that the shared memory segment with id zero, is being
written to the core file.
While looking at this test and trying to work out if it should be
posted upstream, I saw that GDB does appear to write the shared memory
segment into the core file (as expected), which is good. However, GDB
still isn't getting this case exactly right, there appears to be no
NT_FILE entry for the shared memory mapping if the mapping had an id
of zero.
In gcore_memory_sections (gcore.c) we call back into linux-tdep.c (via
the gdbarch_find_memory_regions call) to correctly write the shared
memory segment into the core file, however, in
linux_make_mappings_corefile_notes, when we use
linux_find_memory_regions_full to create the NT_FILE note, we call
back in to dump_note_entry_p for each mapping, and in here we reject
any mapping with a zero inode.
The result of this, is that, for a shared memory segment with a
non-zero id, after loading the core file, the shared memory segment
will appear in the 'proc info mappings' output. But, for a shared
memory segment with a zero id, the segment will not appear in the
'proc info mappings' output.
I initially tried just dropping the inode check in this function (see
previous commit 1e21c846c27, which I then reverted in commit
998165ba99a.
The problem with dropping the inode check is that the special kernel
mappings, e.g. '[vvar]' would now get a NT_FILE entry. In fact, any
special entry except '[vdso]' and '[vsyscall]' which are specifically
checked for in dump_note_entry_p would get a NT_FILE entry, which is
not correct.
So, instead, I propose that if the inode is zero, and the filename
starts with '[' and finished with ']' then we should not create a
NT_FILE entry. But otherwise a zero inode should not prevent a
NT_FILE entry being created.
The test for this change is a bit tricky. The original Fedora
test (mentioned above) has a loop that tries to grab the shared memory
mapping with id zero. This was, unfortunately, not very reliable.
I tried to make this more reliable by going multi-threaded, and
waiting for longer, see my proposal here:
https://inbox.sourceware.org/gdb-patches/0d389b435cbb0924335adbc9eba6cf30b4a2c4ee.1741776651.git.aburgess@redhat.com
But this was still not great. On further testing this was only
passing (i.e. managing to find the shared memory mapping with id zero)
about 60% of the time.
However, I realised that GDB finds the shared memory id by reading the
/proc/PID/smaps file. But we don't really _need_ the shared memory id
for anything, we just use the value (as an inode) to decide if the
segment should be included in the core file or not. The id isn't even
written to the core file. So, if we could intercept the read of the
smaps file, then maybe, we could lie to GDB, and tell it that the id
was zero, and then see how GDB handles this.
And luckily, we can do that using a preload library!
We already have a test that uses a preload library to modify GDB, see
gdb.threads/attach-slow-waitpid.exp.
So, I have created a new preload library. This one intercepts open,
open64, close, read, and pread. When GDB attempts to open
/proc/PID/smaps, the library spots this and loads the file contents
into a memory buffer. The buffer is then modified to change the id of
any shared memory mapping to zero. Any reads from this file are
served from the modified memory buffer.
I tested on x86-64, AArch64, PPC, s390, and ARM, all running various
versions of GNU/Linux. The requirement for open64() came from my ARM
testing. The other targets used plain open().
And so, the test is now simple. Start GDB with the preload library in
place, start the inferior and generate a core file. Then restart GDB,
load the core file, and check the shared memory mapping was included.
This test will fail with an unpatched GDB, and succeed with the patch
applied.
Tested-By: Guinevere Larsen <guinevere@redhat.com>
|
|
Suppose a function returns a struct and a method of that struct is
called. E.g.:
struct S
{
int a;
int get () { return a; }
};
S f ()
{
S s;
s.a = 42;
return s;
}
...
int z = f().get();
...
GDB is able to evaluate the expression:
(gdb) print f().get()
$1 = 42
However, type-checking the expression fails:
(gdb) ptype f().get()
Attempt to take address of value not located in memory.
This happens because the `get` function takes an implicit `this`
pointer, which in this case is the value returned by `f()`, and GDB
wants to get an address for that value, as if passing the implicit
this pointer. However, during type-checking, the struct value
returned by `f()` is a `not_lval`.
A similar issue exists for union types, where methods called on
temporary union objects would fail type-checking in the same way.
Address the problems by handling `TYPE_CODE_STRUCT` and
`TYPE_CODE_UNION` in `evaluate_subexp_for_address_base`.
With this change, for struct's method call, we get
(gdb) ptype f().get()
type = int
Add new test cases to file gdb.cp/chained-calls.exp to test this change.
Regression-tested in X86-64 Linux.
|
|
With the quoted filename completion work that I did last year the
deprecated_filename_completer function will now only complete a single
word as a filename, for example:
(gdb) save breakpoints /tm<TAB>
The 'save breakpoints' command uses the deprecated_filename_completer
completion function. In the above '/tm' will complete to '/tmp/' as
expected. However, if you try this:
(gdb) save breakpoints /tmp/ /tm<TAB>
The second '/tm' will not complete for GDB 16.x, but will complete
with GDB 15.x as GDB 15.x is before my changes were merged.
What's actually happening here is that, before my changes, the
filename completion was breaking words on white space, so in the above
the first '/tmp/' and the second '/tm' are seen as separate words for
completion, the second word is therefore seen as the start of a new
filename.
After my changes, deprecated_filename_completer allows spaces to be
part of the filename, so in the above, GDB is actually trying to
complete a filename '/tmp/ /tm' which likely doesn't exist, and so
completion stops.
This change for how deprecated_filename_completer works makes sense,
commands like 'save breakpoints' take their complete command arguments
and treat it as a single filename, so given this:
(gdb) save breakpoints /tmp/ /tm<ENTER>
GDB really will try to save breakpoints to a file called '/tmp/ /tm',
weird as that may seem. How GDB interprets the command arguments
didn't change with my completion patches, I simply brought completion
into line with how GDB interprets the arguments.
The patches I'm talking about here are this set:
* 4076f962e8c gdb: split apart two different types of filename completion
* dc22ab49e9b gdb: deprecated filename_completer and associated functions
* 35036875913 gdb: improve escaping when completing filenames
* 1d1df753977 gdb: move display of completion results into completion_result class
* bbbfe4af4fb gdb: simplify completion_result::print_matches
* 2bebc9ee270 gdb: add match formatter mechanism for 'complete' command output
* f2f866c6ca8 gdb: apply escaping to filenames in 'complete' results
* 8f87fcb1daf gdb: improve gdb_rl_find_completion_word for quoted words
* 67b8e30af90 gdb: implement readline rl_directory_rewrite_hook callback
* 1be3b2e82f7 gdb: extend completion of quoted filenames to work in brkchars phase
* 9dedc2ac713 gdb: fix for completing a second filename for a command
* 4339a3ffc39 gdb: fix filename completion in the middle of a line
Bug PR gdb/32982 identifies a problem with the shell command;
completion broke between 15.x and 16.x. The shell command also uses
deprecated_filename_completer for completion. But consider a shell
command line:
(gdb) shell ls /tm<TAB>
The arguments to the shell command are 'ls /tm' at the point <TAB> is
pressed. Under the old 15.x completion GDB would split the words on
white space and then try to complete '/tm' as a filename.
Under the 16.x model, GDB completes all the arguments as a single
filename, that is 'ls /tm', which is unlikely to match any filenames,
and so completion fails.
The fix is to write a custom completion function for the shell_command
function (cli/cli-cmds.c), this custom completion function will skip
forward to find the last word in the arguments, and then try to
complete that, so in the above example, GDB will skip over 'ls ', and
then tries to complete '/tm', which is exactly what we want.
Given that the filenames passed to the shell command are forwarded to
an actual shell, I have switched over the new quoted filename
completion function for the shell command, this means that white space
within a filename will be escaped with a backslash by the completion
function, which is likely what the user wants, this means the filename
will arrive in the (actual) shell as a single word, rather than
splitting on white space and arriving as two words.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32982
Reviewed-By: Tom Tromey <tom@tromey.com>
|
|
DAP requests have a "defer_stop_events" option that is intended to
defer the emission of any "stopped" event until after the current
request completes. This was needed to handle async continues like
"finish &".
However, I noticed that sometimes DAP tests can fail, because a stop
event does arrive before the response to the "stepOut" request. I've
only noticed this when the machine is fairly loaded -- for instance
when I'm regression-testing a series, it may occur in some of the
tests mid-series.
I believe the problem is that the implementation in the "request"
function is incorrect -- the flag is set when "request" is invoked,
but instead it must be deferred until the request itself is run. That
is, the setting must be captured in one of the wrapper functions.
Following up on this, Simon pointed out that introducing a delay
before sending a request's response will cause test case failures.
That is, there's a race here that is normally hidden.
Investigation showed that that deferred requests can't force event
deferral. This patch implements this; but more testing showed many
more race failures. Some of these are due to how the test suite is
written.
Anyway, in the end I took the radical approach of deferring all events
by default. Most DAP requests are asynchronous by nature, so this
seemed ok. The only case I found that really required this is
pause.exp, where the test (rightly) expects to see a 'continued' event
while performing an inferior function call.
I went through all events and all requests and tried to convince
myself that this patch will cause acceptable behavior in every case.
However, it's hard to be completely sure about this approach. Maybe
there are cases that do still need an event before the response, but
we just don't have tests for them.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32685
Acked-By: Simon Marchi <simon.marchi@efficios.com>
|
|
On openSUSE Tumbleweed ppc64le-linux using gcc 14.3.0, with a gdb 16.3 based
package and test-case gdb.ada/finish-var-size.exp, I run into:
...
(gdb) finish^M
Run till exit from #0 pck.get (value=true) at pck.adb:19^M
0x0000000100004a20 in p () at finish-var-size/p.adb:18^M
18 V : Result_T := Get (True);^M
Value returned is $1 = <error reading variable: \
Cannot access memory at address 0x0>^M
(gdb) FAIL: gdb.ada/finish-var-size.exp: finish
...
Function pck.get returns type Result_T:
...
type Array_Type is array (1 .. 64) of Integer;
type Maybe_Array (Defined : Boolean := False) is
record
Arr : Array_Type;
Arr2 : Array_Type;
end record;
type Result_T (Defined : Boolean := False) is
record
case Defined is
when False =>
Arr : Maybe_Array;
when True =>
Payload : Boolean;
end case;
end record;
...
and uses r3 as address of the return value, which means
RETURN_VALUE_STRUCT_CONVENTION, but while executing finish_command we do:
...
return_value
= gdbarch_return_value_as_value (gdbarch,
read_var_value (sm->function, nullptr,
callee_frame),
val_type, nullptr, nullptr, nullptr);
...
and get:
...
(gdb) p return_value
$1 = RETURN_VALUE_REGISTER_CONVENTION
...
This is caused by this check in ppc64_sysv_abi_return_value:
...
/* In the ELFv2 ABI, aggregate types of up to 16 bytes are
returned in registers r3:r4. */
if (tdep->elf_abi == POWERPC_ELF_V2
&& valtype->length () <= 16
...
which succeeds because valtype->length () == 0.
Fix this by also checking for !TYPE_HAS_DYNAMIC_LENGTH (valtype).
[ I also tested a version of this patch using "!is_dynamic_type (valtype)"
instead, but ran into a regression in test-case gdb.ada/variant-record.exp,
because type T:
...
Length : constant Positive := 8;
subtype Name_T is String (1 .. Length);
type A_Record_T is
record
X1 : Natural;
X2 : Natural;
end record;
type Yes_No_T is (Yes, No);
type T (Well : Yes_No_T := Yes) is
record
case Well is
when Yes =>
Name : Name_T;
when No =>
Unique_Name : A_Record_T;
end case;
end record;
...
while being dynamic, also has a non-zero size, and is small enough to be
returned in registers r3:r4. ]
Fixing this causes the test-case to fail with the familiar:
...
warning: Cannot determine the function return value.
Try compiling with -fvar-tracking.
...
and indeed using -fvar-tracking makes the test-case pass.
Tested on ppc64le-linux.
PR tdep/33000
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33000
|
|
Building current GDB on Cygwin, fails like so:
/home/pedro/gdb/src/gdbsupport/run-time-clock.cc: In function ‘void get_run_time(user_cpu_time_clock::time_point&, system_cpu_time_clock::time_point&, run_time_scope ’:
/home/pedro/gdb/src/gdbsupport/run-time-clock.cc:52:13: error: ‘RUSAGE_THREAD’ was not declared in this scope; did you mean ‘SIGEV_THREAD’?
52 | who = RUSAGE_THREAD;
| ^~~~~~~~~~~~~
| SIGEV_THREAD
Cygwin does not implement RUSAGE_THREAD. Googling around, I see
Cygwin is not alone, other platforms don't support it either. For
example, here is someone suggesting an alternative for darwin/macos:
https://stackoverflow.com/questions/5652463/equivalent-to-rusage-thread-darwin
Fix this by falling back to process scope if thread scope can't be
supported. I chose this instead of returning zero usage or some other
constant, because if gdb is built without threading support, then
process-scope run time usage is the right info to return.
But instead of falling back silently, print a warning (just once),
like so:
(gdb) maint set per-command time on
⚠️ warning: per-thread run time information not available on this platform
... so that developers on other platforms at least have a hint
upfront.
This new warning also shows on platforms that don't have getrusage in
the first place, but does not show if the build doesn't support
threading at all.
New tests are added to gdb.base/maint.exp, to expect the warning, and
also to ensure other "mt per-command" sub commands don't trigger the
new warning.
Change-Id: Ie01b916b62f87006f855e31594a5ac7cf09e4c02
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Approved-By: Tom Tromey <tom@tromey.com>
|
|
I was a bit confused about the -lbl option in gdb_test_multiple, and needed
to read its implementation to determine that it would be useful for my
needs. Explicitly mention what the option does and why it's useful to
hopefully help other developers.
Reviewed-By: Keith Seitz <keiths@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
The Linaro CI runs the GDB testsuite using the read1 tool, which
significantly increases the time it takes DejaGNU to read inferior output.
On top of that sometimes the test machine has higher than normal load,
which causes tests to run even slower.
Because gdb.base/default.exp tests some verbose commands such as "info
set", it sometimes times out while waiting for the complete command
output when running in the Linaro CI environment.
Fix this problem by consuming each line of output from the most verbose
commands with gdb_test_multiple's -lbl (line-by-line) option — which
causes DejaGNU to reset the timeout after each match — and also by
breaking up regular expressions that try to match more than 2000
characters (the default Expect buffer size) in one go into multiple
matches.
Some tests use the same regular expression, so I created a procedure for
them. This is the case for "i" / "info", "info set" / "show", and "set
print" tests.
The tests for "show print" don't actually test their output, so this
patch improves them by checking some lines of the output.
Reviewed-By: Keith Seitz <keiths@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
Add a dwarf assembly test-case using a DW_FORM_strx in a .dwo file.
Tested on x86_64-linux.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
make-check-all.sh
I forgot to run test-case gdb.dwarf2/dw-form-strx-out-of-bounds.exp with
make-check-all.sh, and consequently failed to notice that it fails with for
instance target board fission-dwp.
The test-case does:
...
source $srcdir/$subdir/dw-form-strx.exp.tcl
...
and in that tcl file, prepare_for_testing fails, so a -1 is returned, but
that is ignored by the source command.
Fix this by using require, but rather that testing the result of the source
command, communicate success by setting a global variable
prepare_for_testing_done.
Likewise in gdb.dwarf2/dw-form-strx.exp.
Also, the test-case gdb.dwarf2/dw-form-strx-out-of-bounds.exp fails for target
board readnow, because the DWARF error occurs during a different command than
expected.
Fix this by just skipping the test-case in that case.
Tested on x86_64-linux.
Reported-by: Simon Marchi <simark@simark.ca>
Approved-By: Tom Tromey <tom@tromey.com>
|