Age | Commit message (Collapse) | Author | Files | Lines |
|
In the commit:
commit 4793f551a5aa68522fd5fbbb7e8f621148f410cd
Date: Mon Nov 27 13:33:17 2023 +0000
gdb: allow use of ~ in 'save gdb-index' command
I added a test which has a directory name within the GDB command,
which then appears in the test name as I failed to give the test a
better name.
Fixed in this commit.
|
|
'@' is a special symbol meaning 'command' in GNU texinfo.
If the GDBINIT or GDBINIT_DIR path during configuration
included an '@' character, the makeinfo command would fail,
as it interpreted the '@' in the path as a start of a command
when expanding the path in the docs.
This patch simply escapes any '@' characters in the path,
by replacing them with '@@'. This was already done for the
bugurl variable.
This was detected because the 'Jenkins' tool sometimes puts
an '@' in the workspace path.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
|
|
I found a "fall-through" comment in gdb/remote.c that was incorrect --
the code here cannot in fact fall through.
|
|
This commit enables the early initialization commands (92e4e97a9f5) to
modify the number of threads used by gdb's thread pool.
The motivation here is to prevent gdb from spawning a detrimental number
of threads on many-core systems under environments with restrictive
ulimits.
With gdb before this commit, the thread pool takes the following sizes:
1. Thread pool size is initialized to 0.
2. After the maintenance commands are defined, the thread pool size is
set to the number of system cores (if it has not already been set).
3. Using early initialization commands, the thread pool size can be
changed using "maint set worker-threads".
4. After the first prompt, the thread pool size can be changed as in the
previous step.
Therefore after step 2. gdb has potentially launched hundreds of threads
on a many-core system.
After this change, step 2 and 3 are reversed so there is an opportunity
to set the required number of threads without needing to default to the
number of system cores first.
There does exist a configure option (added in 261b07488b9) to disable
multithreading, but this does not allow for an already deployed gdb to
be configured.
Additionally, the default number of worker threads is clamped at eight
to control the number of worker threads spawned on many-core systems.
This value was chosen as testing recorded on bugzilla issue 29959
indicates that parallel efficiency drops past this point.
GDB built with GCC 13.
No test suite regressions detected. Compilers: GCC, ACfL, Intel, Intel
LLVM, NVHPC; Platforms: x86_64, aarch64.
The scenario that interests me the most involves preventing GDB from
spawning any worker threads at all. This was tested by counting the
number of clones observed by strace:
strace -e clone,clone3 gdb/gdb -q \
--early-init-eval-command="maint set worker-threads 0" \
-ex q ./gdb/gdb |& grep --count clone
The new test relies on "gdb: install CLI uiout while processing early
init files" developed by Andrew Burgess. This patch will need pushing
prior to this change.
The clamping was tested on machines with both 16 cores and a single
core. "maint show worker-threads" correctly reported eight and one
respectively.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The next commit wants to use a 'show' command within an early
initialisation file, despite these commands not being in the list of
acceptable commands for use within an early initialisation file.
The problem we run into is that the early initialisation files are
processed before GDB has installed the top level interpreter. The
interpreter is responsible to installing the default uiout (accessed
through current_uiout), and as a result code that depends on
uiout (e.g. 'show' commands) will end up dereferencing a nullptr, and
crashing GDB.
I did consider moving the interpreter installation before the early
initialisation, and this would work fine except for the new DAP
interpreter, which relies on having Python available during its
initialisation. Which means we can't install the interpreter until
after Python has been initialised, and the early initialisation
handling has to occur before Python is setup -- that's the whole point
of this feature (to allow customisation of how Python is setup).
So, what I propose is that early within captured_main_1, we install a
temporary cli_ui_out as the current_uiout. This will remain in place
until the top-level interpreter is installed, at which point the
temporary will be replaced.
What this means is that current_uiout will no longer be nullptr,
instead, any commands within an early initialisation file that trigger
output, will perform that output in a CLI style.
I propose that we don't update the documentation for early
initialisation files, we leave the user advice as being only 'set' and
'source' commands are acceptable. But now, if a user does try a
'show' command, then instead of crashing, GDB will do something
predictable.
I've not added a test in this commit. The next commit relies on this
patch and will serve as a test.
Tested-By: Richard Bunt <richard.bunt@linaro.org>
|
|
I noticed that after resizing to a narrow window, I got:
...
┌────────────────┐
│ │
│[ No Source Avail
able ] │
│ │
└────────────────┘
...
Fix this by adding two new functions:
- tui_win_info::display_string (int y, int x, const char *str)
- tui_win_info::display_string (const char *str)
that make sure that borders are not overwritten, which get us instead:
...
┌────────────────┐
│ │
│[ No Source Avai│
│ │
│ │
└────────────────┘
...
Tested on x86_64-linux.
|
|
When using GDB on native linux, it can happen that, while attempting
to detach an inferior, the inferior may have been exited or have been
killed, yet still be in the list of lwps. Should that happen, the
assert in x86_linux_update_debug_registers in
gdb/nat/x86-linux-dregs.c will trigger. The line in question looks
like this:
gdb_assert (lwp_is_stopped (lwp));
For this case, the lwp isn't stopped - it's dead.
The bug which brought this problem to my attention is one in which the
pwntools library uses GDB to to debug a process; as the script is
shutting things down, it kills the process that GDB is debugging and
also sends GDB a SIGTERM signal, which causes GDB to detach all
inferiors prior to exiting. Here's a link to the bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2192169
The following shell command mimics part of what the pwntools
reproducer script does (with regard to shutting things down), but
reproduces the bug much less reliably. I have found it necessary to
run the command a bunch of times before seeing the bug. (I usually
see it within 5-10 repetitions.) If you choose to try this command,
make sure that you have no running "cat" or "gdb" processes first!
cat </dev/zero >/dev/null & \
(sleep 5; (kill -KILL `pgrep cat` & kill -TERM `pgrep gdb`)) & \
sleep 1 ; \
gdb -q -iex 'set debuginfod enabled off' -ex 'set height 0' \
-ex c /usr/bin/cat `pgrep cat`
So, basically, the idea here is to kill both gdb and cat at roughly
the same time. If we happen to attempt the detach before the process
lwp has been deleted from GDB's (linux native) LWP data structures,
then the assert will trigger. The relevant part of the backtrace
looks like this:
#8 0x00000000008a83ae in x86_linux_update_debug_registers (lwp=0x1873280)
at gdb/nat/x86-linux-dregs.c:146
#9 0x00000000008a862f in x86_linux_prepare_to_resume (lwp=0x1873280)
at gdb/nat/x86-linux.c:81
#10 0x000000000048ea42 in x86_linux_nat_target::low_prepare_to_resume (
this=0x121eee0 <the_amd64_linux_nat_target>, lwp=0x1873280)
at gdb/x86-linux-nat.h:70
#11 0x000000000081a452 in detach_one_lwp (lp=0x1873280, signo_p=0x7fff8ca3441c)
at gdb/linux-nat.c:1374
#12 0x000000000081a85f in linux_nat_target::detach (
this=0x121eee0 <the_amd64_linux_nat_target>, inf=0x16e8f70, from_tty=0)
at gdb/linux-nat.c:1450
#13 0x000000000083a23b in thread_db_target::detach (
this=0x1206ae0 <the_thread_db_target>, inf=0x16e8f70, from_tty=0)
at gdb/linux-thread-db.c:1385
#14 0x0000000000a66722 in target_detach (inf=0x16e8f70, from_tty=0)
at gdb/target.c:2526
#15 0x0000000000a8f0ad in kill_or_detach (inf=0x16e8f70, from_tty=0)
at gdb/top.c:1659
#16 0x0000000000a8f4fa in quit_force (exit_arg=0x0, from_tty=0)
at gdb/top.c:1762
#17 0x000000000070829c in async_sigterm_handler (arg=0x0)
at gdb/event-top.c:1141
My colleague, Andrew Burgess, has done some recent work on other
problems with detach. Upon hearing of this problem, he came up a test
case which reliably reproduces the problem and tests for a few other
problems as well. In addition to testing detach when the inferior has
terminated due to a signal, it also tests detach when the inferior has
exited normally. Andrew observed that the linux-native-only
"checkpoint" command would be affected too, so the test also tests
those cases when there's an active checkpoint.
For the LWP exit / termination case with no checkpoint, that's handled
via newly added checks of the waitstatus in detach_one_lwp in
linux-nat.c.
For the checkpoint detach problem, I chose to pass the lwp_info
to linux_fork_detach in linux-fork.c. With that in place, suitable
tests were added before attempting a PTRACE_DETACH operation.
I added a few asserts at the beginning of linux_fork_detach and
modified the caller code so that the newly added asserts shouldn't
trigger. (That's what the 'pid == inferior_ptid.pid' check is about
in gdb/linux-nat.c.)
Lastly, I'll note that the checkpoint code needs some work with regard
to background execution. This patch doesn't attempt to fix that
problem, but it doesn't make it any worse. It does slightly improve
the situation with detach because, due to the check noted above,
linux_fork_detach() won't be called for the wrong inferior when there
are multiple inferiors. (There are at least two other problems with
the checkpoint code when there are multiple inferiors. See:
https://sourceware.org/bugzilla/show_bug.cgi?id=31065)
This commit also adds a new test,
gdb.base/process-dies-while-detaching.exp. Andrew Burgess is the
primary author of this test case. Its design is similar to that of
gdb.threads/main-thread-exit-during-detach.exp, which was also written
by Andrew.
This test checks that GDB correctly handles several cases that can
occur when GDB attempts to detach an inferior process. The process
can exit or be terminated (e.g. via SIGKILL) prior to GDB's event
loop getting a chance to remove it from GDB's internal data
structures. To complicate things even more, detach works differently
when a checkpoint (created via GDB's "checkpoint" command) exists for
the inferior. This test checks all four possibilities: process exit
with no checkpoint, process termination with no checkpoint, process
exit with a checkpoint, and process termination with a checkpoint.
Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
|
|
On Linux, threads are treated much like separate processes by the
kernel. In particular, it's possible to ptrace just a single thread.
If gdb tries to attach to a multi-threaded inferior, where a non-main
thread is already being traced (e.g., by strace), then gdb will get
into an infinite loop attempting to attach.
This patch fixes this problem by having the attach fail if ptrace
fails to attach to any thread of the inferior.
|
|
This changes linux_proc_attach_tgid_threads to use gdb_dir_up. This
makes it robust against exceptions.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
linux_proc_attach_tgid_threads computes a file name, and then
re-computes it for a warning. It is better to reuse the
already-computed name here.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
aarch64-linux-tdep.c
Fix two spots in aarch64-linux-tdep.c that build regcache_map_entry
arrays without a null terminator. The null terminators are needed for
regcache::transfer_regset and regcache_map_entry_size to work properly.
Change-Id: I3224a314e1360b319438f32de8c81e70ab42e105
Reviewed-By: John Baldwin <jhb@FreeBSD.org>
Approved-By: Luis Machado <luis.machado@arm.com>
|
|
regcache::transfer_regset iterates over an array of regcache_map_entry,
transferring the registers (between regcache and buffer) described by
those entries. It stops either when it reaches the end of the
regcache_map_entry array (marked by a null entry) or (it seems like the
intent is) when it reaches the end of the buffer (in which case not all
described registers are transferred).
I said "seems like the intent is", because there appears to be a small
bug. transfer_regset is made of two loops:
foreach regcache_map_entry:
foreach register described by the regcache_map_entry:
if the register doesn't fit in the remainder of the buffer:
break
transfer register
When stopping because we have reached the end of the buffer, the break
only breaks out of the inner loop.
This problem causes some failures when I run tests such as
gdb.arch/aarch64-sme-core-3.exp (on AArch64 Linux, in qemu). This is
partly due to aarch64_linux_iterate_over_regset_sections failing to add
a null terminator in its regcache_map_entry array, but I think there is
still a problem in transfer_regset.
The sequence to the crash is:
- The `regcache_map_entry za_regmap` object built in
aarch64_linux_iterate_over_regset_sections does not have a null
terminator.
- When the target does not have a ZA register,
aarch64_linux_collect_za_regset calls `regcache->collect_regset` with
a size of 0 (it's actually pointless, but still it should work).
- transfer_regset gets called with a buffer size of 0.
- transfer_regset detects that the register to transfer wouldn't fit in
0 bytes, so it breaks out of the inner loop.
- The outer loop tries to go read the next regcache_map_entry, but
there isn't one, and we start reading garbage.
Obviously, this would get fixed by making
aarch64_linux_iterate_over_regset_sections use a null terminator (which
is what the following patch does). But I think that when detecting that
there is not enough buffer left for the current register,
transfer_regset should return, not only break out of the inner loop.
This is a kind of contrived scenario, but imagine we have these two
regcache_map_entry objects:
- 2 registers of 8 bytes
- 2 registers of 4 bytes
For some reason, the caller passes a buffer of 12 bytes.
transfer_regset will detect that the second 8 byte register does not
fit, and break out of the inner loop. However, it will then go try the
next regcache_map_entry. It will see that it can fit one 4 byte
register in the remaining buffer space, and transfer it from/to there.
This is very likely not an expected behavior, we wouldn't expect to
read/write this sequence of registers from/to the buffer.
In this example, whether passing a 12 bytes buffer makes sense or
whether it is a size computation bug in the caller, we don't know, but I
think that exiting as soon as a register doesn't fit is the sane thing
to do.
Change-Id: Ia349627d2e5d281822ade92a8e7a4dea4f839e07
Reviewed-By: John Baldwin <jhb@FreeBSD.org>
Reviewed-By: Luis Machado <luis.machado@arm.com>
|
|
I noticed that the interpreters node in the docs links to the DAP
protocol docs, but I thought the DAP node ought to as well. This
patch adds a bit of introductory text there.
Approved-By: Eli Zaretskii <eliz@gnu.org>
|
|
The commit a3da2e7e550c4fe79128b5e532dbb90df4d4f418 has introduced
regressions when testing using the READ1 mechanism. The reason for that
is the new failure path in proc test_gdb_complete_tab_unique, which
looks for GDB suggesting more than what the test inputted, but not the
correct answer, followed by a white space. Consider the following case:
int foo(int bar, int baz);
Sending the command "break foo<tab>" to GDB will return
break foo(int, int)
which easily fits the buffer in normal testing, so everything works, but
when reading one character at a time, the test will find the partial
"break foo(int, " and assume that there was a mistake, so we get a
spurious FAIL.
That change was added because we wanted to avoid forcing a completion
failure to fail through timeout, which it had to do because there is no
way to verify that the output is done, mostly because when I was trying
to solve a different problem I kept getting reading errors and testing
completion was frustrating.
This commit implements a better way to avoid that frustration, by first
testing gdb's complete command and only if that passes we will test tab
completion. The difference is that when testing with the complete
command, we can tell when the output is over when we receive the GDB
prompt again, so we don't need to rely on timeouts. With this, the
change to test_gdb_complete_tab_unique has been removed as that test
will only be run and fail in the very unlikely scenario that tab
completion is different than command completion.
Approved-By: Andrew Burgess <aburgess@redhat.com>
|
|
This is a patch to simplify the pd_activate () in aix-thread.c incase of a failure to start a thread session
, since pd_activate () now has return type void.
Also this patch fixes the shadow declarartion warning in sync-threadlists.
|
|
Fix these two warnings, when building on macos:
CXX cp-name-parser.o
/Users/smarchi/src/binutils-gdb/gdb/cp-name-parser.y:1644:7: error: fallthrough annotation does not directly precede switch label
[[fallthrough]];
^
CXX dbxread.o
/Users/smarchi/src/binutils-gdb/gdb/dbxread.c:2809:7: error: fallthrough annotation does not directly precede switch label
[[fallthrough]];
^
In these two cases, we [[fallthrough]], followed by a regular label,
followed by a case label. Move the [[fallthrough]] below the regular
label.
Change-Id: If4a3145139e050bdb6950c7f239badd5778e6f64
Approved-By: Tom Tromey <tom@tromey.com>
|
|
In gdb/configure the line:
...
$development || tentative_python_cflags="$tentative_python_cflags -DNDEBUG"
...
intends to ensure that -DNDEBUG is added to the python flags of a release
build.
However, when building gdb-14-branch we have:
...
configure:22024: checking compiler flags for python code
...
configure:22047: result: -fno-strict-aliasing -fwrapv
...
This is a regression since commit db6878ac553 ("Move sourcing of development.sh
to GDB_AC_COMMON"), which introduced a reference before assignment:
...
$development || tentative_python_cflags="$tentative_python_cflags -DNDEBUG"
...
. $srcdir/../bfd/development.sh
...
and consequently -DNDEBUG is never added.
[ This was not obvious to me, but apparently evaluating an empty or undefined
variable in this context is similar to using ':' or 'true', so the line is
evaluated as:
...
true || tentative_python_cflags="$tentative_python_cflags -DNDEBUG"
... ]
Fix this by moving GDB_AC_COMMON up in gdb/configure.ac, similar to how that
was done for gdbserver/configure.ac in commit db6878ac553.
[ Unfortunately, the move might introduce issues similar to the one we're
fixing, and I'm not sure how to check for this. Shellcheck doesn't detect
this type of problem. FWIW, I did run shellcheck (using arguments -xa, in the
src/gdb directory to make sure ../bfd/development.sh is taken into account)
before and after and observed that the number of lines/words/chars in the
shellcheck output is identical. ]
Build & tested on top of trunk.
Also build on top of gdb-14-branch, and observed this in gdb/config.log:
...
configure:25214: checking compiler flags for python code
...
configure:25237: result: -fno-strict-aliasing -fwrapv -DNDEBUG
...
Approved-By: Tom Tromey <tom@tromey.com>
PR build/31099
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31099
|
|
procfs.c doesn't currently compile on Solaris:
/vol/src/gnu/gdb/hg/master/dist/gdb/procfs.c: In member function ‘virtual int procfs_target::can_use_hw_breakpoint(bptype, int, int)’:
/vol/src/gnu/gdb/hg/master/dist/gdb/procfs.c:3017:9: error: ‘ptr_type’ was not declared in this scope; did you mean ‘var_types’?
3017 | type *ptr_type
| ^~~~~~~~
| var_types
This was caused by this patch:
commit 99d9c3b92ca96a7425cbb6b1bf453ede9477a2ee
Author: Simon Marchi <simon.marchi@efficios.com>
Date: Fri Sep 29 14:24:38 2023 -0400
gdb: remove target_gdbarch
Partially undoing it restores the build.
Tested on amd64-pc-solaris2.11.
|
|
C++17 makes the second parameter to static_assert optional, so we can
remove gdb_static_assert now.
|
|
index-write.c has a comment indicating that C++17's try_emplace could
be used. This patch makes the change.
Approved-By: Pedro Alves <pedro@palves.net>
|
|
This changes the various gdb-related directories to use
-Wimplicit-fallthrough=5, meaning that only the fallthrough attribute
can be used in switches -- special 'fallthrough' comments will no
longer be usable.
Approved-By: Pedro Alves <pedro@palves.net>
|
|
This changes gdb to use the C++17 [[fallthrough]] attribute rather
than special comments.
This was mostly done by script, but I neglected a few spellings and so
also fixed it up by hand.
I suspect this fixes the bug mentioned below, by switching to a
standard approach that, presumably, clang supports.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=23159
Approved-By: John Baldwin <jhb@FreeBSD.org>
Approved-By: Luis Machado <luis.machado@arm.com>
Approved-By: Pedro Alves <pedro@palves.net>
|
|
This commit makes the gdb.Command.complete methods more verbose when
it comes to error handling.
Previous to this commit if any commands implemented in Python
implemented the complete method, and if there were any errors
encountered when calling that complete method, then GDB would silently
hide the error and continue as if there were no completions.
This makes is difficult to debug any errors encountered when writing
completion methods, and encourages the idea that Python extensions can
be broken, and GDB will just silently work around them.
I don't think this is a good idea. GDB should encourage extensions to
be written correctly, and robustly, and one way in which GDB can (I
think) support this, is by pointing out when an extension goes wrong.
In this commit I've gone through the Python command completion code,
and added calls to gdbpy_print_stack() or gdbpy_print_stack_or_quit()
in places where we were either clearing the Python error, or, in some
cases, just not handling the error at all.
One thing I have not changed is in cmdpy_completer (py-cmd.c) where we
process the list of completions returned from the Command.complete
method; this routine includes a call to gdbpy_is_string to check a
possible completion is a string, if not the completion is ignored.
I was tempted to remove this check, attempt to complete each result to
a string, and display an error if the conversion fails. After all,
returning anything but a string is surely a mistake by the extension
author.
However, the docs clearly say that only strings within the returned
list will be considered as completions. Anything else is ignored. As
such, and to avoid (what I think is pretty unlikely) breakage of
existing code, I've retained the gdbpy_is_string check.
After the gdbpy_is_string check we call python_string_to_host_string,
if this call fails then I do now print the error, where before we
ignored the error. I think this is OK; if GDB thinks something is a
string, but still can't convert it to a string, then I think it's OK
to display the error in that case.
Another case which I was a little unsure about was in
cmdpy_completer_helper, and the call to PyObject_CallMethodObjArgs,
which is when we actually call Command.complete. Previously, if this
call resulted in an exception then we would ignore this and just
pretend there were no completions.
Of all the changes, this is possibly the one with the biggest
potential for breaking existing scripts, but also, is, I think, the
most useful change. If the user code is wrong in some way, such that
an exception is raised, then previously the user would have no obvious
feedback about this breakage. Now GDB will print the exception for
them, making it, I think, much easier to debug their extension. But,
if there is user code in the wild that relies on raising an exception
as a means to indicate there are no completions .... well, that code
is going to break after this commit. I think we can live with this
though, the exceptions means no completions thing was never documented
behaviour.
I also added a new error() call if the PyObject_CallMethodObjArgs call
raises an exception. This causes the completion mechanism within GDB
to stop. Within GDB the completion code is called twice, the first
time to compute the work break characters, and then a second time to
compute the actual completions.
If PyObject_CallMethodObjArgs raises an exception when computing the
word break character, and we print it by calling
gdbpy_print_stack_or_quit(), but then carry on as if
PyObject_CallMethodObjArgs had returns no completions, GDB will
call the Python completion code again, which results in another call
to PyObject_CallMethodObjArgs, which might raise the same exception
again. This results in the Python exception being printed twice.
By throwing a C++ exception after the failed
PyObject_CallMethodObjArgs call, the completion mechanism is aborted,
and no completions are offered. But importantly, the Python exception
is only printed once. I think this gives a much better user
experience.
I've added some tests to cover this case, as I think this is the most
likely case that a user will run into.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
I spotted I made a small mistake in this commit:
commit aff250145af6c7a8ea9332bc1306c1219f4a63db
Date: Fri Nov 24 12:04:36 2023 +0000
gdb: generate gdb-index identically regardless of work thread count
In this commit I added a new proc in testsuite/lib/gdb.exp called
gdb_get_worker_threads. This proc uses gdb_test_multiple with two
possible patterns. One pattern is anchored with '^', while the other
is missing the '^' which it could use.
This commit adds the missing '^'.
|
|
Fixes this issue, introduced by f9582a22dba7 ("[gdb] Fix segfault in
for_each_block, part 1"):
CXX darwin-nat.o
/Users/smarchi/src/binutils-gdb/gdb/darwin-nat.c:1169:7: error: no matching function for call to 'breakpoint_inserted_here_p'
if (breakpoint_inserted_here_p (inf->aspace, pc))
^~~~~~~~~~~~~~~~~~~~~~~~~~
Change-Id: I3bb6be75b650319f0fa1dbdceb379b18531da96c
|
|
DAP specifies a "process" event that is sent when a process is started
or attached to. gdb was not emitting this (several DAP clients appear
to ignore it entirely), but it looked easy and harmless to implement.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30473
|
|
I noticed in gdb/tui/tui-stack.c a source-level micro-optimization where
strlen with a string literal argument:
...
strlen ("bla")
...
is replaced with sizeof:
...
sizeof ("bla") - 1
...
The benefit of this is that the optimization is also done at O0, but the
drawback is that it makes the expression harder to read.
Use const std::string to encapsulate the string literals, and use
std::string::size () instead.
I tried making the string names (PROC_PREFIX, LINE_PREFIX, PC_PREFIX and
SINGLE_KEY) lower-case, but that clashed with a pre-existing pc_prefix, so
I've left them upper-case.
Tested on x86_64-linux.
Tested-By: Alexandra Petlanova Hajkova <ahajkova@redhat.com>
|
|
The make-check-all.sh script (gdb/testsuite/make-check-all.sh) is
great, it makes it super easy to run some test(s) using all the
available board files.
This commit aims to make this script even easier to access by adding a
check-all-boards target to the GDB Makefile. This new target checks
for (and requires) a number of environment variables, so the target
should be used like this:
make check-all-boards GDB_TARGET_USERNAME=remote-target \
GDB_HOST_USERNAME=remote-host \
TESTS="gdb.base/break.exp"
Where GDB_TARGET_USERNAME and GDB_HOST_USERNAME are the user names
that should be passed to the make-check-all.sh --target-user and
--host-user command line options respectively.
My personal intention is to set these variables in my environment, so
all I'll need to do is:
make check-all-boards TESTS="gdb.base/break.exp"
The make rule always passes --keep-results to the make-check-all.sh
script, as I find that the most useful. It's super frustrating to run
the tests and realise you forgot that option and the results have been
discarded.
|
|
I have been making more use of the make-check-all.sh script to run
tests against all boards.
But one thing is pretty annoying. When a test fails on some random
board, I have to run make-check-all.sh with --verbose and --dry-run in
order to see what RUNTESTFLAGS I should be using.
I always run with --keep-results on, so, in this commit, I propose
that, when --keep-results is on the 'make check' command will be
written out to a file within the stored results directory, like:
check-all/BOARD_NAME/make-check.sh
then, if I want to rerun a test, I can just:
sh check-all/BOARD_NAME/make-check.sh
and the test will be re-run for me.
|
|
Similar to the previous commit, this commit ensures that the dwarf-5
index files are generated identically as the number of worker-threads
changes.
Building the dwarf-5 index makes use of a closed hash table, the
bucket_hash local within debug_names::build(). Entries are added to
bucket_hash from m_name_to_value_set, which, in turn, is populated
by calls to debug_names::insert() in write_debug_names. The insert
calls are ordered based on the entries within the cooked_index, and
the ordering within cooked_index depends on the number of worker
threads that GDB is using.
My proposal is to sort each chain within the bucket_hash closed hash
table prior to using this to build the dwarf-5 index.
The buckets within bucket_hash will always have the same ordering (for
a given GDB build with a given executable), and by sorting the chains
within each bucket, we can be sure that GDB will see each entry in a
deterministic order.
I've extended the index creation test to cover this case.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
It was observed that changing the number of worker threads that GDB
uses (maintenance set worker-threads NUM) would have an impact on the
layout of the generated gdb-index.
The cause seems to be how the CU are distributed between threads, and
then symbols that appear in multiple CU can be encountered earlier or
later depending on whether a particular CU moves between threads.
I certainly found this behaviour was reproducible when generating an
index for GDB itself, like:
gdb -q -nx -nh -batch \
-eiex 'maint set worker-threads NUM' \
-ex 'save gdb-index /tmp/'
And then setting different values for NUM will change the generated
index.
Now, the question is: does this matter?
I would like to suggest that yes, this does matter. At Red Hat we
generate a gdb-index as part of the build process, and we would
ideally like to have reproducible builds: for the same source,
compiled with the same tool-chain, we should get the exact same output
binary. And we do .... except for the index.
Now we could simply force GDB to only use a single worker thread when
we build the index, but, I don't think the idea of reproducible builds
is that strange, so I think we should ensure that our generated
indexes are always reproducible.
To achieve this, I propose that we add an extra step when building the
gdb-index file. After constructing the initial symbol hash table
contents, we will pull all the symbols out of the hash, sort them,
then re-insert them in sorted order. This will ensure that the
structure of the generated hash will remain consistent (given the same
set of symbols).
I've extended the existing index-file test to check that the generated
index doesn't change if we adjust the number of worker threads used.
Given that this test is already rather slow, I've only made one change
to the worker-thread count. Maybe this test should be changed to use
a smaller binary, which is quicker to load, and for which we could
then try many different worker thread counts.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Make static the functions add_index_entry, find_slot, and hash_expand,
member functions of the mapped_symtab class.
Fold an additional snippet of code from write_gdbindex into
mapped_symtab::minimize, this code relates to minimisation, so this
seems like a good home for it.
Make the n_elements, data, and m_string_obstack member variables of
mapped_symtab private. Provide a new obstack() member function to
provide access to the obstack when needed, and also add member
functions begin(), end(), cbegin(), and cend() so that the
mapped_symtab class can be treated like a contained and iterated
over.
I've also taken this opportunity to split out the logic for whether
the hash table (m_data) needs expanding, this is the new function
hash_needs_expanding. This will be useful in a later commit.
There should be no user visible changes after this commit.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
I noticed in passing that out algorithm for generating the gdb-index
file is incorrect. When building the hash table in add_index_entry we
count every incoming entry rehash when the number of entries gets too
large. However, some of the incoming entries will be duplicates,
which don't actually result in new items being added to the hash
table.
As a result, we grow the gdb-index hash table far too often.
With an unmodified GDB, generating a gdb-index for GDB, I see a file
size of 90M, with a hash usage (in the generated index file) of just
2.6%.
With a patched GDB, generating a gdb-index for the _same_ GDB binary,
I now see a gdb-index file size of 30M, with a hash usage of 41.9%.
This is a 67% reduction in gdb-index file size.
Obviously, not every gdb-index file is going to see such big savings,
however, the larger a program, and the more symbols that are
duplicated between compilation units, the more GDB would over count,
and so, over-grow the index.
The gdb-index hash table we create has a minimum size of 1024, and
then we grow the hash when it is 75% full, doubling the hash table at
that time. Given this, then we expect that either:
a. The hash table is size 1024, and less than 75% full, or
b. The hash table is between 37.5% and 75% full.
I've include a test that checks some of these constraints -- I've not
bothered to check the upper limit, and over full hash table isn't
really a problem here, but if the fill percentage is less than 37.5%
then this indicates that we've done something wrong (obviously, I also
check for the 1024 minimum size).
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Split out the code that makes a copy of the GDB executable ready for
self testing into a new proc. A later commit in this series wants to
load the GDB executable into GDB (for creating an on-disk debug
index), but doesn't need to make use of the full do_self_tests proc.
There should be no changes in what is tested after this commit.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Add proper support for option completion to the 'save gdb-index'
command. Update save_gdb_index_command function to make use of the
new option_def data structures for parsing the '-dwarf-5' option.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
Add a call to gdb_tilde_expand in the save_gdb_index_command function,
this means that we can now do:
(gdb) save gdb-index ~/blah/
Previous this wouldn't work.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
The previous commit describes PR gdb/30547, a segfault when running test-case
gdb.base/vfork-follow-parent.exp on powerpc64 (likewise on s390x).
The root cause for the segmentation fault is that linux_is_uclinux gives an
incorrect result: it returns true instead of false.
So, why does linux_is_uclinux:
...
int
linux_is_uclinux (void)
{
CORE_ADDR dummy;
return (target_auxv_search (AT_NULL, &dummy) > 0
&& target_auxv_search (AT_PAGESZ, &dummy) == 0);
...
return true?
This is because ppc_linux_target_wordsize returns 4 instead of 8, causing
ppc_linux_nat_target::auxv_parse to misinterpret the auxv vector.
So, why does ppc_linux_target_wordsize:
...
int
ppc_linux_target_wordsize (int tid)
{
int wordsize = 4;
/* Check for 64-bit inferior process. This is the case when the host is
64-bit, and in addition the top bit of the MSR register is set. */
long msr;
errno = 0;
msr = (long) ptrace (PTRACE_PEEKUSER, tid, PT_MSR * 8, 0);
if (errno == 0 && ppc64_64bit_inferior_p (msr))
wordsize = 8;
return wordsize;
}
...
return 4?
Specifically, we get this result because because tid == 0, so we get
errno == ESRCH.
The tid == 0 is caused by the switch_to_no_thread in
handle_vfork_child_exec_or_exit:
...
/* Switch to no-thread while running clone_program_space, so
that clone_program_space doesn't want to read the
selected frame of a dead process. */
scoped_restore_current_thread restore_thread;
switch_to_no_thread ();
inf->pspace = new program_space (maybe_new_address_space ());
...
but moving the maybe_new_address_space call to before that gives us the
same result. The tid is no longer 0, but we still get ESRCH because the
thread has exited.
Fix this in handle_vfork_child_exec_or_exit by doing the
maybe_new_address_space call in the context of the vfork parent.
Tested on top of trunk on x86_64-linux and ppc64le-linux.
Tested on top of gdb-14-branch on ppc64-linux.
Co-Authored-By: Simon Marchi <simon.marchi@polymtl.ca>
PR gdb/30547
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30547
|
|
When running test-case gdb.base/vfork-follow-parent.exp on powerpc64 (likewise
on s390x), I run into:
...
(gdb) PASS: gdb.base/vfork-follow-parent.exp: \
exec_file=vfork-follow-parent-exit: target-non-stop=on: non-stop=off: \
resolution_method=schedule-multiple: print unblock_parent = 1
continue^M
Continuing.^M
Reading symbols from vfork-follow-parent-exit...^M
^M
^M
Fatal signal: Segmentation fault^M
----- Backtrace -----^M
0x1027d3e7 gdb_internal_backtrace_1^M
src/gdb/bt-utils.c:122^M
0x1027d54f _Z22gdb_internal_backtracev^M
src/gdb/bt-utils.c:168^M
0x1057643f handle_fatal_signal^M
src/gdb/event-top.c:889^M
0x10576677 handle_sigsegv^M
src/gdb/event-top.c:962^M
0x3fffa7610477 ???^M
0x103f2144 for_each_block^M
src/gdb/dcache.c:199^M
0x103f235b _Z17dcache_invalidateP13dcache_struct^M
src/gdb/dcache.c:251^M
0x10bde8c7 _Z24target_dcache_invalidatev^M
src/gdb/target-dcache.c:50^M
...
or similar.
The root cause for the segmentation fault is that linux_is_uclinux gives an
incorrect result: it should always return false, given that we're running on a
regular linux system, but instead it returns first true, then false.
In more detail, the segmentation fault happens as follows:
- a program space with an address space is created
- a second program space is about to be created. maybe_new_address_space
is called, and because linux_is_uclinux returns true, maybe_new_address_space
returns false, and no new address space is created
- a second program space with the same address space is created
- a program space is deleted. Because linux_is_uclinux now returns false,
gdbarch_has_shared_address_space (current_inferior ()->arch ()) returns
false, and the address space is deleted
- when gdb uses the address space of the remaining program space, we run into
the segfault, because the address space is deleted.
Hardcoding linux_is_uclinux to false makes the test-case pass.
We leave addressing the root cause for the following commit in this series.
For now, prevent the segmentation fault by making the address space a refcounted
object.
This was already suggested here [1]:
...
A better solution might be to have the address spaces be reference counted
...
Tested on top of trunk on x86_64-linux and ppc64le-linux.
Tested on top of gdb-14-branch on ppc64-linux.
Co-Authored-By: Simon Marchi <simon.marchi@polymtl.ca>
PR gdb/30547
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30547
[1] https://sourceware.org/pipermail/gdb-patches/2023-October/202928.html
|
|
If a target provides a target description including registers from the
XSAVE extended region, but does not provide an XSAVE layout, use a
fallback XSAVE layout based on the included registers. This fallback
layout matches GDB's behavior in earlier releases which assumes the
layout from Intel CPUs.
This fallback layout is currently only used for remote targets since
native targets which support XSAVE provide an explicit layout derived
from CPUID.
PR gdb/30912
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30912
Approved-By: Simon Marchi <simon.marchi@efficios.com>
|
|
We have a target board cc-with-gdb-index that uses the gdb-add-index script to
add a .gdb_index index to an exec.
There is however an alternative way of adding a .gdb_index: the index-cache.
Add a new target board cc-with-index-cache.
This is not superfluous for two reasons:
- there is functionality that gdb-add-index doesn't support, but the
index-cache does: the index-cache can add an index to an exec with a
.gnu_debugaltlink (note that when using the cc-with-gdb-index board this
case is quietly ignored), and
- using the index-cache is excercised in only a few test-cases, and having
this target board extends the test coverage to the entire test suite. This
is for instance relevant because the index-cache is written by a worker
thread in the background, so we can check more thoroughly for data races
(see PR symtab/30837).
Tested on x86_64-linux.
Shell script changes checked with shellcheck.
Approved-By: Tom Tromey <tom@tromey.com>
|
|
This changes serial_readchar to throw an exception rather than trying
to set and preserve errno.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30770
|
|
This changes serial_send_break and serial_write to throw exceptions
rather than attempt to set errno and return an error indicator. This
lets us correctly report failures on Windows.
Both functions had to be converted in a single patch because one
implementation of send_break works via write.
This also introduces remote_serial_send_break to handle error checking
when attempting to send a break. This was previously ignored.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30770
|
|
remote.c assumes that a failure to open the serial connection will set
errno. This is somewhat true, because the Windows code tries to set
errno appropriately -- but only somewhat, because it isn't clear that
the "pex" code sets it, and the tcp code seems to do the wrong thing.
It seems better to simply have the serial open functions throw on
error.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30770
|
|
remote.c has this code:
if (serial_setbaudrate (rs->remote_desc, baud_rate))
{
/* The requested speed could not be set. Error out to
top level after closing remote_desc. Take care to
set remote_desc to NULL to avoid closing remote_desc
more than once. */
serial_close (rs->remote_desc);
rs->remote_desc = NULL;
perror_with_name (name);
The perror here cannot be correct, because if serial_setbaudrate did
set errno, it may be obscured by serial_close.
This patch changes serial_setbaudrate to throw an exception instead.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30770
|
|
This introduces throw_winerror_with_name, a Windows analog of
perror_with_name, and changes various places in gdb to call it.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30770
|
|
The ClearCommBreak documentation says:
If the function fails, the return value is zero.
ser_windows_send_break inverts this check. This has never been
noticed because the caller doesn't check the result.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30770
|
|
While working on cancellation, I noticed that a DAP 'pause' request
would set the "do not emit the continue" flag. This meant that a
subsequent request that should provoke a 'continue' event would
instead suppress the event.
I then tried writing a more obvious test case for this, involving an
inferior call -- and discovered that gdb.events.cont does not fire for
an inferior call.
This patch installs a new event listener for gdb.events.inferior_call
and arranges for this to emit continue and stop events when
appropriate. It also fixes the original bug, by adding a check to
exec_and_expect_stop.
|
|
Make it return a bool and adjust a few comparisons where it's used.
Change-Id: Ic77d23b0dcfcfc9195dfe65e4c7ff9cf3229f6fb
|