|
the frame cache
The test gdb.base/frame-view.exp fails like this on AArch64:
frame^M
#0 baz (z1=hahaha, /home/simark/src/binutils-gdb/gdb/value.c:4056: internal-error: value_fetch_lazy_register: Assertion `next_frame != NULL' failed.^M
A problem internal to GDB has been detected,^M
further debugging may prove unreliable.^M
FAIL: gdb.base/frame-view.exp: with_pretty_printer=true: frame (GDB internal error)
The sequence of events leading to this is the following:
- When we create the user frame (the "select-frame view" command), we
create a sentinel frame just for our user-created frame, in
create_new_frame. This sentinel frame has the same id as the regular
sentinel frame.
- When printing the frame, after doing the "select-frame view" command,
the argument's pretty printer is invoked, which does an inferior
function call (this is the point of the test). This clears the frame
cache, including the "real" sentinel frame, which sets the
sentinel_frame global to nullptr.
- Later in the frame-printing process (when printing the second
argument), the auto-reinflation mechanism re-creates the user frame
by calling create_new_frame again, creating its own special sentinel
frame again. However, note that the "real" sentinel frame, the
sentinel_frame global, is still nullptr. If the selected frame had
been a regular frame, we would have called get_current_frame at some
point during the reinflation, which would have re-created the "real"
sentinel frame. But it's not the case when reinflating a user frame.
- Deep down the stack, something wants to fill in the unwind stop
reason for frame 0, which requires trying to unwind frame 1. This
leads us to trying to unwind the PC of frame 1:
#0 gdbarch_unwind_pc (gdbarch=0xffff8d010080, next_frame=...) at /home/simark/src/binutils-gdb/gdb/gdbarch.c:2955
#1 0x000000000134569c in dwarf2_tailcall_sniffer_first (this_frame=..., tailcall_cachep=0xffff773fcae0, entry_cfa_sp_offsetp=0xfffff7f7d450)
at /home/simark/src/binutils-gdb/gdb/dwarf2/frame-tailcall.c:390
#2 0x0000000001355d84 in dwarf2_frame_cache (this_frame=..., this_cache=0xffff773fc928) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1089
#3 0x00000000013562b0 in dwarf2_frame_unwind_stop_reason (this_frame=..., this_cache=0xffff773fc928) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1101
#4 0x0000000001990f64 in get_prev_frame_always_1 (this_frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2281
#5 0x0000000001993034 in get_prev_frame_always (this_frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2376
#6 0x000000000199b814 in get_frame_unwind_stop_reason (frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:3051
#7 0x0000000001359cd8 in dwarf2_frame_cfa (this_frame=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1356
#8 0x000000000132122c in dwarf_expr_context::execute_stack_op (this=0xfffff7f80170, op_ptr=0xffff8d8883ee "\217\002", op_end=0xffff8d8883ee "\217\002")
at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:2110
#9 0x0000000001317b30 in dwarf_expr_context::eval (this=0xfffff7f80170, addr=0xffff8d8883ed "\234\217\002", len=1) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239
#10 0x000000000131d68c in dwarf_expr_context::execute_stack_op (this=0xfffff7f80170, op_ptr=0xffff8d88840e "", op_end=0xffff8d88840e "") at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1811
#11 0x0000000001317b30 in dwarf_expr_context::eval (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239
#12 0x0000000001314c3c in dwarf_expr_context::evaluate (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2, as_lval=true, per_cu=0xffff90b03700, frame=..., addr_info=0x0,
type=0xffff8f6c8400, subobj_type=0xffff8f6c8400, subobj_offset=0) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1078
#13 0x000000000149f9e0 in dwarf2_evaluate_loc_desc_full (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, per_cu=0xffff90b03700, per_objfile=0xffff9070b980,
subobj_type=0xffff8f6c8400, subobj_byte_offset=0, as_lval=true) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1513
#14 0x00000000014a0100 in dwarf2_evaluate_loc_desc (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, per_cu=0xffff90b03700, per_objfile=0xffff9070b980, as_lval=true)
at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1557
#15 0x00000000014aa584 in locexpr_read_variable (symbol=0xffff8f6cd770, frame=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:3052
- AArch64 defines a special "prev register" function,
aarch64_dwarf2_prev_register, to handle unwinding the PC. This
function does
frame_unwind_register_unsigned (this_frame, AARCH64_LR_REGNUM);
- frame_unwind_register_unsigned ultimately creates a lazy register
value, saving the frame id of this_frame->next. this_frame is the
user-created frame, to this_frame->next is the special sentinel frame
we created for it. So the saved ID is the sentinel frame ID.
- When time comes to un-lazify the value, value_fetch_lazy_register
calls frame_find_by_id, to find the frame with the ID we saved.
- frame_find_by_id sees it's the sentinel frame ID, so returns the
sentinel_frame global, which is, if you remember, nullptr.
- We hit the `gdb_assert (next_frame != NULL)` assertion in
value_fetch_lazy_register.
The issues I see here are:
- The ID of the sentinel frame created for the user-created frame is
not distinguishable from the ID of the regular sentinel frame. So
there's no way frame_find_by_id could find the right frame, in
value_fetch_lazy_register.
- Even if they had distinguishable IDs, sentinel frames created for
user frames are not registered anywhere, so there's no easy way
frame_find_by_id could find it.
This patch addresses these two issues:
- Give sentinel frames created for user frames their own distinct IDs
- Register sentinel frames in the frame cache, so they can be found
with frame_find_by_id.
I initially had this split in two patches, but I then found that it was
easier to explain as a single patch.
Rergarding the first part of the change: with this patch, the sentinel
frames created for user frames (in create_new_frame) still have
stack_status == FID_STACK_SENTINEL, but their code_addr and stack_addr
fields are now filled with the addresses used to create the user frame.
This ensures this sentinel frame ID is different from the "target"
sentinel frame ID, as well as any other "user" sentinel frame ID. If
the user tries to create the same frame, with the same addresses,
multiple times, create_sentinel_frame just reuses the existing frame.
So we won't end up with multiple user sentinels with the same ID.
Regular "target" sentinel frames remain with code_addr and stack_addr
unset.
The concrete changes for that part are:
- Remove the sentinel_frame_id constant, since there isn't one
"sentinel frame ID" now. Add the frame_id_build_sentinel function
for building sentinel frame IDs and a is_sentinel_frame_id function
to check if a frame id represents a sentinel frame.
- Replace the sentinel_frame_id check in frame_find_by_id with a
comparison to `frame_id_build_sentinel (0, 0)`. The sentinel_frame
global is meant to contain a reference to the "target" sentinel, so
the one with addresses (0, 0).
- Add stack and code address parameters to create_sentinel_frame, to be
able to create the various types of sentinel frames.
- Adjust get_current_frame to create the regular "target" sentinel.
- Adjust create_new_frame to create a sentinel with the ID specific to
the created user frame.
- Adjust sentinel_frame_prev_register to get the sentinel frame ID from
the frame_info object, since there isn't a single "sentinel frame ID"
now.
- Change get_next_frame_sentinel_okay to check for a
sentinel-frame-id-like frame ID, rather than for sentinel_frame
specifically, since this function could be called with another
sentinel frame (and we would want the assert to catch it).
The rest of the change is about registering the sentinel frame in the
frame cache:
- Change frame_stash_add's assertion to allow sentinel frame levels
(-1).
- Make create_sentinel_frame add the frame to the frame cache.
- Change the "sentinel_frame != NULL" check in reinit_frame_cache for a
check that the frame stash is not empty. The idea is that if we only
have some user-created frames in the cache when reinit_frame_cache is
called, we probably want to emit the frames invalid annotation. The
goal of that check is to avoid unnecessary repeated annotations, I
suppose, so the "frame cache not empty" check should achieve that.
After this change, I think we could theoritically get rid of the
sentienl_frame global. That sentinel frame could always be found by
looking up `frame_id_build_sentinel (0, 0)` in the frame cache.
However, I left the global there to avoid slowing the typical case down
for nothing. I however, noted in its comment that it is an
optimization.
With this fix applied, the gdb.base/frame-view.exp now passes for me on
AArch64. value_of_register_lazy now saves the special sentinel frame ID
in the value, and value_fetch_lazy_register is able to find that
sentinel frame after the frame cache reinit and after the user-created
frame was reinflated.
Tested-By: Alexandra Petlanova Hajkova <ahajkova@redhat.com>
Tested-By: Luis Machado <luis.machado@arm.com>
Change-Id: I8b77b3448822c8aab3e1c3dda76ec434eb62704f
|
|
Currently, despite having a smart pointer for frame_infos, GDB may
attempt to use an invalidated frame_info_ptr, which would cause internal
errors to happen. One such example has been documented as PR
python/28856, that happened when printing frame arguments calls an
inferior function.
To avoid failures, the smart wrapper was changed to also cache the frame
id, so the pointer can be reinflated later. For this to work, the
frame-id stuff had to be moved to their own .h file, which is included
by frame-info.h.
Frame_id caching is done explicitly using the prepare_reinflate method.
Caching is done manually so that only the pointers that need to be saved
will be, and reinflating has to be done manually using the reinflate
method because the get method and the -> operator must not change
the internals of the class. Finally, attempting to reinflate when the
pointer is being invalidated causes the following assertion errors:
check_ptrace_stopped_lwp_gone: assertion `lp->stopped` failed.
get_frame_pc: Assertion `frame->next != NULL` failed.
As for performance concerns, my personal testing with `time make
chec-perf GDB_PERFTEST_MODE=run` showed an actual reduction of around
10% of time running.
This commit also adds a testcase that exercises the python/28856 bug with
7 different triggers, run, continue, step, backtrace, finish, up and down.
Some of them can seem to be testing the same thing twice, but since this
test relies on stale pointers, there is always a chance that GDB got lucky
when testing, so better to test extra.
Regression tested on x86_64, using both gcc and clang.
Approved-by: Tom Tomey <tom@tromey.com>
|