From 9247f052848f3961402754ebfbc1bc28cf0857a5 Mon Sep 17 00:00:00 2001 From: Aditya Vidyadhar Kamath Date: Mon, 4 Nov 2024 02:42:05 -0600 Subject: Fix AIX core dump while main thread exits. Consider the test case: void *thread_main(void *) { std::cout << getpid() << std::endl; sleep(20); return nullptr; } int main(void) { pthread_t thread; pthread_create(&thread, nullptr, thread_main, nullptr); pthread_join(thread, nullptr); return 0; } This program creates a thread via main that sleeps for 20 seconds. When we debug this with gdb we get, Reading symbols from ./test... (gdb) b main Breakpoint 1 at 0x10000934: file test.c, line 11. (gdb) r Starting program: /read_only_gdb/binutils-gdb/gdb/test Breakpoint 1, main () at test.c:11 11 pthread_create(&thread, nullptr, thread_main, nullptr); (gdb) c Continuing. 15335884 [New Thread 258 (tid 31130079)] Thread 2 received signal SIGINT, Interrupt. [Switching to Thread 258 (tid 31130079)] 0xd0611d70 in _p_nsleep () from /usr/lib/libpthread.a(_shr_xpg5.o) (gdb) thread 1 [Switching to thread 1 (Thread 1 (tid 25493845))] (gdb) c Continuing. [Thread 1 (tid 25493845) exited] [Thread 258 (tid 31130079) exited] inferior.c:405: internal-error: find_inferior_pid: Assertion `pid != 0' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. ----- Backtrace ----- There are two bugs here. One is the core dump. The other is the main thread information not captured. So, while I was debugging the first part the reason, the reason I figured out was the last for loop in sync_threadlists (). Once both my threads exit we delete them as below: for (struct thread_info *it : all_threads ()) { if (in_queue_threads.count (priv->pdtid) == 0 && in_thread_list (proc_target, it->ptid) && pid == it->ptid.pid ()) { delete_thread (it); data->exited_threads.insert (priv->pdtid); But once these two threads are deleted, all_threads () has one more thread whose tid and pid are 0. gdb) c Continuing. In for loop 8782296 is pid, 19857879 is tid [Thread 1 (tid 19857879) exited] In for loop 8782296 is pid, 30933401 is tid [Thread 258 (tid 30933401) exited] In for loop 0 is pid, 0 is tid [Inferior 1 (process 8782296) exited normally] (gdb) q I used a printf in the for loop mentioned above for explaination. You see the loop enters the third time with 0 as pid. The reason being though the threads are removed but not deleted since they are not deletable (). Hence we use all_threads_safe () iterator instead. The second part to the bug is the lack of information of the main thread. Andrew was right here (https://sourceware.org/pipermail/gdb-patches/2024-September/211875.html) Thank you Andrew. The thread has loaded but then ptrace () call when we tried to fetch_regs_kernel_thread failed. This returned EPERM as errno. if (!ptrace32 (PTT_READ_GPRS, tid, (uintptr_t) gprs32, 0, NULL)) memset (gprs32, 0, sizeof (gprs32)); Hence all registers were set to 0 and we did not get the required infromation. This issue will be fixed within the AIX ptrace call. Approved By: Ulrich Weigand . --- gdb/aix-thread.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c index 9e6952b..4a050cd 100644 --- a/gdb/aix-thread.c +++ b/gdb/aix-thread.c @@ -854,7 +854,7 @@ sync_threadlists (pid_t pid) thread exits and gets into a PST_UNKNOWN state. So this thread will not run in the above for loop. Therefore the below for loop is to manually delete such threads. */ - for (struct thread_info *it : all_threads ()) + for (struct thread_info *it : all_threads_safe ()) { aix_thread_info *priv = get_aix_thread_info (it); if (in_queue_threads.count (priv->pdtid) == 0 -- cgit v1.1