简体   繁体   中英

GDB hangs during remote debugging, library version mismatches

I'm using linux and am trying to remote debug a program.

I launch gdbserver on the target, from .xinitrc with

gdbserver localhost:9134 /root/game/game

On my local pc, which I'm running inside a virtualbox vm, I connect to the target from gdb with

target remote 192.168.1.20:9134

and it connects fine. I can set a breakpoint at main with

b main

and then I can continue and it will break there. I can single step for a ways until it gets to the call SDL_Init(), from which it will never return back to gdb. If I don't single step to SDL_Init but instead set a breakpoint further on in the program, the program will start up and run normally (so it gets past SDL_Init). But when it reaches the breakpoint, it freezes up on the target machine and gdb on my local machine never shows a prompt. The entire thing hangs and must be restarted. It's not completely frozen, however, as the mouse pointer still moves on the target and you can ping it, but the gdb connection no longer works. So it seems that the graphics systems somehow interferes with this since breakpoints before the graphics system init do work, but not after.

I've tried setting the remotetimeout setting to 500 seconds and it exhibits the same behaviour. When I ping the remote target from my local pc the reported time is around 0.3 to 0.4 ms. So that doesn't seem out of the ordinary, but I wouldn't rule out any other misconfigured network settings on my part.

It's on a legacy system (but hey, it still makes money) with gdbserver version 6.8-19.fc10 and gdb version 6.8-29.fc10. Upgrading versions, while a very large headache, could be possible but probably should not be necessary (any upgrades I make to my pc have to also be done to a state regulator's system, as they use the gdb setup for their testing purposes, but it's not impossible). Remote debugging was working in the past before I took over the project, and no one who worked on it before is still around to ask. The gdbserver version definitely worked, as I'm using the exact program used previously.

Update 1:
I updated the gdb version on the host machine to version 7.0.1 and it still exhibits the same behavior. I couldn't do version 8 as it needs a C++11 compiler and the legacy system is before that time.

Update 2:
I've tried this in another virtual machine and I even built a fresh dedicated linux install (so no vm), rebuilt the software, and I get the same behavior. So it appears the issue is probably on the target machine's configuration.

Update 3:
I dug out a serial cable and finally got the remote debugging setup via serial. It still doesn't work, but it gives me more error messages. I get the error

gdbserver: error initializing thread_db library: version mismatch between libthread_db and libpthread

which I think makes sense since my breakpoints quit working after the graphics system is initialized which involves creating some threads. After googling that error, I've tried using set solib-absolute-prefix , set solib-search-path , and set sysroot to the root directory on the host machine of a copy of the filesystem on the target machine (on the host, that is /fw_dev/fgs/cf/initrd/expand, which contains the filesystem that the initrd is made from)
But then when I try to set breakpoints, I get Error accessing memory address 0xb5eb60: Input/output error. I've also tried setting those variables to the lib subdirectory, which doesn't work either. I also tried just copying the local thread libraries from the host's /lib directory to the /lib on the target, but then x windows won't even start.

Update #4:
I tried launching gdb from the root of the copy of the target filesystem on the host (/fw_dev/fgs/cf/initrd/expand), and gdb still hangs on breakpoints but I no longer get the error message about libthread_db and libpthread mismatches, so back to the drawing board.

Update #5
Maybe I'm getting to where I should ask this a separate question, but I compiled gdb, then ran gbd on itself. Then used file to set it to the program on the host, set the remote target, set my breakpoints and then ran continue . When I get to the breakpoint, gdb hangs as always. But now when I press ctrl-c in gdb, I get this backtrace

#0  0x00110416 in __kernel_vsyscall ()
#1  0x00b3f39d in ___newselect_nocancel () from /lib/libc.so.6
#2  0x08203b9a in ser_base_wait_for (scb=0x96a2930, timeout=1) at ser-base.c:206
#3  0x08203c89 in do_ser_base_readchar (scb=0x96a2930, timeout=-1) at ser-base.c:256
#4  0x08204046 in generic_readchar (scb=0x96a2930, timeout=-1, do_readchar=0x8203c60 <do_ser_base_readchar>) at ser-base.c:326
#5  0x082040b0 in ser_base_readchar (scb=0x96a2930, timeout=-1) at ser-base.c:391
#6  0x081ecac2 in serial_readchar (scb=0x96a2930, timeout=-1) at serial.c:376
#7  0x080c4357 in readchar (timeout=<value optimized out>) at remote.c:5922
#8  0x080c5e35 in getpkt_or_notif_sane_1 (buf=0x839f140, sizeof_buf=0x839f144, forever=1, expecting_notif=0) at remote.c:6440
#9  0x080d1e0a in getpkt_sane (ops=0x839f180, ptid=..., status=0xbffff388, options=0) at remote.c:6534
#10 remote_wait_as (ops=0x839f180, ptid=..., status=0xbffff388, options=0) at remote.c:4736
#11 remote_wait (ops=0x839f180, ptid=..., status=0xbffff388, options=0) at remote.c:4843
#12 0x08184d4b in target_wait (ptid=..., status=0xbffff388, options=0) at target.c:2098
#13 0x0815daf2 in wait_for_inferior (treat_exec_as_sigtrap=0) at infrun.c:2028
#14 0x0815ddd4 in proceed (addr=4294967295, siggnal=TARGET_SIGNAL_DEFAULT, step=0) at infrun.c:1652
#15 0x08153729 in continue_1 (all_threads=0) at infcmd.c:668
#16 0x08153ea2 in continue_command (args=0x0, from_tty=0) at infcmd.c:760
#17 0x0808e9e8 in execute_command (p=0x83b89a1 "", from_tty=0) at top.c:453
#18 0x0816b028 in command_handler (command=0x83b89a0 "c") at event-top.c:511
#19 0x0816bd5a in command_line_handler (rl=0x8ce83e8 "\340&\266\b\340\230\321\b") at event-top.c:736
#20 0x0822d5a5 in rl_callback_read_char () at callback.c:205
#21 0x0816b17b in rl_callback_read_char_wrapper (client_data=0x0) at event-top.c:178
#22 0x0816ac54 in handle_file_event (data=...) at event-loop.c:812
#23 0x08169e6b in process_event () at event-loop.c:394
#24 0x0816aba4 in gdb_do_one_event (data=0x0) at event-loop.c:459
#25 0x0816500b in catch_errors (func=0x816a950 <gdb_do_one_event>, func_args=0x0, errstring=0x82ccc3d "", mask=6) at exceptions.c:510
#26 0x080f072a in tui_command_loop (data=0x0) at ./tui/tui-interp.c:153
#27 0x08165684 in current_interp_command_loop () at interps.c:291
#28 0x0808653b in captured_command_loop (data=0x0) at ./main.c:226
#29 0x0816500b in catch_errors (func=0x8086530 <captured_command_loop>, func_args=0x0, errstring=0x82ccc3d "", mask=6) at exceptions.c:510
#30 0x08085ecc in captured_main (data=0xbffff7a4) at ./main.c:902
#31 0x0816500b in catch_errors (func=0x80853d0 <captured_main>, func_args=0xbffff7a4, errstring=0x82ccc3d "", mask=6) at exceptions.c:510
#32 0x080851d1 in gdb_main (args=0xbffff7a4) at ./main.c:911
#33 0x08085195 in main (argc=128, argv=0x0) at gdb.c:33

So it seems gdb is hanging inside __kernel_vsyscall(). Doing a diff on libc.so.6 on in the /lib directory on the host and the libc.so.6 on the target reveal differences. I've tried using LD_PRELOAD and LD_LIBRARY_PATH but that backtrace always shows /lib/libc.so.6 instead of pointing to the copy that the target has. Maybe I'm not setting them correctly, but I've tried setting them in gdb with set env and also setting them on the command line and exporting them, but to no effect. I also tried putting the libc from the host computer onto the target machine, and it won't even boot, it gets a segfault in libc. So how do I get gdb to load different libraries?

Update #6:
So I made a bootable usb key using the target system's disk image as the base. I made minimal changes to it to get it to run on a standard PC, and added gdb and gdb's requisite libraries to it. So now, ibc is the same on both host and target machines and it still hangs on me.

Final. While I know gdb 6.8 worked in the past, I can't figure out the configuration. After upgrading both gdb and gdbserver to 7.12 it worked.

Upgrading versions, while a very large headache, could be possible but probably should not be necessary...

This is the right option. All of the other issues you are encountering are because of this.

I've tried this in another virtual machine and I even built a fresh dedicated linux install (so no vm), rebuilt the software, and I get the same behavior. So it appears the issue is probably on the target machine's configuration.

You should build on the same version, architecture, etc. as the system you are attempting to deploy your code to.

But then when I try to set breakpoints, I get Error accessing memory address 0xb5eb60: Input/output error.

Per this answer ,

Can be caused by 32/64 bit mixups. Check, for example, that you didn't attach to a 32-bit binary with a 64-bit process' ID, or vice versa.

I also tried putting the libc from the host computer onto the target machine, and it won't even boot, it gets a segfault in libc.

Don't do that. As you've found out, it won't work.

So how do I get gdb to load different libraries?

Per this question , you can use LD_LIBRARY_PATH .

Here are some interesting suggestions. Have you tried to attach gdbserver to strace to see what kind of activity is going on during the hang? As other says - it could be a good way to go one step further into figuring out the problem. You can do that with following command on target machine:

strace -p `pidof gdbserver`

Also sending a CONT signal to gdbserver may help when it hangs:

kill -CONT `pidof gdbserver`

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM