简体   繁体   中英

debugging C program with gdb

I'm trying to test a scheduler that I wrote. I schedule two processes - both are infinite while loops (just while(1) statements). When I run the program sometimes it segfaults after like ten seconds (sometimes 5 sec, sometimes 15 or more). Sometimes it doesn't segfault at all and runs as expected. I have a log file which shows me that both processes are scheduled as expected before the segfault occurs. I'm trying to debug the errors using gdb but it's not being very helpful. Here's what I got with backtrace:

#0  0x00007ffff7ff1000 in ?? ()
#1  0x000000000000002b in ?? ()
#2  0x00007ffff78b984a in new_do_write () from /lib64/libc.so.6
#3  0x000000000061e3d0 in ?? ()
#4  0x0000000000000000 in ?? ()

I don't really understand #2.

I think this may be a stack overflow related error. However, I only malloc twice in the whole process - both times when I'm setting up the two processes, I malloc a pcb block in the pcb table I wrote. Has anyone run into similar issues before? Could this be something with how I'm setting/swapping the contexts in the scheduler? Why does it segfault sometimes, and sometimes not?

You didn't tell how you obtained the stack trace that you show in the question.

It is very likely that the stack trace is bogus not because the stack is corrupt, but because you've invoked GDB incorrectly, eg specified wrong executable when attaching the process or examining core dump.

One common mistake is to build the executable with -O2 (let's call this executable E1 ), then rebuild it with -g (let's call this E2 ) and try to analyze core of live process that is running E1 giving GDB E2 as the symbol file.

Don't do that, it doesn't work and isn't expected to work.

Since your stack seems corrupted, you're probably correct that you have a stack buffer overflow somewhere. Without the code, it's a little difficult to tell.

But this has nothing to do with your malloc calls. Overflowing dynamically allocated buffers would corrupt the heap, not the stack.

Whay you'll probably need to be looking at is local variables that aren't big enough for the data you're trying to copy in to them, like:

char xyzzy[5];
strcpy (xyzzy, "this is a bad idea";

Or passing a buffer (again, most likely on the stack) to a system call that writes more data to it than you provide for.

They're the most likely causes though theoretically, of course, any undefined behaviour on your part could cause this. If the solution is not evident based on this answer, you'll probably need to post the code that caused it. Try to ensure you trim it down as much as possible when you do that so that it's the shortest complete program that exhibits the bug.

Often you'll find by doing that, the problem becomes evident :-)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM