简体   繁体   中英

Is it possible to debug core dumps when using Java JNI?

My application is mostly Java but, for certain calculations, uses a C++ library. Our environment is Java 1.6 running on RedHat 3 (soon to be RedHat 5).

My problem is that the C++ library is not thread-safe. To work around this, we run multiple, single-threaded "worker" processes and give them work to do from a central Work Manager, also written in C++. Our Java application calls the C++ Work Manager via a third-party product.

For various reasons, we want to re-write the C++ Work Manager and workers. I'm in favour of writing them all in Java, using JNI in each worker to call the C++ library.

The main problem is what happens if the C++ library core dumps. Unfortunately, this is quite common, and we need to be able to see which line in our C++ library caused the problem, eg by examining a backtrace in something like GDB.

My colleagues believe that it will be impossible to analyse the core dumps, because tools like GDB don't understand core files produced by Java.

I hope that they're wrong, but I need to be sure before pushing my ideas further.

What is the best way to analyse a core dump produced from Java/JNI?

Yes, there is. Everytime JVM crashes because of a SIGSEGV in the JNI part, you'll get a file with core dump in $JAVA_HOME/bin directory. It usually name hs_err_PID.log.

You can get more info here , and here . Here is a somewhat related stackoverflow question.

To get the core file read into gdb you have to add the java virtual machine to it. That is

gdb /usr/local/jdk1.8.0_66/bin/java core

it will very likely tell you that a ton of symbols are not found (which is normal, these are the JVM symbols). However, the JNI call that crashed might appear in your stacktrace if you type 'bt'. An example, in my situation, where I have a crash in a native library I wrote, is:

(gdb) bt
#0  0x00007fd61dfcd107 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007fd61dfce4e8 in __GI_abort () at abort.c:89
#2  0x00007fd61d8d3795 in os::abort(bool) ()
   from /usr/local/jdk1.8.0_66/jre/lib/amd64/server/libjvm.so
#3  0x00007fd61da71e23 in VMError::report_and_die() ()
   from /usr/local/jdk1.8.0_66/jre/lib/amd64/server/libjvm.so
#4  0x00007fd61d8d8fbf in JVM_handle_linux_signal ()
   from /usr/local/jdk1.8.0_66/jre/lib/amd64/server/libjvm.so
#5  0x00007fd61d8cf753 in signalHandler(int, siginfo*, void*) ()
   from /usr/local/jdk1.8.0_66/jre/lib/amd64/server/libjvm.so
#6  <signal handler called>

The first 6 frames are all related to the crash process itself. A signal was caught and dispatched. And allthough we don't know the exact functions, it does not matter. Starting at frame 7, we are in the JNI library we wrote. And if it had still symbols attached you will see them.

#7  0x00007fd5ff43bf7e in FftResampler::resample(Complex const*, int)
    ()
   from /I/home/werner/BpmDj/NextGen/Beta/Desktop/test/libzathras-46703-64.so
#8  0x00007fd5ff43ddcf in TimeStretcher::rescaleEnvelopeSlow(PeakMap const*, Peak*) ()
   from /I/home/werner/BpmDj/NextGen/Beta/Desktop/test/libzathras-46703-64.so
#9  0x00007fd5ff43e4a5 in TimeStretcher::transferPeak(Frame*, Frame*)
    ()
   from /I/home/werner/BpmDj/NextGen/Beta/Desktop/test/libzathras-46703-64.so
#10 0x00007fd5ff43e679 in TimeStretcher::transferPeaks(Channel*) ()
   from /I/home/werner/BpmDj/NextGen/Beta/Desktop/test/libzathras-46703-64.so
#11 0x00007fd5ff43eb3a in TimeStretcher::putStereo(float const*, int)
    ()
   from /I/home/werner/BpmDj/NextGen/Beta/Desktop/test/libzathras-46703-64.so
#12 0x00007fd5ff43edbf in TimeStretcher::processStereo(float const*, int, float*) ()
   from /I/home/werner/BpmDj/NextGen/Beta/Desktop/test/libzathras-46703-64.so
#13 0x00007fd5ff43b45d in Java_org_yellowcouch_bpmdj_mixedit_audio_JavaTimeStretcher_processStereo ()
   from /I/home/werner/BpmDj/NextGen/Beta/Desktop/test/libzathras-46703-64.so

And from frame 14 onward we are back in java land.

#14 0x00007fd6097a29e1 in ?? ()
#15 0x00007fd5d6ee6580 in ?? ()
#16 0x00000000853f53e8 in ?? ()
#17 0x00000000d803c340 in ?? ()
#18 0x00000000d80564e8 in ?? ()
#19 0x00007fd61e773609 in _L_unlock_554 ()
   from /lib/x86_64-linux-gnu/libpthread.so.0

So you see that it is not completely impossible to get some info out of the core files through gdb. Just don't forget to add the jvm as first argument to it.

It is possible that gdb doesn't find the native library itself. In that case you might want to load the symbols manually as follows:

gdb> symbol-file libzathras-46703-64.so

If you want even more information, you might want to compile your c/c++ code with debug info turned on. Generally with the mingw and gcc compiler you add a -g to the command line options. This will provide you the following info, which includes line numbers and so.

#7  FftResampler::resample (this=this@entry=0x7f4bf8f36100, 
    cpx=cpx@entry=0x7f4bf8ed1ea0, n=<optimized out>)
    at timestretcher.cpp:347
#8  0x00007f4c51605dcf in TimeStretcher::rescaleEnvelopeSlow (
    this=0x7f4bf8ec1e10, table=0x7f4bf90f4c20, borders=0x7f4bf8fd27a0)
    at timestretcher.cpp:878
#9  0x00007f4c516064a5 in TimeStretcher::transferPeak (
    this=this@entry=0x7f4bf8ec1e10, 
    prevFrame=prevFrame@entry=0x7f4bf8fde6f0, 
    frame=frame@entry=0x7f4bf8fb2650) at timestretcher.cpp:718
#10 0x00007f4c51606679 in TimeStretcher::transferPeaks (
    this=this@entry=0x7f4bf8ec1e10, 
    channel=channel@entry=0x7f4bf8ec9e90) at timestretcher.cpp:687
#11 0x00007f4c51606b3a in TimeStretcher::putStereo (
    this=this@entry=0x7f4bf8ec1e10, in=in@entry=0x7f4bf8eb9e00, 
    time=time@entry=-1395) at timestretcher.cpp:1483
#12 0x00007f4c51606dbf in TimeStretcher::processStereo (
    this=this@entry=0x7f4bf8ec1e10, in=in@entry=0x7f4bf8eb9e00, 
    time=time@entry=-1395, out=0x7f4bf90f4c60)
    at timestretcher.cpp:1567
#13 0x00007f4c5160345d in Java_org_yellowcouch_bpmdj_mixedit_audio_JavaTimeStretcher_processStereo (env=0x7f4bf90f71f8, obj=<optimized out>, 
    handle=139964275465728, in=0x7f4bed136468, inIdx=<optimized out>, 
    time=-1395, out=0x7f4bed136480) at timestretcher-jni.cpp:69

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM