简体   繁体   中英

Java memory leak with native C and Fortran code

I'm working on an old java program that includes a native library with Fortran calls.

So, I have Java that calls C via JNI, and then calls Fortran.

In production we have an out of memory error like :

Native memory allocation (malloc) failed to allocate 120000 bytes for jfloat in C:\BUILD_AREA\jdk6_37\hotspot\src\share\vm\prims\jni.cpp

I suspect it's a memory leak.

I'm new in the company, and I would like to work on linux but they have me working on Windows :( Under production we are using .so file because we are on solaris, and I use DLL on Windows (logical.)

First, I tried to reproduce the production problem. So, I created a unit test that loads the DLL and calls the java class that calls the native method many times. When I did that, I saw with processExplorer.exe that the memory grew up to 2MB every 2 seconds. And I have the exception like in production.

I'm happy I successfully reproduced the problem, and I could say that the problem came from the C or Fortran Code.

Next, I tried to remove the call to Fortran, and my java only called C (without Fortran, this test permitted me to see if the problem was coming from C or Fortran.)

And the result was that the memory did't move! Cool! I could say that I didn't have any problem with malloc/free in C.

So, I decided to learn a little Fortran to look through the code. :)

I learned that in Fortran we can use the allocate and deallocate keywords to play with the memory. And my code doesn't contains these keywords. :(

After all of this, someone give me access on Solaris to launch my junit test that calls Java->JNI->C=>Fortran and to use the .so instead of DLL.

And surprise - the memory didn't move!!! I don't have any problem under Solaris or RedHat.

I'm stuck because the problem exists on production, but I can't reproduce it clearly. :(

Why do I see a difference between DLL and SO? The code (java/C/Fortran) is exactly the same because it's me that compiles it.

How can I investigate more?

I have try to do a memory dump under windows where I reproduced the problem, but I don't see anything.

Is the problem in the jvm? Or can the problem be in the object passed to C via JNI?

Thanks a lot for helping me with this problem.

Info: I'm using Windows 7 64bits

PS: I'm French, so excuse my English. I try to do my best each time. ;)

Here is the header f the C Code:

    #ifndef unix 
       __MINGW_IMPORT void modlin_OM(float pmt[], float abaque[][], float don[][], float cond[], float res[][], int flag[]) ; 
    #else 
       extern void modlin_om_(float * pmt, float * abaque, float * don, float * cond, float * res, int * flag) ; 
    #endif

and after the method:

   JNIEXPORT jint JNICALL Java_TrtModlin_modlin_1OM
     (JNIEnv * env, jobject obj, 
 jfloatArray pmtPar, 
 jobjectArray abaquePar, jobjectArray donPar, jfloatArray condPar, jobjectArray resPar,  jintArray flagPar)
   {

some code, and the method call for Fortran

   #ifndef unix
      modlin_OM(pmt, abaque, don, cond, res, & iFlag) ;
   #else
modlin_om_(pmt, abaque, don, cond, res, & iFlag) ;
   #endif

As I said before, I test the call to C by removing these lines and the memory did't grow :( I test by removing a line with free(someVar) and the memory grows because free is not done in this case. That's why I conclude that my C was ok with Free/Malloc.

Analyse a memory trouble is always complex. From my experience there are two ways:

1) You try to reproduce. This supposes you have the source code and an idea of the root cause. 2) You observe the production crashes: the frequency, the correlation with other events, etc.

This can help to determine if it is a memory leak or not (it could be a high consumtion under a business load...)

In your particular case, I notice the following points:

  1. The behavior of code can be different on different OS. It's very rare for Java code (JVM bug). It is frequent with native code (for example, forget to close a ZIP causes a memory leak on Linux but not on Windows...)

  2. In your C header (*.h): abaque, don and res are 'float * *' on Windows and 'float *' on Unix. It could be a bug in your C headers, or it means the C implementations do not expect the same argument types depending on the operating system (that is strange for me...)

In the second case, the fact you compile your C headers on Windows (that is not the target) could explain you don't generate correct JNI stubs (typical cross-compilation issue)... From here we can make many assumptions, simple or very complex...

Good luck!

I am facing same problem in my application but come out with great solution. Please create the objects of native but after using make null .so that the unreferenced object will be collected by gc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM