EhCache Weblogic Deadlock?

Question

There is a webservice client which uses ehcache in order to cache some results and avoid too many ws calls.

Apparently the server from where this ws client is called (on Weblogic OSB) just hangs and does not even write anything in logs ...just freeeze ! as soon as there is a little bit of traffic on it.

The full thread dump is here:

http://pastebin.com/rdVyxjNc

below is something not very clear to me parking to wait for < 0x8a03c9c0 >

but i just can;t find any reference to 0x8a03c9c0 in thread dump.

Do you see anything in the thread dump that might cause this server to freeze ?

Thanks

searchByTemplate.data" prio=3 tid=0x0115b400 nid=0x5e waiting on condition [0x5ef7f000..0x5ef7fbf0]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x8a03c9c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
        at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)
        ...

Answer 1

The thread you highlighted is actually “available” to process a request so not the problem. Weblogic Oracle Service Bus relies on XQuery for XML manipulation. XQuery is known to be both CPU & memory intensive when used against big data payload.

I just analyzed your Thread Dump. The thread dump is clearly showing a high CPU pattern where multiple threads are performing tasks such as parsing XML and attempting to allocate memory in some data structures such as ArrayList etc.

I'm suspecting 2 possible scenarios at the source of the "hang" problem:

Excessive garbage collection & OldGen space depletion

The HotSpot JVM 1.6+ includes at the bottom Java Heap utilization. We can see that the OldGen space at 92%. This re-enforce the thread pattern that we see from the Thread Dump.

PSYoungGen total 466944K, used 233472K [0xd1000000, 0xfbc00000, 0xfbc00000)
eden space 233472K, 100% used [0xd1000000,0xdf400000,0xdf400000)
from space 233472K, 0% used [0xed800000,0xed800000,0xfbc00000)
to space 233472K, 0% used [0xdf400000,0xdf400000,0xed800000)
ParOldGen total 1400832K, used 1297110K [0x7b800000, 0xd1000000, 0xd1000000)
object space 1400832K, 92% used [0x7b800000,0xcaab5ac8,0xd1000000)
PSPermGen total 262144K, used 167570K [0x6b800000, 0x7b800000, 0x7b800000)
object space 262144K, 63% used [0x6b800000,0x75ba4ab0,0x7b800000)

Culprit thread consuming your CPU and/or Java Heap memory

In this scenario, one or a few threads may be involved in non-stop processing such as non returning XQuery etc. causing CPU surge & JVM contention.

Now find below my recommendations:

Enable verbose:gc. This will allow you to perform a health & footprint assessment of the Java Heap along garbage collection frequency
Perform a CPU% per Thread analysis next time you see the problem. This will allow you to determine if you are dealing with specific Service Bus request(s) consuming your CPU and/or Java Heap memory

Find below articles from my blog to help you out in your next analysis phase:

Verbose GC analysis

JVM Verbose GC tutorial

Java high CPU troubleshooting

Java High CPU troubleshooting

Regards, PH

EhCache Weblogic Deadlock?

Question

1 answers

solution1
2 ACCPTED 2012-12-15 02:02:35

Excessive garbage collection & OldGen space depletion

Culprit thread consuming your CPU and/or Java Heap memory

Verbose GC analysis

Java high CPU troubleshooting

EhCache Weblogic Deadlock?

Question

1 answers

solution1 2 ACCPTED 2012-12-15 02:02:35

Excessive garbage collection & OldGen space depletion

Culprit thread consuming your CPU and/or Java Heap memory

Verbose GC analysis

Java high CPU troubleshooting

solution1
2 ACCPTED 2012-12-15 02:02:35