简体   繁体   中英

Linux OOM Killer and Java Process

I am often facing issue in our production environment of the Tomcat process getting killed by Linux OOM.

Checking /var/log/messages it says java not tainted and java invoked OOM killer.

-Xms20480m -Xmx20480m on a 32 GB box.

I see below crash -

Is the OOM causing this crash ? or the crash happened because of OOM ? How can I debug this issue ?

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f4c3230aad7, pid=16248, tid=139964439296320
#
# JRE version: Java(TM) SE Runtime Environment (7.0_45-b18) (build 1.7.0_45-b18)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x674ad7]  JVM_Clone+0x97
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---------------  T H R E A D  ---------------

Current thread (0x0000000001313800):  JavaThread "http-bio-14080-exec-17" daemon [_thread_in_native, id=18943, stack(0x00007f4c029f7000,0x00007f4c02af8000)]

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), si_addr=0x0000000000000000

Registers:
RAX=0x0000000000010000, RBX=0x80000001000004db, RCX=0x0000000302bd6fb0, RDX=0x00007f4c32acda50
RSP=0x00007f4c02af2c88, RBP=0x00007f4c02af2d70, RSI=0x00007f4c02af2d80, RDI=0x00000000013139e8
R8 =0x0000000000000001, R9 =0x00000002f8191228, R10=0x00007f4c2e231658, R11=0x00007f4c2e231638
R12=0x0000000302bd6fb0, R13=0x00000007e0002420, R14=0x0000000000000000, R15=0x0000000001313800
RIP=0x00007f4c3230aad7, EFLAGS=0x0000000000010202, CSGSFS=0x0000000000000033, ERR=0x0000000000000000
  TRAPNO=0x000000000000000d

Top of Stack: (sp=0x00007f4c02af2c88)
0x00007f4c02af2c88:   00007f4c3230aa63 00000000129196d1
0x00007f4c02af2c98:   00000002f817b620 00000000013139e8
0x00007f4c02af2ca8:   00007f4c02af2d18 00000002f8191080
0x00007f4c02af2cb8:   0000000000000009 0000000001313800
0x00007f4c02af2cc8:   0000000000000000 0000000001313800
0x00007f4c02af2cd8:   00000000138f5509 0000000000000000
0x00007f4c02af2ce8:   00007f4c2e382ae8 0000000000000000
0x00007f4c02af2cf8:   00007f4c02af2cf8 00000002f817b620
0x00007f4c02af2d08:   00000002f8190fd8 00000007e01ab278
0x00007f4c02af2d18:   00000002f8190ec0 00000007e0002420
0x00007f4c02af2d28:   0000000000000000 00000007e0002420
0x00007f4c02af2d38:   0000000000000000 0000000000000000
0x00007f4c02af2d48:   00000007e0002420 0000000000000000
0x00007f4c02af2d58:   00007f4c02af2de0 0000000000000000
0x00007f4c02af2d68:   0000000001313800 00007f4c02af2dc0
0x00007f4c02af2d78:   00007f4c2e2316c6 0000000302bd6fb0
0x00007f4c02af2d88:   0000000000000000 00007f4c02af2e08
0x00007f4c02af2d98:   00007f4c2e1bc8e1 00007f4c02af2db8
0x00007f4c02af2da8:   00007f4c2e1bc8e1 00000002f8190fd8
0x00007f4c02af2db8:   00000002f817b620 00007f4c02af2e28
0x00007f4c02af2dc8:   00007f4c2e1bc233 00000007e1ec4e9f
0x00007f4c02af2dd8:   00007f4c2e1bc233 0000000302bd6fb0
0x00007f4c02af2de8:   00007f4c02af2de8 00000007e1b7a5db
0x00007f4c02af2df8:   00007f4c02af2e30 00000007e37575a8
0x00007f4c02af2e08:   0000000000000000 00000007e1b7a5e8
0x00007f4c02af2e18:   00007f4c02af2de0 00007f4c02af2e40
0x00007f4c02af2e28:   00007f4c02af2ed0 00007f4c2edaaf34
0x00007f4c02af2e38:   00007f4c2edaaf34 00007f4c0000000a
0x00007f4c02af2e48:   00000007e416ec20 0000000000000000
0x00007f4c02af2e58:   00000007e416e2b8 00007f4c02af2ed0
0x00007f4c02af2e68:   00007f4c2e1bc233 00007f4c02af2ed0
0x00007f4c02af2e78:   00007f4c2e1bc233 000000000000000a 

Instructions: (pc=0x00007f4c3230aad7)
0x00007f4c3230aab7:   85 60 02 00 00 06 00 00 00 4c 89 6d b0 4c 8b 23
0x00007f4c3230aac7:   4d 85 e4 0f 84 68 02 00 00 49 8b 9d 20 01 00 00
0x00007f4c3230aad7:   48 83 7b 10 f7 0f 87 6e 02 00 00 48 8b 43 10 48
0x00007f4c3230aae7:   8d 50 08 48 3b 53 18 0f 87 9c 02 00 00 48 89 53 

Register to memory mapping:

RAX=0x0000000000010000 is an unknown value
RBX=0x80000001000004db is an unknown value
RCX=0x0000000302bd6fb0 is an oop
[Lcom.mycompany.MyClass$Type; 
 - klass: 'com/mycompany/MyClass$Type'[]
 - length: 16
RDX=0x00007f4c32acda50: <offset 0xe37a50> in /usr/java/jre64-1.7.0_45/jre/lib/amd64/server/libjvm.so at 0x00007f4c31c96000
RSP=0x00007f4c02af2c88 is pointing into the stack for thread: 0x0000000001313800
RBP=0x00007f4c02af2d70 is pointing into the stack for thread: 0x0000000001313800
RSI=0x00007f4c02af2d80 is pointing into the stack for thread: 0x0000000001313800
RDI=0x00000000013139e8 is an unknown value
R8 =0x0000000000000001 is an unknown value
R9 =
[error occurred during error reporting (printing register info), id 0xb]

Stack: [0x00007f4c029f7000,0x00007f4c02af8000],  sp=0x00007f4c02af2c88,  free space=1007k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x674ad7]  JVM_Clone+0x97
J  java.lang.Object.clone()Ljava/lang/Object;
j  com.mycompany.MyClass$Type.values()[Lcom/mycompany/MyClass$Type;+3

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J  java.lang.Object.clone()Ljava/lang/Object;
j  com.mycompany.MyClass$Type.values()[Lcom/mycompany/MyClass$Type;+3

/var/log/messages output -

myhostname kernel: java invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
myhostname kernel: java cpuset=/ mems_allowed=0
myhostname kernel: Pid: 32307, comm: java Not tainted 2.6.39-400.209.1.el5uek #1
myhostname kernel: Call Trace:
myhostname kernel:  [<ffffffff811136b4>] dump_header+0x94/0xe0
myhostname kernel:  [<ffffffff811137fd>] oom_kill_process+0x6d/0x160
myhostname kernel:  [<ffffffff811139ec>] out_of_memory+0xfc/0x210
myhostname kernel:  [<ffffffff811187ec>] __alloc_pages_slowpath+0x64c/0x660
myhostname kernel:  [<ffffffff811189b4>] __alloc_pages_nodemask+0x1b4/0x200
myhostname kernel:  [<ffffffff8111b140>] ? __do_page_cache_readahead+0xe0/0x170
myhostname kernel:  [<ffffffff81150893>] alloc_pages_current+0xb3/0x120
myhostname kernel:  [<ffffffff811100da>] __page_cache_alloc+0x9a/0xb0
myhostname kernel:  [<ffffffff8111097f>] page_cache_read+0x4f/0xb0
myhostname kernel:  [<ffffffff81111a54>] filemap_fault+0x174/0x270
myhostname kernel:  [<ffffffff81137a2c>] __do_fault+0x5c/0x550
myhostname kernel:  [<ffffffff81137fc6>] do_linear_fault+0x36/0x40
myhostname kernel:  [<ffffffff81510b6e>] ? call_function_interrupt+0xe/0x20
myhostname kernel:  [<ffffffff81138044>] handle_pte_fault+0x74/0x190
myhostname kernel:  [<ffffffff815106ae>] ? apic_timer_interrupt+0xe/0x20
myhostname kernel:  [<ffffffff8113828f>] handle_mm_fault+0x12f/0x1b0
myhostname kernel:  [<ffffffff8150b1cd>] do_page_fault+0x17d/0x4b0
myhostname kernel:  [<ffffffff8117b821>] ? user_path_at+0x11/0x20
myhostname kernel:  [<ffffffff81170516>] ? vfs_fstatat+0x56/0x90
myhostname kernel:  [<ffffffff8117067b>] ? vfs_stat+0x1b/0x20
myhostname kernel:  [<ffffffff81507cd5>] page_fault+0x25/0x30
myhostname kernel: Mem-Info:
myhostname kernel: Node 0 DMA per-cpu:
myhostname kernel: CPU    0: hi:    0, btch:   1 usd:   0
myhostname kernel: CPU    1: hi:    0, btch:   1 usd:   0
myhostname kernel: CPU    2: hi:    0, btch:   1 usd:   0
myhostname kernel: CPU    3: hi:    0, btch:   1 usd:   0
myhostname kernel: CPU    4: hi:    0, btch:   1 usd:   0
myhostname kernel: CPU    5: hi:    0, btch:   1 usd:   0
myhostname kernel: Node 0 DMA32 per-cpu:
myhostname kernel: CPU    0: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    1: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    2: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    3: hi:  186, btch:  31 usd:  11
myhostname kernel: CPU    4: hi:  186, btch:  31 usd:  30
myhostname kernel: CPU    5: hi:  186, btch:  31 usd:   0
myhostname kernel: Node 0 Normal per-cpu:
myhostname kernel: CPU    0: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    1: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    2: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    3: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    4: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    5: hi:  186, btch:  31 usd:   0
myhostname kernel: active_anon:4372468 inactive_anon:275213 isolated_anon:0
myhostname kernel:  active_file:17 inactive_file:21 isolated_file:0
myhostname kernel:  unevictable:6002 dirty:24 writeback:4 unstable:0
myhostname kernel:  free:38369 slab_reclaimable:3708 slab_unreclaimable:9265
myhostname kernel:  mapped:1253 shmem:67 pagetables:10880 bounce:0
myhostname kernel: Node 0 DMA free:15880kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
myhostname kernel: lowmem_reserve[]: 0 3000 32290 32290
myhostname kernel: Node 0 DMA32 free:119288kB min:2136kB low:2668kB high:3204kB active_anon:30224kB inactive_anon:9152kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3072096kB mlocked:0kB dirty:16kB writeback:4kB mapped:36kB shmem:0kB slab_reclaimable:40kB slab_unreclaimable:232kB kernel_stack:24kB pagetables:48kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:67 all_unreclaimable? yes
myhostname kernel: lowmem_reserve[]: 0 0 29290 29290
myhostname kernel: Node 0 Normal free:17564kB min:20856kB low:26068kB high:31284kB active_anon:17459648kB inactive_anon:1091700kB active_file:120kB inactive_file:88kB unevictable:24008kB isolated(anon):0kB isolated(file):0kB present:29992960kB mlocked:24008kB dirty:80kB writeback:12kB mapped:4976kB shmem:268kB slab_reclaimable:14792kB slab_unreclaimable:36828kB kernel_stack:2768kB pagetables:43472kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:14972 all_unreclaimable? yes
myhostname kernel: lowmem_reserve[]: 0 0 0 0
myhostname kernel: Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB
myhostname kernel: Node 0 DMA32: 458*4kB 494*8kB 284*16kB 55*32kB 7*64kB 4*128kB 5*256kB 7*512kB 7*1024kB 6*2048kB 20*4096kB = 119288kB
myhostname kernel: Node 0 Normal: 3333*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 17428kB
myhostname kernel: 115783 total pagecache pages
myhostname kernel: 114466 pages in swap cache
myhostname kernel: Swap cache stats: add 2183060, delete 2068594, find 1860109/1985849
myhostname kernel: Free swap  = 0kB
myhostname kernel: Total swap = 2097148kB
myhostname kernel: 8388592 pages RAM
myhostname kernel: 134806 pages reserved
myhostname kernel: 6845 pages shared
myhostname kernel: 8210059 pages non-shared

It's more likely that the OOM issue is causing the crash. One common reason for using more memory than the -Xmx argument is if you use native memory. This could happen because you're using a JNI library that allocates a lot of objects, or if you're memory mapping files, etc.

You should try adding some log statements to your java code to print out the memory Java thinks it's using Runtime.totalMemory, etc. Then compare these to values you see via top, and see if there's something else consuming memory.

this is linux limit thread number.

linux default limited 1024, and maximum 65535,

you can use command ulimit -c unlimited to unlimited.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM