分析hadoop集群中被杀死的Java进程

Question

Java program is executing in Hadoop cluster. Java程序正在Hadoop集群中执行。 It got OutOfMemoryError during the execution and process stopped. 它在执行过程中出现OutOfMemoryError，进程停止。 I want to analyze the killed java process for memory & other details. 我想分析杀死的Java进程的内存和其他详细信息。 Where i can find the killed process log files? 在哪里可以找到被杀死的进程日志文件？ I used sar utility to analysis the memory but it shows only system memory instead of process memory. 我使用sar实用程序来分析内存，但它仅显示系统内存，而不显示进程内存。

Answer 1

您可以使用-XX:ErrorFile=<your location>/hs_err_pid<pid>.log作为JVM参数来设置hs_error文件位置。

Answer 2

First, perhaps you JVM has not been configured with enough heap size for your application. 首先，也许您的JVM没有为应用程序配置足够的堆大小。

That said, I have a little recommendation based on my experience. 也就是说，根据我的经验，我有一点建议。 I think you should enable these flags to investigate which objects are consuming too much space. 我认为您应该启用这些标志来调查哪些对象占用了太多空间。

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=path

The first flag is going to dump the heap in case of OOM exception and the second one is going to set the path where the JVM saves that dump file. 万一发生OOM异常，第一个标志将转储堆，第二个标志将设置JVM保存该转储文件的路径。

After getting the heap dump file you should use Memory Analyzer Tool ( https://www.eclipse.org/mat/ ) to see possible memory leaks caused by applications. 获取堆转储文件后，应使用内存分析器工具（ https://www.eclipse.org/mat/ ）查看由应用程序引起的可能的内存泄漏。

In addition, it is important to measure the GC process and you can do this with these flags. 此外，测量GC过程非常重要，您可以使用这些标志来进行此操作。

-XX:+PrintReferenceGC 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps - 
XX:+PrintTenuringDistribution -XX:+PrintAdaptiveSizePolicy 
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -
Xloggc:/some/where/gc.log

The GC.log can be analyzed using this online tool 可以使用此在线工具分析GC.log

http://gceasy.io/ http://gceasy.io/

分析hadoop集群中被杀死的Java进程

问题描述

2 个解决方案

解决方案1
1 2018-03-06 17:04:31

解决方案2
0 2018-03-12 08:45:41

分析hadoop集群中被杀死的Java进程

问题描述

2 个解决方案

解决方案1 1 2018-03-06 17:04:31

解决方案2 0 2018-03-12 08:45:41

解决方案1
1 2018-03-06 17:04:31

解决方案2
0 2018-03-12 08:45:41