简体   繁体   English

HDFS Datanode因OutOfMemoryError崩溃

[英]HDFS Datanode crashes with OutOfMemoryError

I´m having repeated crashes in my Cloudera cluster HDFS Datanodes due to an OutOfMemoryError : 由于OutOfMemoryError ,我在Cloudera集群HDFS Datanode中反复崩溃:

java.lang.OutOfMemoryError: Java heap space
Dumping heap to /tmp/hdfs_hdfs-DATANODE-e26e098f77ad7085a5dbf0d369107220_pid18551.hprof ...
Heap dump file created [2487730300 bytes in 16.574 secs]
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="/usr/lib64/cmf/service/common/killparent.sh"
#   Executing /bin/sh -c "/usr/lib64/cmf/service/common/killparent.sh"...
18551 TS   19 ?        00:25:37 java
Wed Aug  7 11:44:54 UTC 2019
JAVA_HOME=/usr/lib/jvm/java-openjdk
using /usr/lib/jvm/java-openjdk as JAVA_HOME
using 5 as CDH_VERSION
using /run/cloudera-scm-agent/process/3087-hdfs-DATANODE as CONF_DIR
using  as SECURE_USER
using  as SECURE_GROUP
CONF_DIR=/run/cloudera-scm-agent/process/3087-hdfs-DATANODE
CMF_CONF_DIR=/etc/cloudera-scm-agent
4194304

When analyzing the heap dump, the apparent biggest suspects are millions of instances of ScanInfo apparently quequed in the ExecutorService of the class org.apache.hadoop.hdfs.server.datanode.DirectoryScanner . 在分析堆转储时,显然最大的可疑对象是org.apache.hadoop.hdfs.server.datanode.DirectoryScanner类的ExecutorService中显然被查询的数百万个ScanInfo实例。

Eclipse MAT工具显示了主导者树

When I inspect the content of each ScanInfo runnable object, I don´t see anything weird: 当我检查每个ScanInfo可运行对象的内容时,我看不到任何奇怪的东西:

ScanInfo实例内容

Apart from this and a bit high block count in HDFS, I don´t get any other information apart from the different DataNodes crashing randomly in my cluster. 除此之外,HDFS中的块计数很高,除了集群中随机崩溃的不同DataNode之外,我没有得到任何其他信息。

Any idea why these objects keep queueing up in the DirectoryScanner thread pool? 知道为什么这些对象一直在DirectoryScanner线程池中排队吗?

You can try once below command. 您可以在命令下方尝试一次。

$ hadoop dfsadmin -finalizeUpgrade The -finalizeUpgrade command removes the previous version of the NameNode's and DataNodes' storage directories. $ hadoop dfsadmin -finalizeUpgrade -finalizeUpgrade命令删除NameNode和DataNode的存储目录的先前版本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM