简体繁体 English

DataNode在CDH5群集中自动重新启动

[英]DataNode automatically getting restarted in the CDH5 cluster

原文 2014-08-31 13:40:52 4 1 hadoop/ hdfs/ cloudera

We have setup a cluster with 6 slave nodes. 我们设置了一个包含6个从属节点的集群。 I am trying to see how replication happens when one of the DataNode dies. 我正在尝试查看其中一个DataNode死亡时复制是如何发生的。

I logged into one of the slave and killed the DataNode using the kill -9 command. 我登录到一个从属服务器，并使用kill -9命令杀死了DataNode。 After sometime the DataNode is restarted automatically and HDFS gets back into healthy status. 一段时间后，DataNode会自动重新启动，HDFS会恢复正常状态。 I am verify this because the PID of the DataNode has changed. 我确认这是因为DataNode的PID已更改。

I don't see any documentation on the above behavior of DataNode. 我没有看到有关DataNode的上述行为的任何文档。 Is this the Apache Hadoop or Cloudera CDH feature? 这是Apache Hadoop或Cloudera CDH功能吗？ Any reference to the documentation is appreciated. 任何参考文档的赞赏。

1 个解决方案

As the pid of datanode has been changed, I don't think it is a behavior of datanode. 由于datanode的pid已更改，因此我认为这不是datanode的行为。 If you are managing your cluster using Cloudera Manager, there is an option for restarting datanode daemon if it fails(Automatically Restart Process). 如果您正在使用Cloudera Manager管理集群，则有一个用于在datanode守护程序失败时重新启动的选项（自动重新启动过程）。 This option will be set by default. 默认情况下将设置此选项。 When the datanode process gets failed or killed, As Automatic restart option is set Cloudera Scm agent will start the the datanode daemon. 当datanode进程失败或被终止时，由于设置了自动重启选项， Cloudera Scm agent将启动datanode守护程序。

For Automatic restart option : Choose HDFS services -> go to Configuration section -> Search for automatic restart . 对于自动重启选项：选择HDFS服务->转到配置部分->搜索automatic restart 。

This feature is available in CM 4.X release as well. CM 4.X发行版中也提供此功能。