简体繁体 English

在hadoop集群中重新创建namenode之后，重新启动datanode

[英]Restarting datanodes after reformating namenode in a hadoop cluster

原文 2014-09-18 21:01:24 4 1 hadoop

Using the basic configuration provided in the hadoop setup official documentation, I can run a hadoop cluster and submit mapreduce jobs. 使用hadoop设置官方文档中提供的基本配置，我可以运行hadoop集群并提交mapreduce作业。

The problem is whenever I stop all the daemons and reformat the namenode, when I subsequently start all the daemons, the datanode does not start. 问题是每当我停止所有守护程序并重新格式化namenode时，当我随后启动所有守护程序时，datanode都不会启动。

I've been looking around for a solution and it appears that it is because the formatting only formats the namenode and the disk space for the datanode needs to be erased. 我一直在寻找解决方案，似乎是因为格式化仅格式化namenode，并且需要删除datanode的磁盘空间。

How can I do this? 我怎样才能做到这一点？ What changes do I need to make to my config files? 我需要对配置文件进行哪些更改？ After those changes are made, how do I delete the correct files when formatting the namenode again? 进行这些更改后，如何在再次格式化namenode时删除正确的文件？

1 个解决方案

Specifically if you have provided configuration of below 2 parameters which can be defined in hdfs-site.xml 特别是如果您提供了以下2个参数的配置，可以在hdfs-site.xml定义

dfs.name.dir : Determines where on the local filesystem the DFS name node should store the name table(fsimage) . dfs.name.dir ：确定DFS名称节点在本地文件系统上应存储名称table(fsimage) 。 If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. 如果这是用逗号分隔的目录列表，则将名称表复制到所有目录中，以实现冗余。

dfs.data.dir : Determines where on the local filesystem an DFS data node should store its blocks. dfs.data.dir ：确定DFS数据节点在本地文件系统上应存储其块的位置。 If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. 如果这是逗号分隔的目录列表，则数据将存储在所有命名的目录中，通常在不同的设备上。 Directories that do not exist are ignored 不存在的目录将被忽略

if you have provided the specific directory location for above 2 parameters then you need to delete those directories as well before formating namenode . 如果为以上两个参数提供了特定的目录位置，则在格式化namenode之前也需要删除这些目录。

if you have not provided the above 2 parameter so by default it gets created under below parameter : 如果您没有提供上述2个参数，那么默认情况下会在以下参数下创建它：

hadoop.tmp.dir which can be configured in core-site.xml hadoop.tmp.dir可以在core-site.xml配置

Again if you have specified this parameter then you need to remove that directory before formating namenode . 同样，如果您指定了此参数，则需要在格式化namenode之前删除该目录。

if you have not defined so by default it gets created in /tmp/hadoop-$username(hadoop) user so you need to remove this directory . 如果尚未定义，则默认情况下会在/tmp/hadoop-$username(hadoop) user创建它，因此您需要删除此目录。

Summary: you have to delete the name node and data node directory before formating the system. 摘要：在格式化系统之前，必须删除名称节点和数据节点目录。 By default it gets created at /tmp/ location . 默认情况下，它在/tmp/位置创建。