[英]Hadoop Pseudo-Distributed Operation error: Protocol message tag had invalid wire type
I am setting up a Hadoop 2.6.0
Single Node Cluster. 我正在建立一个
Hadoop 2.6.0
单节点集群。 I follow the hadoop-common/SingleCluster documentation . 我按照hadoop-common / SingleCluster文档进行操作 。 I work on
Ubuntu 14.04
. 我在
Ubuntu 14.04
上工作。 So far I have managed to run Standalone Operation successfully. 到目前为止,我已成功运行独立操作。
I face an error when trying to perform Pseudo-Distributed Operation . 我在尝试执行伪分布式操作时遇到错误。 I managed to start NameNode daemon and DataNode daemon.
我设法启动NameNode守护进程和DataNode守护进程。 jps oputut:
jps oputut:
martakarass@marta-komputer:/usr/local/hadoop$ jps
4963 SecondaryNameNode
4785 DataNode
8400 Jps
martakarass@marta-komputer:/usr/local/hadoop$
But when I try to make the HDFS directories required to execute MapReduce jobs, I receive the following error: 但是当我尝试执行MapReduce作业所需的HDFS目录时,我收到以下错误:
martakarass@marta-komputer:/usr/local/hadoop$ bin/hdfs dfs -mkdir /user
15/05/01 20:36:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
mkdir: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "marta-komputer/127.0.0.1"; destination host is: "localhost":9000;
martakarass@marta-komputer:/usr/local/hadoop$
(I believe I can ignore the WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform...
warning at this point.) (我相信我可以忽略
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform...
警告。)
When it comes to Hadoop
config files, I changed only the files mentioned in the documentation. 说到
Hadoop
配置文件,我只更改了文档中提到的文件。 I have: 我有:
etc/hadoop/core-site.xml : etc / hadoop / core-site.xml :
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
etc/hadoop/hdfs-site.xml : etc / hadoop / hdfs-site.xml :
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
I managed to connect to localhost: 我设法连接到localhost:
martakarass@marta-komputer:~$ ssh localhost
martakarass@localhost's password:
Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-45-generic x86_64)
* Documentation: https://help.ubuntu.com/
Last login: Fri May 1 20:28:58 2015 from localhost
I formatted the filesystem: 我格式化了文件系统:
martakarass@marta-komputer:/usr/local/hadoop$ bin/hdfs namenode -format
15/05/01 20:30:21 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = marta-komputer/127.0.0.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.6.0
(...)
15/05/01 20:30:24 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at marta-komputer/127.0.0.1
************************************************************/
/etc/hosts : / etc / hosts :
127.0.0.1 localhost
127.0.0.1 marta-komputer
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
etc/hostname : etc / hostname :
marta-komputer
This is a set of steps I followed on Ubuntu when facing exactly the same problem but with 2.7.1
, the steps shouldn't differ much for previous and future version (I'd believe). 这是我在面对完全相同的问题时在Ubuntu上遵循的一系列步骤,但是对于
2.7.1
,步骤对于前一版和未来版本应该没有太大差异(我相信)。
/etc/hosts
folder: /etc/hosts
文件夹的格式: 127.0.0.1 localhost <computer-name>
# 127.0.1.1 <computer-name>
<ip-address> <computer-name>
# Rest of file with no changes
*.xml
configuration files (displaying contents inside <configuration>
tag): *.xml
配置文件(显示<configuration>
标签内的内容): For core-site.xml
: 对于
core-site.xml
:
<property> <name>fs.defaultFS</name> <value>hdfs://localhost/</value> </property> <!-- set value to a directory you want with an absolute path --> <property> <name>hadoop.tmp.dir</name> <value>"set/a/directory/on/your/machine/"</value> <description>A base for other temporary directories</description> </property>
For hdfs-site.xml
: 对于
hdfs-site.xml
:
<property> <name>dfs.replication</name> <value>1</value> </property>
For yarn-site.xml
: 对于
yarn-site.xml
:
<property> <name>yarn.recourcemanager.hostname</name> <value>localhost</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>
For mapred-site.xml
: 对于
mapred-site.xml
:
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
$HADOOP_CONF_DIR
: $HADOOP_CONF_DIR
: This is a good opportunity to verify that you are indeed using this configuration. 这是验证您确实使用此配置的好机会。 In the folder where your .xml files reside, view contents of script
hadoop_env.sh
and make sure $HADOOP_CONF_DIR
is pointing at the right directory. 在.xml文件所在的文件夹中,查看脚本
hadoop_env.sh
内容,并确保$HADOOP_CONF_DIR
指向正确的目录。
NameNode binds ports 50070 and 8020 on my standard distribution and DataNode binds ports 50010 , 50020 , 50075 and 43758 . NameNode会在我的标准分配结合端口50070和8020和数据节点绑定端口50010,50020,50075和43758。 Run
sudo lsof -i
to be certain no other services are using them for some reason. 运行
sudo lsof -i
以确定没有其他服务出于某种原因使用它们。
At this point, if you have changed the value hadoop.tmp.dir
you should reformat the NameNode by hdfs namenode -format
. 此时,如果您更改了值
hadoop.tmp.dir
,则应该使用hdfs namenode -format
重新格式化NameNode。 If not remove the temporary files already present in the tmp directory you are using (default /tmp/
): 如果没有删除您正在使用的tmp目录中已存在的临时文件(默认为
/tmp/
):
In /sbin/
start the name and data node by using the start-dfs.sh
script and yarn with start-yarn.sh
and evaluate the output of jps: 在
/sbin/
使用start-dfs.sh
脚本和带有start-yarn.sh
yarn启动名称和数据节点,并评估jps的输出:
./start-dfs.sh
./start-yarn.sh
At this point if NameNode, DataNode, NodeManager and ResourceManager are all running you should be set to go! 此时,如果NameNode,DataNode,NodeManager和ResourceManager都在运行,您应该设置为go!
If any of these hasn't started, share the log output for us to re-evaluate. 如果其中任何一个尚未启动,请共享日志输出以供我们重新评估。
remove 127.0.0.1 localhost
from /etc/hosts
and change your core-site.xml
like follow: 从
/etc/hosts
删除127.0.0.1 localhost
并更改您的core-site.xml
,如下所示:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://marta-komputer:9000</value>
</property>
</configuration>
and you can ignore the WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform...
warning 并且您可以忽略
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform...
警告
Do these changes in /etc/hosts : 在/ etc / hosts中进行以下更改:
1. Change: 1.变更:
127.0.0.1 localhost
to 至
127.0.0.1 localhost marta-komputer
2. Delete: 2.删除:
127.0.0.1 marta-komputer
3. Add: 3.添加:
your-system-ip marta-komputer
To find your system IP, type this in terminal 要查找系统IP,请在终端中键入此内容
ifconfig
(find your IP address here) or type this: (在此处找到您的IP地址)或输入:
ifdata -pa eth0
Your final /etc/hosts file should look like: 您的最终/ etc / hosts文件应如下所示:
127.0.0.1 localhost marta-komputer
your-system-ip marta-komputer
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Change hdfs-site.xml : 更改hdfs-site.xml :
1. Change: 1.变更:
hdfs://localhost:9000
to 至
hdfs://marta-komputer:9000
Now, stop and start hadoop processes. 现在,停止并启动hadoop进程。
Your jps command should list these processes: 您的jps命令应列出以下进程:
Namenode
Datanode
TaskTracker
SecondaryNameNode
If it does not list all these processes, check respective logs for errors. 如果它未列出所有这些进程,请检查相应的日志以查找错误。
UPDATE: 更新:
If the problem persists, it might be due to permission issue. 如果问题仍然存在,则可能是由于权限问题。
UPDATE II: 更新II:
sudo mkdir -p /usr/local/hdfs/namenode
sudo mkdir -p /usr/local/hdfs/datanode
sudo chown -R hduser:hadoop /usr/local/hdfs/namenode
sudo chown -R hduser:hadoop /usr/local/hdfs/datanode
hdfs-site.xml
: hdfs-site.xml
添加这些属性: dfs.datanode.data.dir
with value /usr/local/hdfs/datanode
dfs.datanode.data.dir
,其值为/usr/local/hdfs/datanode
dfs.namenode.data.dir
with value /usr/local/hdfs/namenode
dfs.namenode.data.dir
,其值为/usr/local/hdfs/namenode
当我从java代码上传文件到hdfs时,我得到了这个错误,问题是我使用hadoop 1 jar连接到hadoop 2安装,不确定你的情况下是什么问题,但如果你配置了hadoop 1 eariler那么一定要弄乱它
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.