简体   繁体   English

Hadoop伪分布式操作错误:协议消息标记具有无效的线路类型

[英]Hadoop Pseudo-Distributed Operation error: Protocol message tag had invalid wire type

I am setting up a Hadoop 2.6.0 Single Node Cluster. 我正在建立一个Hadoop 2.6.0单节点集群。 I follow the hadoop-common/SingleCluster documentation . 我按照hadoop-common / SingleCluster文档进行操作 I work on Ubuntu 14.04 . 我在Ubuntu 14.04上工作。 So far I have managed to run Standalone Operation successfully. 到目前为止,我已成功运行独立操作。

I face an error when trying to perform Pseudo-Distributed Operation . 我在尝试执行伪分布式操作时遇到错误。 I managed to start NameNode daemon and DataNode daemon. 我设法启动NameNode守护进程和DataNode守护进程。 jps oputut: jps oputut:

martakarass@marta-komputer:/usr/local/hadoop$ jps
4963 SecondaryNameNode
4785 DataNode
8400 Jps
martakarass@marta-komputer:/usr/local/hadoop$ 

But when I try to make the HDFS directories required to execute MapReduce jobs, I receive the following error: 但是当我尝试执行MapReduce作业所需的HDFS目录时,我收到以下错误:

martakarass@marta-komputer:/usr/local/hadoop$ bin/hdfs dfs -mkdir /user
15/05/01 20:36:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
mkdir: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "marta-komputer/127.0.0.1"; destination host is: "localhost":9000; 
martakarass@marta-komputer:/usr/local/hadoop$ 

(I believe I can ignore the WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... warning at this point.) (我相信我可以忽略WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform...警告。)


When it comes to Hadoop config files, I changed only the files mentioned in the documentation. 说到Hadoop配置文件,我只更改了文档中提到的文件。 I have: 我有:

etc/hadoop/core-site.xml : etc / hadoop / core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

etc/hadoop/hdfs-site.xml : etc / hadoop / hdfs-site.xml

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

I managed to connect to localhost: 我设法连接到localhost:

martakarass@marta-komputer:~$ ssh localhost
martakarass@localhost's password: 
Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-45-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

Last login: Fri May  1 20:28:58 2015 from localhost

I formatted the filesystem: 我格式化了文件系统:

martakarass@marta-komputer:/usr/local/hadoop$  bin/hdfs namenode -format
15/05/01 20:30:21 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = marta-komputer/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.0
(...)
15/05/01 20:30:24 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at marta-komputer/127.0.0.1
************************************************************/

/etc/hosts : / etc / hosts

127.0.0.1       localhost
127.0.0.1       marta-komputer

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

etc/hostname : etc / hostname

marta-komputer

This is a set of steps I followed on Ubuntu when facing exactly the same problem but with 2.7.1 , the steps shouldn't differ much for previous and future version (I'd believe). 这是我在面对完全相同的问题时在Ubuntu上遵循的一系列步骤,但是对于2.7.1 ,步骤对于前一版和未来版本应该没有太大差异(我相信)。

1) Format of my /etc/hosts folder: 1)我的/etc/hosts文件夹的格式:

    127.0.0.1    localhost   <computer-name>
    # 127.0.1.1    <computer-name>
    <ip-address>    <computer-name>

    # Rest of file with no changes

2) *.xml configuration files (displaying contents inside <configuration> tag): 2) *.xml配置文件(显示<configuration>标签内的内容):

  • For core-site.xml : 对于core-site.xml

      <property> <name>fs.defaultFS</name> <value>hdfs://localhost/</value> </property> <!-- set value to a directory you want with an absolute path --> <property> <name>hadoop.tmp.dir</name> <value>"set/a/directory/on/your/machine/"</value> <description>A base for other temporary directories</description> </property> 
  • For hdfs-site.xml : 对于hdfs-site.xml

      <property> <name>dfs.replication</name> <value>1</value> </property> 
  • For yarn-site.xml : 对于yarn-site.xml

      <property> <name>yarn.recourcemanager.hostname</name> <value>localhost</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> 
  • For mapred-site.xml : 对于mapred-site.xml

      <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> 

3) Verify $HADOOP_CONF_DIR : 3)验证$HADOOP_CONF_DIR

This is a good opportunity to verify that you are indeed using this configuration. 这是验证您确实使用此配置的好机会。 In the folder where your .xml files reside, view contents of script hadoop_env.sh and make sure $HADOOP_CONF_DIR is pointing at the right directory. .xml文件所在的文件夹中,查看脚本hadoop_env.sh内容,并确保$HADOOP_CONF_DIR指向正确的目录。

4) Check your PORTS: 4)检查您的港口:

NameNode binds ports 50070 and 8020 on my standard distribution and DataNode binds ports 50010 , 50020 , 50075 and 43758 . NameNode会在我的标准分配结合端口500708020和数据节点绑定端口50010,50020,5007543758。 Run sudo lsof -i to be certain no other services are using them for some reason. 运行sudo lsof -i以确定没有其他服务出于某种原因使用它们。

5) Format if necessary: 5)必要时格式化:

At this point, if you have changed the value hadoop.tmp.dir you should reformat the NameNode by hdfs namenode -format . 此时,如果您更改了值hadoop.tmp.dir ,则应该使用hdfs namenode -format重新格式化NameNode。 If not remove the temporary files already present in the tmp directory you are using (default /tmp/ ): 如果没有删除您正在使用的tmp目录中已存在的临时文件(默认为/tmp/ ):

6) Start Nodes and Yarn: 6)启动节点和纱线:

In /sbin/ start the name and data node by using the start-dfs.sh script and yarn with start-yarn.sh and evaluate the output of jps: /sbin/使用start-dfs.sh脚本和带有start-yarn.sh yarn启动名称和数据节点,并评估jps的输出:

    ./start-dfs.sh   
    ./start-yarn.sh

At this point if NameNode, DataNode, NodeManager and ResourceManager are all running you should be set to go! 此时,如果NameNode,DataNode,NodeManager和ResourceManager都在运行,您应该设置为go!

If any of these hasn't started, share the log output for us to re-evaluate. 如果其中任何一个尚未启动,请共享日志输出以供我们重新评估。

remove 127.0.0.1 localhost from /etc/hosts and change your core-site.xml like follow: /etc/hosts删除127.0.0.1 localhost并更改您的core-site.xml ,如下所示:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://marta-komputer:9000</value>
    </property>
</configuration>

and you can ignore the WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... warning 并且您可以忽略WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform...警告

Do these changes in /etc/hosts : / etc / hosts中进行以下更改:

1. Change: 1.变更:

127.0.0.1    localhost

to

127.0.0.1    localhost    marta-komputer

2. Delete: 2.删除:

127.0.0.1    marta-komputer

3. Add: 3.添加:

your-system-ip    marta-komputer

To find your system IP, type this in terminal 要查找系统IP,请在终端中键入此内容

ifconfig

(find your IP address here) or type this: (在此处找到您的IP地址)或输入:

ifdata -pa eth0

Your final /etc/hosts file should look like: 您的最终/ etc / hosts文件应如下所示:

127.0.0.1       localhost       marta-komputer
your-system-ip       marta-komputer

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Change hdfs-site.xml : 更改hdfs-site.xml

1. Change: 1.变更:

hdfs://localhost:9000

to

hdfs://marta-komputer:9000

Now, stop and start hadoop processes. 现在,停止并启动hadoop进程。

Your jps command should list these processes: 您的jps命令应列出以下进程:

Namenode
Datanode
TaskTracker
SecondaryNameNode

If it does not list all these processes, check respective logs for errors. 如果它未列出所有这些进程,请检查相应的日志以查找错误。

UPDATE: 更新:

  1. Follow this tutorial here 在此处阅读本教程

  2. If the problem persists, it might be due to permission issue. 如果问题仍然存在,则可能是由于权限问题。

UPDATE II: 更新II:

  1. Create a directory and change permissions for namenode and datanode: 创建一个目录并更改namenode和datanode的权限:

sudo mkdir -p /usr/local/hdfs/namenode

sudo mkdir -p /usr/local/hdfs/datanode

sudo chown -R hduser:hadoop /usr/local/hdfs/namenode

sudo chown -R hduser:hadoop /usr/local/hdfs/datanode

  1. Add these properties in hdfs-site.xml : hdfs-site.xml添加这些属性:

dfs.datanode.data.dir with value /usr/local/hdfs/datanode dfs.datanode.data.dir ,其值为/usr/local/hdfs/datanode

dfs.namenode.data.dir with value /usr/local/hdfs/namenode dfs.namenode.data.dir ,其值为/usr/local/hdfs/namenode

  1. Stop and start hadoop processes. 停止并启动hadoop进程。

当我从java代码上传文件到hdfs时,我得到了这个错误,问题是我使用hadoop 1 jar连接到hadoop 2安装,不确定你的情况下是什么问题,但如果你配置了hadoop 1 eariler那么一定要弄乱它

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM