简体   繁体   English

hadoop-2.2.0 mapreduce无法在ubuntu上运行

[英]hadoop-2.2.0 mapreduce not working on ubuntu

I've installed hadoop 2.2.0 on 64-bit Ubuntu 12.04.3 (precise) and configured the configuration xml files as suggested in a blog ( http://tuliodomingos.blogspot.com.es/2013/04/installing-apache-hadoop-in-ubuntu-linux.html if you're interested) 我已经在64位Ubuntu 12.04.3(精确)上安装了hadoop 2.2.0,并按照博客中的建议配置了配置xml文件( http://tuliodomingos.blogspot.com.es/2013/04/installing-apache -hadoop-in-ubuntu-linux.html(如果您有兴趣)

The aim is to have a "single node cluster" for dfs and mapreduce. 目的是为dfs和mapreduce提供一个“单节点群集”。

Because some library is lacking, I get the following message often but I don't think it is causing the problems: 因为缺少某些库,所以我经常收到以下消息,但我认为这不会引起问题:

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[I tried a build from maven but got super confused with what was actually going on. [我尝试了从Maven构建的版本,但对实际发生的事情感到非常困惑。 there seemed to be iteration after iteration of compilation and I had no Idea of what was going on.] 在编译的迭代之间似乎有迭代,我对发生的事情一无所知。]

Anyway, with my downloaded (non-maven) hadoop, the distributed file system seems to behave itself. 无论如何,使用我下载的(非Maven的)hadoop,分布式文件系统似乎可以正常运行。 However, when I try to run WordCount mapreduce examples as per tutorials, I get stuck. 但是,当我尝试按照教程运行WordCount mapreduce示例时,我陷入了困境。 The jobs are submitted ok, however they never seem to actually run. 作业提交正常,但是它们似乎从未真正运行过。 The attached "mr_output.txt" is what is returned in the terminal. 附件“ mr_output.txt”是终端中返回的内容。

Also, looking at the local monitoring sites (sorry I can't post these images), one thing I notice is that these sites indicate zero active nodes and I don't understand what is going on, considering that dfs operations are all good. 另外,查看本地监视站点(对不起,我无法发布这些图像),我注意到的一件事是这些站点指示活动节点为零,并且考虑到dfs操作都很好,因此我不知道发生了什么。

Also, here is the output of hdfs dfsadmin -report: 另外,这是hdfs dfsadmin -report的输出:

13/11/06 14:08:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 412849389568 (384.50 GB)
Present Capacity: 134156435456 (124.94 GB)
DFS Remaining: 134152601600 (124.94 GB)
DFS Used: 3833856 (3.66 MB)
DFS Used%: 0.00%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost)
Hostname: rimmer-Inspiron-7520
Decommission Status : Normal
Configured Capacity: 412849389568 (384.50 GB)
DFS Used: 3833856 (3.66 MB)
Non DFS Used: 278692954112 (259.55 GB)
DFS Remaining: 134152601600 (124.94 GB)
DFS Used%: 0.00%
DFS Remaining%: 32.49%
Last contact: Wed Nov 06 14:08:18 EST 2013

If I try to invoke "yarn resoucemanager" or "yarn nodemanager" I get a mega long stream of messages, the error I can see is: 如果我尝试调用“ yarn resoucemanager”或“ yarn nodemanager”,则会收到一长串消息,我看到的错误是:

13/11/06 14:15:11 FATAL nodemanager.NodeManager: Error starting NodeManager
java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers

This is despite "yarn.nodemanager.aux-services" being set to "mapreduce.shuffle" within the file "yarn-site.xml" 尽管在文件“ yarn-site.xml”中将“ yarn.nodemanager.aux-services”设置为“ mapreduce.shuffle”

I've gone through the official docs a bunch of times and also hit google and forums pretty hard. 我已经看过很多次官方文档,并且还对Google和论坛造成了很大的麻烦。 Any wisdom greatly appreciated. 任何智慧都非常感激。

Best, 最好,

Kieran 基兰

For some reason, the valid format for service names changed between Hadoop 2.1.0 and 2.2.0. 由于某些原因,服务名称的有效格式在Hadoop 2.1.0和2.2.0之间更改。

The correct value is now mapreduce_shuffle instead of mapreduce.shuffle 正确的值现在是mapreduce_shuffle而不是mapreduce.shuffle

cf http://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html cf http://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html

Even after changing the value of "yarn.nodemanager.aux-services" to "mapreduce_shuffle", there would still be problems getting the namenode up. 即使将“ yarn.nodemanager.aux-services”的值更改为“ mapreduce_shuffle”,仍然存在启动namenode的问题。

It seems that Hadoop 2.2.0 was shipped to work out of the box on 32bit machines only, due to a folder structure change from 1.2.0 in which now the $HADOOP_INSTALL/lib directory has only one set of libraries (those which work on 32 bit systems only). 似乎Hadoop 2.2.0只能在32位计算机上开箱即用,这是因为文件夹结构从1.2.0更改为1.2.0,现在$ HADOOP_INSTALL / lib目录只有一组库(可在仅32位系统)。

Earlier in 1.2.0, inside that libraries directory there were two sub directories called "Linux-amd64-64" and "Linux-i386-32" corresponding to both x32 and x64 architectures. 在1.2.0之前的版本中,该库目录中有两个子目录,分别对应于x32和x64体系结构,分别称为“ Linux-amd64-64”和“ Linux-i386-32”。

There is a discussion about it here : 这里有一个关于它的讨论:

https://issues.apache.org/jira/browse/HADOOP-9911 https://issues.apache.org/jira/browse/HADOOP-9911

There also a page suggesting that you can compile from source on x64 over here: 还有一个页面建议您可以在此处从x64上的源代码进行编译:

http://blog.csdn.net/focusheart/article/details/14058153 http://blog.csdn.net/focusheart/article/details/14058153

PS I havent been able to compile it without errors though. PS我还没有能够编译没有错误。 The issue on the JIRA thread above is unresolved as well. 上述JIRA线程上的问题也未解决。

EDIT: And because of all the above, everything except the namenode is up and running, which is why you would see the nogemanager, resourcemanager,secondarynamenode (as far as I know it can't "replace" a namenode) and datanode up and running. 编辑:并且由于以上所有,除了namenode之外的所有东西都已启动并正在运行,这就是为什么您会看到nogemanager,resourcemanager,secondarynamenode(据我所知它不能“替换” namenode)和datanode以及运行。

您是否尝试过首先仅运行独立模式?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM