繁体   English   中英

实施Hadoop和MongoDB连接器

[英]Implementing the Hadoop and MongoDB connector

我第一次使用Hadoop,因为我计划将其用于MongoDB。 安装Hadoop之后,我尝试遵循本教程并实现其示例http://docs.mongodb.org/ecosystem/tutorial/getting-started-with-hadoop/

一切正常,直到我调用此命令

bash examples/treasury_yield/run_job.sh

然后我得到以下消息

14/03/11 17:52:45 INFO util.MongoTool: Created a conf: 'Configuration: core-defa
ult.xml, core-site.xml, src/examples/hadoop-local.xml, src/examples/mongo-defaul
ts.xml' on {class com.mongodb.hadoop.examples.treasury.TreasuryYieldXMLConfig} a
s job named '<unnamed MongoTool job>'
14/03/11 17:52:46 INFO util.MongoTool: Mapper Class: class com.mongodb.hadoop.ex
amples.treasury.TreasuryYieldMapper
14/03/11 17:52:46 INFO util.MongoTool: Setting up and running MapReduce job in f
oreground, will wait for results.  {Verbose? true}
14/03/11 17:52:47 WARN fs.FileSystem: "localhost:9100" is a deprecated filesyste
m name. Use "hdfs://localhost:9100/" instead.
14/03/11 17:52:47 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop
.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-goncalopereira/mapre
d/staging/goncalopereira/.staging/job_201403111752_0001/job.jar could only be re
plicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBloc
k(FSNamesystem.java:1639)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.jav
a:736)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at com.sun.proxy.$Proxy2.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryI
nvocationHandler.java:85)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocat
ionHandler.java:62)
        at com.sun.proxy.$Proxy2.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock
(DFSClient.java:3686)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStrea
m(DFSClient.java:3546)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClien
t.java:2749)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFS
Client.java:2989)

14/03/11 17:52:47 WARN hdfs.DFSClient: Error Recovery for block null bad datanod
e[0] nodes == null
14/03/11 17:52:47 WARN hdfs.DFSClient: Could not get block locations. Source fil
e "/tmp/hadoop-goncalopereira/mapred/staging/goncalopereira/.staging/job_2014031
11752_0001/job.jar" - Aborting...
14/03/11 17:52:47 INFO mapred.JobClient: Cleaning up the staging area hdfs://loc
alhost:9100/tmp/hadoop-goncalopereira/mapred/staging/goncalopereira/.staging/job
_201403111752_0001
14/03/11 17:52:47 ERROR security.UserGroupInformation: PriviledgedActionExceptio
n as:goncalopereira cause:org.apache.hadoop.ipc.RemoteException: java.io.IOExcep
tion: File /tmp/hadoop-goncalopereira/mapred/staging/goncalopereira/.staging/job
_201403111752_0001/job.jar could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBloc
k(FSNamesystem.java:1639)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.jav
a:736)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

14/03/11 17:52:47 ERROR util.MongoTool: Exception while executing job...
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-gon
calopereira/mapred/staging/goncalopereira/.staging/job_201403111752_0001/job.jar
 could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBloc
k(FSNamesystem.java:1639)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.jav
a:736)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at com.sun.proxy.$Proxy2.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryI
nvocationHandler.java:85)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocat
ionHandler.java:62)
        at com.sun.proxy.$Proxy2.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock
(DFSClient.java:3686)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStrea
m(DFSClient.java:3546)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClien
t.java:2749)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFS
Client.java:2989)
14/03/11 17:52:47 ERROR hdfs.DFSClient: Failed to close file /tmp/hadoop-goncalo
pereira/mapred/staging/goncalopereira/.staging/job_201403111752_0001/job.jar
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-gon
calopereira/mapred/staging/goncalopereira/.staging/job_201403111752_0001/job.jar
 could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBloc
k(FSNamesystem.java:1639)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.jav
a:736)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1149)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at com.sun.proxy.$Proxy2.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryI
nvocationHandler.java:85)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocat
ionHandler.java:62)
        at com.sun.proxy.$Proxy2.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock
(DFSClient.java:3686)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStrea
m(DFSClient.java:3546)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClien
t.java:2749)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFS
Client.java:2989)

如您所料,这对于像我这样的新手来说有点不知所措。 我认为这是Hadoop的问题,但不能完全确定是什么问题。 我真的希望这里有人可以指出正确的方向。

嗨,我已使用mongoDBConnector通过此链接将hadoop与mongodb连接

hadoop与mongodb的连接

您需要专注于此错误:

ERROR security.UserGroupInformation: PriviledgedActionExceptio n as:goncalopereira cause:org.apache.hadoop.ipc.RemoteException: java.io.IOExcep tion: File /tmp/hadoop-goncalopereira/mapred/staging/goncalopereira/.staging/job _201403111752_0001/job.jar could only be replicated to 0 nodes, instead of 1

  1. 检查路径上是否存在该jar。

  2. 检查是否要启动dataNode,因为它需要花费一些时间才能启动。

确保您的hadoop已正确安装,并尝试仅针对hadoop运行示例数据集,而不必将MangoDB变成图片。 这将区分出问题所在。 希望能帮助到你。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM