简体   繁体   English

MongoDB Hadoop连接器流未运行

[英]MongoDB Hadoop connector streaming not running

I want to launch the MongoDB Hadoop Streaming connector, so I downloaded a compatible version of Hadoop (the 2.2.0) (see https://github.com/mongodb/mongo-hadoop/blob/master/README.md#apache-hadoop-22 ) 我想启动MongoDB Hadoop Streaming连接器,因此我下载了Hadoop的兼容版本(2.2.0)(请参阅https://github.com/mongodb/mongo-hadoop/blob/master/README.md#apache- hadoop-22

I cloned the git repository mongohadoop, changed the build.sbt hadoopRelease for 2.2 : 我克隆了git仓库mongohadoop,将build.sbt hadoopRelease更改为2.2:

$ cat build.sbt
name := "mongo-hadoop"

organization := "org.mongodb"

hadoopRelease in ThisBuild := "2.2"

Then I launched: 然后我启动了:

$ ./sbt package
$ ./sbt mongo-hadoop-streaming/assembly
$ cp core/target/mongo-hadoop-core_2.2.0-1.2.0.jar ../hadoop-2.2.0/lib/
$ cp mongo-2.7.3.jar ../hadoop-2.2.0/lib/ # Previously downloaded
$ cd ../hadoop-2.2.0/
$ ./bin/hadoop jar ../mongo-hadoop/streaming/target/mongo-hadoop-streaming-assembly-1.1.0.jar -mapper ...

And I get this : 我得到这个:

Exception in thread "main" java.lang.ClassNotFoundException: com.mongodb.hadoop.streaming.MongoStreamJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:249)
at org.apache.hadoop.util.RunJar.main(RunJar.java:205)

I don't understand why, I tried almost every version supposed to support streaming but I always get the same error ! 我不明白为什么,我尝试了几乎所有应该支持流传输的版本,但是我总是遇到相同的错误!

I precise I am on Mac OS X. Thanks ! 我精确地说我在Mac OS X上。谢谢!

That is actually a bug that will be fixed in an upcoming release. 那实际上是一个错误,将在以后的版本中修复。 Need for that main class was removed but the generated manifest was not. 该主类的需求已删除,但生成的清单未删除。 You can tweak your jar by removing the Main-Class entry from the manifest in the streaming jar. 您可以通过从流式jar中的清单中删除Main-Class条目来调整jar。 If you run the script below in the directory where your streaming jar is, it'll fix that for you: 如果您在流jar所在的目录中运行以下脚本,它将为您解决该问题:

#! /bin/sh

M=META-INF/MANIFEST.MF
mkdir tmp
cd tmp
cp ../$1 .
JAR=$1

jar xf ${JAR}

sed -e '/Main-Class/d' ${M} >> ${M}.new 
mv ${M}.new  ${M}

jar cvfm ${JAR} ${M}

mv ${JAR} ..
cd ..
rm -r tmp

It's not super pretty but should get you over the hump. 它不是超级漂亮,但应该让您摆脱困境。 We'll try to get a formal 1.2.1 release out soonish. 我们将尝试尽快发布正式的1.2.1版本。 Here's the jira ticket in the meantime: https://jira.mongodb.org/browse/HADOOP-121 这是其间的吉拉票: https : //jira.mongodb.org/browse/HADOOP-121

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM