[英]how to use external jars in Cloudera hadoop?
i have a cloudera hadoop version 4 installed on my cluster. 我在集群上安装了cloudera hadoop版本4。 It comes packaged with google protobuffer jar version 2.4.
它随附于Google protobuffer jar版本2.4。 in my application code i use protobuffer classes compiled with protobuffer version 2.5.
在我的应用程序代码中,我使用由protobuffer 2.5版编译的protobuffer类。
This causes unresolved compilation problems at run time. 这会在运行时导致无法解决的编译问题。 Is there a way to run the map reduce jobs with an external jar or am i stuck until cloudera upgrades their service?
有没有一种方法可以使用外部jar来运行map reduce作业,或者在cloudera升级其服务之前我一直处于困境?
Thanks. 谢谢。
Yes you can run MR jobs with external jars. 是的,您可以使用外部jar运行MR作业。
Be sure to add any dependencies to both the HADOOP_CLASSPATH
and -libjars
upon submitting a job like in the following examples: 像下面的示例一样,提交作业时,请确保将所有依赖项都添加到
HADOOP_CLASSPATH
和-libjars
中:
You can use the following to add all the jar dependencies from current and lib
directories: 您可以使用以下命令从当前目录和
lib
目录添加所有jar依赖项:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`echo *.jar`:`echo lib/*.jar | sed 's/ /:/g'`
Bear in mind that when starting a job through hadoop jar
you'll need to also pass it the jars of any dependencies through use of -libjars
. 请记住,通过
hadoop jar
开始作业时,还需要通过使用-libjars
来传递任何依赖项的-libjars
。 I like to use: 我喜欢使用:
hadoop jar <jar> <class> -libjars `echo ./lib/*.jar | sed 's/ /,/g'` [args...]
NOTE: The sed
commands require a different delimiter character; 注意:
sed
命令需要使用不同的定界符; the HADOOP_CLASSPATH
is :
separated and the -libjars
need to be ,
separated. 该
HADOOP_CLASSPATH
是:
分离, -libjars
需,
分离。
EDIT: If you need your classpath to be interpreted first to ensure your jar (and not the pre-packaged jar) is the one that gets used, you can set the following: 编辑:如果您需要首先解释类路径以确保您的jar(而不是预包装的jar)是被使用的,则可以设置以下内容:
export HADOOP_USER_CLASSPATH_FIRST=true
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.