简体   繁体   English

如何在Cloudera hadoop中使用外部jar?

[英]how to use external jars in Cloudera hadoop?

i have a cloudera hadoop version 4 installed on my cluster. 我在集群上安装了cloudera hadoop版本4。 It comes packaged with google protobuffer jar version 2.4. 它随附于Google protobuffer jar版本2.4。 in my application code i use protobuffer classes compiled with protobuffer version 2.5. 在我的应用程序代码中,我使用由protobuffer 2.5版编译的protobuffer类。

This causes unresolved compilation problems at run time. 这会在运行时导致无法解决的编译问题。 Is there a way to run the map reduce jobs with an external jar or am i stuck until cloudera upgrades their service? 有没有一种方法可以使用外部jar来运行map reduce作业,或者在cloudera升级其服务之前我一直处于困境?

Thanks. 谢谢。

Yes you can run MR jobs with external jars. 是的,您可以使用外部jar运行MR作业。

Be sure to add any dependencies to both the HADOOP_CLASSPATH and -libjars upon submitting a job like in the following examples: 像下面的示例一样,提交作业时,请确保将所有依赖项都添加到HADOOP_CLASSPATH-libjars中:

You can use the following to add all the jar dependencies from current and lib directories: 您可以使用以下命令从当前目录和lib目录添加所有jar依赖项:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`echo *.jar`:`echo lib/*.jar | sed 's/ /:/g'`

Bear in mind that when starting a job through hadoop jar you'll need to also pass it the jars of any dependencies through use of -libjars . 请记住,通过hadoop jar开始作业时,还需要通过使用-libjars来传递任何依赖项的-libjars I like to use: 我喜欢使用:

hadoop jar <jar> <class> -libjars `echo ./lib/*.jar | sed 's/ /,/g'` [args...]

NOTE: The sed commands require a different delimiter character; 注意: sed命令需要使用不同的定界符; the HADOOP_CLASSPATH is : separated and the -libjars need to be , separated. HADOOP_CLASSPATH:分离, -libjars,分离。

EDIT: If you need your classpath to be interpreted first to ensure your jar (and not the pre-packaged jar) is the one that gets used, you can set the following: 编辑:如果您需要首先解释类路径以确保您的jar(而不是预包装的jar)是被使用的,则可以设置以下内容:

export HADOOP_USER_CLASSPATH_FIRST=true

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM