如何在Cloudera hadoop中使用外部jar？

Question

i have a cloudera hadoop version 4 installed on my cluster. 我在集群上安装了cloudera hadoop版本4。 It comes packaged with google protobuffer jar version 2.4. 它随附于Google protobuffer jar版本2.4。 in my application code i use protobuffer classes compiled with protobuffer version 2.5. 在我的应用程序代码中，我使用由protobuffer 2.5版编译的protobuffer类。

This causes unresolved compilation problems at run time. 这会在运行时导致无法解决的编译问题。 Is there a way to run the map reduce jobs with an external jar or am i stuck until cloudera upgrades their service? 有没有一种方法可以使用外部jar来运行map reduce作业，或者在cloudera升级其服务之前我一直处于困境？

Thanks. 谢谢。

Answer 1

Yes you can run MR jobs with external jars. 是的，您可以使用外部jar运行MR作业。

Be sure to add any dependencies to both the HADOOP_CLASSPATH and -libjars upon submitting a job like in the following examples: 像下面的示例一样，提交作业时，请确保将所有依赖项都添加到HADOOP_CLASSPATH和-libjars中：

You can use the following to add all the jar dependencies from current and lib directories: 您可以使用以下命令从当前目录和lib目录添加所有jar依赖项：

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`echo *.jar`:`echo lib/*.jar | sed 's/ /:/g'`

Bear in mind that when starting a job through hadoop jar you'll need to also pass it the jars of any dependencies through use of -libjars . 请记住，通过hadoop jar开始作业时，还需要通过使用-libjars来传递任何依赖项的-libjars 。 I like to use: 我喜欢使用：

hadoop jar <jar> <class> -libjars `echo ./lib/*.jar | sed 's/ /,/g'` [args...]

NOTE: The sed commands require a different delimiter character; 注意： sed命令需要使用不同的定界符； the HADOOP_CLASSPATH is : separated and the -libjars need to be , separated. 该HADOOP_CLASSPATH是:分离， -libjars需,分离。

EDIT: If you need your classpath to be interpreted first to ensure your jar (and not the pre-packaged jar) is the one that gets used, you can set the following: 编辑：如果您需要首先解释类路径以确保您的jar（而不是预包装的jar）是被使用的，则可以设置以下内容：

export HADOOP_USER_CLASSPATH_FIRST=true

如何在Cloudera hadoop中使用外部jar？

问题描述

1 个解决方案

解决方案1
2 2013-04-20 20:06:55

如何在Cloudera hadoop中使用外部jar？

问题描述

1 个解决方案

解决方案1 2 2013-04-20 20:06:55

解决方案1
2 2013-04-20 20:06:55