简体   繁体   中英

how to use external jars in Cloudera hadoop?

i have a cloudera hadoop version 4 installed on my cluster. It comes packaged with google protobuffer jar version 2.4. in my application code i use protobuffer classes compiled with protobuffer version 2.5.

This causes unresolved compilation problems at run time. Is there a way to run the map reduce jobs with an external jar or am i stuck until cloudera upgrades their service?

Thanks.

Yes you can run MR jobs with external jars.

Be sure to add any dependencies to both the HADOOP_CLASSPATH and -libjars upon submitting a job like in the following examples:

You can use the following to add all the jar dependencies from current and lib directories:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`echo *.jar`:`echo lib/*.jar | sed 's/ /:/g'`

Bear in mind that when starting a job through hadoop jar you'll need to also pass it the jars of any dependencies through use of -libjars . I like to use:

hadoop jar <jar> <class> -libjars `echo ./lib/*.jar | sed 's/ /,/g'` [args...]

NOTE: The sed commands require a different delimiter character; the HADOOP_CLASSPATH is : separated and the -libjars need to be , separated.

EDIT: If you need your classpath to be interpreted first to ensure your jar (and not the pre-packaged jar) is the one that gets used, you can set the following:

export HADOOP_USER_CLASSPATH_FIRST=true

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM