[英]Can't get rid of java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.ql.io.RCFileInputFormat
Trying to run a simple mapreduce code that reads data from a RCFile. 尝试运行一个简单的mapreduce代码,该代码从RCFile中读取数据。
I'm running the code using hadoop command : 我正在使用hadoop命令运行代码:
hadoop jar MRJobRCFile.jar MRJobRCFile <inputRCfile> <outputfile>
Inspite of adding hive-exec jar to hadoop classpath, getting this error. 尽管将hive-exec jar添加到hadoop类路径中,但出现此错误。
export HADOOP_CLASSPATH=/opt/cmr/hadoopinstall/hive-0.10.0-cdh4.4.0/lib/hive-exec-0.10.0-cdh4.4.0.jar
How else can I add the jar? 我还能如何添加罐子?
Tried checking the jars loaded in jvm using verbose:class : 尝试使用verbose:class检查jvm中加载的jar:
[Loaded org.apache.hadoop.hive.ql.io.RCFileInputFormat from file:/opt/cmr/hadoopinstall/hive-0.10.0-cdh4.4.0/lib/hive-exec-0.10.0-cdh4.4.0.jar] [从文件:/opt/cmr/hadoopinstall/hive-0.10.0-cdh4.4.0/lib/hive-exec-0.10.0-cdh4.4.0.jar中加载了org.apache.hadoop.hive.ql.io.RCFileInputFormat ]
RCFileInputFormat is being loaded by JVM. JVM正在加载RCFileInputFormat。
Any idea how to proceed on this issue ? 知道如何解决这个问题吗? error:
错误:
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.ql.io.RCFileInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1649)
at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:620)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:394)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.ql.io.RCFileInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1617)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java
Resolved by adding the hive-exec jar to HADOOP_CLASSPATH and to the distributed cache. 通过将hive-exec jar添加到HADOOP_CLASSPATH和分布式缓存来解决。
Adding the jar to distributed cache is to make them available to the remote map and reduce task JVM's. 将jar添加到分布式缓存中是为了使它们可用于远程映射并减少任务JVM。 And adding in HADOOP_CLASSPATH is for the client JVM (created by hadoop jar command).
并且为客户端JVM添加HADOOP_CLASSPATH(由hadoop jar命令创建)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.