[英]ClassNotFoundException when running WordCount example in Eclipse
I'm trying to run the exemplary code for WordCount map/reduce job. 我正在尝试为WordCount映射/减少作业运行示例代码。 I'm running it on Hadoop 1.2.1. 我正在Hadoop 1.2.1上运行它。 and I'm running it from my Eclipse. 我正在Eclipse中运行它。 Here is the code I try to run: 这是我尝试运行的代码:
package mypackage;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.Reducer.Context;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class WordCount {
public static class Map extends
Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}
public static class Reduce extends
Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.set("mapred.job.tracker", "maprfs://,y_address");
conf.set("fs.default.name", "hdfs://my_address");
Job job = new Job(conf, "wordcount");
job.setJarByClass(WordCount.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
Unfortunatelly, running this code ends up with the following error: 不幸的是,运行此代码最终会出现以下错误:
13/11/04 13:27:53 INFO mapred.JobClient: Task Id : attempt_201310311611_0005_m_000000_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: com.rf.hadoopspikes.WordCount$Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:718) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) 13/11/04 13:27:53信息mapred.JobClient:任务ID:try_201310311611_0005_m_000000_0,状态:FAILED java.lang.RuntimeException:java.lang.ClassNotFoundException:com.rf.hadoopspikes.WordCount $ Map在org.apache.hadoop org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)的.conf.Configuration.getClass(Configuration.java:857)org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:718) ),位于org.apache.hadoop.mapred.Child的org.apache.hadoop.mapred.Child $ 4.run(Child.java:255)的java.security.AccessController.doPrivileged(Native) org.apache.hadoop.mapred.Child.main上javax.security.auth.Subject.doAs(Subject.java:415)上org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)上的方法(Child.java:249)
I understand that the WordClass cannot be found but I have no idea how to make this work. 我知道无法找到WordClass,但我不知道如何进行这项工作。 Any ideas? 有任何想法吗?
When running this directly from Eclipse, you need to make sure the classes have been bundled into a Jar file (for which hadoop then copies up to HDFS). 直接从Eclipse运行时,您需要确保这些类已捆绑到Jar文件中(对于该文件,hadoop然后将其复制到HDFS)。 Your error most probably relates to the fact that your Jar hasn't been built, or at runtime the classes are being run from the output directory and not the bundled jar. 您的错误很可能与以下事实有关:尚未构建Jar,或者在运行时从输出目录而不是捆绑的jar中运行类。
Try and export the classes into a jar file, and then run your WordCount class from that Jar file. 尝试将类导出到jar文件中,然后从该Jar文件中运行WordCount类。 You could also look into using the Eclipse Hadoop plugin that i think handles all this form you. 您还可以考虑使用Eclipse Hadoop插件,我认为它可以处理您所有的这种形式。 Final option would be to bundle the jar and then launch from the command line (as outlined in the various Hadoop tutorials) 最终选择是捆绑jar,然后从命令行启动(如各种Hadoop教程所述)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.