简体   繁体   English

MapReduce不产生输出

[英]MapReduce doesn't produce an output

I want to execute a simple MapReduce on textfile but it doesn't an output. 我想在文本文件上执行一个简单的MapReduce,但是它没有输出。 This is my code: 这是我的代码:

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {

  public static class TokenizerMapper
   extends Mapper<Object, Text, Text, IntWritable>{

private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(Object key, Text value, Context context
                ) throws IOException, InterruptedException {
  StringTokenizer itr = new StringTokenizer(value.toString());
  while (itr.hasMoreTokens()) {
    word.set(itr.nextToken());
    context.write(word, one);
  }
 }
}

  public static class IntSumReducer
   extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable<IntWritable> values,
                   Context context
                   ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      result.set(sum) ;
      context.write(key, result);
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    Job job = Job.getInstance(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

I receive this errors during jar file execution: 我在执行jar文件的过程中收到以下错误消息:

17/05/07 23:10:53 WARN mapred.LocalJobRunner: job_local973452829_0001
java.lang.Exception:     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#1
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#1
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: /app/hadoop/tmp%20/mapred/local/localRunner/hduser/jobcache/job_local973452829_0001/attempt_local973452829_0001_m_000000_0/output/file.out.index
at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:193)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
at org.apache.hadoop.io.SecureIOUtils.openFSDataInputStream(SecureIOUtils.java:156)
at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:70)
at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:62)
at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:57)
at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:123)
at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:101)
at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher

What's the problem in my code? 我的代码有什么问题? I'm under ubuntu 14.04 with hadoop 2.4 我在Hadoop 14.04和Hadoop 2.4下

try to write hdfs commands like these 尝试写这样的hdfs命令

hadoop jar (jar file name) (input name) (output name) hadoop jar(jar文件名)(输入名称)(输出名称)

or did you check the before to export your (jar file) did you take any warnings ? 还是您在导出(jar文件)之前检查过之前是否采取了任何警告措施?

This is a part of your error: 这是您的错误的一部分:

Caused by: java.io.FileNotFoundException: /app/hadoop/tmp%20/mapred/local/localRunner/hduser/jobcache/job_local973452829_0001/attempt_local973452829_0001_m_000000_0/output/file.out.index

I'm guessing it's a problem with the configuration property hadoop.tmp.dir in your core-site.xml because Hadoop is unable to store the temporary output files (from Mapper) to your disk. 我猜测这是core-site.xml的配置属性hadoop.tmp.dir的问题,因为Hadoop无法将临时输出文件(来自Mapper)存储到磁盘。

You can remove that property so that hadoop creates it's own temporary directory to store the intermediate output or set it to some directory with appropriate permissions. 您可以删除该属性,以便hadoop创建它自己的临时目录来存储中间输出,或将其设置为具有适当权限的某个目录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM