简体   繁体   English

Hadoop MapReduce reducer无法启动

[英]Hadoop MapReduce reducer does not start

The map phase runs and then just quits without bothering with the reducer. 映射阶段运行,然后直接退出而无需使用reducer。 The job alternately prints "Hello from mapper." 作业交替打印“来自映射器的Hello”。 and "Writing CellWithTotalAmount" and that's it. 和“ Writing CellWithTotalAmount”就是这样。 The output directory it creates is empty. 它创建的输出目录为空。

I've checked at least a dozen of other "reducer won't start" questions and have not found an answer. 我已经检查了至少十几个其他的“减速器无法启动”问题,但没有找到答案。 I've checked that the output of map is the same as input into reduce, that reduce uses Iterable, that correct output classes have been set, etc. 我检查了map的输出是否与reduce的输入相同,reduce使用了Iterable,设置了正确的输出类,等等。

Job config 作业配置

public class HoursJob {
    public static void main(String[] args) throws Exception {
        if (args.length != 2) {
          System.err.println("Usage: HoursJob <input path> <output path>");
          System.exit(-1);
        }

        Job job = Job.getInstance();
        job.setJarByClass(HoursJob.class);
        job.setJobName("Hours job");

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setMapperClass(HoursMapper.class);
        job.setReducerClass(HoursReducer.class);

        job.setMapOutputKeyClass(IntWritable.class);
        job.setMapOutputValueClass(CellWithTotalAmount.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(NullWritable.class);

        int ret = job.waitForCompletion(true) ? 0 : 1;
        System.exit(ret);
    }
}

Mapper 映射器

public class HoursMapper 
        extends Mapper<LongWritable, Text, IntWritable, CellWithTotalAmount> {
    static double BEGIN_LONG = -74.913585;
    static double BEGIN_LAT = 41.474937;
    static double GRID_LENGTH = 0.011972;
    static double GRID_HEIGHT = 0.008983112;

    @Override
    public void map(LongWritable key, Text value, Mapper.Context context)
            throws IOException, InterruptedException {

        System.out.println("Hello from mapper.");
        String recordString = value.toString();
        try {
            DEBSFullRecord record = new DEBSFullRecord(recordString);
            Date pickupDate = record.getPickup();
            Calendar calendar = GregorianCalendar.getInstance();
            calendar.setTime(pickupDate);
            int pickupHour = calendar.get(Calendar.HOUR_OF_DAY);
            int cellX = (int)
                ((record.getPickupLongitude() - BEGIN_LONG) / GRID_LENGTH) + 1;
            int cellY = (int)
                ((BEGIN_LAT - record.getPickupLatitude()) / GRID_HEIGHT) + 1;

            CellWithTotalAmount hourInfo = 
                new CellWithTotalAmount(cellX, cellY, record.getTotal());
            context.write(new IntWritable(pickupHour), hourInfo);
        } catch (Exception ex) {
            System.out.println(
                "Cannot parse: " + recordString + "due to the " + ex);
        }
    }
}

Reducer 减速器

public class HoursReducer 
        extends Reducer<IntWritable, CellWithTotalAmount, Text, NullWritable> {
    @Override
    public void reduce(IntWritable key, Iterable<CellWithTotalAmount> values, 
            Context context) throws IOException, InterruptedException {
        System.out.println("Hello from reducer.");
        int[][] cellRideCounters = getCellRideCounters(values);
        CellWithRideCount cellWithMostRides = 
            getCellWithMostRides(cellRideCounters);

        int[][] cellTotals = getCellTotals(values);
        CellWithTotalAmount cellWithGreatestTotal = 
            getCellWithGreatestTotal(cellTotals);

        String output = key + " "
            + cellWithMostRides.toString() + " "
            + cellWithGreatestTotal.toString();

        context.write(new Text(output), NullWritable.get());
    }

    //omitted for brevity
}

Custom writable class 自定义可写类

public class CellWithTotalAmount implements Writable {
    public int cellX;
    public int cellY;
    public double totalAmount;

    public CellWithTotalAmount(int cellX, int cellY, double totalAmount) {
        this.cellX = cellX;
        this.cellY = cellY;
        this.totalAmount = totalAmount;
    }

    @Override
    public void write(DataOutput out) throws IOException {
        System.out.println("Writing CellWithTotalAmount");
        out.writeInt(cellX);
        out.writeInt(cellY);
        out.writeDouble(totalAmount);
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        System.out.println("Reading CellWithTotalAmount");
        cellX = in.readInt();
        cellY = in.readInt();
        totalAmount = in.readDouble();
    }

    @Override
    public String toString() {
        return cellX + " " + cellY + " " + totalAmount;
    }
}

I think there is a lot of exception in reduce function so Framework can not complete the job properly 我认为reduce函数有很多例外,因此Framework无法正常完成工作

    public class HoursReducer 
            extends Reducer<IntWritable, CellWithTotalAmount, Text, NullWritable> {
        @Override
        public void reduce(IntWritable key, Iterable<CellWithTotalAmount> values, 
                Context context) throws IOException, InterruptedException {
            System.out.println("Hello from reducer.");
    try{
            int[][] cellRideCounters = getCellRideCounters(values);
       if(cellRideCounter[0].length>0){ // control it before executing it. more explanation is above
            CellWithRideCount cellWithMostRides = 
                getCellWithMostRides(cellRideCounters);



            int[][] cellTotals = getCellTotals(values);
            CellWithTotalAmount cellWithGreatestTotal = 
                getCellWithGreatestTotal(cellTotals);

            String output = key + " "
                + cellWithMostRides.toString() + " "
                + cellWithGreatestTotal.toString();

            context.write(new Text(output), NullWritable.get());


     }
   }catch(Exception e)

    e.printstack();
     return;
   {

  }


}
  • add try-catch to get exceptions in reduce function 添加try-catch以获取reduce函数中的异常
  • . Return from function in catch 从catch中的函数返回

. Also add an if statement before calling getCellWithMostRiders(..) I think the issue is in here. 在调用getCellWithMostRiders(..)之前还要添加一条if语句,我认为问题出在这里。 Fill the if statement as you want I made a guess and fill it according to my guess change it however you want if it is not proper for you 根据您的需要填写if语句,然后根据我的猜测进行填充,但是如果您不适合,则可以更改它

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM