简体   繁体   English

如何在Hadoop中以编程方式获取每个减少任务的执行时间?

[英]How do I get each reduce task's execution time programmatically in Hadoop?

I am running a simple map reduce jobs in hadoop, in java i can calculate start time and end time using System.currentTimeInMillis() functions, in mapreduce how do i get this functionality to be done for map (endTime-startTime), reduce (endTime-startTime). 我正在hadoop中运行一个简单的map reduce作业,在Java中,我可以使用System.currentTimeInMillis()函数来计算开始时间和结束时间,在mapreduce中,我该如何为地图(endTime-startTime)完成此功能,reduce( endTime-startTime)。 I tried following code.. and i set job.setNumReduceTasks(4) 我尝试了以下代码..并set job.setNumReduceTasks(4)

Edited: 编辑:

public void reduce(Text _key, Iterable<IntWritable> values, Context context)
            throws IOException, InterruptedException {
        // process values
        long start=System.currentTimeMillis();
        int sum=0;

        for (IntWritable val : values) {

            sum+=val.get();

        }
        result.set(sum);
        context.write(_key, result);
        long end=System.currentTimeMillis();

        System.out.println(" time Taken "+(end-start));


    }

but the result was: 但结果是:

time Taken 1
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 ----------
 ----------

but i set number of reduce task to 4.. and it is showing here time taken to execute each key value pair.. 但我将reduce任务的数量设置为4 ..并且这里显示了执行每个键值对所花费的时间。

After adding setup() method and cleanup() method.. 添加setup()方法和cleanup()方法之后。

public void run(Context context) throws IOException, InterruptedException {
        start=System.currentTimeMillis();
        setup(context);
        try {
          while (context.nextKey()) {
            reduce(context.getCurrentKey(), context.getValues(), context);
          }
        } finally {
          cleanup(context);
          end=System.currentTimeMillis();
          System.out.println(" End- Start : "+(end-start));
        }
      }

    public void reduce(Text _key, Iterable<IntWritable> values, Context context)
            throws IOException, InterruptedException {

        int sum=0;

        for (IntWritable val : values) {

            sum+=val.get();

        }
        result.set(sum);
        context.write(_key, result);

    }

I have set number of reducer to 4 using job.setNumReduceTasks(4) . 我已经使用job.setNumReduceTasks(4)将reducer的数量设置为4。 but it is showing one timestamp only.. am i doing anything wrong here... 但是它只显示一个时间戳。.我在这里做错什么了吗...

To find the total time of a reducer you can: 要查找减速器的总时间,您可以:

  1. Add a long variable to the class that will hold the start time. 向该类添加一个long变量,该变量将保留开始时间。
  2. Set the start time in the setup() method of the reducer. 在reducer的setup()方法中setup()开始时间。
  3. Get the end time in the cleanup() method of the reducer, and subtract from the stored start time to get the total time. 在减速器的cleanup()方法中获取结束时间,然后从存储的开始时间中减去以获取总时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM