简体   繁体   中英

How do I get each reduce task's execution time programmatically in Hadoop?

I am running a simple map reduce jobs in hadoop, in java i can calculate start time and end time using System.currentTimeInMillis() functions, in mapreduce how do i get this functionality to be done for map (endTime-startTime), reduce (endTime-startTime). I tried following code.. and i set job.setNumReduceTasks(4)

Edited:

public void reduce(Text _key, Iterable<IntWritable> values, Context context)
            throws IOException, InterruptedException {
        // process values
        long start=System.currentTimeMillis();
        int sum=0;

        for (IntWritable val : values) {

            sum+=val.get();

        }
        result.set(sum);
        context.write(_key, result);
        long end=System.currentTimeMillis();

        System.out.println(" time Taken "+(end-start));


    }

but the result was:

time Taken 1
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 ----------
 ----------

but i set number of reduce task to 4.. and it is showing here time taken to execute each key value pair..

After adding setup() method and cleanup() method..

public void run(Context context) throws IOException, InterruptedException {
        start=System.currentTimeMillis();
        setup(context);
        try {
          while (context.nextKey()) {
            reduce(context.getCurrentKey(), context.getValues(), context);
          }
        } finally {
          cleanup(context);
          end=System.currentTimeMillis();
          System.out.println(" End- Start : "+(end-start));
        }
      }

    public void reduce(Text _key, Iterable<IntWritable> values, Context context)
            throws IOException, InterruptedException {

        int sum=0;

        for (IntWritable val : values) {

            sum+=val.get();

        }
        result.set(sum);
        context.write(_key, result);

    }

I have set number of reducer to 4 using job.setNumReduceTasks(4) . but it is showing one timestamp only.. am i doing anything wrong here...

To find the total time of a reducer you can:

  1. Add a long variable to the class that will hold the start time.
  2. Set the start time in the setup() method of the reducer.
  3. Get the end time in the cleanup() method of the reducer, and subtract from the stored start time to get the total time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM