I am running a simple map reduce jobs in hadoop, in java i can calculate start time and end time using System.currentTimeInMillis()
functions, in mapreduce how do i get this functionality to be done for map (endTime-startTime), reduce (endTime-startTime). I tried following code.. and i set job.setNumReduceTasks(4)
Edited:
public void reduce(Text _key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
// process values
long start=System.currentTimeMillis();
int sum=0;
for (IntWritable val : values) {
sum+=val.get();
}
result.set(sum);
context.write(_key, result);
long end=System.currentTimeMillis();
System.out.println(" time Taken "+(end-start));
}
but the result was:
time Taken 1
time Taken 0
time Taken 0
time Taken 0
time Taken 0
time Taken 0
time Taken 0
time Taken 0
time Taken 0
time Taken 0
----------
----------
but i set number of reduce task to 4.. and it is showing here time taken to execute each key value pair..
After adding setup() method and cleanup() method..
public void run(Context context) throws IOException, InterruptedException {
start=System.currentTimeMillis();
setup(context);
try {
while (context.nextKey()) {
reduce(context.getCurrentKey(), context.getValues(), context);
}
} finally {
cleanup(context);
end=System.currentTimeMillis();
System.out.println(" End- Start : "+(end-start));
}
}
public void reduce(Text _key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum=0;
for (IntWritable val : values) {
sum+=val.get();
}
result.set(sum);
context.write(_key, result);
}
I have set number of reducer to 4 using job.setNumReduceTasks(4)
. but it is showing one timestamp only.. am i doing anything wrong here...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.