简体   繁体   English

Hadoop:测量JAVA的随机播放时间

[英]Hadoop: Measuring shuffle time from JAVA

Is there a way to get the shuffle time required from each reduce task from the client side using the Hadoop API (Hadoop 1.2.1). 有没有办法使用Hadoop API(Hadoop 1.2.1)从客户端获取每个reduce任务所需的shuffle时间。 I can get the execution time of the reduce tasks from the JobClient using the getReduceTaskReports(JobID jobID) method, but I wonder is there a way to get the percentage that corresponds to the shuffle time. 我可以使用getReduceTaskReports(JobID jobID)方法从JobClient获取reduce任务的执行时间,但我想知道是否有办法获得与shuffle时间相对应的百分比。 Thanks in advance. 提前致谢。

The solution to the problem was to use Apache Rumen ( http://hadoop.apache.org/docs/r1.2.1/rumen.html ). 该问题的解决方案是使用Apache Rumen( http://hadoop.apache.org/docs/r1.2.1/rumen.html )。 This framework enables you to retrieve job history logs in a JSON format, with simple JSON parsing I was able to retrieve the information I needed. 此框架使您能够以JSON格式检索作业历史记录日志,通过简单的JSON解析,我能够检索所需的信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM