The Spark web UI displays great information about the total and active number of cores and tasks. How can I get this information programmatically in Java Spark so that I can display job progress to end users?
I did read about the "append /json/" trick to extract JSON versions of web UI pages from the master, and I can get the total number of cores that way...
But all the information about active cores and tasks seems to be in the driver UI pages. I tried the "/json/" trick on the driver UI pages and it just redirects me back to the HTML pages.
Looks like we have discovered two different ways to reveal this information:
1) Retrieve the SparkStatusTracker from the SparkContext (thank you Sai):
JavaSparkContext javaSparkContext = ...;
JavaSparkStatusTracker javaSparkStatusTracker = javaSparkContext.statusTracker();
for (int stageId : javaSparkStatusTracker.getActiveStageIds()) {
SparkStageInfo sparkStageInfo = javaSparkStatusTracker.getStageInfo(stageId);
int numTasks = sparkStageInfo.numTasks();
int numActiveTasks = sparkStageInfo.numActiveTasks();
int numFailedTasks = sparkStageInfo.numFailedTasks();
int numCompletedTasks = sparkStageInfo.numCompletedTasks();
...
}
2) Consult the REST API available from the driver JVM:
https://spark.apache.org/docs/latest/monitoring.html#rest-api
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.