I have installed Apache Spark 2.3.1 and need to check which script is efficient
Questions:
1.How do I monitor Apache Spark script execution?
2.Which one of these scripts is efficient?
rdd = sc.textFile("Readme.txt")
1:
rdd.flatMap(x => x.split(" ")).countByValue()
2:
words = rdd.flatMap(lambda x: x.split(" "))
result = words.map(lambda x: (x, 1)).reduceByKey(lambda x, y: x + y)
使用spark web ui,它包含有关监视性能所需的信息,这些信息包括-时间,执行程序统计信息,阶段统计信息,任务统计信息,资源统计信息等。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.