Spark 应用程序 - 高“执行器计算时间”

Question

I have a Spark application that is now running for 46 hours.我有一个现在运行了 46 小时的 Spark 应用程序。 While majority of its jobs complete within 25 seconds, specific jobs take hours.虽然它的大部分工作在 25 秒内完成，但特定工作需要几个小时。 Some details are provided below:下面提供了一些详细信息：

Task Time   Shuffle Read        Shuffle Write
7.5 h       2.2 MB / 257402     2.9 MB / 128601

There are other similar task times off-course having values of 11.3 h, 10.6 h, 9.4 h etc. each of them spending bulk of the activity time on "rdd at DataFrameFunctions.scala:42.".还有其他类似的任务时间偏离课程，其值为 11.3 小时、10.6 小时、9.4 小时等。它们中的每一个都将大部分活动时间花费在“rdd at DataFrameFunctions.scala:42.”上。 Details for the stage reveals that the time spent by executor on "Executor Computing time". stage 的详细信息显示 executor 在“Executor Computing time”上花费的时间。 This executor runs at DataNode 1, where the CPU utilization is very normal about 13%.该执行器运行在 DataNode 1 上，CPU 利用率在 13% 左右非常正常。 Other boxes (4 more worker nodes) have very nominal CPU utilization.其他盒子（另外 4 个工作节点）的 CPU 利用率非常低。

When the Shuffle Read is within 5000 records, this is extremely fast and completes with 25 seconds, as stated previously.如前所述，当 Shuffle Read 在 5000 条记录以内时，速度非常快，并在 25 秒内完成。 Nothing is appended to the logs (spark/hadoop/hbase), neither anything is noticed at /tmp or /var/tmp location which will indicate some disk related activity is in progress.日志中没有附加任何内容（spark/hadoop/hbase），在/tmp或/var/tmp位置也没有发现任何内容，这将表明某些与磁盘相关的活动正在进行中。

I am clueless about what is going wrong.我对出了什么问题一无所知。 Have been struggling with this for quite some time now.已经为此苦苦挣扎了一段时间。 The versions of software used are as follows:使用的软件版本如下：

Hadoop    : 2.7.2
Zookeeper : 3.4.9
Kafka     : 2.11-0.10.1.1
Spark     : 2.1.0
HBase     : 1.2.6
Phoenix   : 4.10.0

Some configurations on the spark default file. spark默认文件上的一些配置。

spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://SDCHDPMAST1:8111/data1/spark-event
spark.history.fs.logDirectory    hdfs://SDCHDPMAST1:8111/data1/spark-event
spark.yarn.jars                  hdfs://SDCHDPMAST1:8111/user/appuser/spark/share/lib/*.jar
spark.driver.maxResultSize       5G
spark.deploy.zookeeper.url       SDCZKPSRV01

spark.executor.memory                   12G
spark.driver.memory                     10G
spark.executor.heartbeatInterval        60s
spark.network.timeout                   300s

Is there any way I can reduce the time spent on "Executor Computing time"?有什么办法可以减少花在“执行器计算时间”上的时间？

Answer 1

The job performing on the specific dataset is skewed.在特定数据集上执行的作业是有偏差的。 Because of the skewness, jobs are taking more than expected.由于偏斜，工作的收入超出了预期。

Spark 应用程序 - 高“执行器计算时间”

问题描述

1 个解决方案

解决方案1
0 2021-11-26 12:49:09

Spark 应用程序 - 高“执行器计算时间”

问题描述

1 个解决方案

解决方案1 0 2021-11-26 12:49:09

解决方案1
0 2021-11-26 12:49:09