简体   繁体   English

纱线上火花的作业调度较慢

[英]slower job scheduling in spark on yarn

I have two mapr clusters with the following configuration, 我有两个映射器集群,具有以下配置,

cluster 1: hosted on aws, 3 nodes with 32g of memory/32 cores each
cluster 2: hosted on bare-metal servers, 8 nodes with 128g of memory/32 cores each

I'm running following piece of pyspark code thru yarn, on both the clusters 我在两个集群上运行跟随纱线的pyspark代码

df=hc.sql("select * from hive_table")
df.registerTempTable("df")
df.cache().count()
for: 100times
    result=hc.sql('select xxxx from df')
    result.write.saveAsTable('some hive table', mode='append')

the above code submits 100 new jobs in spark (running on top of yarn). 上面的代码在spark中提交了100个新工作(在纱线上运行)。 On cluster one, the whole operation completes in 30 mins, but on cluster 2 which is a bigger one, it takes 90 mins to complete the same operation. 在第一组中,整个操作在30分钟内完成,但在较大的第2组上,完成相同的操作需要90分钟。 Upon checking, I found out that though each job takes almost about the same time (little faster in the cluster 2), the time between each job is way too higher in 2 than 1. 经过检查,我发现虽然每个作业几乎都需要大约相同的时间(集群2中的速度稍快一些),但每个作业之间的时间比2中的要高得多。

Possible reasons, 可能的原因,

  1. latency between drivers and executors node? 驱动程序和执行程序节点之间的延迟 -- I'm running in - 我正在跑步
    yarn-client mode 纱线客户模式
  2. low driver memory or the way I initiate a spark cluster on yarn is wrong? 低驱动器内存或我在纱线上启动火花簇的方式是错误的?

how do I submit the jobs? 我该如何提交工作?

Cluster 1: /opt/mapr/spark/spark-1.6.1/bin/spark-submit --master yarn --deploy-mode client --num-executors 10 --executor-memory 10g --executor-cores 5 --driver-memory 10g --driver-cores 10 --conf spark.driver.maxResultSize="0" --conf spark.default.parallelism="100" --queue default

Cluster 2: /opt/mapr/spark/spark-1.6.1/bin/spark-submit --master yarn --deploy-mode client --num-executors 10 --executor-memory 80g --executor-cores 28 --driver-memory 25g --driver-cores 25 --conf spark.driver.maxResultSize="0" --conf spark.default.parallelism="100" --queue default

PS: only pasted part of the code. PS:只粘贴部分代码。 There are other modules in the code. 代码中还有其他模块。 Overall, cluster 2 process the code 3x faster than cluster 1, so I don't the think the issue with the 'general' speed. 总的来说,集群2处理代码的速度比集群1快3倍,因此我不认为问题具有“一般”速度。

my question is more specific to 'time' between the jobs. 我的问题更具体到工作之间的“时间”。 For ex, above piece of code runs 100 spark-sql jobs, each job takes on an average 2s in cluster 2, 5s in cluster 1. Time between each job is way too higher in cluster 2 compared to cluster 1. 例如,上面的代码运行100个spark-sql作业,每个作业在集群2中平均需要2s,在集群1中需要5s。在集群2中,每个作业之间的时间与集群1相比要高得多。

In your pseudo-code I don't see any driver related actions (assuming that the executors save the data to a distributed FS) 在您的伪代码中,我没有看到任何与驱动程序相关的操作(假设执行程序将数据保存到分布式FS)

Please note that: 请注意:

  1. You df.cache() but it doesn't seems like you are using the cached df. df.cache()但似乎你没有使用缓存的df。
  2. Your yarn-client configuration seems to be problematic. 您的纱线客户端配置似乎有问题。

It looks like you are trying to use more executor memory and cores than available. 看起来您正在尝试使用比可用内存更多的执行程序内存和内核。

In cluster #1, there are 3 nodes with 32GB of RAM, your execution code is: --num-executors 10 --executor-memory 10g 在集群#1中,有3个节点具有32GB的RAM,您的执行代码是: - --num-executors 10 --executor-memory 10g

Best case scenario you'll have 9 executors with 10GB of RAM each. 在最佳情况下,您将拥有9个执行器,每个执行器具有10GB RAM。 Max 3 executors on each node. 每个节点上最多3个执行程序。 I'd assume that you'll be able to execute only 2 executor per node (as from the 32GB of RAM more than 2GB will be used for yarn, overhead, etc, hence less than 29GB will be left ==> 2 containers of 10GB each) 我假设你每个节点只能执行2个执行程序(因为32GB的RAM超过2GB将用于纱线,开销等,因此剩下的不到29GB ==> 2个容器每个10GB)

==> Cluster #1 will have 6 to 9 executors ==>群集#1将有6到9个执行程序

In cluster #2, there are 5 nodes with 128GB of RAM, your execution code is: --num-executors 10 --executor-memory 80g 在集群#2中,有5个节点具有128GB的RAM,您的执行代码是: - --num-executors 10 --executor-memory 80g

Best case scenario you'll have 5 executors with 80GB of RAM. 最好的情况是你有5个执行器和80GB的RAM。 Each executor on one node. 每个执行器在一个节点上。

Since cluster #1 has more executors (even that they are smaller) , it is possible that it will run faster (depend on your application) 由于集群#1有更多的执行程序(即使它们更小),它可能会运行得更快(取决于您的应用程序)

Reducing executors memory and core in cluster #2 together with increasing in the number of executors should provide better performance. 减少集群#2中的执行程序内存和核心以及执行程序数量的增加应该可以提供更好的性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM