简体   繁体   English

Spark mesos 集群模式比本地模式慢

[英]Spark mesos cluster mode is slower than local mode

I submit the same jar to run by using both local mode and mesos cluster mode.我提交了相同的 jar 以使用本地模式和 mesos 集群模式运行。 And found for some exactly same stages, local mode only takes several milliseconds to finish however cluster mode will take seconds!并发现对于某些完全相同的阶段,本地模式只需要几毫秒即可完成,而集群模式则需要几秒钟!

listed is one example: stage 659列出的是一个例子:阶段 659

local mode: 659 Streaming job from [output operation 1, batch time 17:45:50] map at KafkaHelper.scala:35 +details 2016/03/22 17:46:31 11 ms本地模式:659 来自 [输出操作 1,批处理时间 17:45:50] 映射在 KafkaHelper.scala:35 +details 2016/03/22 17:46:31 11 ms 的流作业

mesos cluster mode: 659 Streaming job from [output operation 1, batch time 18:01:20] map at KafkaHelper.scala:35 +details 2016/03/22 18:09:33 3 s mesos 集群模式:659 来自 [输出操作 1,批处理时间 18:01:20] 映射在 KafkaHelper.scala:35 +details 2016/03/22 18:09:33 3 s 的流作业

And I found from spark UI that mesos cluster mode will consistently take 4 seconds to finish the foreachRDD jobs, why is that?我从 spark UI 中发现,mesos 集群模式将始终需要 4 秒才能完成 foreachRDD 作业,这是为什么呢? Any submit commands options can help with this?任何提交命令选项可以帮助解决这个问题?

Bunch of thanks in advance!提前致谢!

That behavior depends on multiple factors.这种行为取决于多种因素。 You don't specify what kind of job you run in which cluster mode, and with which settings.您没有指定在哪种集群模式下运行哪种作业以及使用哪种设置。 If Spark is not installed on the Slaves, you'll see an overhead because the distribution needs to be downloaded etc.如果 Slaves 上没有安装 Spark,你会看到一个开销,因为需要下载发行版等。

Furthermore, the jars you're using need to be distributed to the executors, which can take some time for the startup as well.此外,您使用的 jar 需要分发给执行程序,这也需要一些时间来启动。

As said, this all depends on how you run Spark on Mesos.如前所述,这一切都取决于您如何在 Mesos 上运行 Spark。

See

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM