简体   繁体   English

Spark流使用更少的执行程序

[英]Spark streaming uses lesser number of executors

I am using spark streaming process some events. 我正在使用Spark Streaming处理一些事件。 It is deployed in standalone mode with 1 master and 3 workers. 它以1个主服务器和3个工作器的独立模式进行部署。 I have set number of cores per executor to 4 and total num of executors to 24. This means totally 6 executors will be spawned. 我将每个执行器的核心数设置为4,将执行器的总数设置为24。这意味着将产生总共6个执行器。 I have set spread-out to true. 我已将扩展设置为true。 So each worker machine get 2 executors. 因此,每台工作计算机都有2个执行程序。 My batch interval is 1 second. 我的批处理间隔是1秒。 Also I have repartitioned the batch to 21. The rest 3 are for receivers. 另外,我将批次重新分配为21。其余3个用于接收方。 While running what I observe from event timeline is that only 3 of the executors are being used. 在运行事件时间轴时,我观察到的是仅使用了3个执行器。 The other 3 are not being used. 其他3个未使用。 As far as I know, there is no parameter in spark standalone mode to specify the number of executors. 据我所知,在火花独立模式下没有参数来指定执行程序的数量。 How do I make spark to use all the available executors? 我如何发火花使用所有可用的执行程序?

Probably your streaming has not so many partitions to fill all executors on every 1-second minibatch. 可能您的流媒体没有那么多分区来填充每1秒微型批处理中的所有执行程序。 Try repartition(24) as first streaming transformation to use full spark cluster power. 尝试使用repartition(24)作为第一个流转换,以使用完整的Spark集群功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Spark本地模式下的执行程序数 - Number of Executors in Spark Local Mode 给定内核和执行器的数量,如何确定rdd中partd的数量? - How to determine number of partitons of rdd in spark given the number of cores and executors ? 作为执行程序和线程数量的函数,spark中的分区数量是多少? - What is a good number of partitions in spark as a function of number of executors and threads? 在火花流工作中,如何从执行程序收集错误消息到驱动程序并在每个流批处理结束时记录这些错误消息? - In spark streaming job, how to collect error messages from executors to drivers and log these at the end of each streaming batch? 为什么我们需要的执行者多于Spark中的机器数量? - Why do we need more executors than number of machines in Spark? Spark 重新分区执行器 - Spark Repartition Executors Spark执行器上的对象缓存 - Object cache on Spark executors Spark Streaming 和 Kafka 集成中的并行任务数 - Number Of Parallel Task in Spark Streaming and Kafka Integration Spark Streaming Scala窗口长度(按对象数) - Spark streaming scala window length by number of objects spark-streaming-kafka-0-10:如何限制Spark分区的数量 - spark-streaming-kafka-0-10: How to limit number of Spark partitions
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM