简体   繁体   English

独立集群模式:spark如何分配spark.executor.cores?

[英]Standalone Cluster Mode: how does spark allocate spark.executor.cores?

I'm searching for how and where spark allocates cores per executor in the source code. 我正在寻找spark在源代码中如何以及在哪里为每个执行者分配内核。 Is it possible to control programmaticaly allocated cores in standalone cluster mode? 是否可以在独立集群模式下控制以编程方式分配的内核?

Regards, Matteo 问候,Matteo

Spark allows for configuration options to be passed through the .set method on the SparkConf class. Spark允许通过SparkConf类的.set方法传递配置选项。

Here's some scala code that sets up a new spark configuration: 这是一些scala代码,用于设置新的spark配置:

new SparkConf()
  .setAppName("App Name")
  .setMaster('local[2]')
  .set("spark.executor.cores", "2")

Documentation about the different configuration options: 有关不同配置选项的文档:

http://spark.apache.org/docs/1.6.1/configuration.html#execution-behavior http://spark.apache.org/docs/1.6.1/configuration.html#execution-行为

I haven't looked through the source code exhaustively, but I think this is the spot in the source code where the executor cores are defined prior to allocation: 我没有详尽地查看源代码,但是我认为这是源代码中在分配之前定义执行程序核心的地方:

https://github.com/apache/spark/blob/d6dc12ef0146ae409834c78737c116050961f350/core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala https://github.com/apache/spark/blob/d6dc12ef0146ae409834c78737c116050961f350/core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala

In stand alone mode, you have following options: 在独立模式下,您有以下选择:

a. 一种。 While starting the cluster, you can mention how many cpu cores to be allotted for spark applications. 在启动集群时,您可以提及要为Spark应用程序分配多少个cpu内核。 This can be set both as env variable SPARK_WORKER_CORES or passed as argument to shell script (-c or --cores) 既可以将其设置为环境变量SPARK_WORKER_CORES,也可以将其作为参数传递给Shell脚本(-c或--cores)

b. b。 Care should be taken (if other applications also share resources like cores) not to allow spark to take all the cores. 应当小心(如果其他应用程序也共享内核之类的资源),不要让spark占用所有内核。 This can be set using spark.cores.max parameter. 可以使用spark.cores.max参数进行设置。

c. C。 You can also pass --total-executor-cores <numCores> to the spark shell 您还可以将--total-executor-cores <numCores>传递给spark shell

For more info, you can look here 有关更多信息,您可以在这里查看

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 java.util.NoSuchElementException:spark.executor.cores - java.util.NoSuchElementException: spark.executor.cores 增加spark.executor.cores会更快地进行洗牌 - Will increasing spark.executor.cores make shuffling faster spark-submit命令中的spark.executor.cores和executor-cores有什么区别? - What is the difference between spark.executor.cores and executor-cores in the spark-submit command? spark.driver.cores 在 spark 独立集群模式下的设置 - spark.driver.cores setting in spark standalone cluster mode 尽管spark.executor.cores = 1,但Apache Spark执行程序仍使用多个内核 - Apache Spark executor uses more than one core despite of spark.executor.cores=1 如果spark.executor.instances和spark.cores.max不起作用,如何在Spark Standalone模式下增加执行程序的数量 - How to increase the number of executors in Spark Standalone mode if spark.executor.instances and spark.cores.max aren't working Spark Standalone-总执行者核心数 - Spark Standalone --total-executor-cores 使用livy时spark.executor.cores无法生效 - spark.executor.cores can't take effect while using livy 为什么Spark Standalone集群不使用所有可用内核? - Why does Spark Standalone cluster not use all available cores? Spark独立集群,每个执行程序的内存问题 - Spark Standalone cluster, memory per executor issue
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM