[英]Why is the number of spark executors reduced using custom settings on EMR
I'm running spark 1.6, cluster mode on EMR 4.3.0 with the following settings: 我正在EMR 4.3.0上使用以下设置运行spark 1.6,集群模式:
[
{
"classification": "spark-defaults",
"properties": {
"spark.executor.cores" : "16"
}
},
{
"classification": "spark",
"properties": {
"maximizeResourceAllocation": "true"
}
}
]
With the following instances: 对于以下情况:
master: 1 * m3.xlarge
core: 2 * m3.xlarge
When I test the number of executors with: 当我通过以下方式测试执行程序的数量时:
val numExecutors = sc.getExecutorStorageStatus.size - 1
I only get 2 . 我只得到2 。
Are somehow the EMR settings for spark overwritten? 火花的EMR设置是否被覆盖?
Ok, here is the problem : you are settings the number of cores for each executor and not the number of executors. 好的,这是问题所在:您正在设置每个执行程序的核心数量,而不是执行程序的数量。 eg "spark.executor.cores" : "16"
. 例如"spark.executor.cores" : "16"
。
And since you are on AWS EMR, this means also that you are using YARN
. 而且,由于您使用的是AWS EMR,因此这也意味着您正在使用YARN
。
By default, the number of executor instances is 2 ( spark.executor.instances
is the property that defines the number of executors). 默认情况下,执行程序实例的数量为2( spark.executor.instances
是定义执行程序数量的属性)。
Note : 注意 :
spark.dynamicAllocation.enabled
. 该属性与spark.dynamicAllocation.enabled
不兼容。 If both spark.dynamicAllocation.enabled
and spark.executor.instances
are specified, dynamic allocation is turned off and the specified number of spark.executor.instances
is used. 如果同时指定了spark.dynamicAllocation.enabled
和spark.executor.instances
,则动态分配将关闭,并使用指定数量的spark.executor.instances
。 Thus you get the following : 这样您将获得以下内容:
scala> val numExecutors = sc.getExecutorStorageStatus.size - 1
res1 : numberExectuors : Int = 2
This means that you are actually using two executors, one per slave that is only operating on 1 core. 这意味着您实际上使用的是两个执行程序,每个从属程序一个仅在1个内核上运行的执行程序。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.