简体   繁体   English

在纱线上使用火花时火花执行器和纱线容器是什么关系

[英]what is the relationship between spark executor and yarn container when using spark on yarn

what is the relationship between spark executor and yarn container when using spark on yarn?在纱线上使用火花时火花执行器和纱线容器是什么关系?
For example, when I set executor-memory = 20G and yarn container memory = 10G, does 1 executor contains 2 containers?比如我设置executor-memory=20G,yarn容器内存=10G,1个executor是不是包含2个容器?

Spark Executor Runs within a Yarn Container. Spark Executor 在 Yarn 容器中运行。 A Yarn Container is provided by Resource Manager on demand. Yarn Container 由资源管理器按需提供。 A Yarn container can have 1 or more Spark Executors.一个 Yarn 容器可以有 1 个或多个 Spark Executor。 Spark-Executors are the one which runs the Tasks. Spark-Executors 是运行任务的那个。 Spark Executor will be started on a Worker Node(DataNode) Spark Executor 将在一个工作节点(DataNode)上启动

In your case when you set executor-memory = 20G -> This means you are asking for a Container of size 20GB in which your Executors will be running.在您设置 executor-memory = 20G -> 的情况下,这意味着您需要一个大小为 20GB 的容器,您的 Executor 将在其中运行。 Now you might have 1 or more Executors using this 20GB of Memory and this is Per Worker Node .现在您可能有 1 个或多个 Executor 使用这 20GB 的内存,这是 Per Worker Node

So for example if u have a Cluster to 8 nodes, it will be 8 * 20 GB of Total Memory for your Job.因此,例如,如果您有一个包含 8 个节点的集群,那么您的作业的总内存将为 8 * 20 GB。

Below are the 3 config options available in yarn-site.xml with which you can play around and see the differences.以下是 yarn-site.xml 中可用的 3 个配置选项,您可以使用它们查看差异。

yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
yarn.nodemanager.resource.memory-mb

When running Spark on YARN, each Spark executor runs as a YARN container, This means the number of containers will always be the same as the executors created by a Spark application eg via --num-executors parameter in spark-submit.在 YARN 上运行 Spark 时,每个 Spark 执行器都作为一个 YARN 容器运行,这意味着容器的数量将始终与 Spark 应用程序创建的执行器数量相同,例如通过 spark-submit 中的 --num-executors 参数。

https://stackoverflow.com/a/38348175/9605741 https://stackoverflow.com/a/38348175/9605741

In YARN mode, each executor runs in one container.在 YARN 模式下,每个 executor 运行在一个容器中。 The number of executors is the same as the number of containers allocated from YARN(except in cluster mode, which will allocate another container to run the driver).执行器的数量与从 YARN 分配的容器数量相同(集群模式除外,它将分配另一个容器来运行驱动程序)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM