简体   繁体   English

如何解决火花上的纱线容器上浆问题?

[英]How to solve yarn container sizing issue on spark?

I want to launch some pyspark jobs on YARN .我想在YARN上启动一些 pyspark 作业。 I have 2 nodes, with 10 GB each.我有 2 个节点,每个节点 10 GB。 I am able to open up the pyspark shell like so: pyspark我可以像这样打开 pyspark shell: pyspark

Now when I have a very simple example that I try to launch:现在,当我尝试启动一个非常简单的示例时:

import random
NUM_SAMPLES=1000
def inside(p):
    x, y = random.random(), random.random()
    return x*x + y*y < 1

count = sc.parallelize(xrange(0, NUM_SAMPLES)) \
             .filter(inside).count()
print "Pi is roughly %f" % (4.0 * count / NUM_SAMPLES)

I get as a result a very long spark log with the error output.结果我得到了一个很长的带有错误输出的火花日志。 The most important information is:最重要的信息是:

ERROR cluster.YarnScheduler: Lost executor 1 on (ip>: Container marked as failed: <containerID> on host: <ip>. Exit status 1.  Diagnostics: Exception from container-launch.  ......

later on in the logs I see...后来在日志中我看到...

ERROR scheduler.TaskSetManager: Task 0 in stage 0.0 failed 1 times: aborting job
INFO cluster.YarnClientSchedulerBackend: Asked to remove non-existent executor 1
INFO spark.ExecutorAllocationManager: Existing executor 1 has been removed (new total is 0)

From what I'm gathering from the logs above, this seems to be a container sizing issue in yarn.从我从上面的日志中收集的信息来看,这似乎是纱线中的容器大小问题。

My yarn-site.xml file has the following settings:我的yarn-site.xml文件具有以下设置:

yarn.scheduler.maximum-allocation-mb = 10240
yarn.nodemanager.resource.memory-mb = 10240

and in spark-defaults.conf contains:并在spark-defaults.conf包含:

spark.yarn.executor.memoryOverhead=2048
spark.driver.memory=3g

If there are any other settings you'd like to know about, please let me know.如果您想了解任何其他设置,请告诉我。

How do I set the container size in yarn appropriately?如何适当地设置纱线中的容器尺寸?
(bounty on the way for someone who can help me with this) (对可以帮助我的人的赏金)

Let me first explain the basic set of properties required to tune your spark application on a YARN cluster.让我首先解释在 YARN 集群上调整 Spark 应用程序所需的基本属性集。

Note: Container in YARN is equivalent to Executor in Spark.注意: YARN 中的 Container 相当于 Spark 中的 Executor。 For understandability, you can consider that both are same.为了便于理解,您可以认为两者是相同的。

On yarn-site.xml:在 yarn-site.xml 上:

yarn.nodemanager.resource.memory-mb is the total memory available to the cluster from a given node. yarn.nodemanager.resource.memory-mb是给定节点集群可用的总内存。

yarn.nodemanager.resource.cpu-vcores is the total number of CPU vcores available to the cluster from a given node. yarn.nodemanager.resource.cpu-vcores是来自给定节点的集群可用的 CPU vcores 总数。

yarn.scheduler.maximum-allocation-mb is the maximum memory in mb that can be allocated per yarn container. yarn.scheduler.maximum-allocation-mb是每个纱线容器可以分配的最大内存(以 mb 为单位)。

yarn.scheduler.maximum-allocation-vcores is the maximum number of vcores that can be allocated per yarn container. yarn.scheduler.maximum-allocation-vcores是每个纱线容器可以分配的最大 vcores 数。

Example: If a node has 16GB and 8vcores and you would like to contribute 14GB and 6vcores to the cluster(for containers), then set properties as shown below:示例:如果一个节点有 16GB 和 8vcores,并且您想为集群贡献 14GB 和 6vcores(对于容器),则设置如下所示的属性:

yarn.nodemanager.resource.memory-mb : 14336 (14GB) yarn.nodemanager.resource.memory-mb : 14336 (14GB)

yarn.nodemanager.resource.cpu-vcores : 6 yarn.nodemanager.resource.cpu-vcores:6

And, to create containers with 2GB and 1vcore each, set these properties:并且,要创建每个具有 2GB 和 1vcore 的容器,请设置以下属性:

yarn.scheduler.maximum-allocation-mb : 2049 yarn.scheduler.maximum-allocation-mb:2049

yarn.scheduler.maximum-allocation-vcores : 1 yarn.scheduler.maximum-allocation-vcores : 1

Note: Even though there is enough memory(14gb) to create 7 containers with 2GB, above config will only create 6 containers with 2GB and only 12GB out of 14GB will be utilized to the cluster.注意:即使有足够的内存(14GB)来创建 7 个 2GB 的容器,上面的配置也只会创建 6 个 2GB 的容器,并且只有 14GB 中的 12GB 将用于集群。 This is because there are only 6vcores available to the cluster.这是因为集群只有 6 个 vcores 可用。

Now on Spark side,现在在 Spark 方面,

Below properties specify memory to be requested per executor/container下面的属性指定每个执行程序/容器要请求的内存

spark.driver.memory

spark.executor.memory

Below properties specify vcores to be requested per executor/container下面的属性指定每个执行程序/容器要请求的 vcore

spark.driver.cores

spark.executor.cores

IMP: All the Spark's memory and vcore properties should be less than or equal to what YARN's configuration IMP: Spark 的所有内存和 vcore 属性都应该小于或等于 YARN 的配置

Below property specifies the total number of executors/containers that can be used for your spark application from the YARN cluster.下面的属性指定了 YARN 集群中可用于 Spark 应用程序的执行程序/容器的总数。

spark.executor.instances

This property should be less than the total number of containers available in the YARN cluster.此属性应小于 YARN 集群中可用的容器总数。

Once the yarn configuration is complete, the spark should request for containers that can be allocated based on the YARN configurations .一旦 yarn 配置完成,spark 应该请求可以根据 YARN 配置分配的容器。 That means if YARN is configured to allocate a maximum of 2GB per container and Spark requests a container with 3GB memory, then the job will either halt or stop because YARN cannot satisfy the spark's request.这意味着如果 YARN 配置为每个容器最多分配 2GB 并且 Spark 请求一个具有 3GB 内存的容器,那么作业将停止或停止,因为 YARN 无法满足 Spark 的请求。

Now for your use case: Usually, cluster tuning is based on the workloads.现在针对您的用例:通常,集群调优基于工作负载。 But below config should be more suitable.但是下面的配置应该更合适。

Memory available : 10GB * 2 nodes Vcores available : 5 * 2 vcores [Assumption]可用内存:10GB * 2 个节点可用 Vcor​​es :5 * 2 个 vcores [假设]

On yarn-site.xml [In both the nodes]在yarn-site.xml [在两个节点中]

yarn.nodemanager.resource.memory-mb : 10240 yarn.nodemanager.resource.memory-mb :10240

yarn.nodemanager.resource.cpu-vcores : 5 yarn.nodemanager.resource.cpu-vcores :5

yarn.scheduler.maximum-allocation-mb : 2049 yarn.scheduler.maximum-allocation-mb :2049

yarn.scheduler.maximum-allocation-vcores : 1 yarn.scheduler.maximum-allocation-vcores : 1

Using above config, you can create a maximum of 10 containers on each of the nodes having 2GB,1vcore per container.使用上面的配置,您可以在每个具有 2GB,1vcore 每个容器的节点上创建最多 10 个容器。

Spark config火花配置

spark.driver.memory 1536mb spark.driver.memory 1536mb

spark.yarn.executor.memoryOverhead 512mb spark.yarn.executor.memoryOverhead 512mb

spark.executor.memory 1536mb spark.executor.memory 1536mb

spark.yarn.executor.memoryOverhead 512mb spark.yarn.executor.memoryOverhead 512mb

spark.driver.cores 1 spark.driver.cores 1

spark.executor.cores 1 spark.executor.cores 1

spark.executor.instances 19 spark.executor.instances 19

Please feel free to play around these configurations to suit your needs.请随意使用这些配置以满足您的需求。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM