Yarn Cluster Configuration: 8 Nodes 8 cores per Node 8 GB RAM per Node 1TB HardDisk per Node
Executor memory & No of Executors
Executor memory and no of executors/node are interlinked so you would first start selecting Executor memory or No of executors then based on your choice you can follow this to set properties to get desired results
In YARN these properties would affect number of containers (/executors in Spark) that can be instantiated in a NodeManager based on spark.executor.cores, spark.executor.memory
property values (along with executor memory overhead)
For example, if a cluster with 10 nodes (RAM: 16 GB, cores: 6) and set with following yarn properties
yarn.scheduler.maximum-allocation-mb=10GB
yarn.nodemanager.resource.memory-mb=10GB
yarn.scheduler.maximum-allocation-vcores=4
yarn.nodemanager.resource.cpu-vcores=4
Then with spark properties spark.executor.cores=2, spark.executor.memory=4GB you can expect 2 Executors/Node so total you'll get 19 executors + 1 container for Driver
If the spark properties are spark.executor.cores=3, spark.executor.memory=8GB
then you will get 9 Executor (only 1 Executor/Node) + 1 container for Driver link
Driver memory
spark.driver.memory
—Maximum size of each Spark driver's Java heap memory
spark.yarn.driver.memoryOverhead
—Amount of extra off-heap memory that can be requested from YARN, per driver. This, together with spark.driver.memory, is the total memory that YARN can use to create a JVM for a driver process.
Spark driver memory does not impact performance directly , but it ensures that the Spark jobs run without memory constraints at the driver . Adjust the total amount of memory allocated to a Spark driver by using the following formula, assuming the value of yarn.nodemanager.resource.memory-mb
is X:
These numbers are for the sum of spark.driver.memory
and spark.yarn.driver.memoryOverhead
. Overhead should be 10-15% of the total.
You can also follow this Cloudera link for tuning Spark jobs
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.