简体   繁体   中英

what is best way to calculate --executor-memory --num-executors --executor-cores in spark

I have cluster contain 1 master and 5 slave ( node ) , each of them 32 core and 64 GB memory .

Is there any pattern to calculate following parameter in spark submission with yarn

--executor-memory --num-executors --executor-cores

If we have following hardware then calculate spark

  • 6 Node
  • 16 code
  • 64 GB of Ram

Calculations:

  • 5 core per executor
  • -For max HDFS throughput
  • Cluster has 6*15 =90 cores in total
  • afer taking out Hadoop /Yarn daemon cores)
  • 90 cores /5 cores/executor (19/5=18-1)
  • =18 executors
  • Each node has 3 executors
  • 63 GB/3 =21 GB ,21*(1 -0,07)
  • ~19 GB
  • 1 executor for AM=> 17 executor

Ans

  • 17 Executors in total
  • 19 GB memory /executor
  • 5 cores /executor

  • Number of executer (--num -executors)
  • Coures for each executer(--executorr-cores)
  • Memory for each executer (--executor-memory)

在此处输入图片说明


Executer-memory controls the heap size:

  • Node some overhead(controlledby spark.yarn.executor.memory.overhead) for off heap memory default is max (384 MB ,.07* Spark.executer.memory)

Spark Memory Management

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM