简体   繁体   中英

How to restrict processing to specified number of cores in spark standalone

We have tried using various combinations of settings - but mpstat is showing that all or most cpu's are always being used (on a single 8 core system)

Following have been tried:

set master to:

local[2]

send in

conf.set("spark.cores.max","2")

in the spark configuration

Also using

--total-executor-cores 2

and

--executor-cores 2

In all cases

mpstat -A

shows that all of the CPU's are being used - and not just by the master.

So I am at a loss presently. We do need to limit the usage to a specified number of cpu's.

I had the same problem with memory size and I wanted to increase it but none of the above worked for me as well. Based on this user post I was able to resolve my problem and I think this should also work for number of cores:

from pyspark import SparkConf, SparkContext

# In Jupyter you have to stop the current context first
sc.stop()

# Create new config
conf = (SparkConf().set("spark.cores.max", "2"))

# Create new context
sc = SparkContext(conf=conf)

Hope this helps you. And please, if you have resolved your problem, send your solution as answer for this post so we can all benefit from it :)

Cheers

Apparently spark standalone ignores the spark.cores.max setting. That setting does work in yarn.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM