How to restrict processing to specified number of cores in spark standalone

Question

We have tried using various combinations of settings - but mpstat is showing that all or most cpu's are always being used (on a single 8 core system)

Following have been tried:

set master to:

local[2]

send in

conf.set("spark.cores.max","2")

in the spark configuration

Also using

--total-executor-cores 2

and

--executor-cores 2

In all cases

mpstat -A

shows that all of the CPU's are being used - and not just by the master.

So I am at a loss presently. We do need to limit the usage to a specified number of cpu's.

Answer 1

I had the same problem with memory size and I wanted to increase it but none of the above worked for me as well. Based on this user post I was able to resolve my problem and I think this should also work for number of cores:

from pyspark import SparkConf, SparkContext

# In Jupyter you have to stop the current context first
sc.stop()

# Create new config
conf = (SparkConf().set("spark.cores.max", "2"))

# Create new context
sc = SparkContext(conf=conf)

Hope this helps you. And please, if you have resolved your problem, send your solution as answer for this post so we can all benefit from it :)

Cheers

Answer 2

Apparently spark standalone ignores the spark.cores.max setting. That setting does work in yarn.

How to restrict processing to specified number of cores in spark standalone

Question

2 answers

solution1
4 2015-09-01 23:49:45

solution2
1 ACCPTED 2017-11-15 13:33:12

How to restrict processing to specified number of cores in spark standalone

Question

2 answers

solution1 4 2015-09-01 23:49:45

solution2 1 ACCPTED 2017-11-15 13:33:12

solution1
4 2015-09-01 23:49:45

solution2
1 ACCPTED 2017-11-15 13:33:12