简体繁体中英

Apache Spark standalone mode: number of cores

原文 2015-01-23 22:44:08 6 1 multithreading/ deployment/ apache-spark

I'm trying to understand the basics of Spark internals and Spark documentation for submitting applications in local mode says for spark-submit --master setting:

local[K] Run Spark locally with K worker threads (ideally, set this to the number of cores on your machine).

local[*] Run Spark locally with as many worker threads as logical cores on your machine.

Since all the data is stored on a single local machine, it does not benefit from distributed operations on RDD s.

How does it benefit and what internally is going on when Spark utilizes several logical cores?

1 answers

The system will allocate additional threads for processing data. Despite being limited to a single machine, it can still take advantage of the high degree of parallelism available in modern servers.

If you have a reasonable sized data set, say something with a dozen partitions, you can measure the time it takes to use local[1] vs local[n] (where n is the number of cores in your machine). You can also see the difference in utilization of your machine. If you only have one core designated for use, it will only use 100% of one core (plus some extra for garbage collection). If you have 4 cores, and specify local[4], it will use 400% of a core (4 cores). And execution time can be significantly shortened (although not typically by 4x).

Spark: understanding partitioning - cores

Running a standalone Hadoop application on multiple CPU cores

Why does Spark Streaming need a certain number of CPU cores to run correctly?

Multiprocessing on a set number of cores

Set number of cores in OpenMP

Java Threads and number of Cores

Java threads and number of cores

How to limit number of cores with threading

openMp and the number of cores vs cpus

number of cores effecting multithreading OS

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Spark: understanding partitioning - cores Running a standalone Hadoop application on multiple CPU cores Why does Spark Streaming need a certain number of CPU cores to run correctly? Multiprocessing on a set number of cores Set number of cores in OpenMP Java Threads and number of Cores Java threads and number of cores How to limit number of cores with threading openMp and the number of cores vs cpus number of cores effecting multithreading OS

Related Tags

Apache Spark standalone mode: number of cores

Question

1 answers

solution1 7 ACCPTED 2015-01-24 00:57:01

solution1
7 ACCPTED 2015-01-24 00:57:01