简体繁体中英

Spark driver node and worker node for a Spark application in Standalone cluster

原文 2016-12-04 22:33:06 8 1 scala/ apache-spark

I want to understand when a Spark application is submitted which node will act as a driver node and which node will be as a worker node ?

For example if I have Standalone cluster of 3 nodes.

When spark first application(app1) is submitted, spark framework will randomly choose one of the node as driver node and other nodes as worker nodes. This is only for app1. During it's execution, if another spark application(app2) is submitted, spark can choose randomly one node as driver node and other nodes as worker nodes. This is only for app2. So while both spark applications are executing there can be a situation that two different nodes can be master nodes. Please correct me If misunderstand.

1 answers

You're on the right track. Spark has a notion of a Worker node which is used for computation. Each such worker can have N amount of Executor processes running on it. If Spark assigns a driver to be ran on an arbitrary Worker that doesn't mean that Worker can't run additional Executor processes which run the computation.

As for your example, Spark doesn't select a Master node. A master node is fixed in the environment. What it does choose is where to run the driver , which is where the SparkContext will live for the lifetime of the app. Basically if you interchange Master and Driver, your answer is correct.

Spark: driver/worker configuration. Does driver run on Master node?

Spark Cluster: How to print out the content of RDD on each worker node

Spark Standalone Cluster deployMode = "cluster": Where is my Driver?

How multiple executors are managed on the worker nodes with a Spark standalone cluster?

Spark standalone cluster

Spark Master Re-Submits endlessly after IOException on the Worker Node Application/Job Submitted

Can you run an Java-Spark application (desktop) in a multi-cluster node

how to read a file present on the edge node when submit spark application in deploy mode = cluster

standalone spark: worker didn't show up

Spark worker nodes unable to access file on master node

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Spark: driver/worker configuration. Does driver run on Master node? Spark Cluster: How to print out the content of RDD on each worker node Spark Standalone Cluster deployMode = "cluster": Where is my Driver? How multiple executors are managed on the worker nodes with a Spark standalone cluster? Spark standalone cluster Spark Master Re-Submits endlessly after IOException on the Worker Node Application/Job Submitted Can you run an Java-Spark application (desktop) in a multi-cluster node how to read a file present on the edge node when submit spark application in deploy mode = cluster standalone spark: worker didn't show up Spark worker nodes unable to access file on master node

Related Tags

Spark driver node and worker node for a Spark application in Standalone cluster

Question

1 answers

solution1 4 ACCPTED 2016-12-04 23:24:19

solution1
4 ACCPTED 2016-12-04 23:24:19