简体   繁体   中英

Two separate images to run spark in client-mode using Kubernetes, Python with Apache-Spark 3.2.0?

I deployed Apache Spark 3.2.0 using this script run from a distribution folder for Python:

./bin/docker-image-tool.sh -r <repo> -t my-tag -p ./kubernetes/dockerfiles/spark/bindings/python/Dockerfile build

I can create a container under K8s using Spark-Submit just fine. My goal is to run spark-submit configured for client mode vs. local mode and expect additional containers will be created for the executors.

Does the image I created allow for this, or do I need to create a second image (without the -p option) using the docker-image tool and configure within a different container ?

It turns out that only one image is needed if you're running PySpark. Using Client-mode, the code spawns the executors and workers for you and they run once you create a spark-submit command. Big improvement from Spark version 2.4!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM