[英]Spark on kubernetes: Executor pods not able to start and while creating sparkContext
I am trying to run Spark on kubernetes along with interactive commands run through Spark shell or jupyter interface.我正在尝试在 kubernetes 上运行 Spark,以及通过 Spark shell 或 jupyter 接口运行的交互式命令。 I had build custom images for both driver pod and executor pods and use below code to spin up Spark Context我已经为驱动程序 pod 和执行程序 pod 构建了自定义映像,并使用下面的代码来启动 Spark Context
import pyspark
conf = pyspark.SparkConf()
conf.setMaster("k8s://https://kubernetes.default.svc.cluster.local:443")
conf.set(
"spark.kubernetes.container.image",
"<Repo>/<IMAGENAME>:latest")
conf.set("spark.kubernetes.namespace": "default")
# Authentication certificate and token (required to create worker pods):
conf.set(
"spark.kubernetes.authenticate.caCertFile",
"/var/run/secrets/kubernetes.io/serviceaccount/ca.crt")
conf.set(
"spark.kubernetes.authenticate.oauthTokenFile",
"/var/run/secrets/kubernetes.io/serviceaccount/token")
conf.set(
"spark.kubernetes.authenticate.driver.serviceAccountName",
"spark-master")
conf.set("spark.executor.instances", "2")
conf.set(
"spark.driver.host", "spark-test-jupyter")
conf.set("spark.executor.memory", "1g")
conf.set("spark.executor.cores", "1")
conf.set("spark.driver.blockManager.port", "7777")
conf.set("spark.driver.bindAddress", "0.0.0.0")
conf.set("spark.driver.port", "29416")
sc = pyspark.SparkContext(conf=conf)
Driver tries to run executor pods, but it ends up in 2 executor pods trying to start but eventually erroring out and new set of pods doing the same. Driver 尝试运行 executor pod,但最终导致 2 个 executor pod 尝试启动但最终出错,并且新的一组 pod 也在执行相同的操作。 Logs below:下面的日志:
pyspark-shell-1620894878554-exec-8 0/1 Pending 0 0s
pyspark-shell-1620894878554-exec-8 0/1 ContainerCreating 0 0s
pyspark-shell-1620894878528-exec-7 1/1 Running 0 1s
pyspark-shell-1620894878554-exec-8 1/1 Running 0 2s
pyspark-shell-1620894878528-exec-7 0/1 Error 0 4s
pyspark-shell-1620894878554-exec-8 0/1 Error 0 4s
pyspark-shell-1620894878528-exec-7 0/1 Terminating 0 5s
pyspark-shell-1620894878528-exec-7 0/1 Terminating 0 5s
pyspark-shell-1620894878554-exec-8 0/1 Terminating 0 5s
pyspark-shell-1620894878554-exec-8 0/1 Terminating 0 5s
pyspark-shell-1620894883595-exec-9 0/1 Pending 0 0s
pyspark-shell-1620894883595-exec-9 0/1 Pending 0 0s
pyspark-shell-1620894883595-exec-9 0/1 ContainerCreating 0 0s
pyspark-shell-1620894883623-exec-10 0/1 Pending 0 0s
pyspark-shell-1620894883623-exec-10 0/1 Pending 0 0s
pyspark-shell-1620894883623-exec-10 0/1 ContainerCreating 0 0s
pyspark-shell-1620894883595-exec-9 1/1 Running 0 1s
pyspark-shell-1620894883623-exec-10 1/1 Running 0 3s
This goes on endlessly until stopped.这种情况无休止地进行,直到停止。
What could be going wrong here?这里可能出了什么问题?
Your spark.driver.host
should be DNS of the service, so something like spark-test-jupyter.default.svc.cluster.local
您的spark.driver.host
应该是服务的 DNS,所以类似于spark-test-jupyter.default.svc.cluster.local
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.