简体   繁体   中英

Exposing Spark Worker (stdout stderr) logs in Kubernetes

I have a Spark Cluster with one Master and 4 Workers running in a 3 Node Kubernetes Cluster. The Spark UI and Master are exposed through Ingress/NodePort and is hence accessible from outside the Kubernetes cluster.

But, the worker ports are not exposed. Due to this, the Worker logs (stdout and stderr) are not accessible through the UI. The logs URL is getting redirected to <Worker1_Pod_IP:8080> , <Worker2_Pod_IP:8080> and so on.

My setup is such that there are two worker pods running on the same machine. So even if I NodePort expose the workers, there will be a conflict in ports as the same port will be assigned for the two workers on the machine. Spark History server only provides Event Logs and not Worker logs.

How can this be solved? Is there a way the NodePort value can be dynamically assigned for the workers

I believe you are talking about the SPARK_WORKER_WEBUI_PORT and not the SPARK_WORKER_PORT as described below since that its assigned a random port.

火花

This is a bit tricky because you can only expose a single port per node. If you have two spark workers per node you could create two deployments for your nodes, one that exposes the SPARK_WORKER_WEBUI_PORT on 8081 and another on 8082 and also make sure that only one pod is scheduled per node.

You can pass these values in the container environment varialbles

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM