简体   繁体   中英

Spark UI stdout/stderr links point to executors internal address

Environment : AWS EMR, yarn cluster.

Description : On Spark ui, in Environment and Executors tabs, the links of stdout and stderr point to the internal address of the executors. This would imply to expose the executors so that links can be accessed. Shouldn't those links be pointed to master then handled internally serving the master as a proxy for these files instead of exposing the internal machines?

I have tried setting SPARK_PUBLIC_DNS and SPARK_LOCAL_IP variables so they contain the master ip address. I also tried with this properties: spark.yarn.appMasterEnv.SPARK_LOCAL_IP and spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS but it does not seem to work.

Screenshot of error

Any suggestion?

Spark recommends setting the log directory to hdfs so that it can be accessed from anywhere.

https://spark.apache.org/docs/latest/configuration.html

So what you need to do is to set this in spark-default.conf:

spark.eventLog.enabled` true
spark.eventLog.dir      hdfs:///somewhere

The first one allows you to consult the logs after the execution is completed. The second one tells spark executors where to put the logs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM