简体   繁体   中英

How to retain Spark executor logs in Yarn after Spark application is crashed

I am trying to find the root cause of recent Spark application failure in production. When the Spark application is running I can check NodeManager's yarn.nodemanager.log-dir property to get the Spark executor container logs.

The container has logs for both the running Spark applications

Here is the view of the container logs: drwx--x--- 3 yarn yarn 51 Jul 19 09:04 application_1467068598418_0209 drwx--x--- 5 yarn yarn 141 Jul 19 09:04 application_1467068598418_0210

But when the application is killed both the application logs are automatically deleted. I have set all the log retention setting etc in Yarn to a very large number. But still these logs are deleted as soon as the Spark applications are crashed.

Question: How can we retain these Spark application logs in Yarn for debugging when the Spark application is crashed for some reason.

The following location has executor logs.

HADOOP_USER_NAME=mapred hadoop fs -ls /hadoop/log/yarn/user/USER/logs/APPLICATION_ID

Also, set the following property :-

"yarn.log-aggregation-enable","false"
"spark.eventLog.enabled", "true"    
"spark.eventLog.dir","hdfs:///user/spark/applicationHistory" 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM