Spark应用程序崩溃后如何在纱线中保留Spark执行程序日志

Question

I am trying to find the root cause of recent Spark application failure in production. 我试图找到生产中最近Spark应用程序失败的根本原因。 When the Spark application is running I can check NodeManager's yarn.nodemanager.log-dir property to get the Spark executor container logs. 当Spark应用程序运行时，我可以检查NodeManager的yarn.nodemanager.log-dir属性以获取Spark执行程序容器日志。

The container has logs for both the running Spark applications 该容器具有两个正在运行的Spark应用程序的日志

Here is the view of the container logs: drwx--x--- 3 yarn yarn 51 Jul 19 09:04 application_1467068598418_0209 drwx--x--- 5 yarn yarn 141 Jul 19 09:04 application_1467068598418_0210 这是容器日志的视图：drwx--x --- 3根纱线51 Jul 19 09:04 application_1467068598418_0209 drwx--x --- 5根纱线141 Jul 19 09:04 application_1467068598418_0210

But when the application is killed both the application logs are automatically deleted. 但是，当应用程序被杀死时，两个应用程序日志都会被自动删除。 I have set all the log retention setting etc in Yarn to a very large number. 我已经将Yarn中的所有日志保留设置等设置为非常大的数量。 But still these logs are deleted as soon as the Spark applications are crashed. 但是，一旦Spark应用程序崩溃，这些日志仍然会被删除。

Question: How can we retain these Spark application logs in Yarn for debugging when the Spark application is crashed for some reason. 问题：当由于某些原因导致Spark应用程序崩溃时，如何将这些Spark应用程序日志保留在Yarn中进行调试。

Answer 1

The following location has executor logs. 以下位置具有执行程序日志。

HADOOP_USER_NAME=mapred hadoop fs -ls /hadoop/log/yarn/user/USER/logs/APPLICATION_ID

Also, set the following property :- 另外，设置以下属性：

"yarn.log-aggregation-enable","false"
"spark.eventLog.enabled", "true"    
"spark.eventLog.dir","hdfs:///user/spark/applicationHistory"

Spark应用程序崩溃后如何在纱线中保留Spark执行程序日志

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-07-19 19:24:10

Spark应用程序崩溃后如何在纱线中保留Spark执行程序日志

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-07-19 19:24:10

解决方案1
2 已采纳 2016-07-19 19:24:10