简体   繁体   English

在YARN上Spark的日志在哪里?

[英]Where are logs in Spark on YARN?

I'm new to spark. 我是新来的。 Now I can run spark 0.9.1 on yarn (2.0.0-cdh4.2.1). 现在我可以在纱线上运行spark 0.9.1(2.0.0-cdh4.2.1)。 But there is no log after execution. 但执行后没有日志。

The following command is used to run a spark example. 以下命令用于运行spark示例。 But logs are not found in the history server as in a normal MapReduce job. 但是,在正常的MapReduce作业中,在历史记录服务器中找不到日志。

SPARK_JAR=./assembly/target/scala-2.10/spark-assembly-0.9.1-hadoop2.0.0-cdh4.2.1.jar \
./bin/spark-class org.apache.spark.deploy.yarn.Client --jar ./spark-example-1.0.0.jar \
--class SimpleApp --args yarn-standalone  --num-workers 3 --master-memory 1g \
--worker-memory 1g --worker-cores 1

where can I find the logs/stderr/stdout? 我在哪里可以找到logs / stderr / stdout?

Is there someplace to set the configuration? 有什么地方可以设置配置吗? I did find an output from console saying: 我确实找到了控制台的输出说:

14/04/14 18:51:52 INFO Client: Command for the ApplicationMaster: $JAVA_HOME/bin/java -server -Xmx640m -Djava.io.tmpdir=$PWD/tmp org.apache.spark.deploy.yarn.ApplicationMaster --class SimpleApp --jar ./spark-example-1.0.0.jar --args 'yarn-standalone' --worker-memory 1024 --worker-cores 1 --num-workers 3 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr 14/04/14 18:51:52 INFO客户端:ApplicationMaster的命令:$ JAVA_HOME / bin / java -server -Xmx640m -Djava.io.tmpdir = $ PWD / tmp org.apache.spark.deploy.yarn.ApplicationMaster --class SimpleApp --jar ./spark-example-1.0.0.jar --args'yarn-standalone' - worker-memory 1024 --worker-cores 1 --num-workers 3 1> <LOG_DIR> / stdout 2> <LOG_DIR> / stderr

In this line, notice 1> $LOG_DIR/stdout 2> $LOG_DIR/stderr 在这一行中,请注意1> $LOG_DIR/stdout 2> $LOG_DIR/stderr

Where can LOG_DIR be set? LOG_DIR可以在哪里设置?

You can access logs through the command 您可以通过该命令访问日志

yarn logs -applicationId <application ID> [OPTIONS]

general options are: 一般选择是:

  • appOwner <Application Owner> - AppOwner (assumed to be current user if not specified) appOwner <Application Owner> - AppOwner(如果未指定,则假定为当前用户)
  • containerId <Container ID> - ContainerId (must be specified if node address is specified) containerId <Container ID> - ContainerId(如果指定了节点地址,则必须指定)
  • nodeAddress <Node Address> - NodeAddress in the format nodename:port (must be specified if container id is specified) nodeAddress <Node Address> - nodeAddress <Node Address>的格式为nodename:port (如果指定了容器ID,则必须指定)

Examples: 例子:

yarn logs -applicationId application_1414530900704_0003                                      
yarn logs -applicationId application_1414530900704_0003 myuserid

// the user ids are different
yarn logs -applicationId <appid> --appOwner <userid>

Pretty article for this question: 这个问题的漂亮文章:

Running Spark on YARN - see the section "Debugging your Application". 在YARN上运行Spark - 请参阅“调试应用程序”部分。 Decent explanation with all required examples. 所有必需示例的体面解释。

The only thing you need to follow to get correctly working history server for Spark is to close your Spark context in your application. 要获得正确使用Spark的历史记录服务器,您需要遵循的唯一方法是关闭应用程序中的Spark上下文。 Otherwise, application history server does not see you as COMPLETE and does not show anything (despite history UI is accessible but not so visible). 否则,应用程序历史记录服务器不会将您视为COMPLETE且不显示任何内容(尽管历史UI可访问但不可见)。

None of the answers make it crystal clear where to look for logs ( although they do in pieces) so I am putting it together. 没有一个答案清楚地说明了在哪里寻找原木(虽然它们分块)所以我把它放在一起。

If log aggregation is turned on (with the yarn.log-aggregation-enable yarn-site.xml) then do this 如果启用了日志聚合(使用yarn.log-aggregation-enable yarn-site.xml),请执行此操作

yarn logs -applicationId <app ID>

However, if this is not turned on then one needs to go on the Data-Node machine and look at 但是,如果没有打开,那么需要继续使用数据节点机器并查看

$HADOOP_HOME/logs/userlogs/application_1474886780074_XXXX/

application_1474886780074_XXXX is the application id application_1474886780074_XXXX是应用程序ID

It logs to: 它记录到:

/var/log/hadoop-yarn/containers/[application id]/[container id]/stdout

The logs are on every node that your Spark job runs on. 日志位于运行Spark作业的每个节点上。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM