I'm running the following code
.map{x =>
Logger.fatal("Hello World")
x._2
}
It's spark streaming applciation runs on YARN. I upadted log4j and provided it with spark-submit (using --files). My Log4j configuration was loaded which I see from logs and applied to Driver's logs (I see my log level only and my pattern in logs), however logs from executors are not available. I can't find "Hello Word" in logs. Also, I checked ${yarn.nodemanager.log-dirs} and it's empty which looks strange. Where is my log?
thank you in advance
According to the official Spark documentation ( link ), there are two ways YARN manages the logging:
yarn.log-aggregation-enable config
): Container logs are deleted from the local machines (executors) and are copied to an HDFS directory. These logs can be viewed from anywhere on the cluster with the yarn logs
command, in the following manner:
yarn logs -applicationId <app ID>
Logs are maintained locally on each machine under YARN_APP_LOGS_DIR
, which is usually configured to /tmp/logs
or $HADOOP_HOME/logs/userlogs
depending on the Hadoop version and installation. According to the documentation, viewing logs for a container requires going to the host that contains them and looking in this directory.
I found the solution, the proper log4j
configuration must be set in the following way during the application submit:
--files /opt/spark/conf/log4j.properties
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties"
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.properties"
where
spark.driver.extraJavaOptions -> sets up log configuration for driver
spark.executor.extraJavaOptions -> sets up log configuration for executor(s)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.