简体   繁体   中英

How to tail yarn logs?

I am submitting a Spark Job using below command. I want to tail the yarn log using application Id similar to tail command operation in Linux box.

export SPARK_MAJOR_VERSION=2
nohup spark-submit --class "com.test.TestApplication" --name TestApp --queue queue1 --properties-file application.properties --files "hive-site.xml,tez-site.xml,hbase-site.xml,application.properties" --master yarn --deploy-mode cluster Test-app.jar > /tmp/TestApp.log &

Not easily.

"YARN logs" aren't really in YARN, they are actually on the executor nodes of Spark. If YARN log aggregation is enabled, then logs are in HDFS, and available from Spark History server.

The industry deployment pattern is to configure the Spark log4j properties to write to a file with a log forwarder (like Filebeat, Splunk, Fluentd), then those processes collect data into a search engine like Solr, Elasticsearch, Graylog, Splunk, etc. From these tools, you can approximately tail/search/analyze log messages outside of a CLI.

If by " Yarn logs " , you mean your executors' logs, you can see it easily or tail it if you have access to the executor machine where your yarn job is submitted. You have to do just :

yarn logs -applicationId <you app ID>

on the executor machine. You can watch master logs in the yarn UI if you have configured it properly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM