How to tail yarn logs?

Question

I am submitting a Spark Job using below command. I want to tail the yarn log using application Id similar to tail command operation in Linux box.

export SPARK_MAJOR_VERSION=2
nohup spark-submit --class "com.test.TestApplication" --name TestApp --queue queue1 --properties-file application.properties --files "hive-site.xml,tez-site.xml,hbase-site.xml,application.properties" --master yarn --deploy-mode cluster Test-app.jar > /tmp/TestApp.log &

Answer 1

Not easily.

"YARN logs" aren't really in YARN, they are actually on the executor nodes of Spark. If YARN log aggregation is enabled, then logs are in HDFS, and available from Spark History server.

The industry deployment pattern is to configure the Spark log4j properties to write to a file with a log forwarder (like Filebeat, Splunk, Fluentd), then those processes collect data into a search engine like Solr, Elasticsearch, Graylog, Splunk, etc. From these tools, you can approximately tail/search/analyze log messages outside of a CLI.

Answer 2

If by " Yarn logs " , you mean your executors' logs, you can see it easily or tail it if you have access to the executor machine where your yarn job is submitted. You have to do just :

yarn logs -applicationId <you app ID>

on the executor machine. You can watch master logs in the yarn UI if you have configured it properly.

Answer 3

yarn logs -applicationId application_1648123761230_0106 -log_files stdout -size -1000

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.1/data-operating-system/content/use_the_yarn_cli_to_view_logs_for_running_applications.html

How to tail yarn logs?

Question

2 answers

solution1
2 2019-01-23 20:52:19

solution2
0 2019-01-23 21:15:58

solution3
0 2022-05-04 11:14:34

How to tail yarn logs?

Question

2 answers

solution1 2 2019-01-23 20:52:19

solution2 0 2019-01-23 21:15:58

solution3 0 2022-05-04 11:14:34

solution1
2 2019-01-23 20:52:19

solution2
0 2019-01-23 21:15:58

solution3
0 2022-05-04 11:14:34