I have a single node Spark on machine A, and I do spark-submit on another machine B, this is how I run spark-submit:
spark-submit \
--class com.foo.misc.spark.WordCount \
--master yarn \
--deploy-mode cluster \
--executor-memory 1G \
--num-executors 5 \
wordcount.jar \
file:///root/input01.txt \
hdfs://os74gcc52-c6cfd5d5:9000/test/output9
This works fine, and I can see the output9 generated with word count.
Only that, when looking into the terminal (the one I ran spark-submit), I cannot find my log info, all I see is spark log like
2018-11-07 15:41:36 INFO Client:54 - Application report for application_1541562152848_0010 (state: RUNNING)
2018-11-07 15:41:37 INFO Client:54 - Application report for application_1541562152848_0010 (state: RUNNING)
2018-11-07 15:41:38 INFO Client:54 - Application report for application_1541562152848_0010 (state: RUNNING)
2018-11-07 15:41:39 INFO Client:54 - Application report for application_1541562152848_0010 (state: RUNNING)
2018-11-07 15:41:40 INFO Client:54 - Application report for application_1541562152848_0010 (state: FINISHED)
This is how I do the log in WordCount.java,
import org.apache.log4j.LogManager;
import org.apache.log4j.Logger;
...
public class WordCount {
private static final Logger log = LogManager.getLogger(WordCount.class);
public static void main(String[] args) {
log.warn("start foooooooooooooooooooo");
...
Is it because I'm using cluster deploy-mode? Or because of some other things?
get the application id of the spark job from resource manager.Use the yarn command to get the logs for the application id .You will find your info's printed using log manager over there. If the application is submitted in client mode i think these can be seen on the console while the job runs
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.