简体   繁体   中英

Apache Spark Stderr and Stdout

I am running spark-1.0.0 by connecting to a spark standalone cluster which has one master and two slaves. I ran wordcount.py by Spark-submit, actually it reads data from HDFS and also write the results into HDFS. So far everything is fine and the results will correctly be written into HDFS. But the thing makes me concern is that when I check Stdout for each worker, it is empty I dont know whether it is suppose to be empty? and I got following in stderr:

stderr log page for Some(app-20140704174955-0002)

Spark 
Executor Command: "java" "-cp" "::
/usr/local/spark-1.0.0/conf:
/usr/local/spark-1.0.0
/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop1.2.1.jar:/usr/local/hadoop/conf" "
-XX:MaxPermSize=128m" "-Xms512M" "-Xmx512M" "org.apache.spark.executor.CoarseGrainedExecutorBackend
" "akka.tcp://spark@master:54477/user/CoarseGrainedScheduler" "0" "slave2" "1
" "akka.tcp://sparkWorker@slave2:41483/user/Worker" "app-20140704174955-0002"
========================================


14/07/04 17:50:14 ERROR CoarseGrainedExecutorBackend: 
Driver Disassociated [akka.tcp://sparkExecutor@slave2:33758] -> 
[akka.tcp://spark@master:54477] disassociated! Shutting down.

Spark always writes everything, even INFO to stderr. People seem to do this to stop stdout buffering messages and causing less predictable logging. It's an acceptable practice when it's known that an application is never going to be used in bash scripting, so especially common for logging.

Try this in log4j.properties passed to Spark (or modify default configuration under Spark/conf)

# Log to stdout and stderr
log4j.rootLogger=INFO, stdout, stderr

# Send TRACE - INFO level to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Threshold=TRACE
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.filter.filter1=org.apache.log4j.varia.LevelRangeFilter
log4j.appender.stdout.filter.filter1.levelMin=TRACE
log4j.appender.stdout.filter.filter1.levelMax=INFO
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n

# Send WARN or higher to stderr
log4j.appender.stderr=org.apache.log4j.ConsoleAppender
log4j.appender.stderr.Threshold=WARN
log4j.appender.stderr.Target  =System.err
log4j.appender.stderr.layout=org.apache.log4j.PatternLayout
log4j.appender.stderr.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n

# Change this to set Spark log level
log4j.logger.org.apache.spark=WARN
log4j.logger.org.apache.spark.util=ERROR

Also, the progress bars shown at INFO level are sent to stderr.

Disable with

spark.ui.showConsoleProgress=false

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM