简体   繁体   中英

Using Apache Flume to write logs from MapReduce job into HDFS

I am trying to write logs from MapReduce job into HDFS. I am using Apache Flume NG.

My environment:

  • Java 6
  • Log4j 1.2.16
  • Apache Hadoop 2.3.0
  • Apache Flume 1.4.0

Problem #1

I have created simple MapReduce job as Maven project and I have used logger.info() in my classes. When my job is completed I can see my logs in syslog file.

I would like to create my own log4j configuration and write logs to console too. How can I do this? Where do I have to put log4j.properties file? Should I modify general Hadoop conf/log4j.properties?

Problem #2

I would like to write logs to HDFS. But I don't want to use tail -f command and write the content of syslog file. I would like to write logs only from my classes - messages from logger.info() method.

Is this possible using Apache Flume NG? Or maybe can I do this easier?

I had an idea to implement Flume Log4j Appender in log4j.properties (for example on localhost, 44444 port). In Flume NG configuration I wanted to use the same address for Avro source and through memory channel write logs to HDFS.

Is this good solution?

Problem #1

Which console? Remember the tasks are running on different JVMs. So there is no single console. If you want the logs from the Driver then that would be simple configuration.

Problem #2

What you are attempting is a generally a good solution. Flume appender is available in log4j project : Log4J 2 Flume Appender

1 : http://logging.apache.org/log4j/2.x/manual/appenders.html#FlumeAppender or the other option : Kite SDK

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM