繁体   English   中英

无法将数据从水槽接收到hdfs hadoop以获取日志

[英]Unable to ingest data from flume to hdfs hadoop for logs

我正在使用以下配置将数据从日志文件推送到hdfs。

agent.channels.memory-channel.type = memory
agent.channels.memory-channel.capacity=5000
agent.sources.tail-source.type = exec
agent.sources.tail-source.command = tail -F /home/training/Downloads/log.txt
agent.sources.tail-source.channels = memory-channel
agent.sinks.log-sink.channel = memory-channel
agent.sinks.log-sink.type = logger
agent.sinks.hdfs-sink.channel = memory-channel
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.batchSize=10
agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/user/flume/data/log.txt
agent.sinks.hdfs-sink.hdfs.fileType = DataStream
agent.sinks.hdfs-sink.hdfs.writeFormat = Text
agent.channels = memory-channel
agent.sources = tail-source
agent.sinks = log-sink hdfs-sink
agent.channels = memory-channel
agent.sources = tail-source
agent.sinks = log-sink hdfs-sink

我没有收到错误消息,但仍然无法在hdfs中找到输出。 关于中断,我可以看到接收器中断异常以及该日志文件的一些数据。 我正在运行以下命令:flume-ng代理--conf / etc / flume-ng / conf / --conf文件/etc/flume-ng/conf/flume.conf -Dflume.root.logger = DEBUG,控制台- n代理人

我有一个类似的问题

就我而言,它现在可以正常工作了下面是conf文件:

#Exec Source
execAgent.sources=e
execAgent.channels=memchannel
execAgent.sinks=HDFS
#channels
execAgent.channels.memchannel.type=file
execAgent.channels.memchannel.capacity = 20000
execAgent.channels.memchannel.transactionCapacity = 1000
#Define Source
execAgent.sources.e.type=org.apache.flume.source.ExecSource
execAgent.sources.e.channels=memchannel
execAgent.sources.e.shell=/bin/bash -c
execAgent.sources.e.fileHeader=false
execAgent.sources.e.fileSuffix=.txt
execAgent.sources.e.command=cat /home/sample.txt
#Define Sink
execAgent.sinks.HDFS.type=hdfs
execAgent.sinks.HDFS.hdfs.path=hdfs://localhost:8020/user/flume/
execAgent.sinks.HDFS.hdfs.fileType=DataStream
execAgent.sinks.HDFS.hdfs.writeFormat=Text
execAgent.sinks.HDFS.hdfs.batchSize=1000
execAgent.sinks.HDFS.hdfs.rollSize=268435
execAgent.sinks.HDFS.hdfs.rollInterval=0
#Bind Source Sink Channel
execAgent.sources.e.channels=memchannel
execAgent.sinks.HDFS.channel=memchannel
`

希望对您有所帮助。

我建议在将文件放入HDFS时使用前缀配置:

agent.sinks.hdfs-sink.hdfs.filePrefix = log.out

@bhavesh-您确定日志文件(agent.sources.tail-source.command = tail -F /home/training/Downloads/log.txt)始终附加数据吗? 由于您已将Tail命令与-F一起使用,因此仅将已更改的数据(文件内)转储到HDFS中

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM